Example output
Analysis of Customer Purchase Behavior Dataset
Dataset Description:
The "Customer Purchase Behavior" dataset contains 1,000 data points from an e-commerce company. It includes information about customer demographics, purchase history, and website interaction data. The variables of interest are age, gender, income, purchase frequency, and time spent on the website.
Research Question:
We aim to investigate whether there is a significant correlation between customer age and purchase frequency, and whether this relationship is influenced by customer income and time spent on the website.
Statistical Techniques:
Data Preprocessing:
Missing values were imputed using mean values for numerical features.
Outliers were identified and treated using z-scores.
Data was transformed to normalize variables.
Exploratory Data Analysis (EDA):
Summary statistics revealed that the average age of customers is 35, with a standard deviation of 10.
The average purchase frequency is 2.5 times per month, with a wide distribution.
A scatterplot showed a positive correlation between age and purchase frequency.
Income distribution is right-skewed, with a median income of $50,000.
Time spent on the website varies widely, with an average of 30 minutes per session.
Hypothesis Testing:
Null Hypothesis (H0): There is no significant correlation between age and purchase frequency.
Alternative Hypothesis (H1): There is a significant correlation.
p-value < 0.05, rejecting the null hypothesis.
A significant positive correlation (r = 0.35) was found.
Regression Analysis:
Multiple linear regression was performed with purchase frequency as the dependent variable and age, income, and time spent on the website as independent variables.
The regression equation is: Purchase Frequency = 1.2 + 0.03 * Age + 0.002 * Income - 0.01 * Time Spent on Website.
Assumptions of linearity, independence, homoscedasticity, and normality were met.
Data Visualization:
Scatterplots with regression lines show the positive relationship between age and purchase frequency.
Box plots display income distribution.
Bar charts compare purchase frequency across different age groups.
Interpretation:
Age has a significant positive impact on purchase frequency.
Income and time spent on the website also influence purchase frequency.
The company can use these findings to target specific age groups in their marketing strategies and optimize website engagement to increase purchase frequency.