Dataset Link : https://www.kaggle.com/datasets/devarajv88/target-dataset?select=products.csv
Project Overview: This project involves a comprehensive analysis of an ecommerce dataset to understand various business metrics and customer behaviors. The dataset was sourced from Kaggle and stored in an ecommerce database. The analysis was performed using Jupyter Notebook.
Objectives:
- List all unique cities where customers are located.
- Count the number of orders placed in 2017.
- Find the total sales per category.
- Calculate the percentage of orders that were paid in installments.
- Count the number of customers from each state.
- Calculate the number of orders per month in 2018.
- Find the average number of products per order, grouped by customer city.
- Calculate the percentage of total revenue contributed by each product category.
- Identify the correlation between product price and the number of times a product has been purchased.
- Calculate the total revenue generated by each seller, and rank them by revenue.
- Calculate the moving average of order values for each customer over their order history.
- Calculate the cumulative sales per month for each year.
- Calculate the year-over-year growth rate of total sales.
Methodology: Data Storage: The dataset was stored in an ecommerce database to facilitate efficient data retrieval and manipulation. Data Retrieval: Data was fetched from the database using SQL queries. Data Analysis: The analysis was conducted using Python in a Jupyter Notebook. Key Python libraries used include Pandas for data manipulation, Matplotlib and Seaborn for data visualization, and SQLAlchemy for database interaction.
Key Insights:
- Year-on-Year Growth: The analysis revealed significant growth trends, identifying peak periods and potential areas for improvement.
- Customer Order Values: By calculating the moving average of order values, we identified loyal customers with consistent order patterns.
- Cumulative Sales: The cumulative sales analysis highlighted the most profitable months and seasons, aiding in inventory and marketing planning.
- Average Order Products: Grouping by customer city and other factors, we uncovered variations in order sizes and preferences across different regions.
Tools and Technologies: Database: SQL-based ecommerce database Programming Language: Python Libraries: Pandas, Matplotlib, Seaborn, mysql.connector Environment: Jupyter Notebook