Company XYZ owns a supermarket chain across the country. Each major branch located in 3 cities across the country recorded sales information for 3 months, to help the company understand sales trends and determine its growth, as the rise of supermarkets competition is seen to increase.
- Automated reading and combining of CSV files from the three major branches through the use of glob
- Performed exploratory analysis on the data
- Transformed the date and time into a datetime data type, also engineered new features from the date and time
- Peformed more analysis on the categorical variables
- Through the appropriate use of groupby, derived more insights from the data
- Used different plots to generate more insights from the data
After analysing the datasets, I was able to discover that:
- All products that are sold in the three major branches of company XYZ are grouped into six categories, these are: Electronic accessories, Fashion accessories, Food and beverages, Health and beauty, Home and lifestyle, Sports and travel.
- Amongst the three cities, Port Harcourt generates the most revenue for Electronic accessories, Fashion accessories, Food and beverages, while Abuja generates the most revenue for Health and beauty, Sports and travel, Lagos generates the most revenue for Home and lifestyle products.
- Port Harcourt has the highest Gross income, while Abuja has the lowest leaving Lagos In between
- Branch A which is situated in Lagos has the highest sales record
- Port Harcourt generated the highest overall revenue out of the three cities
- Most Customers buying Food and beverages prefer to use their cards as a means of payment, while most customers buying Fashion accessories, Home and lifestyle, Health and beauty products prefer to pay for the products electronically, most customers buying Electronic accessories, Sports and travel products come to the supermarket with their money.
- Out of the three branches, branch B has the lowest rating from customers.
- Across the three branches, more Quantity of Food and beverages, Fashion accessories, Sports and travel, Home and lifestyle products were bought by females than males
- Across the three branches, more men bought the health and beauty product than the females.
- Females generated more of the revenue gotten from the food and beverages, home and lifestyle products, with the Fashion accessories, while males generated more of the revenue gotten from the health and beauty product than females.
- The difference between the revenue generated by males and females in the Electronic accessories with the sports and travel is not much.
- The products with the highest unit price are Fashion accessories with Sports and travel, then we have Food and beverages with Home and lifestyle having almost equal unit price, then Health and beauty while Electronic accessories has the lowest unit price.
- The products purchased the most by customers are Electronic accessories with Home and lifestyle, then we have Health and beauty, followed by Sports and travel then Food and beverages, Fashion accessories is the least product purchased by customers.
Build a machine learning model that would be able to predict the product that would generate the most revenue in each of the three branches at a particular point in time in company XYZ.
After analysing the sales trend in each of the three branches every fifteen days, I was able to discover that, on Average:
- Branch B experienced a decrease in sales in the first 15 days of January and throughout February but there was a slight increase in sales in March.
- Branch A had the highest revenue only in the first fifteen days of January.
- Branch C only experienced a decrease in sales in the first fifteen days of January and February.
- In branch B, they sell more in the morning and afternoon compared to the night
- In branch A,they sell the most in the afternoon, more in the morning and they sell the least at night
- In branch C, they sell the most at night and they sell the least in the morning