Giter VIP home page Giter VIP logo

cprasad01 / sql-challenge-case-study-2--pizza-runner Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 13 KB

Data cleaning and SQL queries analyze pizza delivery data, cleaning tables such as customer_orders and runner_orders, and extracting insights like pizza order counts, customer preferences, runner performance, delivery times, and ingredient optimization from the dataset.

dataanalytics datacleaning datatransformation sql

sql-challenge-case-study-2--pizza-runner's Introduction

SQL-Challenge-Case-Study-2--Pizza-Runner

🍕 Case Study #2 Pizza Runner

Image

📚 Table of Contents


Business Task

Danny is expanding his new Pizza Empire and at the same time, he wants to Uberize it, so Pizza Runner was launched!

Danny started by recruiting “runners” to deliver fresh pizza from Pizza Runner Headquarters (otherwise known as Danny’s house) and also maxed out his credit card to pay freelance developers to build a mobile app to accept orders from customers.

Entity Relationship Diagram

Pizza Runner

🧼 Data Cleaning & Transformation

🔨 Table: customer_orders

Looking at the customer_orders table below, we can see that there are

  • In the exclusions column, there are missing/ blank spaces ' ' and null values.
  • In the extras column, there are missing/ blank spaces ' ' and null values.

image

Our course of action to clean the table:

  • Create a temporary table with all the columns
  • Remove null values in exlusions and extras columns and replace with blank space ' '.
CREATE TEMP TABLE customer_orders_temp AS
SELECT 
  order_id, 
  customer_id, 
  pizza_id, 
  CASE
	  WHEN exclusions IS null OR exclusions LIKE 'null' THEN ' '
	  ELSE exclusions
	  END AS exclusions,
  CASE
	  WHEN extras IS NULL or extras LIKE 'null' THEN ' '
	  ELSE extras
	  END AS extras,
	order_time
FROM pizza_runner.customer_orders;

This is how the clean customers_orders_temp table looks like and we will use this table to run all our queries.

image


🔨 Table: runner_orders

Looking at the runner_orders table below, we can see that there are

  • In the exclusions column, there are missing/ blank spaces ' ' and null values.
  • In the extras column, there are missing/ blank spaces ' ' and null values

image

Our course of action to clean the table:

  • In pickup_time column, remove nulls and replace with blank space ' '.
  • In distance column, remove "km" and nulls and replace with blank space ' '.
  • In duration column, remove "minutes", "minute" and nulls and replace with blank space ' '.
  • In cancellation column, remove NULL and null and and replace with blank space ' '.
CREATE TEMP TABLE runner_orders_temp AS
SELECT 
  order_id, 
  runner_id,  
  CASE
	  WHEN pickup_time LIKE 'null' THEN ' '
	  ELSE pickup_time
	  END AS pickup_time,
  CASE
	  WHEN distance LIKE 'null' THEN ' '
	  WHEN distance LIKE '%km' THEN TRIM('km' from distance)
	  ELSE distance 
    END AS distance,
  CASE
	  WHEN duration LIKE 'null' THEN ' '
	  WHEN duration LIKE '%mins' THEN TRIM('mins' from duration)
	  WHEN duration LIKE '%minute' THEN TRIM('minute' from duration)
	  WHEN duration LIKE '%minutes' THEN TRIM('minutes' from duration)
	  ELSE duration
	  END AS duration,
  CASE
	  WHEN cancellation IS NULL or cancellation LIKE 'null' THEN ' '
	  ELSE cancellation
	  END AS cancellation
FROM pizza_runner.runner_orders;

Then, we alter the pickup_time, distance and duration columns to the correct data type.

ALTER TABLE runner_orders_temp
ALTER COLUMN pickup_time DATETIME,
ALTER COLUMN distance FLOAT,
ALTER COLUMN duration INT;

This is how the clean runner_orders_temp table looks like and we will use this table to run all our queries.

image


Solution

A. Pizza Metrics

1. How many pizzas were ordered?

SELECT COUNT(*) AS pizza_order_count
FROM customer_orders_temp;

Answer:

1*Ma9L4y6O_zhln6Wy7CdWMQ

  • Total of 14 pizzas were ordered.

2. How many unique customer orders were made?

SELECT COUNT(DISTINCT order_id) AS unique_order_count
FROM customer_orders_temp;

Answer:

image

  • There are 10 unique customer orders.

3. How many successful orders were delivered by each runner?

SELECT 
  runner_id, 
  COUNT(order_id) AS successful_orders
FROM #runner_orders
WHERE distance != 0
GROUP BY runner_id;

Answer:

image

  • Runner 1 has 4 successful delivered orders.
  • Runner 2 has 3 successful delivered orders.
  • Runner 3 has 1 successful delivered order.

4. How many of each type of pizza was delivered?

SELECT 
  p.pizza_name, 
  COUNT(c.pizza_id) AS delivered_pizza_count
FROM #customer_orders AS c
JOIN #runner_orders AS r
  ON c.order_id = r.order_id
JOIN pizza_names AS p
  ON c.pizza_id = p.pizza_id
WHERE r.distance != 0
GROUP BY p.pizza_name;

Answer:

image

  • There are 9 delivered Meatlovers pizzas and 3 Vegetarian pizzas.

5. How many Vegetarian and Meatlovers were ordered by each customer?**

SELECT 
  c.customer_id, 
  p.pizza_name, 
  COUNT(p.pizza_name) AS order_count
FROM #customer_orders AS c
JOIN pizza_names AS p
  ON c.pizza_id= p.pizza_id
GROUP BY c.customer_id, p.pizza_name
ORDER BY c.customer_id;

Answer:

image

  • Customer 101 ordered 2 Meatlovers pizzas and 1 Vegetarian pizza.
  • Customer 102 ordered 2 Meatlovers pizzas and 2 Vegetarian pizzas.
  • Customer 103 ordered 3 Meatlovers pizzas and 1 Vegetarian pizza.
  • Customer 104 ordered 1 Meatlovers pizza.
  • Customer 105 ordered 1 Vegetarian pizza.

6. What was the maximum number of pizzas delivered in a single order?

WITH pizza_count_cte AS
(
  SELECT 
    c.order_id, 
    COUNT(c.pizza_id) AS pizza_per_order
  FROM #customer_orders AS c
  JOIN #runner_orders AS r
    ON c.order_id = r.order_id
  WHERE r.distance != 0
  GROUP BY c.order_id
)

SELECT 
  MAX(pizza_per_order) AS pizza_count
FROM pizza_count_cte;

Answer:

image

  • Maximum number of pizza delivered in a single order is 3 pizzas.

7. For each customer, how many delivered pizzas had at least 1 change and how many had no changes?

SELECT 
  c.customer_id,
  SUM(
    CASE WHEN c.exclusions <> ' ' OR c.extras <> ' ' THEN 1
    ELSE 0
    END) AS at_least_1_change,
  SUM(
    CASE WHEN c.exclusions = ' ' AND c.extras = ' ' THEN 1 
    ELSE 0
    END) AS no_change
FROM #customer_orders AS c
JOIN #runner_orders AS r
  ON c.order_id = r.order_id
WHERE r.distance != 0
GROUP BY c.customer_id
ORDER BY c.customer_id;

Answer:

image

  • Customer 101 and 102 likes his/her pizzas per the original recipe.
  • Customer 103, 104 and 105 have their own preference for pizza topping and requested at least 1 change (extra or exclusion topping) on their pizza.

8. How many pizzas were delivered that had both exclusions and extras?

SELECT  
  SUM(
    CASE WHEN exclusions IS NOT NULL AND extras IS NOT NULL THEN 1
    ELSE 0
    END) AS pizza_count_w_exclusions_extras
FROM #customer_orders AS c
JOIN #runner_orders AS r
  ON c.order_id = r.order_id
WHERE r.distance >= 1 
  AND exclusions <> ' ' 
  AND extras <> ' ';

Answer:

image

  • Only 1 pizza delivered that had both extra and exclusion topping. That’s one fussy customer!

9. What was the total volume of pizzas ordered for each hour of the day?

SELECT 
  DATEPART(HOUR, [order_time]) AS hour_of_day, 
  COUNT(order_id) AS pizza_count
FROM #customer_orders
GROUP BY DATEPART(HOUR, [order_time]);

Answer:

image

  • Highest volume of pizza ordered is at 13 (1:00 pm), 18 (6:00 pm) and 21 (9:00 pm).
  • Lowest volume of pizza ordered is at 11 (11:00 am), 19 (7:00 pm) and 23 (11:00 pm).

10. What was the volume of orders for each day of the week?

SELECT 
  FORMAT(DATEADD(DAY, 2, order_time),'dddd') AS day_of_week, -- add 2 to adjust 1st day of the week as Monday
  COUNT(order_id) AS total_pizzas_ordered
FROM #customer_orders
GROUP BY FORMAT(DATEADD(DAY, 2, order_time),'dddd');

Answer:

image

  • There are 5 pizzas ordered on Friday and Monday.
  • There are 3 pizzas ordered on Saturday.
  • There is 1 pizza ordered on Sunday.

B. Runner and Customer Experience

1. How many runners signed up for each 1 week period? (i.e. week starts 2021-01-01)

SELECT 
  DATEPART(WEEK, registration_date) AS registration_week,
  COUNT(runner_id) AS runner_signup
FROM runners
GROUP BY DATEPART(WEEK, registration_date);

Answer:

image

  • On Week 1 of Jan 2021, 2 new runners signed up.
  • On Week 2 and 3 of Jan 2021, 1 new runner signed up.

2. What was the average time in minutes it took for each runner to arrive at the Pizza Runner HQ to pickup the order?

WITH time_taken_cte AS
(
  SELECT 
    c.order_id, 
    c.order_time, 
    r.pickup_time, 
    DATEDIFF(MINUTE, c.order_time, r.pickup_time) AS pickup_minutes
  FROM #customer_orders AS c
  JOIN #runner_orders AS r
    ON c.order_id = r.order_id
  WHERE r.distance != 0
  GROUP BY c.order_id, c.order_time, r.pickup_time
)

SELECT 
  AVG(pickup_minutes) AS avg_pickup_minutes
FROM time_taken_cte
WHERE pickup_minutes > 1;

Answer:

image

  • The average time taken in minutes by runners to arrive at Pizza Runner HQ to pick up the order is 15 minutes.

3. Is there any relationship between the number of pizzas and how long the order takes to prepare?

WITH prep_time_cte AS
(
  SELECT 
    c.order_id, 
    COUNT(c.order_id) AS pizza_order, 
    c.order_time, 
    r.pickup_time, 
    DATEDIFF(MINUTE, c.order_time, r.pickup_time) AS prep_time_minutes
  FROM #customer_orders AS c
  JOIN #runner_orders AS r
    ON c.order_id = r.order_id
  WHERE r.distance != 0
  GROUP BY c.order_id, c.order_time, r.pickup_time
)

SELECT 
  pizza_order, 
  AVG(prep_time_minutes) AS avg_prep_time_minutes
FROM prep_time_cte
WHERE prep_time_minutes > 1
GROUP BY pizza_order;

Answer:

image

  • On average, a single pizza order takes 12 minutes to prepare.
  • An order with 3 pizzas takes 30 minutes at an average of 10 minutes per pizza.
  • It takes 16 minutes to prepare an order with 2 pizzas which is 8 minutes per pizza — making 2 pizzas in a single order the ultimate efficiency rate.

4. Is there any relationship between the number of pizzas and how long the order takes to prepare?

SELECT 
  c.customer_id, 
  AVG(r.distance) AS avg_distance
FROM #customer_orders AS c
JOIN #runner_orders AS r
  ON c.order_id = r.order_id
WHERE r.duration != 0
GROUP BY c.customer_id;

Answer:

image

(Assuming that distance is calculated from Pizza Runner HQ to customer’s place)

  • Customer 104 stays the nearest to Pizza Runner HQ at average distance of 10km, whereas Customer 105 stays the furthest at 25km.

5. What was the difference between the longest and shortest delivery times for all orders?

Firstly, I'm going to filter results with non-null duration first just to have a feel. You can skip this step and go straight to the answer

SELECT 
  order_id, duration
FROM #runner_orders
WHERE duration not like ' ';

image

SELECT MAX(duration::NUMERIC) - MIN(duration::NUMERIC) AS delivery_time_difference
FROM runner_orders2
where duration not like ' ';

Answer:

image

  • The difference between longest (40 minutes) and shortest (10 minutes) delivery time for all orders is 30 minutes.

6. What was the average speed for each runner for each delivery and do you notice any trend for these values?

SELECT 
  r.runner_id, 
  c.customer_id, 
  c.order_id, 
  COUNT(c.order_id) AS pizza_count, 
  r.distance, (r.duration / 60) AS duration_hr , 
  ROUND((r.distance/r.duration * 60), 2) AS avg_speed
FROM #runner_orders AS r
JOIN #customer_orders AS c
  ON r.order_id = c.order_id
WHERE distance != 0
GROUP BY r.runner_id, c.customer_id, c.order_id, r.distance, r.duration
ORDER BY c.order_id;

Answer:

image

(Average speed = Distance in km / Duration in hour)

  • Runner 1’s average speed runs from 37.5km/h to 60km/h.
  • Runner 2’s average speed runs from 35.1km/h to 93.6km/h. Danny should investigate Runner 2 as the average speed has a 300% fluctuation rate!
  • Runner 3’s average speed is 40km/h

7. What is the successful delivery percentage for each runner?

SELECT 
  runner_id, 
  ROUND(100 * SUM(
    CASE WHEN distance = 0 THEN 0
    ELSE 1 END) / COUNT(*), 0) AS success_perc
FROM #runner_orders
GROUP BY runner_id;

Answer:

image

  • Runner 1 has 100% successful delivery.
  • Runner 2 has 75% successful delivery.
  • Runner 3 has 50% successful delivery

(It’s not right to attribute successful delivery to runners as order cancellations are out of the runner’s control.)


C. Ingredient Optimisation

1. What are the standard ingredients for each pizza?

2. What was the most commonly added extra?

WITH toppings_cte AS (
SELECT
  pizza_id,
  REGEXP_SPLIT_TO_TABLE(toppings, '[,\s]+')::INTEGER AS topping_id
FROM pizza_runner.pizza_recipes)

SELECT 
  t.topping_id, pt.topping_name, 
  COUNT(t.topping_id) AS topping_count
FROM toppings_cte t
INNER JOIN pizza_runner.pizza_toppings pt
  ON t.topping_id = pt.topping_id
GROUP BY t.topping_id, pt.topping_name
ORDER BY topping_count DESC;

Solution

image

sql-challenge-case-study-2--pizza-runner's People

Contributors

cprasad01 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.