diff --git a/LeetCode SQL 50 Solution/1075. Project Employees I/readme.md b/LeetCode SQL 50 Solution/1075. Project Employees I/readme.md index e69de29..4c1d8da 100644 --- a/LeetCode SQL 50 Solution/1075. Project Employees I/readme.md +++ b/LeetCode SQL 50 Solution/1075. Project Employees I/readme.md @@ -0,0 +1,131 @@ + +# πŸ† Project Employees I - LeetCode 1075 + +## πŸ“Œ Problem Statement +You are given two tables: **Project** and **Employee**. + +### Project Table +| Column Name | Type | +| ----------- | ---- | +| project_id | int | +| employee_id | int | + +- `(project_id, employee_id)` is the primary key of this table. +- `employee_id` is a foreign key referencing the `Employee` table. + +### Employee Table +| Column Name | Type | +| ---------------- | ------- | +| employee_id | int | +| name | varchar | +| experience_years | int | + +- `employee_id` is the primary key. +- `experience_years` is guaranteed to be **NOT NULL**. + +The task is to **return the average experience years of all employees for each project, rounded to 2 decimal places**. + +--- + +## πŸ“Š Example 1: +### Input: +**Project Table** +| project_id | employee_id | +| ---------- | ----------- | +| 1 | 1 | +| 1 | 2 | +| 1 | 3 | +| 2 | 1 | +| 2 | 4 | + +**Employee Table** +| employee_id | name | experience_years | +| ----------- | ------ | ---------------- | +| 1 | Khaled | 3 | +| 2 | Ali | 2 | +| 3 | John | 1 | +| 4 | Doe | 2 | + +### Output: +| project_id | average_years | +| ---------- | ------------- | +| 1 | 2.00 | +| 2 | 2.50 | + +### Explanation: +- **Project 1:** `(3 + 2 + 1) / 3 = 2.00` +- **Project 2:** `(3 + 2) / 2 = 2.50` + +--- + +## πŸ–₯ SQL Solutions + +### 1️⃣ Standard MySQL Solution +#### Explanation: +- We **JOIN** the `Project` and `Employee` tables using `employee_id`. +- We **calculate the average** of `experience_years` for each `project_id`. +- We **round** the result to **two decimal places**. + +```sql +SELECT project_id, ROUND(AVG(experience_years), 2) AS average_years +FROM project AS p +LEFT JOIN employee AS e +ON p.employee_id = e.employee_id +GROUP BY project_id; +``` + +--- + +### 2️⃣ Window Function (SQL) Solution +#### Explanation: +- Using **window functions**, we calculate the `AVG(experience_years)` over a **partitioned** dataset. + +```sql +SELECT DISTINCT project_id, + ROUND(AVG(experience_years) OVER (PARTITION BY project_id), 2) AS average_years +FROM project AS p +JOIN employee AS e +ON p.employee_id = e.employee_id; +``` + +--- + +## 🐍 Pandas Solution (Python) +#### Explanation: +- We read both tables into Pandas **DataFrames**. +- We merge the tables on `employee_id`. +- We group by `project_id` and compute the mean. +- We round the output to 2 decimal places. + +```python +import pandas as pd + +def project_average_experience(project: pd.DataFrame, employee: pd.DataFrame) -> pd.DataFrame: + df = project.merge(employee, on="employee_id") + result = df.groupby("project_id")["experience_years"].mean().round(2).reset_index() + result.columns = ["project_id", "average_years"] + return result +``` + +--- + +## πŸ“ File Structure +``` +πŸ“‚ Project-Employees-I +│── πŸ“œ README.md +│── πŸ“œ solution.sql +│── πŸ“œ solution_window.sql +│── πŸ“œ solution_pandas.py +│── πŸ“œ test_cases.sql +``` + +--- + +## πŸ”— Useful Links +- πŸ“– [LeetCode Problem](https://leetcode.com/problems/project-employees-i/) +- πŸ“š [SQL Joins Explanation](https://www.w3schools.com/sql/sql_join.asp) +- 🐍 [Pandas Documentation](https://pandas.pydata.org/docs/) + +--- + +## Let me know if you need any modifications! πŸš€ \ No newline at end of file diff --git a/LeetCode SQL 50 Solution/1141. User Activity for the Past 30 Days I/readme.md b/LeetCode SQL 50 Solution/1141. User Activity for the Past 30 Days I/readme.md index e69de29..fd44f51 100644 --- a/LeetCode SQL 50 Solution/1141. User Activity for the Past 30 Days I/readme.md +++ b/LeetCode SQL 50 Solution/1141. User Activity for the Past 30 Days I/readme.md @@ -0,0 +1,135 @@ + +# πŸ“Š User Activity for the Past 30 Days I - LeetCode 1141 + +## πŸ“Œ Problem Statement +You are given the **Activity** table that records user activities on a social media website. + +### Activity Table +| Column Name | Type | +| ------------- | ---- | +| user_id | int | +| session_id | int | +| activity_date | date | +| activity_type | enum | + +- The `activity_type` column is an ENUM of **('open_session', 'end_session', 'scroll_down', 'send_message')**. +- Each session belongs to exactly **one user**. +- The table **may have duplicate rows**. + +### Task: +Find the **daily active user count** for a period of **30 days ending 2019-07-27 inclusively**. +- A user is considered **active on a given day** if they made at least **one activity**. +- Ignore days with **zero active users**. + +--- + +## πŸ“Š Example 1: +### Input: +**Activity Table** +| user_id | session_id | activity_date | activity_type | +| ------- | ---------- | ------------- | ------------- | +| 1 | 1 | 2019-07-20 | open_session | +| 1 | 1 | 2019-07-20 | scroll_down | +| 1 | 1 | 2019-07-20 | end_session | +| 2 | 4 | 2019-07-20 | open_session | +| 2 | 4 | 2019-07-21 | send_message | +| 2 | 4 | 2019-07-21 | end_session | +| 3 | 2 | 2019-07-21 | open_session | +| 3 | 2 | 2019-07-21 | send_message | +| 3 | 2 | 2019-07-21 | end_session | +| 4 | 3 | 2019-06-25 | open_session | +| 4 | 3 | 2019-06-25 | end_session | + +### Output: +| day | active_users | +| ---------- | ------------ | +| 2019-07-20 | 2 | +| 2019-07-21 | 2 | + +### Explanation: +- **2019-07-20**: Users **1 and 2** were active. +- **2019-07-21**: Users **2 and 3** were active. +- **Days with zero active users are ignored**. + +--- + +## πŸ–₯ SQL Solutions + +### 1️⃣ Standard MySQL Solution +#### Explanation: +- **Filter records** for the last **30 days** (ending on `2019-07-27`). +- Use `COUNT(DISTINCT user_id)` to count **unique active users per day**. +- Ignore **days with zero active users**. + +```sql +SELECT + activity_date AS day, + COUNT(DISTINCT user_id) AS active_users +FROM + Activity +WHERE + DATEDIFF('2019-07-27', activity_date) < 30 + AND DATEDIFF('2019-07-27', activity_date) >= 0 +GROUP BY activity_date; +``` + +--- + +### 2️⃣ Alternative Solution Using `BETWEEN` +#### Explanation: +- This solution filters the date range using `BETWEEN` instead of `DATEDIFF`. + +```sql +SELECT + activity_date AS day, + COUNT(DISTINCT user_id) AS active_users +FROM + Activity +WHERE + activity_date BETWEEN DATE_SUB('2019-07-27', INTERVAL 29 DAY) AND '2019-07-27' +GROUP BY activity_date; +``` + +--- + +## 🐍 Pandas Solution (Python) +#### Explanation: +- Filter activity records for the **last 30 days**. +- **Group by `activity_date`** and count **unique `user_id`s**. +- **Ignore days with zero active users**. + +```python +import pandas as pd + +def daily_active_users(activity: pd.DataFrame) -> pd.DataFrame: + # Filter data within the last 30 days (ending on '2019-07-27') + filtered = activity[(activity["activity_date"] >= "2019-06-28") & (activity["activity_date"] <= "2019-07-27")] + + # Group by day and count unique users + result = filtered.groupby("activity_date")["user_id"].nunique().reset_index() + + # Rename columns + result.columns = ["day", "active_users"] + return result +``` + +--- + +## πŸ“ File Structure +``` +πŸ“‚ User-Activity-Past-30-Days +│── πŸ“œ README.md +│── πŸ“œ solution.sql +│── πŸ“œ solution_between.sql +│── πŸ“œ solution_pandas.py +│── πŸ“œ test_cases.sql +``` + +--- + +## πŸ”— Useful Links +- πŸ“– [LeetCode Problem](https://leetcode.com/problems/user-activity-for-the-past-30-days-i/) +- πŸ“š [SQL Date Functions](https://www.w3schools.com/sql/sql_dates.asp) +- 🐍 [Pandas Documentation](https://pandas.pydata.org/docs/) + +## Let me know if you need any changes! πŸš€ \ No newline at end of file diff --git a/LeetCode SQL 50 Solution/1164. Product Price at a Given Date/readme.md b/LeetCode SQL 50 Solution/1164. Product Price at a Given Date/readme.md index e69de29..e4aaf10 100644 --- a/LeetCode SQL 50 Solution/1164. Product Price at a Given Date/readme.md +++ b/LeetCode SQL 50 Solution/1164. Product Price at a Given Date/readme.md @@ -0,0 +1,179 @@ + + +# **1164. Product Price at a Given Date** + +## **Problem Statement** +You are given the **Products** table, which keeps track of price changes. + +### **Products Table** +``` ++------------+-----------+-------------+ +| product_id | new_price | change_date | ++------------+-----------+-------------+ +| int | int | date | ++------------+-----------+-------------+ +``` +- `(product_id, change_date)` is the **primary key**. +- Each row represents a price update for a product on a specific date. + +### **Task:** +Find the price of all products on **2019-08-16**. +Assume the **initial price of all products is 10** before any change occurs. + +--- + +## **Example 1:** + +### **Input:** +**Products Table** +``` ++------------+-----------+-------------+ +| product_id | new_price | change_date | ++------------+-----------+-------------+ +| 1 | 20 | 2019-08-14 | +| 2 | 50 | 2019-08-14 | +| 1 | 30 | 2019-08-15 | +| 1 | 35 | 2019-08-16 | +| 2 | 65 | 2019-08-17 | +| 3 | 20 | 2019-08-18 | ++------------+-----------+-------------+ +``` + +### **Output:** +``` ++------------+-------+ +| product_id | price | ++------------+-------+ +| 2 | 50 | +| 1 | 35 | +| 3 | 10 | ++------------+-------+ +``` + +### **Explanation:** +- **Product 1:** Last change before `2019-08-16` β†’ **35** +- **Product 2:** Last change before `2019-08-16` β†’ **50** +- **Product 3:** **No price change before 2019-08-16**, so default price is **10** + +--- + +## **SQL Solutions** + +### **1️⃣ Standard MySQL Solution** +```sql +SELECT + p.product_id, + COALESCE(( + SELECT new_price + FROM Products + WHERE product_id = p.product_id + AND change_date <= '2019-08-16' + ORDER BY change_date DESC + LIMIT 1 + ), 10) AS price +FROM + (SELECT DISTINCT product_id FROM Products) p; +``` +#### **Explanation:** +1. **Find the last price before or on `2019-08-16`** + - `ORDER BY change_date DESC LIMIT 1` β†’ Gets the most recent price before `2019-08-16`. +2. **Use `COALESCE()`** + - If no price exists, set default price **10**. +3. **Use `DISTINCT product_id`** + - Ensures all unique products are checked. + +--- + +### **2️⃣ Window Function (SQL) Solution** +```sql +# Write your MySQL query statement below +# Write your MySQL query statement below +WITH + T AS (SELECT DISTINCT product_id FROM Products), + P AS ( + SELECT product_id, new_price AS price + FROM Products + WHERE + (product_id, change_date) IN ( + SELECT product_id, MAX(change_date) AS change_date + FROM Products + WHERE change_date <= '2019-08-16' + GROUP BY 1 + ) + ) +SELECT product_id, IFNULL(price, 10) AS price +FROM + T + LEFT JOIN P USING (product_id); +``` +#### **Explanation:** +1. **`RANK() OVER (PARTITION BY product_id ORDER BY change_date DESC)`** + - Assigns **rank 1** to the last price before `2019-08-16`. +2. **`LEFT JOIN` with `DISTINCT product_id`** + - Ensures all products are included. +3. **Use `COALESCE(price, 10)`** + - If no price exists, set default **10**. + +--- + +## **Pandas Solution (Python)** +```python +import pandas as pd + +# Sample Data +products_data = { + 'product_id': [1, 2, 1, 1, 2, 3], + 'new_price': [20, 50, 30, 35, 65, 20], + 'change_date': ['2019-08-14', '2019-08-14', '2019-08-15', '2019-08-16', '2019-08-17', '2019-08-18'] +} + +# Create DataFrame +products_df = pd.DataFrame(products_data) +products_df['change_date'] = pd.to_datetime(products_df['change_date']) # Convert to datetime + +# Filter for changes before or on '2019-08-16' +valid_prices = products_df[products_df['change_date'] <= '2019-08-16'] + +# Get the latest price for each product before '2019-08-16' +latest_prices = valid_prices.sort_values(by=['product_id', 'change_date']).groupby('product_id').last().reset_index() + +# Rename column +latest_prices = latest_prices[['product_id', 'new_price']].rename(columns={'new_price': 'price'}) + +# Get all unique products +all_products = products_df[['product_id']].drop_duplicates() + +# Merge with latest prices and fill missing values with 10 +final_prices = all_products.merge(latest_prices, on='product_id', how='left').fillna({'price': 10}) + +print(final_prices) +``` + +### **Explanation:** +1. **Convert `change_date` to datetime** + - Ensures proper date comparison. +2. **Filter for prices before `2019-08-16`** + - Excludes future price changes. +3. **Get the latest price per product (`groupby().last()`)** + - Retrieves the most recent price change. +4. **Merge with all products and set missing prices to `10`** + - Ensures all products are included. + +--- + +## **File Structure** +``` +LeetCode1164/ +β”œβ”€β”€ problem_statement.md # Contains the problem description and constraints. +β”œβ”€β”€ sql_solution.sql # Contains the SQL solutions (Standard + Window Functions). +β”œβ”€β”€ pandas_solution.py # Contains the Pandas solution. +β”œβ”€β”€ README.md # Overview of the problem and available solutions. +``` + +--- + +## **Useful Links** +- [LeetCode Problem 1164](https://leetcode.com/problems/product-price-at-a-given-date/) +- [SQL COALESCE Documentation](https://www.w3schools.com/sql/sql_coalesce.asp) +- [Pandas GroupBy Documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html) + diff --git a/LeetCode SQL 50 Solution/1174. Immediate Food Delivery II/readme.md b/LeetCode SQL 50 Solution/1174. Immediate Food Delivery II/readme.md index e69de29..ad1e800 100644 --- a/LeetCode SQL 50 Solution/1174. Immediate Food Delivery II/readme.md +++ b/LeetCode SQL 50 Solution/1174. Immediate Food Delivery II/readme.md @@ -0,0 +1,186 @@ + + +# **1174. Immediate Food Delivery II** + +## **Problem Statement** +You are given a table `Delivery` that records food deliveries made to customers. Each row represents an order with the date it was placed and the customer’s preferred delivery date. + +--- + +## **Delivery Table** +``` ++-------------+-------------+------------+-----------------------------+ +| Column Name | Type | Description | ++-------------+-------------+----------------------------------------------+ +| delivery_id | int | Unique identifier for the delivery | +| customer_id | int | Identifier for the customer | +| order_date | date | Date when the order was placed | +| customer_pref_delivery_date | date | Customer’s preferred delivery date | ++-------------+-------------+----------------------------------------------+ +``` +- `delivery_id` is the **primary key**. +- Each customer specifies a preferred delivery date, which can be the same as or after the order date. + +--- + +## **Task:** +Calculate the **percentage** of customers whose **first order** is **immediate** (i.e., the order date is the same as the customer’s preferred delivery date). +- A customer’s **first order** is defined as the order with the **earliest order_date** for that customer. +- The result should be **rounded to 2 decimal places**. +- Return the percentage as `immediate_percentage`. + +--- + +## **Example 1:** + +### **Input:** +**Delivery Table** +``` ++-------------+-------------+------------+-----------------------------+ +| delivery_id | customer_id | order_date | customer_pref_delivery_date | ++-------------+-------------+------------+-----------------------------+ +| 1 | 1 | 2019-08-01 | 2019-08-02 | +| 2 | 2 | 2019-08-02 | 2019-08-02 | +| 3 | 1 | 2019-08-11 | 2019-08-12 | +| 4 | 3 | 2019-08-24 | 2019-08-24 | +| 5 | 3 | 2019-08-21 | 2019-08-22 | +| 6 | 2 | 2019-08-11 | 2019-08-13 | +| 7 | 4 | 2019-08-09 | 2019-08-09 | ++-------------+-------------+------------+-----------------------------+ +``` + +### **Output:** +``` ++----------------------+ +| immediate_percentage | ++----------------------+ +| 50.00 | ++----------------------+ +``` + +### **Explanation:** +- **Customer 1:** First order is on **2019-08-01** (preferred: 2019-08-02) β†’ **Scheduled** +- **Customer 2:** First order is on **2019-08-02** (preferred: 2019-08-02) β†’ **Immediate** +- **Customer 3:** First order is on **2019-08-21** (preferred: 2019-08-22) β†’ **Scheduled** +- **Customer 4:** First order is on **2019-08-09** (preferred: 2019-08-09) β†’ **Immediate** + +Out of 4 customers, 2 have immediate first orders. +Percentage = (2 / 4) * 100 = **50.00** + +--- + +## **SQL Solutions** + +### **1️⃣ Standard MySQL Solution** +```sql +SELECT + ROUND(100 * SUM(CASE + WHEN first_orders.order_date = first_orders.customer_pref_delivery_date THEN 1 + ELSE 0 + END) / COUNT(*), 2) AS immediate_percentage +FROM ( + -- Get the first order (earliest order_date) for each customer + SELECT customer_id, order_date, customer_pref_delivery_date + FROM Delivery + WHERE (customer_id, order_date) IN ( + SELECT customer_id, MIN(order_date) + FROM Delivery + GROUP BY customer_id + ) +) AS first_orders; +``` + +#### **Explanation:** +- **Subquery:** Retrieves the first order for each customer by selecting the minimum `order_date`. +- **Outer Query:** + - Uses a `CASE` statement to check if the `order_date` equals `customer_pref_delivery_date` (i.e., immediate order). + - Calculates the percentage of immediate first orders. + - Rounds the result to 2 decimal places. + +--- + +### **2️⃣ Window Function (SQL) Solution** +```sql +WITH RankedOrders AS ( + SELECT + customer_id, + order_date, + customer_pref_delivery_date, + ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date) AS rn + FROM Delivery +) +SELECT + ROUND(100 * SUM(CASE WHEN order_date = customer_pref_delivery_date THEN 1 ELSE 0 END) / COUNT(*), 2) AS immediate_percentage +FROM RankedOrders +WHERE rn = 1; +``` + +#### **Explanation:** +- **CTE `RankedOrders`:** + - Uses `ROW_NUMBER()` to rank orders for each customer by `order_date`. + - Filters for the first order of each customer (`rn = 1`). +- **Final SELECT:** + - Computes the percentage of first orders that are immediate. + - Rounds the result to 2 decimal places. + +--- + +## **Pandas Solution (Python)** +```python +import pandas as pd + +def immediate_food_delivery_percentage(delivery: pd.DataFrame) -> pd.DataFrame: + # Ensure order_date and customer_pref_delivery_date are in datetime format + delivery['order_date'] = pd.to_datetime(delivery['order_date']) + delivery['customer_pref_delivery_date'] = pd.to_datetime(delivery['customer_pref_delivery_date']) + + # Get the first order date for each customer + first_order = delivery.groupby('customer_id')['order_date'].min().reset_index() + first_order = first_order.rename(columns={'order_date': 'first_order_date'}) + + # Merge to get the corresponding preferred delivery date for the first order + merged = pd.merge(delivery, first_order, on='customer_id', how='inner') + first_orders = merged[merged['order_date'] == merged['first_order_date']] + + # Calculate immediate orders + immediate_count = (first_orders['order_date'] == first_orders['customer_pref_delivery_date']).sum() + total_customers = first_orders['customer_id'].nunique() + immediate_percentage = round(100 * immediate_count / total_customers, 2) + + return pd.DataFrame({'immediate_percentage': [immediate_percentage]}) + +# Example usage: +# df = pd.read_csv('delivery.csv') +# print(immediate_food_delivery_percentage(df)) +``` + +#### **Explanation:** +- **Convert Dates:** + - Convert `order_date` and `customer_pref_delivery_date` to datetime for accurate comparison. +- **Determine First Order:** + - Group by `customer_id` to find the minimum `order_date` as the first order. + - Merge with the original DataFrame to obtain details of the first order. +- **Calculate Percentage:** + - Count how many first orders are immediate (where `order_date` equals `customer_pref_delivery_date`). + - Compute the percentage and round to 2 decimal places. + +--- + +## **File Structure** +``` +LeetCode1174/ +β”œβ”€β”€ problem_statement.md # Contains the problem description and constraints. +β”œβ”€β”€ sql_standard_solution.sql # Contains the Standard MySQL solution. +β”œβ”€β”€ sql_window_solution.sql # Contains the Window Function solution. +β”œβ”€β”€ pandas_solution.py # Contains the Pandas solution. +β”œβ”€β”€ README.md # Overview of the problem and available solutions. +``` + +--- + +## **Useful Links** +- [LeetCode Problem 1174](https://leetcode.com/problems/immediate-food-delivery-ii/) +- [SQL GROUP BY Documentation](https://www.w3schools.com/sql/sql_groupby.asp) +- [SQL Window Functions](https://www.w3schools.com/sql/sql_window.asp) +- [Pandas GroupBy Documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html) +- [Pandas Merge Documentation](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html)