Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 131 additions & 0 deletions LeetCode SQL 50 Solution/1075. Project Employees I/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@

# 🏆 Project Employees I - LeetCode 1075

## 📌 Problem Statement
You are given two tables: **Project** and **Employee**.

### Project Table
| Column Name | Type |
| ----------- | ---- |
| project_id | int |
| employee_id | int |

- `(project_id, employee_id)` is the primary key of this table.
- `employee_id` is a foreign key referencing the `Employee` table.

### Employee Table
| Column Name | Type |
| ---------------- | ------- |
| employee_id | int |
| name | varchar |
| experience_years | int |

- `employee_id` is the primary key.
- `experience_years` is guaranteed to be **NOT NULL**.

The task is to **return the average experience years of all employees for each project, rounded to 2 decimal places**.

---

## 📊 Example 1:
### Input:
**Project Table**
| project_id | employee_id |
| ---------- | ----------- |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 4 |

**Employee Table**
| employee_id | name | experience_years |
| ----------- | ------ | ---------------- |
| 1 | Khaled | 3 |
| 2 | Ali | 2 |
| 3 | John | 1 |
| 4 | Doe | 2 |

### Output:
| project_id | average_years |
| ---------- | ------------- |
| 1 | 2.00 |
| 2 | 2.50 |

### Explanation:
- **Project 1:** `(3 + 2 + 1) / 3 = 2.00`
- **Project 2:** `(3 + 2) / 2 = 2.50`

---

## 🖥 SQL Solutions

### 1️⃣ Standard MySQL Solution
#### Explanation:
- We **JOIN** the `Project` and `Employee` tables using `employee_id`.
- We **calculate the average** of `experience_years` for each `project_id`.
- We **round** the result to **two decimal places**.

```sql
SELECT project_id, ROUND(AVG(experience_years), 2) AS average_years
FROM project AS p
LEFT JOIN employee AS e
ON p.employee_id = e.employee_id
GROUP BY project_id;
```

---

### 2️⃣ Window Function (SQL) Solution
#### Explanation:
- Using **window functions**, we calculate the `AVG(experience_years)` over a **partitioned** dataset.

```sql
SELECT DISTINCT project_id,
ROUND(AVG(experience_years) OVER (PARTITION BY project_id), 2) AS average_years
FROM project AS p
JOIN employee AS e
ON p.employee_id = e.employee_id;
```

---

## 🐍 Pandas Solution (Python)
#### Explanation:
- We read both tables into Pandas **DataFrames**.
- We merge the tables on `employee_id`.
- We group by `project_id` and compute the mean.
- We round the output to 2 decimal places.

```python
import pandas as pd

def project_average_experience(project: pd.DataFrame, employee: pd.DataFrame) -> pd.DataFrame:
df = project.merge(employee, on="employee_id")
result = df.groupby("project_id")["experience_years"].mean().round(2).reset_index()
result.columns = ["project_id", "average_years"]
return result
```

---

## 📁 File Structure
```
📂 Project-Employees-I
│── 📜 README.md
│── 📜 solution.sql
│── 📜 solution_window.sql
│── 📜 solution_pandas.py
│── 📜 test_cases.sql
```

---

## 🔗 Useful Links
- 📖 [LeetCode Problem](https://leetcode.com/problems/project-employees-i/)
- 📚 [SQL Joins Explanation](https://www.w3schools.com/sql/sql_join.asp)
- 🐍 [Pandas Documentation](https://pandas.pydata.org/docs/)

---

## Let me know if you need any modifications! 🚀
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@

# 📊 User Activity for the Past 30 Days I - LeetCode 1141

## 📌 Problem Statement
You are given the **Activity** table that records user activities on a social media website.

### Activity Table
| Column Name | Type |
| ------------- | ---- |
| user_id | int |
| session_id | int |
| activity_date | date |
| activity_type | enum |

- The `activity_type` column is an ENUM of **('open_session', 'end_session', 'scroll_down', 'send_message')**.
- Each session belongs to exactly **one user**.
- The table **may have duplicate rows**.

### Task:
Find the **daily active user count** for a period of **30 days ending 2019-07-27 inclusively**.
- A user is considered **active on a given day** if they made at least **one activity**.
- Ignore days with **zero active users**.

---

## 📊 Example 1:
### Input:
**Activity Table**
| user_id | session_id | activity_date | activity_type |
| ------- | ---------- | ------------- | ------------- |
| 1 | 1 | 2019-07-20 | open_session |
| 1 | 1 | 2019-07-20 | scroll_down |
| 1 | 1 | 2019-07-20 | end_session |
| 2 | 4 | 2019-07-20 | open_session |
| 2 | 4 | 2019-07-21 | send_message |
| 2 | 4 | 2019-07-21 | end_session |
| 3 | 2 | 2019-07-21 | open_session |
| 3 | 2 | 2019-07-21 | send_message |
| 3 | 2 | 2019-07-21 | end_session |
| 4 | 3 | 2019-06-25 | open_session |
| 4 | 3 | 2019-06-25 | end_session |

### Output:
| day | active_users |
| ---------- | ------------ |
| 2019-07-20 | 2 |
| 2019-07-21 | 2 |

### Explanation:
- **2019-07-20**: Users **1 and 2** were active.
- **2019-07-21**: Users **2 and 3** were active.
- **Days with zero active users are ignored**.

---

## 🖥 SQL Solutions

### 1️⃣ Standard MySQL Solution
#### Explanation:
- **Filter records** for the last **30 days** (ending on `2019-07-27`).
- Use `COUNT(DISTINCT user_id)` to count **unique active users per day**.
- Ignore **days with zero active users**.

```sql
SELECT
activity_date AS day,
COUNT(DISTINCT user_id) AS active_users
FROM
Activity
WHERE
DATEDIFF('2019-07-27', activity_date) < 30
AND DATEDIFF('2019-07-27', activity_date) >= 0
GROUP BY activity_date;
```

---

### 2️⃣ Alternative Solution Using `BETWEEN`
#### Explanation:
- This solution filters the date range using `BETWEEN` instead of `DATEDIFF`.

```sql
SELECT
activity_date AS day,
COUNT(DISTINCT user_id) AS active_users
FROM
Activity
WHERE
activity_date BETWEEN DATE_SUB('2019-07-27', INTERVAL 29 DAY) AND '2019-07-27'
GROUP BY activity_date;
```

---

## 🐍 Pandas Solution (Python)
#### Explanation:
- Filter activity records for the **last 30 days**.
- **Group by `activity_date`** and count **unique `user_id`s**.
- **Ignore days with zero active users**.

```python
import pandas as pd

def daily_active_users(activity: pd.DataFrame) -> pd.DataFrame:
# Filter data within the last 30 days (ending on '2019-07-27')
filtered = activity[(activity["activity_date"] >= "2019-06-28") & (activity["activity_date"] <= "2019-07-27")]

# Group by day and count unique users
result = filtered.groupby("activity_date")["user_id"].nunique().reset_index()

# Rename columns
result.columns = ["day", "active_users"]
return result
```

---

## 📁 File Structure
```
📂 User-Activity-Past-30-Days
│── 📜 README.md
│── 📜 solution.sql
│── 📜 solution_between.sql
│── 📜 solution_pandas.py
│── 📜 test_cases.sql
```

---

## 🔗 Useful Links
- 📖 [LeetCode Problem](https://leetcode.com/problems/user-activity-for-the-past-30-days-i/)
- 📚 [SQL Date Functions](https://www.w3schools.com/sql/sql_dates.asp)
- 🐍 [Pandas Documentation](https://pandas.pydata.org/docs/)

## Let me know if you need any changes! 🚀
Loading