|
| 1 | + |
| 2 | +# 📊 User Activity for the Past 30 Days I - LeetCode 1141 |
| 3 | + |
| 4 | +## 📌 Problem Statement |
| 5 | +You are given the **Activity** table that records user activities on a social media website. |
| 6 | + |
| 7 | +### Activity Table |
| 8 | +| Column Name | Type | |
| 9 | +| ------------- | ---- | |
| 10 | +| user_id | int | |
| 11 | +| session_id | int | |
| 12 | +| activity_date | date | |
| 13 | +| activity_type | enum | |
| 14 | + |
| 15 | +- The `activity_type` column is an ENUM of **('open_session', 'end_session', 'scroll_down', 'send_message')**. |
| 16 | +- Each session belongs to exactly **one user**. |
| 17 | +- The table **may have duplicate rows**. |
| 18 | + |
| 19 | +### Task: |
| 20 | +Find the **daily active user count** for a period of **30 days ending 2019-07-27 inclusively**. |
| 21 | +- A user is considered **active on a given day** if they made at least **one activity**. |
| 22 | +- Ignore days with **zero active users**. |
| 23 | + |
| 24 | +--- |
| 25 | + |
| 26 | +## 📊 Example 1: |
| 27 | +### Input: |
| 28 | +**Activity Table** |
| 29 | +| user_id | session_id | activity_date | activity_type | |
| 30 | +| ------- | ---------- | ------------- | ------------- | |
| 31 | +| 1 | 1 | 2019-07-20 | open_session | |
| 32 | +| 1 | 1 | 2019-07-20 | scroll_down | |
| 33 | +| 1 | 1 | 2019-07-20 | end_session | |
| 34 | +| 2 | 4 | 2019-07-20 | open_session | |
| 35 | +| 2 | 4 | 2019-07-21 | send_message | |
| 36 | +| 2 | 4 | 2019-07-21 | end_session | |
| 37 | +| 3 | 2 | 2019-07-21 | open_session | |
| 38 | +| 3 | 2 | 2019-07-21 | send_message | |
| 39 | +| 3 | 2 | 2019-07-21 | end_session | |
| 40 | +| 4 | 3 | 2019-06-25 | open_session | |
| 41 | +| 4 | 3 | 2019-06-25 | end_session | |
| 42 | + |
| 43 | +### Output: |
| 44 | +| day | active_users | |
| 45 | +| ---------- | ------------ | |
| 46 | +| 2019-07-20 | 2 | |
| 47 | +| 2019-07-21 | 2 | |
| 48 | + |
| 49 | +### Explanation: |
| 50 | +- **2019-07-20**: Users **1 and 2** were active. |
| 51 | +- **2019-07-21**: Users **2 and 3** were active. |
| 52 | +- **Days with zero active users are ignored**. |
| 53 | + |
| 54 | +--- |
| 55 | + |
| 56 | +## 🖥 SQL Solutions |
| 57 | + |
| 58 | +### 1️⃣ Standard MySQL Solution |
| 59 | +#### Explanation: |
| 60 | +- **Filter records** for the last **30 days** (ending on `2019-07-27`). |
| 61 | +- Use `COUNT(DISTINCT user_id)` to count **unique active users per day**. |
| 62 | +- Ignore **days with zero active users**. |
| 63 | + |
| 64 | +```sql |
| 65 | +SELECT |
| 66 | + activity_date AS day, |
| 67 | + COUNT(DISTINCT user_id) AS active_users |
| 68 | +FROM |
| 69 | + Activity |
| 70 | +WHERE |
| 71 | + DATEDIFF('2019-07-27', activity_date) < 30 |
| 72 | + AND DATEDIFF('2019-07-27', activity_date) >= 0 |
| 73 | +GROUP BY activity_date; |
| 74 | +``` |
| 75 | + |
| 76 | +--- |
| 77 | + |
| 78 | +### 2️⃣ Alternative Solution Using `BETWEEN` |
| 79 | +#### Explanation: |
| 80 | +- This solution filters the date range using `BETWEEN` instead of `DATEDIFF`. |
| 81 | + |
| 82 | +```sql |
| 83 | +SELECT |
| 84 | + activity_date AS day, |
| 85 | + COUNT(DISTINCT user_id) AS active_users |
| 86 | +FROM |
| 87 | + Activity |
| 88 | +WHERE |
| 89 | + activity_date BETWEEN DATE_SUB('2019-07-27', INTERVAL 29 DAY) AND '2019-07-27' |
| 90 | +GROUP BY activity_date; |
| 91 | +``` |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +## 🐍 Pandas Solution (Python) |
| 96 | +#### Explanation: |
| 97 | +- Filter activity records for the **last 30 days**. |
| 98 | +- **Group by `activity_date`** and count **unique `user_id`s**. |
| 99 | +- **Ignore days with zero active users**. |
| 100 | + |
| 101 | +```python |
| 102 | +import pandas as pd |
| 103 | + |
| 104 | +def daily_active_users(activity: pd.DataFrame) -> pd.DataFrame: |
| 105 | + # Filter data within the last 30 days (ending on '2019-07-27') |
| 106 | + filtered = activity[(activity["activity_date"] >= "2019-06-28") & (activity["activity_date"] <= "2019-07-27")] |
| 107 | + |
| 108 | + # Group by day and count unique users |
| 109 | + result = filtered.groupby("activity_date")["user_id"].nunique().reset_index() |
| 110 | + |
| 111 | + # Rename columns |
| 112 | + result.columns = ["day", "active_users"] |
| 113 | + return result |
| 114 | +``` |
| 115 | + |
| 116 | +--- |
| 117 | + |
| 118 | +## 📁 File Structure |
| 119 | +``` |
| 120 | +📂 User-Activity-Past-30-Days |
| 121 | +│── 📜 README.md |
| 122 | +│── 📜 solution.sql |
| 123 | +│── 📜 solution_between.sql |
| 124 | +│── 📜 solution_pandas.py |
| 125 | +│── 📜 test_cases.sql |
| 126 | +``` |
| 127 | + |
| 128 | +--- |
| 129 | + |
| 130 | +## 🔗 Useful Links |
| 131 | +- 📖 [LeetCode Problem](https://leetcode.com/problems/user-activity-for-the-past-30-days-i/) |
| 132 | +- 📚 [SQL Date Functions](https://www.w3schools.com/sql/sql_dates.asp) |
| 133 | +- 🐍 [Pandas Documentation](https://pandas.pydata.org/docs/) |
| 134 | + |
| 135 | +## Let me know if you need any changes! 🚀 |
0 commit comments