Perfect ✅ Here's GitHub-style SQL Exam Prep Cheatsheet that’s geared toward Data Engineering Certification.
It includes:
- 📌 Core SQL Concepts (syntax + quick examples)
- 📊 Data Engineering–Specific SQL Tasks
- 💡 Real-world business case problems (with step-by-step solutions)
- 🛠 Performance tips for big datasets
- 📝 Practice challenge section
(Prepare for Datacamp's Data Engineer Certification Exam)
🚀 A complete SQL quick-reference for Data Engineers preparing for DataCamp / Data Engineering certification. Includes syntax, examples, and real-world challenges.
-- Select specific columns
SELECT column1, column2
FROM table;
-- Filter rows
SELECT *
FROM employees
WHERE department = 'Sales'
AND salary > 50000;
-- Sort results
ORDER BY salary DESC, name ASC;
-- Limit output
LIMIT 10;-- INNER JOIN: Only matching rows
SELECT o.order_id, c.customer_name
FROM orders o
INNER JOIN customers c
ON o.customer_id = c.customer_id;
-- LEFT JOIN: All from left, match from right
SELECT c.customer_name, o.order_id
FROM customers c
LEFT JOIN orders o
ON c.customer_id = o.customer_id;
-- Multi-table join
SELECT o.order_id, c.customer_name, p.product_name
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN products p ON o.product_id = p.product_id;💡 Tip: Use table aliases (o, c, p) to keep queries clean.
SELECT department, COUNT(*) AS emp_count, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000
ORDER BY avg_salary DESC;-- Rank top products by sales per category
SELECT product_id, category,
RANK() OVER (PARTITION BY category ORDER BY SUM(sales) DESC) AS rank_in_cat
FROM sales_data
GROUP BY product_id, category;-- Common Table Expression
WITH high_value_orders AS (
SELECT order_id, customer_id, total_amount
FROM orders
WHERE total_amount > 1000
)
SELECT customer_id, COUNT(*) AS big_orders
FROM high_value_orders
GROUP BY customer_id;
-- Subquery example
SELECT name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);-- Handle NULLs
SELECT COALESCE(middle_name, 'N/A') AS middle_name_fixed
FROM employees;
-- Remove spaces
UPDATE employees
SET name = TRIM(name);
-- Change formats
SELECT TO_CHAR(order_date, 'YYYY-MM') AS order_month
FROM orders;SELECT order_id,
EXTRACT(YEAR FROM order_date) AS year,
EXTRACT(MONTH FROM order_date) AS month,
CURRENT_DATE AS today
FROM orders;- ✅ Use
EXPLAINto check query plans. - ✅ Select only needed columns (avoid
SELECT *in production). - ✅ Index join keys & filter columns.
- ✅ Use CTEs for step-by-step logic in complex pipelines.
Q1: Find the top 3 products by revenue in 2024. Q2: Identify customers who made no purchases in the last 6 months. Q3: Get the month-over-month growth rate of orders.
Dataset: orders, order_details, products, customers
Goal: Return:
- Customer name
- Total orders in 2024
- Total spend in 2024 Sort by total spend DESC.
💡 Hint:
SELECT c.customer_name,
COUNT(o.order_id) AS total_orders,
SUM(od.quantity * p.price) AS total_spend
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_details od ON o.order_id = od.order_id
JOIN products p ON od.product_id = p.product_id
WHERE EXTRACT(YEAR FROM o.order_date) = 2024
GROUP BY c.customer_name
ORDER BY total_spend DESC;Do you want me to add that section?