This repository contains Jupyter notebooks, code snippets, and practical examples featured in my articles published on Medium and Towards Data Science. The goal is to bridge the gap between theory and implementation with production-ready examples.
-
๐๏ธ Plan: Building a Mini Query Engine in Python The chapter-by-chapter plan for a draft-first series on implementing a mini query engine in Python
-
๐ฅ Practical Introduction to Polars High-performance data wrangling with Python and Polars
-
๐งน Improving Code Quality During Data Transformation with Polars Best practices for clean, testable, and reusable Polars code
-
๐ฆ ClickHouse as Part of the ETL/ELT Process Using ClickHouse for efficient analytics in modern data stacks
-
๐ง The Hidden Treasure of _delta_log Unlocking performance insights from Delta Lake internals
-
๐ง How to Update ClickHouse Tables in Production Without Downtime Techniques for safe schema evolution in ClickHouse
- Polars, ClickHouse, Delta Lake, Temporal, Apache Arrow
- ETL/ELT design, Data Lake Architecture, Data Quality & Observability
- Performance optimization, Data modeling, Streaming & Batch processing
Each folder in this repository corresponds to a specific article and contains fully reproducible code.
This repository serves as a hands-on companion to my writing. Dive in, run the examples, and build better data systems.
๐ More content: