Stars
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Efficient, check-pointed data loading for deep learning with massive data sets.
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphic…
A tiny scalar-valued autograd engine and a neural net library on top of it with PyTorch-like API
Python tool for converting files and office documents to Markdown.
D2 is a modern diagram scripting language that turns text to diagrams.
Starting code for the GildedRose Refactoring Kata in many programming languages.
A sophisticated exploration of dbt macro capabilities, pushing the boundaries of what's possible with dbt's macro system.
NumPy aware dynamic Python compiler using LLVM
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Software design principles for machine learning applications
🦆 A curated list of awesome DuckDB resources
A faster, open-license alternative to Microsoft TrueSkill
Container auto-configurations for Spring Boot based integration tests
Run, mock and test fake Snowflake databases locally.
Notes on books I read, talks I watch, articles I study, and papers I love
Opensource IDE For Exploring and Testing API's (lightweight alternative to Postman/Insomnia)
The data-validation toolkit for enhanced dbt (data build tool) PR review
21 Lessons, Get Started Building with Generative AI
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement …
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable,…
Statistical Rethinking Course for Jan-Mar 2023





