Starred repositories
Reference implementation for Deep Unsupervised Learning using Nonequilibrium Thermodynamics
An autonomous agent for deep financial research
LLM agents built for control. Designed for real-world use. Deployed in minutes.
A Distributed, Fault-Tolerant Message Queue from Scratch. Inspired by Apache Kafka
Tongyi Deep Research, the Leading Open-source Deep Research Agent
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
A curated list of insanely awesome libraries, packages and resources for systematic trading. Crypto, Stock, Futures, Options, CFDs, FX, and more | 量化交易 | 量化投资
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
🐙 Guides, papers, lessons, notebooks and resources for prompt engineering, context engineering, RAG, and AI Agents.
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.
A framework for few-shot evaluation of language models.
Amazon Textract Code Samples
Repository to track the progress in Vietnamese Natural Language Processing, including the datasets and the current state-of-the-art for the most common Vietnamese NLP tasks.
A collection of Vietnamese Natural Language Processing resources.
A framework for training and evaluating AI models on a variety of openly available dialogue datasets.
Machine Learning Interviews from FAANG, Snapchat, LinkedIn. I have offers from Snapchat, Coupang, Stitchfix etc. Blog: mlengineer.io.
Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.
Code of the paper "Neighborhood Contrastive Learning Applied to Online Patient Monitoring"
BioBERT: a pre-trained biomedical language representation model for biomedical text mining


