Skip to content
View Simba2017's full-sized avatar
  • china,jiangsu

Block or report Simba2017

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 36,904 4,382 Updated Jan 7, 2026

个人构建MoE大模型:从预训练到DPO的完整实践

Python 2,218 162 Updated Dec 30, 2025

Community maintained hardware plugin for vLLM on Ascend

Python 1,553 719 Updated Jan 11, 2026

Fully open reproduction of DeepSeek-R1

Python 25,802 2,407 Updated Nov 24, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,407 337 Updated Jan 5, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1 Updated Mar 6, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,901 312 Updated Mar 10, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,950 288 Updated May 15, 2025

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 9,297 916 Updated Jan 7, 2026

The PYthoN General UnIt Test geNerator is a test-generation tool for Python

Python 1,347 94 Updated Dec 9, 2025
Python 4,281 465 Updated Jul 31, 2025

LLaMA-Factory使用经验记录

Jupyter Notebook 41 5 Updated Aug 26, 2024
Python 1,067 49 Updated Jan 10, 2026

🙌 OpenHands: AI-Driven Development

Python 66,471 8,232 Updated Jan 11, 2026

verl: Volcano Engine Reinforcement Learning for LLMs

Python 18,216 2,997 Updated Jan 11, 2026

Minimal reproduction of DeepSeek R1-Zero

Python 12,588 1,544 Updated Apr 24, 2025

Everything about the SmolLM and SmolVLM family of models

Python 3,548 256 Updated Nov 20, 2025

Simple RL training for reasoning

Python 3,824 283 Updated Dec 23, 2025

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 65,262 6,577 Updated Nov 11, 2025

21 Lessons, Get Started Building with Generative AI

Jupyter Notebook 104,961 56,002 Updated Jan 5, 2026

DataComp for Language Models

HTML 1,405 129 Updated Sep 9, 2025

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,865 587 Updated May 3, 2024

The Open Cookbook for Top-Tier Code Large Language Model

Python 1,984 115 Updated Dec 8, 2024

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 5,725 313 Updated Jan 9, 2026

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 82,722 12,435 Updated Jan 10, 2026

✨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024

Python 184 11 Updated Aug 16, 2024

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,256 994 Updated Jul 1, 2024

Code for the paper "Language Models are Unsupervised Multitask Learners"

Python 24,544 5,849 Updated Aug 14, 2024

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 15,232 1,288 Updated May 23, 2024
Next