v4if

🎯

Focusing

v4if

🎯

Focusing

RiseAI-Sys

41 followers · 185 following

Achievements

Organizations

Lists (11)

Sort

Stars

43 stars written in Python

Clear filter

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 38,262 4,586 Updated Jan 18, 2026

microsoft / agent-lightning

The absolute trainer to light up AI agents.

Python 11,886 973 Updated Jan 27, 2026

GeeeekExplorer / nano-vllm

Nano vLLM

Python 11,210 1,468 Updated Nov 3, 2025

vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 8,979 974 Updated Jul 8, 2025

OpenPipe / ART

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Python 8,452 696 Updated Jan 29, 2026

kernc / backtesting.py

🔎 📈 🐍 💰 Backtest trading strategies in Python.

Python 7,856 1,388 Updated Dec 20, 2025

p-christ / Deep-Reinforcement-Learning-Algorithms-with-PyTorch

PyTorch implementations of deep reinforcement learning algorithms and environments

Python 5,921 1,211 Updated Jul 25, 2024

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,606 494 Updated Oct 27, 2025

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,950 337 Updated Jan 18, 2026

KellerJordan / modded-nanogpt

NanoGPT (124M) in 2 minutes

Python 4,507 593 Updated Jan 29, 2026

skyzh / tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 3,733 262 Updated Dec 18, 2025

THUDM / slime

slime is an LLM post-training framework for RL Scaling.

Python 3,577 471 Updated Jan 29, 2026

XinJingHao / DRL-Pytorch

Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Python 3,257 389 Updated Jun 11, 2025

seungeunrho / minimalRL

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

Python 3,132 485 Updated Apr 22, 2023

alibaba / ROLL

An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models

Python 2,728 207 Updated Jan 29, 2026

huggingface / picotron

Minimalistic 4D-parallelism distributed training framework for education purpose

Python 2,037 163 Updated Aug 26, 2025

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 1,442 70 Updated Feb 8, 2025

NVIDIA-NeMo / RL

Scalable toolkit for efficient model reinforcement

Python 1,270 231 Updated Jan 29, 2026

sail-sg / understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1,199 56 Updated Aug 27, 2025

volcengine / veScale

Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs

Python 923 54 Updated Nov 27, 2025

TideDra / lmm-r1

Extend OpenRLHF to support LMM RL training for reproduction of DeepSeek-R1 on multimodal tasks.

Python 839 53 Updated May 14, 2025

Osilly / Vision-R1

[ICLR2026] This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incen…

Python 753 20 Updated Jan 26, 2026