hecola

hecola

6 followers · 98 following

Achievements

Highlights

Lists (1)

Sort

🔮 Future ideas

1 repository

Starred repositories

50 stars written in Python

Clear filter

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 155,066 31,732 Updated Jan 14, 2026

karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 51,969 8,720 Updated Nov 12, 2025

hpcaitech / ColossalAI

Making large AI models cheaper, faster and more accessible

Python 41,318 4,542 Updated Dec 22, 2025

tinygrad / tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 31,127 3,829 Updated Jan 14, 2026

sgl-project / sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 22,440 4,048 Updated Jan 14, 2026

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 16,354 1,008 Updated Jan 11, 2026

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,632 2,008 Updated Jan 14, 2026

GeeeekExplorer / nano-vllm

Nano vLLM

Python 10,756 1,381 Updated Nov 3, 2025

triton-inference-server / server

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10,238 1,701 Updated Jan 14, 2026

hoochanlon / hamulete

🏔️国立台湾大学、新加坡国立大学、早稻田大学、东京大学，中央研究院（台湾）以及中国重点高校及科研机构，社科、经济、数学、博弈论、哲学、系统工程类学术论文等知识库。

Python 9,314 1,893 Updated Jan 6, 2026

NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Python 8,891 1,508 Updated Jan 14, 2026

timercrack / trader

交易模块

Python 7,639 1,754 Updated Sep 10, 2025

tile-ai / tilelang

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 4,697 395 Updated Jan 14, 2026

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Python 4,650 646 Updated Jan 14, 2026

gpustack / gpustack

Performance-Optimized AI Inference on Your GPUs. Unlock it by selecting and tuning the optimal inference engine for your model.

Python 4,373 443 Updated Jan 14, 2026

skyzh / tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 3,675 257 Updated Dec 18, 2025

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada and Blackwell GPUs, to provide better performance…

Python 3,087 607 Updated Jan 13, 2026