Awesome Candle

Anyone may include a project in this list if it uses huggingface/candle.

Inclusion in the list is not an endorsement.

Related Huggingface External Libraries

candle-extensions

Official Candle extensions for more specialized kernels, typically without backward equivalents but faster than raw Candle expressions

git: https://github.com/huggingface/candle-extensions

candle-cublaslt

CublasLt matmul operation for the Candle ML framework with support for bias and Relu/Gelu fusing

git: https://github.com/huggingface/candle-cublaslt

ratchet

Cross-platform browser ML framework leveraging WebGPU for inference with support for Whisper, Phi models, and quantization

git: https://github.com/huggingface/ratchet

candle-layer-norm

Fused Layer Norm operation adapted from Flash Attention with support for dropout, residual, RMSNorm, and hidden dimensions up to 8192

git: https://github.com/huggingface/candle-layer-norm

candle-rotary

Optimized rotary embeddings implementation adapted from vLLM project for efficient positional encoding in transformers

git: https://github.com/huggingface/candle-rotary

Third-Party Frameworks and Libraries

mistral.rs

Blazingly fast LLM inference platform with all-in-one multimodal workflow support for text, vision, audio, speech, image, and embeddings

git: https://github.com/EricLBuehler/mistral.rs

candle-vllm

Efficient platform for inference and serving local LLMs with OpenAI compatible API server

git: https://github.com/EricLBuehler/candle-vllm

candle-lora

Efficient and ergonomic LoRA implementation for Candle with out-of-the-box support for many models

git: https://github.com/EricLBuehler/candle-lora

candle-sampling

Sampling techniques for Candle including multinomial, top-k, top-p, logprobs, repeat penalty, and logit bias

git: https://github.com/EricLBuehler/candle-sampling

candle-ext

Extension library adding PyTorch functions not currently available in Candle

git: https://github.com/EricLBuehler/candle-ext

floneum

Runtime for quantized ML inference using WGPU to run models on any accelerator natively or in the browser

git: https://github.com/floneum/floneum

candle-optimisers

Collection of optimizers including SGD with momentum, AdaGrad, AdaDelta, AdaMax, Adam, AdamW, NAdam, RAdam, and RMSprop

git: https://github.com/KGrewal1/candle-optimisers

atoma-infer

The atoma-infer repository is a collection of optimized infrastructure for serving Large Language Models (LLMs) compute. We rely on highly optimized KV cache memory management software, through block pagination, such as PagedAttention and FlashAttention2. The codebase is mostly written in the Rust programming language, which leads to safe and highly optimized inference request scheduling, enhancing LLM inference serving.

git: https://github.com/AtomaAI/atoma-infer

burn-candle

Candle backend for the Burn deep learning framework, enabling Burn to leverage Candle's performance

git: https://github.com/tracel-ai/burn

candle_embed

Simple CUDA or CPU powered library for creating vector embeddings using Candle and Hugging Face models

git: https://github.com/ShelbyJenkins/candle_embed

glowrs

Candle-based sentence embedder library and server with OpenAI-compatible API for sentence transformers

git: https://github.com/wdoppenberg/glowrs

cake

Distributed LLM and Stable Diffusion inference framework for mobile, desktop and server with support for LLaMA3

git: https://github.com/evilsocket/cake

candle-gcu

Port of Candle ML framework to Enflame GCU platform for deep learning inference on Enflame hardware accelerators

git: https://github.com/EnflameTechnology/candle-gcu

candle-vllm-gcu

Large language model inference and chat service framework for Enflame GCU built on Candle-GCU and candle-vllm

git: https://github.com/EnflameTechnology/candle-vllm-gcu

rag-toolchain

Rust native library for Gen AI workflows providing seamless access to RAG pipelines and embeddings

git

candle-rwkv

RWKV models and inference implementation with quantization support in Candle for efficient recurrent neural networks

git: https://github.com/nkypy/candle-rwkv

ort-candle

Integration layer between ONNX Runtime and Candle for hardware-accelerated inference with cross-platform support

git: https://github.com/pykeio/ort

Standalone Applications

moshi-candle

Kyutai's Moshi speech-text foundation model implementation in Rust/Candle with int8 and bf16 quantization for full-duplex spoken dialogue

git: https://github.com/kyutai-labs/moshi

vibevoice-rs

Text To Speech interface implemented in pure Rust using Candle over Axum with a Tauri/Leptos WASM frontend

git: https://github.com/danielclough/vibevoice-rs

fireside-chat

LLM chat interface implemented in pure Rust using Candle over Axum WebSockets with an SQL database and Tauri/Leptos WASM frontend

git: https://github.com/danielclough/fireside-chat

candle-video

candle-video is a Rust-native implementation of video generation models, targeting deployment scenarios where startup time, binary size, and memory efficiency matter. It provides inference for state-of-the-art text-to-video models without requiring a Python runtime.

git: https://github.com/FerrisMind/candle-video

llama-candle-rs

Tutorial project demonstrating Llama inference on GPU using Candle with GGUF format support

git: https://github.com/philschmid/llama-candle-rs

atoma-node

Core infrastructure for confidential computing in distributed AI systems using Candle for inference operations

git: https://github.com/AtomaAI/atoma-node

rust-candle-demos

Demo projects showcasing Candle capabilities on GPU instances (AWS Deep Learning AMI)

git: https://github.com/nogibjj/rust-candle-demos

screenpipe

24/7 local screen and audio recording application using Candle for OCR, voice activity detection, and AI-powered analysis

git: https://github.com/mediar-ai/screenpipe

smolrsrwkv

Basic RWKV implementation in Rust supporting 32, 8 and 4 bit quantized evaluation with PyTorch and SafeTensors model loading

git: https://github.com/KerfuffleV2/smolrsrwkv

xd-tts

Almost-pure Rust TTS engine with experimental Candle/Torch/Tract support for loading pre-trained SpeedySpeech models

git: https://github.com/xd009642/xd-tts

Tutorials

Convert Pytorch to Candle

Detailed tutorial showing how to convert PyTorch models to Candle

git: https://github.com/ToluClassics/candle-tutorial

LLMs from scratch - Rust

This project aims to provide Rust code that follows the incredible text, Build An LLM From Scratch by Sebastian Raschka. The book provides arguably the most clearest step by step walkthrough for building a GPT-style LLM. Listed below are the titles for each of the 7 Chapters of the book.

git: https://github.com/nerdai/llms-from-scratch-rs

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE.txt		LICENSE.txt
README.md		README.md

License

danielclough/awesome-candle

Folders and files

Latest commit

History

Repository files navigation

Awesome Candle

Inclusion in the list is not an endorsement.

Related Huggingface External Libraries

candle-extensions

candle-cublaslt

ratchet

candle-layer-norm

candle-rotary

Third-Party Frameworks and Libraries

mistral.rs

candle-vllm

candle-lora

candle-sampling

candle-ext

floneum

candle-optimisers

atoma-infer

burn-candle

candle_embed

glowrs

cake

candle-gcu

candle-vllm-gcu

rag-toolchain

candle-rwkv

ort-candle

Standalone Applications

moshi-candle

vibevoice-rs

fireside-chat

candle-video

llama-candle-rs

atoma-node

rust-candle-demos

screenpipe

smolrsrwkv

xd-tts

Tutorials

Convert Pytorch to Candle

LLMs from scratch - Rust

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages