Skip to content

A curated list of Awesome projects, models, tutorials, and tools for Rust machine learning related to HuggingFace/Candle!

License

Notifications You must be signed in to change notification settings

danielclough/awesome-candle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Awesome Candle


Anyone may include a project in this list if it uses huggingface/candle.

Inclusion in the list is not an endorsement.

Related Huggingface External Libraries

candle-extensions

Official Candle extensions for more specialized kernels, typically without backward equivalents but faster than raw Candle expressions

candle-cublaslt

CublasLt matmul operation for the Candle ML framework with support for bias and Relu/Gelu fusing

ratchet

Cross-platform browser ML framework leveraging WebGPU for inference with support for Whisper, Phi models, and quantization

candle-layer-norm

Fused Layer Norm operation adapted from Flash Attention with support for dropout, residual, RMSNorm, and hidden dimensions up to 8192

candle-rotary

Optimized rotary embeddings implementation adapted from vLLM project for efficient positional encoding in transformers

Third-Party Frameworks and Libraries

mistral.rs

Blazingly fast LLM inference platform with all-in-one multimodal workflow support for text, vision, audio, speech, image, and embeddings

candle-vllm

Efficient platform for inference and serving local LLMs with OpenAI compatible API server

candle-lora

Efficient and ergonomic LoRA implementation for Candle with out-of-the-box support for many models

candle-sampling

Sampling techniques for Candle including multinomial, top-k, top-p, logprobs, repeat penalty, and logit bias

candle-ext

Extension library adding PyTorch functions not currently available in Candle

floneum

Runtime for quantized ML inference using WGPU to run models on any accelerator natively or in the browser

candle-optimisers

Collection of optimizers including SGD with momentum, AdaGrad, AdaDelta, AdaMax, Adam, AdamW, NAdam, RAdam, and RMSprop

atoma-infer

The atoma-infer repository is a collection of optimized infrastructure for serving Large Language Models (LLMs) compute. We rely on highly optimized KV cache memory management software, through block pagination, such as PagedAttention and FlashAttention2. The codebase is mostly written in the Rust programming language, which leads to safe and highly optimized inference request scheduling, enhancing LLM inference serving.

burn-candle

Candle backend for the Burn deep learning framework, enabling Burn to leverage Candle's performance

candle_embed

Simple CUDA or CPU powered library for creating vector embeddings using Candle and Hugging Face models

glowrs

Candle-based sentence embedder library and server with OpenAI-compatible API for sentence transformers

cake

Distributed LLM and Stable Diffusion inference framework for mobile, desktop and server with support for LLaMA3

candle-gcu

Port of Candle ML framework to Enflame GCU platform for deep learning inference on Enflame hardware accelerators

candle-vllm-gcu

Large language model inference and chat service framework for Enflame GCU built on Candle-GCU and candle-vllm

rag-toolchain

Rust native library for Gen AI workflows providing seamless access to RAG pipelines and embeddings

  • git

candle-rwkv

RWKV models and inference implementation with quantization support in Candle for efficient recurrent neural networks

ort-candle

Integration layer between ONNX Runtime and Candle for hardware-accelerated inference with cross-platform support

Standalone Applications

moshi-candle

Kyutai's Moshi speech-text foundation model implementation in Rust/Candle with int8 and bf16 quantization for full-duplex spoken dialogue

vibevoice-rs

Text To Speech interface implemented in pure Rust using Candle over Axum with a Tauri/Leptos WASM frontend

fireside-chat

LLM chat interface implemented in pure Rust using Candle over Axum WebSockets with an SQL database and Tauri/Leptos WASM frontend

candle-video

candle-video is a Rust-native implementation of video generation models, targeting deployment scenarios where startup time, binary size, and memory efficiency matter. It provides inference for state-of-the-art text-to-video models without requiring a Python runtime.

llama-candle-rs

Tutorial project demonstrating Llama inference on GPU using Candle with GGUF format support

atoma-node

Core infrastructure for confidential computing in distributed AI systems using Candle for inference operations

rust-candle-demos

Demo projects showcasing Candle capabilities on GPU instances (AWS Deep Learning AMI)

screenpipe

24/7 local screen and audio recording application using Candle for OCR, voice activity detection, and AI-powered analysis

smolrsrwkv

Basic RWKV implementation in Rust supporting 32, 8 and 4 bit quantized evaluation with PyTorch and SafeTensors model loading

xd-tts

Almost-pure Rust TTS engine with experimental Candle/Torch/Tract support for loading pre-trained SpeedySpeech models

Tutorials

Convert Pytorch to Candle

Detailed tutorial showing how to convert PyTorch models to Candle

LLMs from scratch - Rust

This project aims to provide Rust code that follows the incredible text, Build An LLM From Scratch by Sebastian Raschka. The book provides arguably the most clearest step by step walkthrough for building a GPT-style LLM. Listed below are the titles for each of the 7 Chapters of the book.

About

A curated list of Awesome projects, models, tutorials, and tools for Rust machine learning related to HuggingFace/Candle!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors