Stars
Research work aimed at addressing the problem of modeling infinite-length context
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
Findings of ACL'25 DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)
[Arxiv] Discrete Diffusion in Large Language and Multimodal Models: A Survey
This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive A…
An Autonomous Agentic Framework for Reflective PowerPoint Generation
[NeurIPS 2025] 🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Ongoing Research Project for continaual pre-training LLM(dense mode)
Ongoing Research Project for Mixture of Expert models
Official PyTorch implementation for ICLR2025 paper "Scaling up Masked Diffusion Models on Text"
Repo housing the open sourced code for the ai2 scholar qa app and also the corresponding library
Repository for "TESS-2: A Large-Scale, Generalist Diffusion Language Model"
Official PyTorch implementation for "Large Language Diffusion Models"
Toolkit for linearizing PDFs for LLM datasets/training
Source code for "Discrete Dictionary-based Decomposition Layer for Structured Representation Learning"
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
[ICLR2025] DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models
OLMoE: Open Mixture-of-Experts Language Models
A simple, performant and scalable Jax LLM!
Modeling, training, eval, and inference code for OLMo