Skip to content
View DanielLin94144's full-sized avatar
🦄
🦄

Highlights

  • Pro

Block or report DanielLin94144

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A real-time and multilingual speech translation model

Python 128 14 Updated Feb 13, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 194,306 33,477 Updated Feb 14, 2026

PaperBanana: Automating Academic Illustration For AI Scientists

JavaScript 3,507 165 Updated Feb 2, 2026

fd-sds

Python 11 Updated Feb 2, 2026

SoTA open-source TTS

Python 22,610 2,964 Updated Feb 3, 2026

TTS model capable of streaming conversational audio in realtime.

Python 1,067 87 Updated Nov 29, 2025
Python 26 2 Updated Aug 21, 2025

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 7,681 962 Updated Feb 6, 2026

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 7,160 814 Updated Mar 5, 2025
Python 11 Updated Jan 21, 2026

PersonaPlex code.

Python 4,986 738 Updated Feb 9, 2026

Open-Source Frontier Voice AI

Python 23,243 2,543 Updated Feb 7, 2026

X-Talk is an open-source full-duplex cascaded spoken dialogue system framework enabling low-latency, interruptible, and human-like speech interaction with a lightweight, pure-Python, production-rea…

Python 176 17 Updated Feb 11, 2026

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 3,305 278 Updated Jan 5, 2026

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,859 118 Updated Sep 27, 2024

Post-training with Tinker

Python 2,836 319 Updated Feb 11, 2026

Omnilingual ASR Open-Source Multilingual SpeechRecognition for 1600+ Languages

Python 2,641 234 Updated Dec 30, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 18,775 2,318 Updated Dec 2, 2025

Realtime demo, Streaming and Finetuning code for CSM

Python 443 68 Updated Sep 17, 2025

Fast and memory-efficient exact attention

Python 22,249 2,382 Updated Feb 14, 2026

Qwen2.5-Omni fine-tuned on MNV-17 dataset for nonverbal vocalization recognition

HTML 28 1 Updated Nov 13, 2025

🟣 LLMs interview questions and answers to help you prepare for your next machine learning and data science interview in 2026.

879 106 Updated Feb 12, 2026

Official code of ICML 2025 paper "NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Prediction"

Python 135 22 Updated Oct 27, 2025

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

Jupyter Notebook 7,773 1,398 Updated Nov 28, 2025

Mini-Omni-Reasoner: a real-time speech reasoning framework that interleaves silent reasoning tokens with spoken response tokens (“thinking-in-speaking”), exploiting the LLM–audio throughput gap to …

162 19 Updated Aug 26, 2025

End-to-end realtime stack for connecting humans and AI

Go 17,069 1,738 Updated Feb 14, 2026

Cartesia Line SDK for voice agents.

Python 92 34 Updated Feb 13, 2026

Foundational model for human-like, expressive TTS

Python 4,191 690 Updated Jul 30, 2024

EMO-Reasoning: Benchmarking Emotional Reasoning Capabilities in Spoken Dialogue Systems

Python 6 Updated Aug 27, 2025
Python 4 Updated Jan 6, 2026
Next