Skip to content
View ywdong's full-sized avatar

Block or report ywdong

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理!- A powered tool for easy and efficient video subtitling.

Python 12,618 1,009 Updated Dec 11, 2025

Anthropic's Interactive Prompt Engineering Tutorial

Jupyter Notebook 28,016 2,674 Updated Jul 11, 2024

A free, open source, and extensible speech-to-text application that works completely offline.

TypeScript 10,069 694 Updated Jan 4, 2026

We write your reusable computer vision tools. 💜

Python 36,260 3,069 Updated Dec 22, 2025

Reference PyTorch implementation and models for DINOv3

Jupyter Notebook 9,157 677 Updated Nov 20, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,794 390 Updated Dec 31, 2025

[NeurIPS 2025] Official implementation of "XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation".

Python 617 46 Updated Oct 22, 2025

A unified inference and post-training framework for accelerated video generation.

Python 2,895 233 Updated Jan 4, 2026

An inference and training framework for multiple image input in Flux Kontext dev

Jupyter Notebook 429 31 Updated Sep 1, 2025

A PyTorch native platform for training generative AI models

Python 4,921 658 Updated Jan 4, 2026

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.

Python 5,148 393 Updated Apr 21, 2025

Collection of ComfyUI Workflows

23 1 Updated Jul 24, 2025

The ultimate training toolkit for finetuning diffusion models

Python 8,726 1,017 Updated Jan 2, 2026
Python 1,596 214 Updated Jan 3, 2026

[ICCV 2025] Official implementations for paper: VACE: All-in-One Video Creation and Editing

Python 3,547 243 Updated Oct 17, 2025

This node preserves image quality by selectively merging only the changed regions from AI-generated edits back into the original image.

Python 91 7 Updated Aug 12, 2025
Python 382 11 Updated Jul 13, 2025

Simple, Efficient, and Effective Negative Guidance in Few-Step Image Generation Models By Value Sign Flip

Jupyter Notebook 36 1 Updated Jan 4, 2026

Repo for SeedVR2 & SeedVR (CVPR2025 Highlight)

Python 880 55 Updated Jul 2, 2025

Use Claude Code as the foundation for coding infrastructure, allowing you to decide how to interact with the model while enjoying updates from Anthropic.

TypeScript 24,646 1,938 Updated Jan 4, 2026

ComfyUI implemtation for NAG

Python 297 27 Updated Nov 3, 2025

LBM: Latent Bridge Matching for Fast Image-to-Image Translation ✨ (ICCV 2025 Highlight)

Python 807 50 Updated Jul 24, 2025

Context engineering is the new vibe coding - it's the way to actually make AI coding assistants work. Claude Code is the best for this so that's what this repo is centered around, but you can apply…

Python 12,047 2,554 Updated Nov 16, 2025

[NeurIPS' 2025] JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Python 762 29 Updated Jan 1, 2026

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 89,646 10,347 Updated Jan 4, 2026

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,708 313 Updated Nov 28, 2025

[WWW 2025] Official PyTorch Code for "CTR-Driven Advertising Image Generation with Multimodal Large Language Models"

Python 61 5 Updated Aug 3, 2025

[AAAI 2026] VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Python 359 12 Updated Mar 26, 2025

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

Python 1,615 83 Updated Oct 29, 2025

CLIP+MLP Aesthetic Score Predictor

Python 1,240 112 Updated Jul 1, 2024
Next