Skip to content
View CuiRobert's full-sized avatar

Block or report CuiRobert

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 6,961 404 Updated Dec 31, 2025

Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models

863 36 Updated Dec 4, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 13,536 1,605 Updated Dec 17, 2025

Bring portraits to life!

Python 17,596 1,828 Updated Nov 16, 2025

A MemAgent framework that can be extrapolated to 3.5M, along with a training framework for RL training of any agent workflow.

Python 852 58 Updated Jul 31, 2025

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 3,824 310 Updated Aug 14, 2025

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 4,299 369 Updated Dec 4, 2025

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥

Python 11,900 878 Updated Jul 18, 2024

Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷

Python 5,728 313 Updated Jan 12, 2026

Text Normalization & Inverse Text Normalization

Python 718 95 Updated Dec 1, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 18,922 2,107 Updated Jan 12, 2026

PyTorch implementation of some attentions for Deep Learning Researchers.

Python 547 73 Updated Mar 4, 2022

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

Jupyter Notebook 447 31 Updated Apr 13, 2025

More relighting!

Python 8,340 525 Updated Feb 20, 2025

基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.

Python 9,197 1,152 Updated Dec 3, 2025

APISR: Anime Production Inspired Real-World Anime Super-Resolution (CVPR 2024)

Python 1,068 68 Updated Oct 16, 2025

[WIP] Layer Diffusion for WebUI (via Forge)

Python 4,109 351 Updated Aug 30, 2024

Transparent Image Layer Diffusion using Latent Transparency

2,186 36 Updated Jun 16, 2024

Unofficial implementation of I2VGenXL for ComfyUI

Python 114 10 Updated May 22, 2024

[ICML 2024] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (RPG)

Jupyter Notebook 1,841 102 Updated Feb 1, 2025

PhotoMaker [CVPR 2024]

Jupyter Notebook 10,108 823 Updated Oct 31, 2024

Video editing with Python

Python 14,227 2,016 Updated Sep 25, 2025

Translate manga/image 一键翻译各类图片内文字 https://cotrans.touhou.ai/ (no longer working)

Python 9,203 897 Updated Dec 17, 2025

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,715 449 Updated May 29, 2024

Official implementation of AnimateDiff.

Python 11,981 1,037 Updated Jul 31, 2024

该资源为作者在CSDN的撰写Python图像处理文章的支撑,主要是Python实现图像处理、图像识别、图像分类等算法代码实现,希望该资源对您有所帮助,一起加油。

Jupyter Notebook 2,088 474 Updated Apr 12, 2024

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.

Jupyter Notebook 6,405 412 Updated Jun 28, 2024

GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code

Python 2,659 300 Updated Oct 18, 2024

Focus on prompting and generating

Python 47,497 7,746 Updated Dec 1, 2025
Next