-
University of Chang'an
- Xi'an
- https://github.com/JackieJoe1021
Stars
PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
Mask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.
Open3D: A Modern Library for 3D Data Processing
OpenPCDet Toolbox for LiDAR-based 3D Object Detection.
3D detection and tracking viewer (visualization) for kitti & waymo dataset
[ICLR 2025] Official implementation of Posterior-Mean Rectified Flow: Towards Minimum MSE Photo-Realistic Image Restoration
The source code and pre-trained models for Motion Matters: Neural Motion Transfer for Better Camera Physiological Sensing (WACV 2024, Oral).
rPPG-Toolbox: Deep Remote PPG Toolbox (NeurIPS 2023)
This repository demonstrates instance segmentation using YOLOv8 (smart) model on Triton Inference Server
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.
SkyReels-V2: Infinite-length Film Generative model
[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
This repository features an Energy Optimization System (EOS) that optimizes energy distribution, usage for batteries, heat pumps& household devices. It includes predictive models for electricity pr…
Official Algorithm Codebase for the Paper "BEHAVIOR Robot Suite: Streamlining Real-World Whole-Body Manipulation for Everyday Household Activities"
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Official Pytorch Implementation of SMIRK: 3D Facial Expressions through Analysis-by-Neural-Synthesis (CVPR 2024)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
A GPU cluster manager that configures and orchestrates inference engines like vLLM and SGLang for high-performance AI model deployment.
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
Integrate the DeepSeek API into popular software
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
50+ tutorials and implementations for Generative AI Agent techniques, from basic conversational bots to complex multi-agent systems.
[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents
[CVPR 2025 Oral] SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding