-
University of Chang'an
- Xi'an
- https://github.com/JackieJoe1021
Stars
An opinionated list of Python frameworks, libraries, tools, and resources
All Algorithms implemented in Python
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
The world's simplest facial recognition api for Python and the command line
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Open-Sora: Democratizing Efficient Video Production for All
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
Wan: Open and Advanced Large-Scale Video Generative Models
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
An open-source tool-augmented conversational language model from Fudan University
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
ImageBind One Embedding Space to Bind Them All
Effortless data labeling with AI support from Segment Anything and other awesome models.
Real-Time and Accurate Full-Body Multi-Person Pose Estimation&Tracking System
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Real-Time High-Resolution Background Matting