Stars
GIM: Learning Generalizable Image Matcher From Internet Videos (ICLR 2024 Spotlight)
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
An open source implementation of CLIP.
Refine high-quality datasets and visual AI models
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
SwanLab Local Visualization Python Package Plugin|SwanLab本地可视化python包插件
SwanLab Official Documentation | SwanLab官方文档
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / verl / LLaMA Factory / ms-swift / U…
Use DINOv3’s powerful, self-supervised visual features + YOLOv12’s blazing-fast detection, all in one repo. Whether you have only a few hundred labeled images or a medium-sized dataset, DINOV3-YOLO…
[TGRS 2024] Towards Dense Moving Infrared Small Target Detection: New Datasets and Baseline
LimitIRSTD Dataset for ICPR 2024 Resource-Limited Infrared Small Target Detection Challenge
A Foundation Model for SAR Target Recognition
Self-Supervised Learning for SAR ATR with a Joint-Embedding Predictive Architecture
YOLO-FaceV2: A Scale and Occlusion Aware Face Detector
[ICRA 2025] Official repository for "UASTHN: Uncertainty-Aware Deep Homography Estimation for UAV Satellite-Thermal Geo-localization"
[RA-L 2024] Official repository for "STHN: Deep Homography Estimation for UAV Thermal Geo-localization with Satellite Imagery"
[IROS 2023] Official repository for "Long-range UAV Thermal Geo-localization with Satellite Imagery"
[NeurIPS 2025] Official repository for "ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation"
[DEIMv2] Real Time Object Detection Meets DINOv3
Offical implementation of "Visual Instruction Pretraining for Domain-Specific Foundation Models"
一个修改YOLOv5以使用SwinTransformer模块的代码仓库。A repository that modifies YOLOv5 to use various SwinTransformer blocks.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Distilled coarse part of LoFTR adapted for compatibility with TensorRT and embedded divices
fabio-sim / LightGlue-ONNX
Forked from cvg/LightGlueONNX-compatible LightGlue: Local Feature Matching at Light Speed. Supports TensorRT, OpenVINO
