Stars
Smoke Detection with Deep learning.
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
SD-Trainer. LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
Using Low-rank adaptation to quickly fine-tune diffusion models.
The definitive Web UI for local AI, with powerful features and easy setup.
Must-have resource for anyone who wants to experiment with and build on the OpenAI vision API 🔥
AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI
NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024
WEDGE: A multi-weather autonomous driving dataset built from generative vision-language models
A curated list of recent diffusion models for video generation, editing, and various other applications.
[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
A PyTorch implementation of EfficientNet
Implementation of EfficientNet model. Keras and TensorFlow Keras.
Object Detection Metrics. 14 object detection metrics: mean Average Precision (mAP), Average Recall (AR), Spatio-Temporal Tube Average Precision (STT-AP). This project supports different bounding b…
Models and examples built with TensorFlow
A resource for learning about Machine learning & Deep Learning
Best Practices, code samples, and documentation for Computer Vision.
Real-Time 3D Semantic Reconstruction using Kimera Sematics and Image Segmentation in Pytorch
Real-Time 3D Semantic Reconstruction from 2D data
Implementation for PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation (CVPR 2020)
[ICCV 2023] Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
Command-line program to download videos from YouTube.com and other video sites
A list of Image Aesthetics papers and resources.
Classification of Blurred and Non-Blurred Images
Code samples used on cloud.google.com