Stars
[NeurIPS 2025] This is the official repository for VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set
Codebase for the paper "Projecting Assumptions: The Duality Between Sparse Autoencoders and Concept Geometry"
Official repo of "MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents". It can be used to evaluate a GUI agent with a hierarchical manner across multiple platforms, includi…
Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People
哔哩哔哩常用API调用。支持视频、番剧、用户、频道、音频等功能。原仓库地址:https://github.com/MoyuScript/bilibili-api
This repository contains the implementation of all methods evaluated in the paper "Learning a Thousand Tasks in a Day". We provide model architectures, training scripts, and deployment examples.
OpenICL is an open-source framework to facilitate research, development, and prototyping of in-context learning.
Easily create large video dataset from video urls
Explainable Multimodal Emotion Reasoning (EMER), OV-MER (ICML), and AffectGPT (ICML, Oral)
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
Implementation of π₀, the robotic foundation model architecture proposed by Physical Intelligence
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
NVIDIA Isaac GR00T N1.6 - A Foundation Model for Generalist Robots.
openvla / openvla
Forked from TRI-ML/prismatic-vlmsOpenVLA: An open-source vision-language-action model for robotic manipulation.
An open source implementation of CLIP.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
A simulation platform for versatile Embodied AI research and developments.
Deep Reinforcement Learning for Autonomous Drone Navigation
This is the repository for the collection of Graph-based Deep Learning for Communication Networks.
PyBullet Gymnasium environments for single and multi-agent reinforcement learning of quadcopter control
Python implementation of DDQN multi-UAV data harvesting
Codes for ACL 2023 paper: Reasoning Implicit Sentiment with Chain-of-Thought Prompting
