Skip to content
View JumpingRain's full-sized avatar
😇
😇

Highlights

  • Pro

Block or report JumpingRain

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Ovis-Image is a 7B text-to-image model specifically optimized for high-quality text rendering, designed to operate efficiently under stringent computational constraints.

Python 293 16 Updated Dec 21, 2025

This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]

Python 536 48 Updated Dec 20, 2025

MTEB: Massive Text Embedding Benchmark

Python 3,054 529 Updated Jan 3, 2026

An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.

Python 446 14 Updated Dec 2, 2025

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

Python 140 9 Updated May 29, 2025

[MathCoder, MathCoder-VL] Family of LLMs/LMMs for mathematical reasoning.

Python 339 26 Updated Oct 18, 2025

OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.

Jupyter Notebook 337 7 Updated Jun 1, 2025

Awesome Unified Multimodal Models

1,013 31 Updated Aug 17, 2025

Training VLM agents with multi-turn reinforcement learning

Python 363 40 Updated Jan 2, 2026

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,694 76 Updated May 11, 2025

This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reas…

Python 747 20 Updated Sep 10, 2025

MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning

Python 765 28 Updated Sep 7, 2025

[NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies

Python 224 4 Updated Apr 14, 2025
Python 79 7 Updated Mar 11, 2025

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 951 48 Updated Mar 19, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,989 2,942 Updated Jan 3, 2026

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models

Jupyter Notebook 151 7 Updated Jan 13, 2025

The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]

Python 24 1 Updated Dec 28, 2024

Train transformer language models with reinforcement learning.

Python 16,846 2,401 Updated Jan 2, 2026

This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR 2025]

Python 77 7 Updated Jul 1, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,757 12,350 Updated Jan 4, 2026

A collection of OCR-related datasets

202 7 Updated Sep 7, 2022

This repository contains datasets and baselines for benchmarking Chinese text recognition.

Python 502 51 Updated Dec 2, 2022

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,945 1,099 Updated Jan 3, 2026

One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks

Python 3,519 467 Updated Dec 31, 2025

Arena-Hard-Auto: An automatic LLM benchmark.

Python 977 138 Updated Jun 21, 2025

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Python 1,427 84 Updated Sep 22, 2025
Jupyter Notebook 10 Updated Feb 16, 2024

A flexible and efficient training framework for large-scale alignment tasks

Python 447 39 Updated Oct 23, 2025

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 39,347 4,780 Updated Jun 2, 2025
Next