JumpingRain

😇

Xia Yu JumpingRain

😇

4 followers · 3 following

Nanjing, China

Achievements

Highlights

Lists (1)

Sort

c++

Stars

AIDC-AI / Ovis-Image

Ovis-Image is a 7B text-to-image model specifically optimized for high-quality text rendering, designed to operate efficiently under stringent computational constraints.

Python 293 16 Updated Dec 21, 2025

TIGER-AI-Lab / VLM2Vec

This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025]

Python 536 48 Updated Dec 20, 2025

embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark

Python 3,054 529 Updated Jan 3, 2026

AIDC-AI / Ovis-U1

An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.

Python 446 14 Updated Dec 2, 2025

dvlab-research / ARPO

Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay

Python 140 9 Updated May 29, 2025

mathllm / MathCoder

[MathCoder, MathCoder-VL] Family of LLMs/LMMs for mathematical reasoning.

Python 339 26 Updated Oct 18, 2025

zhaochen0110 / OpenThinkIMG

OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.

Jupyter Notebook 337 7 Updated Jun 1, 2025

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

1,013 31 Updated Aug 17, 2025

mll-lab-nu / VAGEN

Training VLM agents with multi-turn reinforcement learning

Python 363 40 Updated Jan 2, 2026

BytedTsinghua-SIA / DAPO

An Open-source RL System from ByteDance Seed and Tsinghua AIR

Python 1,694 76 Updated May 11, 2025

Osilly / Vision-R1

This is the first paper to explore how to effectively use R1-like RL for MLLMs and introduce Vision-R1, a reasoning MLLM that leverages cold-start initialization and RL training to incentivize reas…

Python 747 20 Updated Sep 10, 2025

ModalMinds / MM-EUREKA

MM-EUREKA: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning

Python 765 28 Updated Sep 7, 2025

jqtangust / hawk

[NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies

Python 224 4 Updated Apr 14, 2025

GAIR-NLP / AIME-Preview

Python 79 7 Updated Mar 11, 2025

fla-org / native-sparse-attention

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 951 48 Updated Mar 19, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,989 2,942 Updated Jan 3, 2026

Alpha-Innovator / DocGenome

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models

Jupyter Notebook 151 7 Updated Jan 13, 2025

AIDC-AI / Wings

The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]

Python 24 1 Updated Dec 28, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 16,846 2,401 Updated Jan 2, 2026

TIGER-AI-Lab / MEGA-Bench

This repo contains the code for "MEGA-Bench Scaling Multimodal Evaluation to over 500 Real-World Tasks" [ICLR 2025]

Python 77 7 Updated Jul 1, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,757 12,350 Updated Jan 4, 2026

xinke-wang / OCRDatasets

A collection of OCR-related datasets

202 7 Updated Sep 7, 2022

FudanVI / benchmarking-chinese-text-recognition

This repository contains datasets and baselines for benchmarking Chinese text recognition.

Python 502 51 Updated Dec 2, 2022

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 11,945 1,099 Updated Jan 3, 2026