runninglsy

runninglsy

Alibaba Group
shiyin.lu

Achievements

Stars

AIDC-AI / Ovis-Image

Ovis-Image is a 7B text-to-image model specifically optimized for high-quality text rendering, designed to operate efficiently under stringent computational constraints.

Python 296 16 Updated Dec 21, 2025

opendatalab / OmniDocBench

[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation

Python 1,341 126 Updated Dec 19, 2025

AIDC-AI / Ovis-U1

An unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.

Python 446 14 Updated Dec 2, 2025

AIDC-AI / Awesome-Unified-Multimodal-Models

Awesome Unified Multimodal Models

1,022 32 Updated Aug 17, 2025

MinorJerry / WebVoyager

Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"

Python 989 111 Updated Mar 4, 2024

LengSicong / MMR1

MMR1: Enhancing Multimodal Reasoning with Variance-Aware Sampling and Open Resources

Python 211 9 Updated Sep 26, 2025

jqtangust / hawk

[NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies

Python 224 4 Updated Apr 14, 2025

neulab / MultiUI

Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding

Python 53 5 Updated Dec 12, 2024

bytedance / Valley

Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.

Python 268 14 Updated Dec 2, 2025

tianyu-z / VCR

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

Python 32 3 Updated Feb 26, 2025

AIDC-AI / Wings

The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]

Python 24 1 Updated Dec 28, 2024

AIDC-AI / AutoGPTQ

Forked from AutoGPTQ/AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 3 Updated Nov 4, 2024

AIDC-AI / Meissonic

Python 3 Updated Nov 14, 2024

AIDC-AI / Agentic-ADK

Agentic ADK is an Agent application development framework launched by Alibaba International AI Business, based on Google-ADK and Ali-LangEngine.

Java 638 120 Updated Nov 24, 2025

rese1f / MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

Python 677 43 Updated Jan 29, 2025

Mihaiii / llm_steer

Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors

Python 264 14 Updated Feb 17, 2025

WePOINTS / WePOINTS

Python 187 10 Updated Feb 6, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …

Python 12,020 1,106 Updated Jan 7, 2026