-
Institute of Information Engineering
- Beijing
Stars
Code and implementations for the ACL 2025 paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.
The paper list of the 86-page SCIS cover paper "The Rise and Potential of Large Language Model Based Agents: A Survey" by Zhiheng Xi et al.
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…
A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
The awesome collection of OpenClaw skills. 5,400+ skills filtered and categorized from the official OpenClaw Skills Registry.🦞
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
[CVPR 2025] A Comprehensive Benchmark for Document Parsing and Evaluation
新概念英语在线点读,点句即读、连续播放,支持 EN / EN+CN / CN。
Practice English, one strike, one step forward; 练习英语,一次敲击,一点进步;
收集整理一些在Seedream 4.0 下生成的令人惊艳的图片和提示词
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
Multilingual Document Layout Parsing in a Single Vision-Language Model
The simplest, fastest repository for training/finetuning medium-sized GPTs.
MMaDA - Open-Sourced Multimodal Large Diffusion Language Models (dLLMs with block diffusion, mixed-CoT, unified RL)
Seed-Coder is a family of lightweight open-source code LLMs comprising base, instruct and reasoning models, developed by ByteDance Seed.
Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning, achieving state-of-the-art performance on 38 out of 60 public benchmarks.