jianguoz

Jianguo Zhang jianguoz

Senior Research Scientist at Salesforce AI Research

80 followers · 12 following

Achievements

x2 x2

Achievements

x2 x2

Stars

amazon-agi / tau2-bench-verified

τ²-Bench-Verified is a corrected and verified version of the original τ²-bench benchmark. This release addresses issues discovered in the original dataset where task definitions, expected actions, …

Python 25 2 Updated Dec 15, 2025

0917Ray / Reading_Notes

Some reading notes edited in LaTeX. 一些学习笔记，使用LaTeX编辑.

Jupyter Notebook 78 5 Updated Jan 5, 2026

SalesforceAIResearch / MCPEval

MCP-based Agent Deep Evaluation System

Python 142 16 Updated Sep 26, 2025

Kiln-AI / Kiln

Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.

Python 4,574 335 Updated Jan 17, 2026

sierra-research / tau2-bench

τ²-Bench: Evaluating Conversational Agents in a Dual-Control Environment

Python 650 160 Updated Dec 18, 2025

astral-sh / uv

An extremely fast Python package and project manager, written in Rust.

Rust 77,253 2,459 Updated Jan 19, 2026

SalesforceAIResearch / MAS-Zero

Designing Multi-Agent Systems with Zero Supervision

Python 109 12 Updated Jul 8, 2025

SalesforceAIResearch / xLAM

xLAM: A Family of Large Action Models to Empower AI Agent Systems

Python 600 49 Updated Aug 21, 2025

SalesforceAIResearch / CoTA

Python 4 1 Updated Jan 14, 2025

Open-Source-O1 / Open-O1

Python 1,343 53 Updated Nov 21, 2024

karthikv792 / LLMs-Planning

An extensible benchmark for evaluating large language models on planning

PDDL 441 47 Updated Sep 17, 2025

weirayao / Retroformer

Python 39 2 Updated May 2, 2024

wasiahmad / Awesome-LLM-Synthetic-Data

A reading list on LLM based Synthetic Data Generation 🔥

1,510 91 Updated Jun 5, 2025

Storia-AI / sage

Chat with any codebase in under two minutes | Fully local or via third-party APIs

Python 1,261 117 Updated Nov 11, 2024

mistralai / cookbook

Jupyter Notebook 2,149 471 Updated Jan 9, 2026

TIGER-AI-Lab / Mantis

Official code for Paper "Mantis: Multi-Image Instruction Tuning" [TMLR 2024 Best Paper]

Python 237 22 Updated Jan 3, 2026

nlpxucan / WizardLM

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,481 744 Updated Jun 7, 2025

camel-ai / camel

🐫 CAMEL: The first and the best multi-agent framework. Finding the Scaling Law of Agents. https://www.camel-ai.org

Python 15,599 1,717 Updated Jan 19, 2026

google-research / android_world

AndroidWorld is an environment and benchmark for autonomous agents

Python 589 123 Updated Jan 15, 2026

RAIVNLab / mnms

m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks

Python 44 5 Updated Sep 26, 2024

sierra-research / tau-bench

Code and Data for Tau-Bench

Python 1,061 175 Updated Aug 28, 2025

SalesforceAIResearch / MobileAIBench

C++ 23 1 Updated Nov 10, 2025

anthropics / hh-rlhf

Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"

1,808 151 Updated Jun 17, 2025

mamba-org / mamba

The Fast Cross-Platform Package Manager

C++ 7,874 427 Updated Jan 16, 2026

conda-forge / miniforge

A conda-forge distribution.

Shell 9,175 467 Updated Jan 1, 2026

Ber666 / ToolkenGPT

ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings - NeurIPS 2023 (oral)

Python 267 27 Updated Apr 18, 2024

InternLM / Agent-FLAN

[ACL2024 Findings] Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

355 9 Updated Mar 22, 2024

BerriAI / litellm

Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…

Python 34,079 5,381 Updated Jan 19, 2026

MeetKai / functionary

Chat language model that can use tools and interpret the results

Python 1,592 118 Updated Dec 3, 2025

SalesforceAIResearch / AgentLite

Jupyter Notebook 641 83 Updated Nov 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jianguo Zhang jianguoz

Achievements

Achievements

Block or report jianguoz

Stars

amazon-agi / tau2-bench-verified

0917Ray / Reading_Notes

SalesforceAIResearch / MCPEval

Kiln-AI / Kiln

sierra-research / tau2-bench

astral-sh / uv

SalesforceAIResearch / MAS-Zero

SalesforceAIResearch / xLAM

SalesforceAIResearch / CoTA

Open-Source-O1 / Open-O1

karthikv792 / LLMs-Planning

weirayao / Retroformer

wasiahmad / Awesome-LLM-Synthetic-Data

Storia-AI / sage

mistralai / cookbook

TIGER-AI-Lab / Mantis

nlpxucan / WizardLM

camel-ai / camel

google-research / android_world

RAIVNLab / mnms

sierra-research / tau-bench

SalesforceAIResearch / MobileAIBench

anthropics / hh-rlhf

mamba-org / mamba

conda-forge / miniforge

Ber666 / ToolkenGPT

InternLM / Agent-FLAN

BerriAI / litellm

MeetKai / functionary

SalesforceAIResearch / AgentLite