PARE: Proactive Agent Research Environment

PARE is a Python research framework for evaluating proactive AI assistants through active user simulation. Built on top of Meta-ARE, it provides a realistic mobile-phone simulation environment where a proactive assistant must observe user behavior, infer goals, and intervene helpfully -- without being asked.

Paper: PARE: Simulating Active Users to Evaluate Proactive Assistants
Documentation: deepakn97.github.io/pare

What is PARE?

Proactive assistants need to decide when to help and what to do -- all from passively observing user activity. Evaluating this requires simulating realistic users in realistic environments, which is what PARE provides:

9 domain apps modeled as finite state machines: Apartment, Cab, Calendar, Contacts, Email, Messaging, Note, Reminder, and Shopping
2 core system apps: HomeScreenSystemApp for navigation (open, switch, go home) and PAREAgentUserInterface for proposal management (accept/reject)
143 benchmark scenarios spanning multi-app orchestration, goal inference, and intervention timing
Observe-Execute agent architecture with configurable models per stage
Oracle validation to automatically verify task completion

How It Works

PARE orchestrates a two-agent simulation: a user agent that navigates the phone realistically, and a proactive agent that observes and intervenes.

The key insight is asymmetric interfaces. The user agent sees only the tools available on the current screen (just like a real user tapping through apps), while the proactive agent gets flat API access to all apps for efficient task execution. This forces realistic user behavior without handicapping the assistant.

Sending a message requires navigating through screens for the user (right), but a single API call for the assistant (left).

Quick Start

Prerequisites

Python 3.12+
uv package manager

Installation

git clone git@github.com:deepakn97/pare.git
cd pare
make install

Configure API Keys

Copy the example environment file and fill in your API keys:

cp .env.example .env

Edit .env with the keys for the providers you plan to use:

# Required for GPT models (gpt-5, gpt-5-mini, gpt-4o, etc.)
OPENAI_API_KEY=your_openai_api_key_here

# Required for Hugging Face model access
HF_TOKEN=your_hf_token_here

# Required for AWS Bedrock models (llama-4-scout, llama-4-maverick, etc.)
AWS_ACCESS_KEY_ID=aws_access_key_id
AWS_SECRET_ACCESS_KEY=secret_access_key_id
AWS_REGION_NAME="us-east-1"
# Or use the new Bedrock API key:
AWS_BEARER_TOKEN_BEDROCK=new_aws_api_key_here

# Scenario configuration (defaults to benchmark/)
PARE_SCENARIOS_DIR=benchmark

# Path to environment augmentation data (relative to project root)
ENV_AUGMENTATION_DATA_PATH="data/metaare_augmentation_data.json"

Run a Single Scenario

pare benchmark sweep -s email_notification -om gpt-5 -em gpt-5

Run the Full Benchmark

pare benchmark sweep --split full -om gpt-5 -em gpt-5 --runs 3

Model Sweep

Model pairs are zipped (not crossed). Each --observe-model is paired with the corresponding --execute-model:

pare benchmark sweep --split full \
  -om gpt-5 -om claude-4.5-sonnet \
  -em gpt-5 -em claude-4.5-sonnet \
  --runs 3

Results

Results are saved in a structured directory under results/:

results/
  {experiment}_{split}_user_{model}_mt_{turns}_umi_..._omi_..._emi_.../
    obs_{model}_exec_{model}_..._result.json
    obs_{model}_exec_{model}_..._report.txt
    combined_report.txt

Use pare benchmark sweep --help for the full list of configuration options.

Other CLI Commands

pare annotation sample -t <traces_dir> -n <size>   # Sample decision points for human eval
pare annotation launch                               # Launch annotation UI
pare cache status                                    # Show cache location and entry count
pare cache invalidate                                # Clear cached results

Note

macOS users: The --executor-type process option may fail due to a known Python multiprocessing issue with the 'spawn' method on macOS. Use the default --executor-type thread instead.

Documentation

Full API reference and architecture docs are available at deepakn97.github.io/pare.

Contributing

See CONTRIBUTING.md for development setup, code style guidelines, and how to submit pull requests.

License

This project is licensed under the terms of the MIT License.

Citation

If you use PARE in your research, please cite:

@misc{nathani2026proactiveagentresearchenvironment,
      title={Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants},
      author={Deepak Nathani and Cheng Zhang and Chang Huan and Jiaming Shan and Yinfei Yang and Alkesh Patel and Zhe Gan and William Yang Wang and Michael Saxon and Xin Eric Wang},
      year={2026},
      eprint={2604.00842},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2604.00842},
}

Name		Name	Last commit message	Last commit date
Latest commit History 131 Commits
.github		.github
data		data
docs		docs
github_actions		github_actions
pare		pare
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
codecov.yaml		codecov.yaml
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PARE: Proactive Agent Research Environment

What is PARE?

How It Works

Quick Start

Prerequisites

Installation

Configure API Keys

Run a Single Scenario

Run the Full Benchmark

Model Sweep

Results

Other CLI Commands

Documentation

Contributing

License

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PARE: Proactive Agent Research Environment

What is PARE?

How It Works

Quick Start

Prerequisites

Installation

Configure API Keys

Run a Single Scenario

Run the Full Benchmark

Model Sweep

Results

Other CLI Commands

Documentation

Contributing

License

Citation

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages