LLM-to-LLM Conversation Runner

This repository runs automated conversations between two frontier language models over multiple turns and saves both:

a readable transcript
a structured state file for transparency and partial reproducibility

It currently supports:

Anthropic Claude models
Alibaba GLM models via DashScope
Alibaba Qwen models via DashScope
Direct Z.AI GLM models
Google Gemini models
Moonshot Kimi models
OpenRouter Kimi and GLM models
OpenAI GPT models
xAI Grok models

The project is shared as an experiment framework rather than a polished product.

What It Does

For each run, the program:

starts a conversation between Model A and Model B
gives Model A an initial facilitator message
alternates turns between the two models
optionally asks each model a final facilitator question
prints the exchange live in the terminal
saves a .txt transcript for readability
saves a .state.json snapshot with structured run state

Output Files

Each run produces files in transcripts/:

...txt
...state.json

The transcript is the readable artifact.

The state file is the transparency artifact. It includes:

run status and phase
system prompts
run configuration
provider/model metadata
usage and cost metrics
convo_a and convo_b, which are the full API-formatted conversation snapshots each model would see at that checkpoint
git commit hash
error details for failed runs

Setup

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Create a .env file with whichever provider keys you need:

ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
GOOGLE_API_KEY=...
MOONSHOT_API_KEY=...
OPEN_ROUTER_API_KEY=...
ZAI_API_KEY=...
DASHSCOPE_API_KEY=...
# or ALI_BABA_API_KEY=...
# or BAILIAN_API_KEY=...
XAI_API_KEY=...

Only the keys for the models you actually run are required. For DashScope Qwen and GLM, DASHSCOPE_API_KEY, ALI_BABA_API_KEY, or BAILIAN_API_KEY all work.

For OpenRouter Kimi/GLM, use OPEN_ROUTER_API_KEY (or OPENROUTER_API_KEY).

The default OpenRouter base URL used by the repo is:

https://openrouter.ai/api/v1

For direct Z.AI GLM, use ZAI_API_KEY.

The default Z.AI base URL used by the repo is:

https://api.z.ai/api/paas/v4/

For DashScope Qwen, the default OpenAI-compatible base URL used by the repo is:

https://dashscope-intl.aliyuncs.com/compatible-mode/v1

For glm-5, the default base URL used by the repo is:

https://dashscope.aliyuncs.com/compatible-mode/v1

Alibaba documents glm-5 as a Chinese Mainland deployment-mode model, so it uses the mainland DashScope endpoint rather than the default international Qwen endpoint.

You can override the Qwen endpoint with DASHSCOPE_BASE_URL and the GLM endpoint with GLM_BASE_URL.

Basic Usage

Run with the defaults from params.py:

python3 main.py

See all CLI options:

python3 main.py --help

List models:

python3 list_models.py
python3 list_models.py --google --details

Prompts

The repository includes a public prompt directory:

prompts/public/

The default shipped prompt is:

prompts/public/standard.txt

If you want local-only prompts, put them in:

prompts/private/

That directory is intended for private prompts and is ignored by git.

Configuration

The easiest way to change defaults is to edit params.py.

Key settings include:

MODEL_A / MODEL_B
SYSTEM_PROMPT_A / SYSTEM_PROMPT_B
MAX_TURNS
TEMPERATURE_A / TEMPERATURE_B
THINKING_BUDGET_A / THINKING_BUDGET_B
START_MESSAGE_A / START_MESSAGE_B
FINAL_QUESTION_A / FINAL_QUESTION_B

The default kickoff currently sends Model A:

You are about to speak with another LLM. Please begin the conversation.

Thinking / Reasoning Notes

Thinking behavior is provider-specific.

Current notable cases:

Anthropic:
- most thinking-capable Claude models use a fixed thinking budget
- claude-opus-4-6 uses adaptive thinking, with the repo mapping the budget setting onto effort levels
Google Gemini:
- Gemini 3 models use thinkingLevel=HIGH with thoughts included
OpenRouter Kimi / GLM:
- use OpenRouter's reasoning parameter
- the repo captures streamed reasoning_details and carries them forward in assistant history
Direct Z.AI GLM:
- uses Z.AI's OpenAI-compatible API
- captures reasoning_content
OpenAI GPT-5 family:
- uses the Responses API
- default setup uses medium reasoning effort
- temperature is not sent for GPT-5-family models in this path

Cost Estimation

This repo can estimate per-turn and total run cost when usage data is available.

To enable that:

Copy the pricing template:

cp pricing.example.json pricing.json

Fill in the models you care about.
Run with:

python3 main.py --pricing-file pricing.json

If no pricing file is provided, or a model is missing from the file, token usage may still appear but cost will show as N/A.

Shared Artifacts

If you want to publish selected transcripts, use a curated folder rather than the full working archive.

This repo includes:

shared_transcripts/

That folder is intended for transcripts you are comfortable publishing.

Project Files

Core files:

main.py - conversation loop, transcript/state output, CLI
api_client.py - provider routing and API calls
conversation.py - history shaping utilities
costing.py - token normalization and cost estimation
params.py - default run configuration

Other useful files:

requirements.txt
pricing.example.json
list_models.py
list_gemini_models.py

Notes

The repo is designed for experimentation, not strict determinism.
Even with the same prompts and models, runs can diverge because sampling and provider behavior are stochastic.
Some transcripts and state files may include failures, quota errors, or other operational artifacts by design.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM-to-LLM Conversation Runner

What It Does

Output Files

Setup

Basic Usage

Prompts

Configuration

Thinking / Reasoning Notes

Cost Estimation

Shared Artifacts

Project Files

Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
prompts		prompts
shared_transcripts		shared_transcripts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
api_client.py		api_client.py
conversation.py		conversation.py
costing.py		costing.py
list_gemini_models.py		list_gemini_models.py
list_models.py		list_models.py
main.py		main.py
params.py		params.py
pricing.example.json		pricing.example.json
pricing.json		pricing.json
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

LLM-to-LLM Conversation Runner

What It Does

Output Files

Setup

Basic Usage

Prompts

Configuration

Thinking / Reasoning Notes

Cost Estimation

Shared Artifacts

Project Files

Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages