Idea Generation Proposal Evaluation Demo Deployment
Public Access: https://rateyourproposal.ai/ NUS Intranet: https://rproposal.comp.nus.edu.sg/
**Multi-Agent Collaboration** is designed to facilitate AI-powered academic discussions and research collaboration using Large Language Models (LLMs). This framework enables researchers to simulate realistic academic conversations between AI agents with different expertise levels and generate structured research proposals from collaborative discussions.- π Multi-Agent Academic Discussions: Simulate realistic conversations between AI researchers with different expertise levels
- π Research Proposal Generation: Automatically synthesize discussions into structured, citable research proposals
- π Literature Integration: Built-in Semantic Scholar API integration for real paper citations and analysis
- π€ Flexible LLM Support: DeepSeek V3, OpenAI GPT-4, O1-mini, and custom model integration
- βοΈ YAML-Based Configuration: Easy-to-customize discussion scenarios and agent behaviors
- π€ Multiple Collaboration Patterns: Horizontal, Vertical, Interdisciplinary, and Leader-led discussion types
- π§ Advanced Memory Management: Sophisticated chat history and context-aware memory systems
- π§ Extensible Tool System: Integrated paper search, analysis, and research tools
- π Features
- π§ Installation
- π Environment Variables
- π― Quick Start
- π Available Configurations
- βοΈ Configuration System
- π οΈ Advanced Usage
- π Output Structure
- π Troubleshooting
- π€ Contributing
Make sure you have Python >= 3.9
git clone <your-repository-url>
cd Multi-Agent-Collaboration
pip install -r requirements.txtYou need to export your API keys as follows:
# For DeepSeek (Recommended)
export DEEPSEEK_API_KEY="your_deepseek_api_key_here"
export DEEPSEEK_BASE_URL="https://api.deepseek.com/v1"
# For OpenAI (Optional)
export OPENAI_API_KEY="your_openai_api_key_here"
# For Semantic Scholar Literature Search (Optional)
export SEMANTIC_SCHOLAR_API_KEY="your_semantic_scholar_key_here"Choose any configuration and run a discussion on your topic:
# Horizontal collaboration (peer-level researchers)
cd agentverse/tasks/simulation/Horizontal_Collaboration
python run_dynamic_topic.py --topic "machine learning interpretability"
# Vertical collaboration (mixed hierarchy levels)
cd agentverse/tasks/simulation/Vertical_Collaboration
python run_dynamic_topic.py --topic "quantum computing applications"
# Interdisciplinary collaboration (cross-domain experts)
cd agentverse/tasks/simulation/Interdisciplinary_Collaboration
python run_dynamic_topic.py --topic "AI for healthcare"
# Leader-led collaboration (senior-guided discussion)
cd agentverse/tasks/simulation/Leader_Led_Collaboration
python run_dynamic_topic.py --topic "federated learning privacy"
# Multi-agent collaboration (general framework)
cd agentverse/tasks/simulation/Multi_Collaboration
python run_dynamic_topic.py --topic "neural architecture search"
# Individual reflection with DeepSeek
cd agentverse/tasks/simulation/Solitary_Ideation_deepseek_v3
python run_dynamic_topic.py --topic "AI ethics"
# Individual reflection with O1-mini
cd agentverse/tasks/simulation/Solitary_Ideation_o1_mini
python run_dynamic_topic.py --topic "transformer architectures"The agentverse/tasks/simulation/ directory contains various pre-configured discussion scenarios:
| Configuration | Description | Best Use Case |
|---|---|---|
Horizontal_Collaboration |
Peer-level researchers with equal expertise | Equal-level expert discussions, peer reviews |
Vertical_Collaboration |
Mixed hierarchy with different seniority levels | Academic mentoring, student-supervisor interactions |
Interdisciplinary_Collaboration |
Cross-domain experts from different fields | Multi-domain problem solving, interdisciplinary research |
Leader_Led_Collaboration |
Senior researcher guiding junior participants | Research leadership, guided team discussions |
Multi_Collaboration |
General multi-agent discussion framework | Flexible group discussions, custom scenarios |
| Configuration | Description | Model Used |
|---|---|---|
Solitary_Ideation_deepseek_v3 |
Single researcher self-reflection | DeepSeek V3 |
Solitary_Ideation_o1_mini |
Single researcher self-reflection | OpenAI O1-mini |
The framework uses YAML configuration files to define discussion scenarios. Each configuration contains:
prompts: # Role definitions and behavioral guidelines
environment: # Discussion rules and turn management
agents: # Participant configurations and LLM settings
tools: # Research tools (Semantic Scholar, etc.)
ai_researcher_config: # Literature search parameters| Section | Purpose | Key Settings |
|---|---|---|
| Prompts | Define agent personalities and expertise levels | Role descriptions, behavioral guidelines |
| Environment | Control discussion flow | Max turns, order type, visibility rules |
| Agents | Configure participants | LLM type, temperature, memory settings |
| Tools | Enable research capabilities | Paper search, citation analysis |
- Copy an existing configuration:
cp Horizontal_Collaboration/config.yaml my_custom_config.yaml-
Modify key parameters:
- Change
max_turnsfor discussion length - Adjust
temperaturefor creativity levels - Modify prompts for different expertise
- Add/remove tools as needed
- Change
-
Test your configuration:
python run_dynamic_topic.py --config my_custom_config.yaml --topic "test topic"# Compare different collaboration types on the same topic
topic="federated learning privacy"
for config in Horizontal_Collaboration Vertical_Collaboration Leader_Led_Collaboration; do
cd agentverse/tasks/simulation/$config
python run_dynamic_topic.py --topic "$topic"
cd ../../../..
done
# Test multiple topics with the same configuration
for topic in "NLP transformers" "Computer Vision" "Robotics control"; do
python run_dynamic_topic.py --topic "$topic"
done
# Compare different LLMs on the same topic
topic="quantum machine learning"
cd agentverse/tasks/simulation/
cd Solitary_Ideation_deepseek_v3 && python run_dynamic_topic.py --topic "$topic" && cd ..
cd Solitary_Ideation_o1_mini && python run_dynamic_topic.py --topic "$topic" && cd ..Each discussion generates structured outputs:
outputs/
βββ {topic}_run{n}_{timestamp}.txt # Complete conversation log
βββ logs/
β βββ {topic}_run{n}_{timestamp}.log # Debug and execution info
βββ research_proposals/
βββ {topic}_proposal.txt # Synthesized research proposal
The framework automatically synthesizes discussions into structured proposals containing:
- Title - Research question formulation
- Problem Statement - Current limitations and knowledge gaps
- Motivation & Hypothesis - Research rationale and expected outcomes
- Proposed Method - Technical approach and methodology
- Experiment Plan - Step-by-step experimental design
- References - Verified citations from Semantic Scholar integration
- API Key Errors: Ensure environment variables are properly set
- Import Errors: Install missing dependencies with
pip install semanticscholar - Memory Issues: Reduce
max_tokensormax_turnsin configuration - Network Issues: Check internet connectivity for Semantic Scholar API
Note: This framework is designed for academic research simulation and collaboration. Ensure proper attribution when using generated content for actual research purposes.
An intelligent system for evaluating research proposals using advanced AI models with structured assessment criteria.
This project provides a comprehensive framework for evaluating research proposals through automated AI-driven analysis. It offers both batch processing capabilities for multiple proposals and a web-based interface for individual proposal evaluation.
- Structured Evaluation: 8-dimensional assessment framework with detailed scoring criteria
- Batch Processing: Process multiple proposals from directories efficiently
- Web Interface: User-friendly web application for individual proposal evaluation
- Ensemble Review: Multiple AI reviewers for robust evaluation
- Self-Reflection: Iterative improvement through AI self-reflection
- Meta-Review: Synthesis of multiple reviews into final assessment
βββ app.py # Web application for individual proposal evaluation
βββ predict_proposal.py # Batch processing script for multiple proposals
βββ ai_scientist/ # Core evaluation modules
βββ examples_*/ # Input directories for batch processing
βββ results_*/ # Output directories for batch results
βββ templates/ # Web interface templates
Process multiple research proposals from directories:
python predict_proposal.pyFeatures:
- Processes all
.txtfiles in specified example directories - Supports multiple input formats (triple-quoted blocks, Python lists)
- Generates detailed JSON reviews and summary files
- Multi-threaded processing for efficiency
Input Directories:
examples/
Output:
- JSON review files with detailed scores and justifications
- Summary text files with key metrics
- Comprehensive logging
Deploy a web interface for individual proposal evaluation:
python app.pyFeatures:
- Real-time proposal evaluation
- Interactive scoring display
- Detailed criteria explanations
- HTTPS support for secure access
Access:
- Local:
http://localhost:4090 - HTTPS:
https://localhost:4090(with SSL)
The system evaluates research proposals across 8 core dimensions, each scored from 1.0 to 10.0:
Definition: Assesses the originality and paradigm-modifying potential of the research idea.
Definition: Evaluates the feasibility and implementability of the proposed research plan.
Definition: Assesses how well the proposal applies to the stated research problem.
Definition: Evaluates the clarity and thoroughness of the proposal articulation.
Definition: Assesses how well diverse concepts and methodologies are integrated.
Definition: Evaluates long-term potential and forward-looking perspective.
Definition: Assesses the soundness and appropriateness of research methods.
Definition: Evaluates logical flow and coherence of the argument.
Definition: Synthesizes all eight dimensions to evaluate overall proposal quality and potential impact.
Each evaluation produces a structured JSON response:
{
"Novelty": "8.5/10",
"Workability": "7.2/10",
"Relevance": "9.1/10",
"Specificity": "8.8/10",
"Integration_Depth": "7.9/10",
"Strategic_Vision": "8.3/10",
"Methodological_Rigor": "8.7/10",
"Argumentative_Cohesion": "8.0/10",
"Overall_Quality": "8.3/10",
"Decision": "Accept",
"Weaknesses": [
"Limited discussion of potential ethical concerns",
"Could benefit from more detailed timeline"
],
"Justifications": {
"Novelty": "Proposes a novel approach to...",
"Workability": "The methodology is well-defined...",
// ... other justifications
}
}- Python 3.8+
- OpenAI API access
- Required Python packages (see requirements.txt)
git clone <repository-url>
cd Proposal_Evaluation
pip install -r requirements.txt- Set up your OpenAI API credentials in the respective scripts
- Configure model parameters as needed
- Prepare input directories for batch processing
from predict_proposal import perform_structured_review
review = perform_structured_review(
proposal_text,
model="deepseek-v3",
client=openai_client,
temperature=0.1,
num_reviews_ensemble=3,
num_reflections=3
)- Multiple Independent Reviews: Generate 3 independent evaluations
- Meta-Review: Synthesize reviews into comprehensive assessment
- Self-Reflection: Iterative improvement through AI self-reflection
- Final Synthesis: Weighted average of ensemble scores
The system considers the proposal's origin context:
- Leader-guided: Highly curated, expected highest quality
- Multi-Person: Broad consensus approach
- Single-Person: Individual perspective
This project builds upon advanced AI evaluation techniques and research proposal assessment frameworks.
@misc{chen2025brainstorm,
title={Beyond Brainstorming: What Drives High-Quality Scientific Ideas? Lessons from Multi-Agent Collaboration},
author={Nuo Chen and Yicheng Tong and Jiaying Wu and Minh Duc Duong and Qian Wang and Qingyun Zou and Bryan Hooi and Bingsheng He},
year={2025},
eprint={2508.04575},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.04575},
}