Project: Multi-Modal Data Analysis API Description: FastAPI-based REST API that orchestrates LLM workflows for multi-modal data analysis (text, CSV, images, code generation, etc.). This is a generic, shareable README template with no personal data.
Warning: This is a fallback version and not my main version. I may use this in case my version is not able to work or I may use a different fallback altogether.
Note: This work is only a minor modification of https://github.com/MuthuPeru/DataAnalystAgent (This is much better than what I could come up with and is really helping me learn. I really wanted to undertand how we could use Langchain here and despite getting stuck in random errors, I kept trying. Now that I have Muthu's code to learn from, I will continute to work further to enhance my understanding. The original copyright and MIT license belongs to Muthu Peru.)
- REST API built with FastAPI
- Supports multi-file uploads;
questions.txt(or similar) is required for analysis requests - Multiple workflow types (data analysis, image analysis, code generation, EDA, predictive modeling, etc.)
- Synchronous processing for short-running tasks (service-level timeout / limits should be enforced separately)
- Designed to be run locally or in a container
- Multiple file upload support (1 required
questions.txt+ optional files) - Intelligent workflow detection (configurable)
- Multi-modal processing (text, images, structured files)
- Enhanced logging and error handling
- Prebuilt workflows for common analytical tasks
- Python 3.10+
- Recommended: virtual environment (venv / conda)
- See
requirements.txtorpyproject.tomlfor full dependency list
Create a .env from the template .env.template and set required values. Example environment variables (do not commit secrets):
OPENAI_API_KEY— API key for LLM provider (if used)LANGCHAIN_TRACING_V2— enable tracing (optional)LANGCHAIN_API_KEY— LangChain tracing key (optional)- Other provider-specific keys as required by your configuration
POST /api/— Submit analysis tasks (multipart/form-data;questions_txtrequired)GET /health— Health check (returns 200 OK if service is healthy)GET /— API info / landing page
POST /api/analyze— JSON-based analysis requestPOST /api/workflow— Execute a specific workflow by namePOST /api/pipeline— Execute a multi-step pipelineGET /api/tasks/{id}/status— Check task status
curl "http://localhost:8000/api/" \
-F "questions_txt=@questions.txt" \
-F "files=@dataset.csv" \
-F "files=@image.png"import requests
resp = requests.post(
"http://localhost:8000/api/analyze",
json={
"task_description": "Analyze customer churn data",
"workflow_type": "data_analysis",
"dataset_info": {
"description": "Customer dataset",
"columns": ["age", "tenure", "charges", "churn"],
"sample_size": 1000
}
}
)
print(resp.json())data_analysis— General analysis & recommendationsimage_analysis— Image processing / CV pipelinestext_analysis— NLP & text analyticscode_generation— Generate executable Python code for analysisexploratory_data_analysis— EDA plan & executionpredictive_modeling— Model guidance & training suggestionsdata_visualization— Chart/plot suggestions or generationstatistical_analysis— Statistical tests & inferenceweb_scraping— Web data extraction guidancedatabase_analysis— SQL / DuckDB assistance
- Clone the repository
git clone <REPO_URL>
cd <REPO_DIR>- Create & activate a virtual environment
python -m venv .venv
source .venv/bin/activate # Linux/macOS
.venv\Scripts\activate # Windows (PowerShell)- Install dependencies
pip install -r requirements.txt- Create
.envfrom the template and set variables
cp .env.template .env
# edit .env to add required keys (do not commit)- Run the server (development)
uvicorn main:app --reload --host 0.0.0.0 --port 8000- Visit the API docs
http://localhost:8000/docs
Build and run using the provided scripts or manually:
# build
docker build -t data-analysis-api .
# run (example)
docker run -d --name data-analysis-api -p 8000:80 --env-file .env data-analysis-api- Simple tests can be run with included test scripts:
python test_api.py
python test_file_upload_api.py(Adapt names to your test files)