Skip to content

the0king0mina/Multimodal-Medical-AI-System

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🩺 Multimodal Medical Product Analysis System

An end-to-end AI-powered multimodal system designed to analyze medical product images using OCR and Large Language Models (LLMs) to generate structured medical insights in both English and Arabic.

This project demonstrates practical skills in Computer Vision, NLP, and AI system design, with a focus on real-world medical use cases.

Demo

🚀 Key Features

  • OCR-based text extraction from medical product images
  • Text preprocessing and normalization
  • LLM-powered medical analysis and decision generation
  • Bilingual output (English & Arabic)
  • Interactive web interface using Gradio
  • Modular and scalable pipeline design

🧠 System Pipeline

  1. Image Input
  2. OCR Extraction
  3. Text Preprocessing
  4. Prompt Engineering
  5. LLM-based Analysis
  6. Bilingual Output
  7. Gradio Interface

🛠️ Technologies Used

  • Python
  • OpenCV
  • Tesseract OCR
  • Hugging Face Transformers
  • Large Language Models
  • Gradio
  • NumPy

▶️ How to Run

pip install -r requirements.txt python app/app.py

⚠️ Medical Disclaimer

This project is for educational and research purposes only and does not replace professional medical advice.

👤 Author

Mina Nabil

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published