An end-to-end AI-powered multimodal system designed to analyze medical product images using OCR and Large Language Models (LLMs) to generate structured medical insights in both English and Arabic.
This project demonstrates practical skills in Computer Vision, NLP, and AI system design, with a focus on real-world medical use cases.
- OCR-based text extraction from medical product images
- Text preprocessing and normalization
- LLM-powered medical analysis and decision generation
- Bilingual output (English & Arabic)
- Interactive web interface using Gradio
- Modular and scalable pipeline design
- Image Input
- OCR Extraction
- Text Preprocessing
- Prompt Engineering
- LLM-based Analysis
- Bilingual Output
- Gradio Interface
- Python
- OpenCV
- Tesseract OCR
- Hugging Face Transformers
- Large Language Models
- Gradio
- NumPy
pip install -r requirements.txt python app/app.py
This project is for educational and research purposes only and does not replace professional medical advice.
Mina Nabil
