GA Drawing Analysis

Description

Analyze GA (General Arrangement) Drawings using state-of-the-art deep learning models for automated detection and extraction of tables, nozzles, notes, and views.

Tech Stack

YOLOv8 (nano): For cropping regions of interest (notes, tables, nozzles, views)
PaddleOCR (OCR + PPStructureV3): For parsing tables and nozzle symbols with high accuracy
DONUT: Transformer-based model for extracting handwritten/typed notes from images

Resources

GPU: RTX 4070 Ti (12GB VRAM)
CPU: Intel Core i9-14900KF (24 cores, 32 threads)
RAM: 64GB

Note: All models currently run on CPU. (Takes ~30 Seconds per image)

Methodology

Task	Training Images (after augmentation)	Validation Images	Resolution	mAP50	mAP	Notes/Dice (ED)
YOLOv8n - Notes & Tables	33 (from 11)	1	2048	99.5%	93.9%
YOLOv8 - Views Detection	33 (from 11)	1	1536	99.5%	95.1%
YOLOv8 - Nozzles Detection	40	2	1024	96%	85%
DONUT - Notes Extraction	11 (runtime augmentations)	1	1280			ED: 0.031

Output JSON Schema

{
  "tables": [{ "name": "str", "rows": [["..."]], "bbox": [x, y, w, h] }],
  "nozzles": [{ "bbox": [x, y, w, h], "view": "str", "text": "str" }],
  "notes": { "notes": { "key": "val" }, "bbox": [x, y, w, h] }
}

Setup Instructions

Clone the repository:
```
git clone https://www.github.com/acen20/ga-analysis
cd ga-analysis
```
Download det_models/ and ocr_model/ from Google Drive Folder

Download YOLOv8 checkpoint files and place them in the det_models/ directory:

det_models/
├── view/
│   └── best.pt
├── section/
│   └── best.pt
└── nozzle/
    └── best.pt

Download DONUT checkpoint files and place them in the ocr_model/ directory:
```
ocr_model/
└── {all extracted files here}
```
Build and start the containers:
```
docker-compose up --build
```
Access the API endpoint:
```
http://localhost:8000/detect
```

Note: First Startup takes time as PaddleOCR downloads OCR models

Usage

The API will be exposed at:
http://localhost:8000/detect
Send a POST request with the pdf_file file to the endpoint to get detections and extracted info.

Example Request (using `curl`):

curl -X POST "http://localhost:8000/detect" \
  -F "file=@path/to/your/file.pdf"

Limitations

Limited Dataset Size
With only 12 documents available for training/testing, the model may not generalize well to variations in document structure, terminology, or formatting.
Edge Case – Split Notes Section
In some cases, the Notes section is split into two parts across the document. This was not detected due to the lack of such variations in the dataset.
- Potential fix: Oversample relevant cases to train for this pattern. This can be attempted in future iterations.
Complex Table Structures
Some table layouts are too complex for PaddleOCR’s PPStructureV3 to accurately distinguish cell boundaries. A more sophisticated table parsing strategy is required for improved accuracy.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
assets		assets
donut_api		donut_api
examples		examples
vessel		vessel
.gitignore		.gitignore
README.md		README.md
compose.yml		compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GA Drawing Analysis

Description

Tech Stack

Resources

Methodology

Output JSON Schema

Setup Instructions

Usage

Example Request (using `curl`):

Limitations

About

Uh oh!

Releases

Packages

Languages

acen20/ga-analysis

Folders and files

Latest commit

History

Repository files navigation

GA Drawing Analysis

Description

Tech Stack

Resources

Methodology

Output JSON Schema

Setup Instructions

Usage

Example Request (using curl):

Limitations

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Example Request (using `curl`):

Packages