Skip to content

Product-Designs/ocr-template-creator

Repository files navigation

OCR Template Tools

This directory contains tools and utilities for creating OCR templates for Archparser.

Template Creator Screen

Files Overview

Template Creator Tools

  • standalone-template-creator.html - ⭐ Recommended - Standalone tool that works offline with any PDF
  • TEMPLATE-CREATOR-GUIDE.md - Complete guide for using both creator tools

Documentation

Quick Start

Creating a New Template

  1. Open the standalone creator:

    open standalone-template-creator.html
  2. Load your PDF drawing

  3. Set viewport scale to match your backend (default: 2.0)

  4. Draw regions by clicking and dragging on the PDF

  5. Export JSON and save the file

  6. Import into Archparser via the frontend UI

Using Templates

Templates can be used when uploading drawing sets in the Archparser frontend. Select your template from the dropdown during upload.

Standalone

✅ Works completely offline ✅ No backend required ✅ Can be shared with team members ✅ Production users can use it ✅ Converts PDFs directly in browser ✅ Can be hosted on any web server

Use when:

  • Creating templates for production use
  • Sharing with team members
  • Working with new PDFs

Viewport Scale Settings

The viewport scale determines the resolution at which PDFs are rendered. This must match between:

  1. The template creator tool
  2. Your Archparser backend configuration

Common Scales

Scale DPI Quality Use Case
2.0 ~150 Good Recommended default
4.0 ~300 Excellent Medium - average processing speed
6.0 ~450 Overkill Very large drawings only

Finding Your Backend Scale

Check the Archparser backend configuration:

// backend/src/drawing-processing/ocr-config.ts
export const DEFAULT_OCR_CONFIG: OcrProcessingConfig = {
  viewportScale: 2.0 // ← This value
  // ...
};

Template Format

Templates are JSON files with this structure:

{
  "name": "Template Name",
  "type": "standard",
  "viewportScale": 2.0,
  "regions": {
    "title": { "x": 100, "y": 200, "width": 500, "height": 80 },
    "drawing": { "x": 100, "y": 300, "width": 400, "height": 60 }
  }
}

Standard Regions

Common region names:

  • title - Drawing title
  • drawing - Drawing number
  • revision - Revision information
  • date - Drawing date
  • project - Project name
  • architect - Architect name
  • contractor - Contractor name
  • engineer - Engineer name
  • drawingType - Type of drawing
  • drawingCategory - Category

Coordinate System

Coordinates are in pixels from the top-left corner:

(0,0) ────────────────► X
  │
  │     ┌─────────┐
  │     │ Region  │ ← (x, y, width, height)
  │     └─────────┘
  │
  ▼
  Y
Loading
  • x - Distance from left edge
  • y - Distance from top edge
  • width - Region width
  • height - Region height

All values are in pixels at the specified viewport scale.

Sharing Templates

Within Your Organization

  1. Create template with standalone tool
  2. Export JSON file
  3. Share file with team members
  4. They import via Archparser UI

Public Sharing

Consider creating a repository of common templates:

  • Generic Arch D titleblock
  • Standard ANSI formats
  • Common AEC firm formats (with permission)

Hosting the Tool

You can host these HTML files on any web server:

# Simple Python server
cd template-creator
python3 -m http.server 8000

# Or use any static hosting service
# Netlify, Vercel, GitHub Pages, etc.

Then share the URL with your team.

Troubleshooting

"Out of bounds" errors

Cause: Template viewport scale doesn't match backend

Solution:

  1. Check backend viewport scale
  2. Recreate template at matching scale
  3. Or update backend to match template

No text extracted

Cause: Regions positioned incorrectly

Solution:

  1. Process a test drawing
  2. Check where title block actually appears
  3. Adjust regions accordingly

Partial extraction

Cause: Regions too small or positioned slightly off

Solution:

  1. Make regions slightly larger
  2. Use template creator's zoom feature for precision
  3. Test with multiple drawings to verify

Support

For issues or questions:

  • Check documentation in /guides directory
  • Review backend logs for coordinate warnings
  • Test with standalone tool to verify coordinates
  • Compare viewport scales between tool and backend

Best Practices

  1. Name templates clearly - Include firm, drawing type, and version
  2. Match viewport scale - Always verify backend settings first
  3. Test thoroughly - Process multiple drawings to verify accuracy
  4. Document quirks - Note any special considerations
  5. Version control - Keep backups of working templates
  6. Share with team - Make templates available to all users

See Also

About

Create an OCR template directly from a PDF for import into Archparser. No local server needed!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages