Labeling Tools Research: Engineering Drawing Annotation¶

Source files: Labeling Tools for Technical Engineering Drawings and GD&T Annotation.docx Last synthesized: March 2026

Executive Summary¶

This research evaluates labeling and annotation tools suitable for preparing training data for vision models that analyze engineering drawings and GD&T (Geometric Dimensioning & Tolerancing) callouts.

Recommended approach: Combination of open-source CVAT (for general bounding-box annotation) with custom GD&T-specific labeling schema (JSON-based metadata tagging).

Current operational note (March 2026): This page remains the tool-survey reference. The active RapidDraft labeling schema and friend-facing visual examples now live in RapidDraft > Drawing Analysis, where the current Label Studio class definitions and reference examples are maintained.

What We Need to Label¶

Drawing Elements¶

GD&T Callouts — Positional tolerance, profile, perpendicularity, runout symbols
Surface Finish — Ra, Rz, machining marks (ISO 1302 symbols)
Dimensions — Linear, angular, with tolerance ranges
Material Specs — Material grades, hardness, plating specs
Title Block Info — Part number, revision, date, tolerances
Process Notes — Text annotations ("deburr all edges", "break sharp corners")
Fastener Callouts — Screw sizes, torque specs, thread types
Notes & References — Secondary information, cross-references

Annotation Format¶

Bounding box: Locate symbol or text in image
Classification: What type of callout (tolerance, surface finish, etc.)
Value extraction: Numeric value or symbolic code (e.g., "M10", "Ra 1.6", "⊥ 0.1")
Confidence: How clear is the callout? (clear, partial, unclear)

Evaluated Tools¶

1. CVAT (Open-Source) ⭐ RECOMMENDED¶

Description: Open-source annotation platform by Intel; widely used for computer vision datasets

Strengths: - ✅ Free and open-source (Apache 2.0) - ✅ Self-hosted (no cloud dependency) - ✅ Excellent bounding-box annotation UI - ✅ Multi-user support (collaborative) - ✅ Export to multiple formats (COCO JSON, PASCAL VOC, YOLO TXT) - ✅ Version control for annotations - ✅ Large community, good documentation - ✅ Keyboard shortcuts for speed

Limitations: - ⚠️ Generic bounding-box + classification (not GD&T-specific) - ⚠️ Text extraction requires manual entry (no OCR native) - ⚠️ Deployment requires Docker + GPU (not lightweight) - ⚠️ No built-in GD&T symbol library (custom setup needed)

Best for: Core annotation workflow; export to COCO format for training

Cost: Free (self-hosted)

Setup:

docker run -d -p 8080:8080 -e DJANGO_SU_NAME=admin -e DJANGO_SU_PASS=password cvat/cvat
# Access at http://localhost:8080

2. Labelbox (Commercial SaaS)¶

Description: Cloud-based annotation platform with enterprise focus

Strengths: - ✅ Excellent UI/UX (intuitive for annotators) - ✅ Built-in OCR (Tesseract backend) - ✅ Quality assurance workflows (review, consensus) - ✅ Collaboration tools (comments, assignments) - ✅ Model-assisted labeling (pre-fills some annotations) - ✅ Easy team scaling - ✅ Integrations with ML pipelines

Limitations: - ❌ Expensive: $500-5000+/month depending on volume - ❌ Cloud-based (data leaves your machine) - ⚠️ Not GD&T-specific (requires custom schema) - ⚠️ Data export/ownership terms can be restrictive

Best for: Teams with large datasets (1000+ images); enterprise with budget

Cost: Per-image pricing or subscription ($500/month minimum)

3. Prodigy (Active Learning)¶

Description: Lightweight annotation tool by Explosion AI; focused on active learning

Strengths: - ✅ Lightweight (runs on laptop) - ✅ Active learning (suggests examples to label) - ✅ Fast annotation interface - ✅ Good for NLP + image tasks - ✅ Python API (easy integration)

Limitations: - ⚠️ Expensive: $3000/year for single user - ⚠️ Better for NLP than vision - ⚠️ Limited multi-user support - ❌ Not specifically designed for GD&T

Best for: Small teams with active learning workflow; NLP focus

Cost: $3000/year (educational discounts available)

4. Custom Web-Based Annotation (Build Your Own)¶

Description: Build a lightweight custom annotation UI for your specific schema

Strengths: - ✅ Fully customizable for GD&T symbols - ✅ No licensing costs - ✅ Own your data - ✅ Can optimize for your specific workflow

Limitations: - ❌ High upfront cost (2-4 weeks to build) - ❌ No collaboration/QA features (requires additional work) - ⚠️ Maintenance burden - ⚠️ Team training required

Best for: If you have specific GD&T schema not covered by off-the-shelf tools

5. VGG Image Annotator (VIA) — Lightweight¶

Description: Minimal, browser-based annotation tool (runs locally, no installation)

Strengths: - ✅ Zero installation (pure HTML/JS) - ✅ Works offline - ✅ Export to JSON/CSV - ✅ Free and open-source

Limitations: - ❌ Very basic UI (single-user) - ❌ No collaboration or QA workflow - ⚠️ Limited to simple shapes (rectangles, polygons) - ❌ Not suitable for large teams

Best for: Quick prototyping or small datasets (<100 images)

Cost: Free (open-source)

6. RoboFlow (ML-Focused)¶

Description: Vision-first annotation platform with built-in model training

Strengths: - ✅ Annotation + training in one platform - ✅ Model-assisted labeling (faster) - ✅ Version control for datasets - ✅ Integration with popular frameworks (YOLOv8, etc.) - ✅ Good documentation

Limitations: - ❌ Expensive for large datasets - ❌ Cloud-based (data privacy concerns) - ⚠️ Not designed for text/symbol-heavy drawings - ⚠️ GD&T extraction would require custom training

Best for: Vision teams wanting integrated annotation + training pipeline

Cost: Pay-per-upload ($0.25 per 1K images minimum)

Comparison Matrix¶

Criterion	CVAT	Labelbox	Prodigy	Custom	VIA	RoboFlow
Cost	✅ Free	❌ $500+/mo	⚠️ $3K/yr	Medium	✅ Free	⚠️ Pay-per-use
GD&T support	⚠️ Generic	⚠️ Custom	⚠️ Custom	✅ Custom	⚠️ Generic	⚠️ Custom
Self-hosted	✅ Yes	❌ Cloud	✅ Yes	✅ Yes	✅ Yes	❌ Cloud
Collaboration	✅ Good	✅ Excellent	⚠️ Limited	Depends	❌ None	✅ Good
QA workflow	✅ Yes	✅ Excellent	⚠️ Limited	Depends	❌ None	✅ Yes
Training export	✅ COCO JSON	✅ COCO JSON	✅ Python	Depends	✅ JSON	✅ YOLOv8
Ease of setup	⚠️ Docker	✅ Browser	✅ Python	❌ High	✅ Trivial	✅ Browser
Data privacy	✅ Own infra	❌ Cloud	✅ Local	✅ Own infra	✅ Local	❌ Cloud
OCR/text extraction	❌ No	✅ Yes	⚠️ Limited	Depends	❌ No	⚠️ Yes

Recommendation for TextCAD¶

Phased Approach¶

Phase 1: Prototype (Months 1-2)¶

Use: CVAT + Custom GD&T Schema

Deploy CVAT locally (Docker)
Create custom class definitions:
tolerance_positional
tolerance_profile
tolerance_perpendicular
surface_finish
material_spec
process_note
etc.
Bounding-box annotation: Localize each callout in drawing images
Export: COCO JSON format
Metadata tagging: Add custom JSON fields for values (e.g., "tolerance_value: 0.1 mm")

Expected dataset: 200-500 annotated drawing images from various CAD tools

Cost: $0 (self-hosted CVAT)

Team: 1-2 annotators + 1 engineer (schema design + export pipeline)

Phase 2: Scale (Months 3-6)¶

Enhance Phase 1 with OCR + Value Extraction

Add OCR layer: Tesseract or cloud OCR to pre-fill symbol values
Improve schema: Validate extracted values against tolerance standards (ISO 1101 codes)
Quality assurance: Second pass (review annotations for accuracy)
Training dataset: Target 1000+ labeled images

Cost: Minimal (Tesseract is free; custom review pipeline)

Phase 3: Production (Months 6-12)¶

If dataset and model quality justify it:

Consider Labelbox for: - Larger annotation team (10+ people) - Advanced QA workflows - Active learning (model suggests uncertain examples for annotation)

Cost: $1000-3000/month (if justified by volume)

Custom GD&T Annotation Schema¶

Recommended JSON format:

{
  "image_id": "drawing_12345.png",
  "annotations": [
    {
      "id": 1,
      "bbox": [100, 200, 150, 250],
      "class": "tolerance_positional",
      "symbol_type": "position_hole",
      "value": "Ø0.1",
      "datum_refs": ["A", "B"],
      "confidence": "high",
      "notes": "Hole position tolerance",
      "ocr_text": "⌀0.1 A B C"
    },
    {
      "id": 2,
      "bbox": [300, 400, 350, 425],
      "class": "surface_finish",
      "roughness": "Ra 1.6",
      "production_method": "machined",
      "confidence": "high",
      "notes": "Surface finish callout"
    },
    {
      "id": 3,
      "bbox": [50, 500, 200, 530],
      "class": "process_note",
      "text": "Deburr all edges",
      "ocr_text": "Deburr all edges",
      "confidence": "high"
    }
  ]
}

Implementation Roadmap¶

Phase	Timeline	Tool	Team	Output
Phase 1	2 months	CVAT + custom schema	2 people	200-500 labeled images, COCO JSON
Phase 2	3 months	CVAT + OCR	2 people	1000+ labeled images, refined schema
Phase 3	Optional	Labelbox (if scale needed)	5-10 people	5000+ labeled images, production model

Training Pipeline Integration¶

Annotated images (COCO JSON)
       ↓
Vision model training (PyTorch, YOLOv8 for localization)
       ↓
GD&T symbol classifier (separate model for classification)
       ↓
Value extraction (OCR + entity linking)
       ↓
Integration into DFM pipeline (vision findings)

Risks & Mitigation¶

Risk	Severity	Mitigation
Annotation quality inconsistency	Medium	Second-pass QA; inter-annotator agreement checks
GD&T symbols hard to standardize	Medium	Clear annotation guidelines; symbol library examples
OCR fails on handwritten notes	Low	Accept as manual; log for user review
Different drawing standards (ISO, ASME, JIS)	Medium	Document target standard; tag by standard

Conclusion¶

Phase 1 recommendation: Use CVAT with custom GD&T schema.

Minimal cost (free, self-hosted)
Suitable for prototype dataset (200-500 images)
Flexible schema allows GD&T-specific metadata
Clear upgrade path to commercial tools if scale demands it
Export format (COCO JSON) compatible with modern vision frameworks

Start with 200 representative drawings from diverse CAD tools; measure inter-annotator agreement; iterate on schema if needed.

If team size grows or annotation speed becomes a bottleneck, migrate to Labelbox for enterprise features (QA, active learning, collaboration).

Sources¶

Labeling Tools for Technical Engineering Drawings and GD&T Annotation.docx
engineering_drawing_label_schema_pipeline_v1.xlsx
rapid_label_reference_examples.pdf