Skip to content

Labeling Tools Research: Engineering Drawing Annotation

Source files: Labeling Tools for Technical Engineering Drawings and GD&T Annotation.docx Last synthesized: March 2026

Executive Summary

This research evaluates labeling and annotation tools suitable for preparing training data for vision models that analyze engineering drawings and GD&T (Geometric Dimensioning & Tolerancing) callouts.

Recommended approach: Combination of open-source CVAT (for general bounding-box annotation) with custom GD&T-specific labeling schema (JSON-based metadata tagging).

Current operational note (March 2026): This page remains the tool-survey reference. The active RapidDraft labeling schema and friend-facing visual examples now live in RapidDraft > Drawing Analysis, where the current Label Studio class definitions and reference examples are maintained.


What We Need to Label

Drawing Elements

  1. GD&T Callouts — Positional tolerance, profile, perpendicularity, runout symbols
  2. Surface Finish — Ra, Rz, machining marks (ISO 1302 symbols)
  3. Dimensions — Linear, angular, with tolerance ranges
  4. Material Specs — Material grades, hardness, plating specs
  5. Title Block Info — Part number, revision, date, tolerances
  6. Process Notes — Text annotations ("deburr all edges", "break sharp corners")
  7. Fastener Callouts — Screw sizes, torque specs, thread types
  8. Notes & References — Secondary information, cross-references

Annotation Format

  • Bounding box: Locate symbol or text in image
  • Classification: What type of callout (tolerance, surface finish, etc.)
  • Value extraction: Numeric value or symbolic code (e.g., "M10", "Ra 1.6", "⊥ 0.1")
  • Confidence: How clear is the callout? (clear, partial, unclear)

Evaluated Tools

Description: Open-source annotation platform by Intel; widely used for computer vision datasets

Strengths: - ✅ Free and open-source (Apache 2.0) - ✅ Self-hosted (no cloud dependency) - ✅ Excellent bounding-box annotation UI - ✅ Multi-user support (collaborative) - ✅ Export to multiple formats (COCO JSON, PASCAL VOC, YOLO TXT) - ✅ Version control for annotations - ✅ Large community, good documentation - ✅ Keyboard shortcuts for speed

Limitations: - ⚠️ Generic bounding-box + classification (not GD&T-specific) - ⚠️ Text extraction requires manual entry (no OCR native) - ⚠️ Deployment requires Docker + GPU (not lightweight) - ⚠️ No built-in GD&T symbol library (custom setup needed)

Best for: Core annotation workflow; export to COCO format for training

Cost: Free (self-hosted)

Setup:

docker run -d -p 8080:8080 -e DJANGO_SU_NAME=admin -e DJANGO_SU_PASS=password cvat/cvat
# Access at http://localhost:8080


2. Labelbox (Commercial SaaS)

Description: Cloud-based annotation platform with enterprise focus

Strengths: - ✅ Excellent UI/UX (intuitive for annotators) - ✅ Built-in OCR (Tesseract backend) - ✅ Quality assurance workflows (review, consensus) - ✅ Collaboration tools (comments, assignments) - ✅ Model-assisted labeling (pre-fills some annotations) - ✅ Easy team scaling - ✅ Integrations with ML pipelines

Limitations: - ❌ Expensive: $500-5000+/month depending on volume - ❌ Cloud-based (data leaves your machine) - ⚠️ Not GD&T-specific (requires custom schema) - ⚠️ Data export/ownership terms can be restrictive

Best for: Teams with large datasets (1000+ images); enterprise with budget

Cost: Per-image pricing or subscription ($500/month minimum)


3. Prodigy (Active Learning)

Description: Lightweight annotation tool by Explosion AI; focused on active learning

Strengths: - ✅ Lightweight (runs on laptop) - ✅ Active learning (suggests examples to label) - ✅ Fast annotation interface - ✅ Good for NLP + image tasks - ✅ Python API (easy integration)

Limitations: - ⚠️ Expensive: $3000/year for single user - ⚠️ Better for NLP than vision - ⚠️ Limited multi-user support - ❌ Not specifically designed for GD&T

Best for: Small teams with active learning workflow; NLP focus

Cost: $3000/year (educational discounts available)


4. Custom Web-Based Annotation (Build Your Own)

Description: Build a lightweight custom annotation UI for your specific schema

Strengths: - ✅ Fully customizable for GD&T symbols - ✅ No licensing costs - ✅ Own your data - ✅ Can optimize for your specific workflow

Limitations: - ❌ High upfront cost (2-4 weeks to build) - ❌ No collaboration/QA features (requires additional work) - ⚠️ Maintenance burden - ⚠️ Team training required

Best for: If you have specific GD&T schema not covered by off-the-shelf tools


5. VGG Image Annotator (VIA) — Lightweight

Description: Minimal, browser-based annotation tool (runs locally, no installation)

Strengths: - ✅ Zero installation (pure HTML/JS) - ✅ Works offline - ✅ Export to JSON/CSV - ✅ Free and open-source

Limitations: - ❌ Very basic UI (single-user) - ❌ No collaboration or QA workflow - ⚠️ Limited to simple shapes (rectangles, polygons) - ❌ Not suitable for large teams

Best for: Quick prototyping or small datasets (<100 images)

Cost: Free (open-source)


6. RoboFlow (ML-Focused)

Description: Vision-first annotation platform with built-in model training

Strengths: - ✅ Annotation + training in one platform - ✅ Model-assisted labeling (faster) - ✅ Version control for datasets - ✅ Integration with popular frameworks (YOLOv8, etc.) - ✅ Good documentation

Limitations: - ❌ Expensive for large datasets - ❌ Cloud-based (data privacy concerns) - ⚠️ Not designed for text/symbol-heavy drawings - ⚠️ GD&T extraction would require custom training

Best for: Vision teams wanting integrated annotation + training pipeline

Cost: Pay-per-upload ($0.25 per 1K images minimum)


Comparison Matrix

Criterion CVAT Labelbox Prodigy Custom VIA RoboFlow
Cost ✅ Free ❌ $500+/mo ⚠️ $3K/yr Medium ✅ Free ⚠️ Pay-per-use
GD&T support ⚠️ Generic ⚠️ Custom ⚠️ Custom ✅ Custom ⚠️ Generic ⚠️ Custom
Self-hosted ✅ Yes ❌ Cloud ✅ Yes ✅ Yes ✅ Yes ❌ Cloud
Collaboration ✅ Good ✅ Excellent ⚠️ Limited Depends ❌ None ✅ Good
QA workflow ✅ Yes ✅ Excellent ⚠️ Limited Depends ❌ None ✅ Yes
Training export ✅ COCO JSON ✅ COCO JSON ✅ Python Depends ✅ JSON ✅ YOLOv8
Ease of setup ⚠️ Docker ✅ Browser ✅ Python ❌ High ✅ Trivial ✅ Browser
Data privacy ✅ Own infra ❌ Cloud ✅ Local ✅ Own infra ✅ Local ❌ Cloud
OCR/text extraction ❌ No ✅ Yes ⚠️ Limited Depends ❌ No ⚠️ Yes

Recommendation for TextCAD

Phased Approach

Phase 1: Prototype (Months 1-2)

Use: CVAT + Custom GD&T Schema

  1. Deploy CVAT locally (Docker)
  2. Create custom class definitions:
  3. tolerance_positional
  4. tolerance_profile
  5. tolerance_perpendicular
  6. surface_finish
  7. material_spec
  8. process_note
  9. etc.
  10. Bounding-box annotation: Localize each callout in drawing images
  11. Export: COCO JSON format
  12. Metadata tagging: Add custom JSON fields for values (e.g., "tolerance_value: 0.1 mm")

Expected dataset: 200-500 annotated drawing images from various CAD tools

Cost: $0 (self-hosted CVAT)

Team: 1-2 annotators + 1 engineer (schema design + export pipeline)


Phase 2: Scale (Months 3-6)

Enhance Phase 1 with OCR + Value Extraction

  1. Add OCR layer: Tesseract or cloud OCR to pre-fill symbol values
  2. Improve schema: Validate extracted values against tolerance standards (ISO 1101 codes)
  3. Quality assurance: Second pass (review annotations for accuracy)
  4. Training dataset: Target 1000+ labeled images

Cost: Minimal (Tesseract is free; custom review pipeline)


Phase 3: Production (Months 6-12)

If dataset and model quality justify it:

Consider Labelbox for: - Larger annotation team (10+ people) - Advanced QA workflows - Active learning (model suggests uncertain examples for annotation)

Cost: $1000-3000/month (if justified by volume)


Custom GD&T Annotation Schema

Recommended JSON format:

{
  "image_id": "drawing_12345.png",
  "annotations": [
    {
      "id": 1,
      "bbox": [100, 200, 150, 250],
      "class": "tolerance_positional",
      "symbol_type": "position_hole",
      "value": "Ø0.1",
      "datum_refs": ["A", "B"],
      "confidence": "high",
      "notes": "Hole position tolerance",
      "ocr_text": "⌀0.1 A B C"
    },
    {
      "id": 2,
      "bbox": [300, 400, 350, 425],
      "class": "surface_finish",
      "roughness": "Ra 1.6",
      "production_method": "machined",
      "confidence": "high",
      "notes": "Surface finish callout"
    },
    {
      "id": 3,
      "bbox": [50, 500, 200, 530],
      "class": "process_note",
      "text": "Deburr all edges",
      "ocr_text": "Deburr all edges",
      "confidence": "high"
    }
  ]
}

Implementation Roadmap

Phase Timeline Tool Team Output
Phase 1 2 months CVAT + custom schema 2 people 200-500 labeled images, COCO JSON
Phase 2 3 months CVAT + OCR 2 people 1000+ labeled images, refined schema
Phase 3 Optional Labelbox (if scale needed) 5-10 people 5000+ labeled images, production model

Training Pipeline Integration

Annotated images (COCO JSON)
Vision model training (PyTorch, YOLOv8 for localization)
GD&T symbol classifier (separate model for classification)
Value extraction (OCR + entity linking)
Integration into DFM pipeline (vision findings)

Risks & Mitigation

Risk Severity Mitigation
Annotation quality inconsistency Medium Second-pass QA; inter-annotator agreement checks
GD&T symbols hard to standardize Medium Clear annotation guidelines; symbol library examples
OCR fails on handwritten notes Low Accept as manual; log for user review
Different drawing standards (ISO, ASME, JIS) Medium Document target standard; tag by standard

Conclusion

Phase 1 recommendation: Use CVAT with custom GD&T schema.

  • Minimal cost (free, self-hosted)
  • Suitable for prototype dataset (200-500 images)
  • Flexible schema allows GD&T-specific metadata
  • Clear upgrade path to commercial tools if scale demands it
  • Export format (COCO JSON) compatible with modern vision frameworks

Start with 200 representative drawings from diverse CAD tools; measure inter-annotator agreement; iterate on schema if needed.

If team size grows or annotation speed becomes a bottleneck, migrate to Labelbox for enterprise features (QA, active learning, collaboration).

Sources