Skip to content

Label Studio Schema

Source of truth: engineering_drawing_label_schema_pipeline_v1.xlsx Role: Authoritative v1 labeling schema for RapidDraft drawing-analysis training

Summary

The workbook defines a two-layer system:

  • 18 visual object classes for first-pass detection
  • 5 error labels for review logic and relation checking

That separation is important. The first detector should learn drawing structure, not try to infer nuanced drafting mistakes directly from pixels on day one.

Label Studio Mapping

Use Label Studio to capture the dataset in this shape:

  • Use rectangle labels for the 18 object classes.
  • Store sheet metadata at task level: standard_family, drawing_type, units, source_type, image_quality.
  • Keep the 5 error types as review tags, relation annotations, or external JSON linked to the same sheet.
  • For synthetic error samples, keep a reviewer note explaining why the sample is wrong.

Object Classes

ID Class Group Priority Annotation unit Include Exclude
1 title_block region P0 Whole title block area as one box Title/name, drawing number, company, scale, units, approvals grouped in one block Do not box each field separately in v1
2 revision_block region P0 Whole revision table/block Vertical or horizontal revision table Do not annotate single revision letters outside the block unless they are the only revision indicator
3 parts_list_bom region P0 Entire BOM / parts list table Tabular material or item list Do not split rows in v1
4 notes_block region P0 Whole general notes area Ordered notes, process notes, flag-note lists Exclude isolated callout text tied to one feature
5 general_tolerance_note region P1 General tolerance statement as one box "Unless otherwise specified" style note near the title block Exclude specific tolerance callouts attached to a feature
6 drawing_view region P0 Boundary of a single orthographic or pictorial view Front, top, side, isometric, or enlarged view Exclude title block, notes, and tables
7 section_cutting_plane symbol P1 Entire cutting-plane indicator including arrows and label A-A, B-B section indicators Do not label hatch only
8 detail_callout symbol P1 Circular/detail identifier with leader Detail A, enlarged-view markers, view reference markers Exclude section cutting plane
9 item_balloon_find_number symbol P0 Balloon and number together Assembly item identifiers / find numbers Exclude note numbers and zone numbers
10 dimension_cluster symbol_group P0 Dimension text, arrows, and line as one object Linear, angular, radius, and diameter dimension groups Do not box every digit or arrow separately
11 leader_callout symbol_group P1 Leader line plus attached note or callout text Feature notes, flag notes, item callouts tied to geometry Exclude BOM balloons and dimension clusters
12 hole_callout text_symbol_group P1 Complete hole note string O, THRU, CSK, CBORE, depth, quantity callouts Exclude pure thread-only notes if not hole related
13 thread_callout text_symbol_group P1 Complete thread specification string M6x1, 1/4-20 UNC, thread standards note Exclude generic note-block text
14 datum_symbol symbol P0 Datum feature symbol with attached leader/triangle if present A, B, C datum identifiers Exclude plain letters that are not datums
15 feature_control_frame symbol P0 Entire GD&T frame as one box Single- or multi-segment FCF Do not split frame cells in v1
16 surface_finish_symbol symbol P1 Whole surface texture symbol and value Ra / roughness symbols Exclude general note text describing finish without a symbol
17 weld_symbol symbol P1 Weld symbol plus tail/reference where connected Fillet, groove, and related weld annotations Exclude geometry lines without a weld symbol
18 section_hatch_region region P2 Continuous hatch area inside a sectioned part Crosshatch indicating cut material Exclude shading used only for pictorial emphasis

Error Labels

These are not the first-pass detector classes. They are better treated as review outputs, relation labels, or downstream rule-engine results.

ID Error Level Severity Definition Evidence needed Annotation rule
1 undefined_centerline_reference relation High A feature or dimension is located from a centerline or axis that is not clearly defined dimension_cluster + hole_callout + datum_symbol or center reference nearby Box the affected dimension or hole note and mark the relation to the missing/offending reference
2 missing_angular_orientation relation High Pattern, hole, or slot location lacks clocking or tertiary datum when angular orientation is needed hole_callout + dimension_cluster + datum_symbol or feature_control_frame Label the affected feature group, not the whole sheet
3 ambiguous_or_inaccessible_datum relation High Datum feature is unclear, poorly chosen, or inaccessible for inspection/use datum_symbol + feature_control_frame + nearby feature Annotate the datum and affected FCF or feature together
4 note_missing_acceptance_criteria object Medium A note gives instruction but not enough acceptance/rejection criteria or is ambiguous notes_block or leader_callout Box only the ambiguous note unless the whole notes block is defective
5 bom_or_findnumber_mismatch sheet_or_relation Medium BOM or parts list does not agree with balloons/find numbers, or a listed part is not identified in the field parts_list_bom + item_balloon_find_number Prefer a sheet-level flag plus optional boxes on offending rows or balloons

Sheet Metadata

Field Allowed values Why it matters
standard_family ASME, ISO, NASA-tailored, unknown Standards change note expectations and interpretation rules.
drawing_type detail, assembly, inseparable_assembly, schematic, other Drawing type changes what content is required.
units metric, inch, mixed, unknown Improves OCR normalization and tolerance parsing.
source_type real_production, book_sample, synthetic, web Helps measure the domain gap between training and production data.
image_quality clean_export, scanned_clean, scanned_noisy, photo Helps target augmentation and failure analysis.

Pipeline

  1. Fix scope first: v1 targets mechanical detail and assembly drawings only.
  2. Capture metadata per sheet so results can be sliced by standard family, drawing type, units, and scan quality.
  3. Start with a pilot of 50 sheets to test class stability and frequency.
  4. Rasterize large drawings to PNG and tile with overlap while keeping splits by original sheet.
  5. Annotate object classes first and train a detector on structure before trying to label every error.
  6. Run OCR after detection for title blocks, notes, BOMs, hole/thread callouts, and FCFs.
  7. Add relation links on a smaller subset, such as balloon-to-BOM row and FCF-to-datum.
  8. Label the 5 error types on a smaller curated set, with reviewer notes explaining why each case is wrong.
  9. Train a v1 detector on the 18 classes and evaluate per class, not only overall mAP.
  10. Build a rule layer that converts detections, OCR, and relations into sheet-level checks.
  11. Review false alarms and convert each one into a relabel, a new synthetic sample, or a schema change.
  12. Scale carefully and postpone thin-line primitives like hidden lines, centrelines, extension lines, and arrowheads to phase 2.

Consistency Rules

  • Box the whole title block, revision block, BOM, and notes block; do not decompose them into every field or row in v1.
  • Treat a dimension_cluster as one named object, not many tiny boxes on digits, arrows, and line segments.
  • Treat a feature_control_frame as one box, even if it has multiple cells.
  • Keep hole_callout and thread_callout as complete strings so OCR and downstream parsing see the full instruction.
  • Use drawing_view boxes for view boundaries only; do not let those boxes swallow title blocks or tables.
  • Use section_cutting_plane for the indicator with arrows and label; use section_hatch_region only for the cut-material hatch area.
  • Keep error labels out of the base detector vocabulary unless a future experiment proves a visual-only error class is stable enough.
  • Split data by original drawing or sheet, never by crop or tile, to avoid train/validation leakage.

Where The Visual Examples Live

Use the Visual Label Reference page when annotators need a quick visual answer about region boundaries. If the visual example and the schema ever appear to disagree, this schema page wins.

Sources