Label Studio Schema¶
Source of truth:
engineering_drawing_label_schema_pipeline_v1.xlsxRole: Authoritative v1 labeling schema for RapidDraft drawing-analysis training
Summary¶
The workbook defines a two-layer system:
- 18 visual object classes for first-pass detection
- 5 error labels for review logic and relation checking
That separation is important. The first detector should learn drawing structure, not try to infer nuanced drafting mistakes directly from pixels on day one.
Label Studio Mapping¶
Use Label Studio to capture the dataset in this shape:
- Use rectangle labels for the 18 object classes.
- Store sheet metadata at task level:
standard_family,drawing_type,units,source_type,image_quality. - Keep the 5 error types as review tags, relation annotations, or external JSON linked to the same sheet.
- For synthetic error samples, keep a reviewer note explaining why the sample is wrong.
Object Classes¶
| ID | Class | Group | Priority | Annotation unit | Include | Exclude |
|---|---|---|---|---|---|---|
| 1 | title_block |
region |
P0 | Whole title block area as one box | Title/name, drawing number, company, scale, units, approvals grouped in one block | Do not box each field separately in v1 |
| 2 | revision_block |
region |
P0 | Whole revision table/block | Vertical or horizontal revision table | Do not annotate single revision letters outside the block unless they are the only revision indicator |
| 3 | parts_list_bom |
region |
P0 | Entire BOM / parts list table | Tabular material or item list | Do not split rows in v1 |
| 4 | notes_block |
region |
P0 | Whole general notes area | Ordered notes, process notes, flag-note lists | Exclude isolated callout text tied to one feature |
| 5 | general_tolerance_note |
region |
P1 | General tolerance statement as one box | "Unless otherwise specified" style note near the title block | Exclude specific tolerance callouts attached to a feature |
| 6 | drawing_view |
region |
P0 | Boundary of a single orthographic or pictorial view | Front, top, side, isometric, or enlarged view | Exclude title block, notes, and tables |
| 7 | section_cutting_plane |
symbol |
P1 | Entire cutting-plane indicator including arrows and label | A-A, B-B section indicators | Do not label hatch only |
| 8 | detail_callout |
symbol |
P1 | Circular/detail identifier with leader | Detail A, enlarged-view markers, view reference markers | Exclude section cutting plane |
| 9 | item_balloon_find_number |
symbol |
P0 | Balloon and number together | Assembly item identifiers / find numbers | Exclude note numbers and zone numbers |
| 10 | dimension_cluster |
symbol_group |
P0 | Dimension text, arrows, and line as one object | Linear, angular, radius, and diameter dimension groups | Do not box every digit or arrow separately |
| 11 | leader_callout |
symbol_group |
P1 | Leader line plus attached note or callout text | Feature notes, flag notes, item callouts tied to geometry | Exclude BOM balloons and dimension clusters |
| 12 | hole_callout |
text_symbol_group |
P1 | Complete hole note string | O, THRU, CSK, CBORE, depth, quantity callouts |
Exclude pure thread-only notes if not hole related |
| 13 | thread_callout |
text_symbol_group |
P1 | Complete thread specification string | M6x1, 1/4-20 UNC, thread standards note |
Exclude generic note-block text |
| 14 | datum_symbol |
symbol |
P0 | Datum feature symbol with attached leader/triangle if present | A, B, C datum identifiers | Exclude plain letters that are not datums |
| 15 | feature_control_frame |
symbol |
P0 | Entire GD&T frame as one box | Single- or multi-segment FCF | Do not split frame cells in v1 |
| 16 | surface_finish_symbol |
symbol |
P1 | Whole surface texture symbol and value | Ra / roughness symbols | Exclude general note text describing finish without a symbol |
| 17 | weld_symbol |
symbol |
P1 | Weld symbol plus tail/reference where connected | Fillet, groove, and related weld annotations | Exclude geometry lines without a weld symbol |
| 18 | section_hatch_region |
region |
P2 | Continuous hatch area inside a sectioned part | Crosshatch indicating cut material | Exclude shading used only for pictorial emphasis |
Error Labels¶
These are not the first-pass detector classes. They are better treated as review outputs, relation labels, or downstream rule-engine results.
| ID | Error | Level | Severity | Definition | Evidence needed | Annotation rule |
|---|---|---|---|---|---|---|
| 1 | undefined_centerline_reference |
relation |
High | A feature or dimension is located from a centerline or axis that is not clearly defined | dimension_cluster + hole_callout + datum_symbol or center reference nearby |
Box the affected dimension or hole note and mark the relation to the missing/offending reference |
| 2 | missing_angular_orientation |
relation |
High | Pattern, hole, or slot location lacks clocking or tertiary datum when angular orientation is needed | hole_callout + dimension_cluster + datum_symbol or feature_control_frame |
Label the affected feature group, not the whole sheet |
| 3 | ambiguous_or_inaccessible_datum |
relation |
High | Datum feature is unclear, poorly chosen, or inaccessible for inspection/use | datum_symbol + feature_control_frame + nearby feature |
Annotate the datum and affected FCF or feature together |
| 4 | note_missing_acceptance_criteria |
object |
Medium | A note gives instruction but not enough acceptance/rejection criteria or is ambiguous | notes_block or leader_callout |
Box only the ambiguous note unless the whole notes block is defective |
| 5 | bom_or_findnumber_mismatch |
sheet_or_relation |
Medium | BOM or parts list does not agree with balloons/find numbers, or a listed part is not identified in the field | parts_list_bom + item_balloon_find_number |
Prefer a sheet-level flag plus optional boxes on offending rows or balloons |
Sheet Metadata¶
| Field | Allowed values | Why it matters |
|---|---|---|
standard_family |
ASME, ISO, NASA-tailored, unknown |
Standards change note expectations and interpretation rules. |
drawing_type |
detail, assembly, inseparable_assembly, schematic, other |
Drawing type changes what content is required. |
units |
metric, inch, mixed, unknown |
Improves OCR normalization and tolerance parsing. |
source_type |
real_production, book_sample, synthetic, web |
Helps measure the domain gap between training and production data. |
image_quality |
clean_export, scanned_clean, scanned_noisy, photo |
Helps target augmentation and failure analysis. |
Pipeline¶
- Fix scope first: v1 targets mechanical detail and assembly drawings only.
- Capture metadata per sheet so results can be sliced by standard family, drawing type, units, and scan quality.
- Start with a pilot of 50 sheets to test class stability and frequency.
- Rasterize large drawings to PNG and tile with overlap while keeping splits by original sheet.
- Annotate object classes first and train a detector on structure before trying to label every error.
- Run OCR after detection for title blocks, notes, BOMs, hole/thread callouts, and FCFs.
- Add relation links on a smaller subset, such as balloon-to-BOM row and FCF-to-datum.
- Label the 5 error types on a smaller curated set, with reviewer notes explaining why each case is wrong.
- Train a v1 detector on the 18 classes and evaluate per class, not only overall mAP.
- Build a rule layer that converts detections, OCR, and relations into sheet-level checks.
- Review false alarms and convert each one into a relabel, a new synthetic sample, or a schema change.
- Scale carefully and postpone thin-line primitives like hidden lines, centrelines, extension lines, and arrowheads to phase 2.
Consistency Rules¶
- Box the whole title block, revision block, BOM, and notes block; do not decompose them into every field or row in v1.
- Treat a
dimension_clusteras one named object, not many tiny boxes on digits, arrows, and line segments. - Treat a
feature_control_frameas one box, even if it has multiple cells. - Keep
hole_calloutandthread_calloutas complete strings so OCR and downstream parsing see the full instruction. - Use
drawing_viewboxes for view boundaries only; do not let those boxes swallow title blocks or tables. - Use
section_cutting_planefor the indicator with arrows and label; usesection_hatch_regiononly for the cut-material hatch area. - Keep error labels out of the base detector vocabulary unless a future experiment proves a visual-only error class is stable enough.
- Split data by original drawing or sheet, never by crop or tile, to avoid train/validation leakage.
Where The Visual Examples Live¶
Use the Visual Label Reference page when annotators need a quick visual answer about region boundaries. If the visual example and the schema ever appear to disagree, this schema page wins.