Skip to content

Drawing Ensemble v1

Why this page exists

This page captures how the combined CVAT auto-annotation pass was built, why those particular models were chosen, and which model currently owns which label in the merged output.

The goal was not to replace every individual detector. The goal was to create one smoother CVAT annotation pass that uses the strongest parts of several detectors behind the scenes and returns one merged shapes payload to the annotator.

The combined connector

The current user-facing CVAT connector is:

Combine (YOLO26s Drawing Detect v1, CAD Drawing iy9tc v14, NEXFORM Drawing Elements v5)

Under the hood, this connector runs three backends on the same image:

  1. YOLO26s Drawing Detect v1 (CPU)
  2. CAD Drawing iy9tc v14
  3. NEXFORM Drawing Elements v5

It then normalizes all outputs into the shared CVAT label vocabulary and merges them into one final annotation payload.

How the combined pass was chosen

The selection came from the local comparison set in:

D:\02_Code\49_yolotraining_firstdataset\comparison_set

The practical result of that comparison was that no single model dominated every useful label. Instead, the models were complementary.

The strongest pattern was:

  • YOLO detect remained the most useful broad layout and symbol detector
  • CAD iy9tc v14 became useful once drawing -> drawing_view was mapped correctly
  • NEXFORM v5 was the strongest source for notes_block, title_block, and many dense dimension_cluster detections

That made an ensemble more useful than repeated manual detector passes.

Which model is used for which label

The current v1 ownership policy is:

CVAT label Primary model Why
drawing_view YOLO26s Drawing Detect v1 (CPU) Best broad view/layout behavior and strongest symbol/layout support overall
dimension_cluster CAD Drawing iy9tc v14 Useful dimension-region contribution after the CAD mapping fix
notes_block NEXFORM Drawing Elements v5 Best notes-region coverage on the comparison set
title_block NEXFORM Drawing Elements v5 Best useful title-block behavior without over-mapping title-block subfields
parts_list_bom YOLO26s Drawing Detect v1 (CPU) Lives in the broad layout/symbol family already covered by YOLO
thread_callout YOLO26s Drawing Detect v1 (CPU) Currently best served by the local YOLO path
surface_finish_symbol YOLO26s Drawing Detect v1 (CPU) Currently best served by the local YOLO path
feature_control_frame YOLO26s Drawing Detect v1 (CPU) Currently best served by the local YOLO path

The current fallback policy is:

CVAT label Fallback model Condition
drawing_view CAD Drawing iy9tc v14 Use drawing only when there is no overlapping kept YOLO detection
dimension_cluster NEXFORM Drawing Elements v5 Use NEXFORM dimension only when CAD has no overlapping kept detection
notes_block CAD Drawing iy9tc v14 Use CAD Notes only when NEXFORM has no overlapping kept detection
title_block YOLO26s Drawing Detect v1 (CPU) Only if YOLO already emits a compatible title_block detection

How the merge works

The merged pass does not simply concatenate detector outputs.

It first normalizes every prediction into a common internal contract:

  • label
  • confidence
  • type
  • points
  • source_model

Then it applies these rules:

  1. all raw classes are translated into the shared CVAT labels first
  2. predictions are grouped only within the same target label
  3. primary-owner predictions are kept first
  4. fallback predictions are added only if they do not overlap a kept primary prediction for that same label at IoU >= 0.4
  5. final same-label NMS runs at IoU 0.4
  6. different labels do not suppress one another

This keeps the pass closer to "maximum useful coverage" than any one detector alone, while staying cleaner than manually stacking several full detector runs.

Important mapping decisions

The combined pass depends on one important CAD mapping improvement:

drawing -> drawing_view

Without that change, CAD Drawing iy9tc v14 would return useful raw detections that CVAT would silently drop.

For NEXFORM v5, the useful v1 mappings are intentionally conservative:

  • dimension -> dimension_cluster
  • notes and drawing notes -> notes_block
  • Title block -> title_block
  • view-like classes -> drawing_view

Field-level title-block classes like company, material, description, and tolerances are currently left unmapped because they create fragmented, noisy title-block output instead of high-value region annotations.

What improved

The ensemble improved the practical single-pass workflow in two ways.

First, it reduced the need to run several detectors manually just to get decent coverage. Second, it preserved different model strengths in one click instead of forcing one model to do everything.

The comparison report used for this pass showed:

  • notes_block coverage winner in isolation: NEXFORM Drawing Elements v5
  • drawing_view coverage winner in isolation: CAD Drawing iy9tc v14
  • stronger combined dimension_cluster, title_block, and some symbol coverage once the merged pass was applied

Current limitations

The ensemble is deliberately conservative, not exhaustive.

It still leaves some potentially useful raw classes unmapped, especially from the NEXFORM family, because some of those classes represent title-block subfields rather than the whole regions the current CVAT schema actually wants.

Also, the combined pass is only as good as the current model set. If the component detectors change, the ownership table should be revisited rather than treated as permanent truth.

Open Questions

  1. Should dimension_cluster remain CAD-owned, or should NEXFORM v5 become the primary owner now that its raw dimension density is often higher?
  2. Should a later ensemble version include one of the newer workspace-trained Roboflow models once their mapping and runtime behavior are stabilized?
  3. Should title-block subfield classes eventually be represented as a different annotation family instead of being forced into the current region schema?

Sources

  • D:\02_Code\50_CVAT_RoboFlow\cvat_serverless\drawing_ensemble_v1\ensemble_handler.py
  • D:\02_Code\50_CVAT_RoboFlow\cvat_serverless\drawing_ensemble_v1\function.yaml
  • D:\02_Code\50_CVAT_RoboFlow\cvat_serverless\roboflow_technical_drawing_public_v1\function.yaml
  • D:\02_Code\50_CVAT_RoboFlow\outputs\reports\detector_comparison_20260413_193916.md
  • D:\02_Code\49_yolotraining_firstdataset\comparison_set