Skip to content

RapidDraft Agent and Local AI Server Architecture

Last updated: 2026-06-02

Purpose

This page explains how the local AI server supports RapidDraft Agent for the current Theegarten demo path. It is written for a technical/product audience that needs to understand the architecture well enough to explain it.

The important framing is:

  • RapidDraft Agent is a tool-orchestrating product feature inside RapidDraft.
  • RapidDraft Knowledge is one backend tool family the Agent can call for indexed documents, search, answers, citations, and inventory.
  • The Fedora server runs Knowledge, local models, LiteLLM, embeddings, OCR/RAG, and local vision.
  • Hosted RapidDraft backends can reach Fedora through knowledge.rapiddraft.ai and localai.rapiddraft.ai; browser clients must still go through the RapidDraft backend.
  • RapidDraft product code changes live in the Mac RapidDraft repo, not on the Fedora server.
  • The local RAG service is infrastructure, not a separate chat product.

One Sentence Explanation

RapidDraft owns the engineering workflow and artifacts; the Local AI Server provides private model and Knowledge capabilities that RapidDraft Agent calls as backend tools.

Product Architecture Synthesis

For the broader research-backed local AI platform architecture, see Engineering Local AI Platform Architecture.

The short version is that RapidDraft should not be framed as a one-off local LLM box. RapidDraft owns the engineering control plane: workflow, evidence, prompts, approvals, reports, and audit events. The local AI server is one reference implementation of the inference and Knowledge planes. Customer deployments may later run the same RapidDraft product boundary on a DGX Spark-style appliance, customer GPU servers, NVIDIA NIM, vLLM, or an approved enterprise AI platform.

Current Topology

flowchart TB
  subgraph Mac["RapidDraft product boundary"]
    RDUI["RapidDraft web UI<br/>Agent rail, chat cards, canvas tabs"]
    RDAPI["RapidDraft backend<br/>server/agent, server/bom, reports"]

    subgraph AgentLayer["Agent layer"]
      AG["Agent orchestrator<br/>runs, events, artifacts, approvals"]
      TR["Tool registry<br/>Knowledge, screenshots, DFM, drawings, BOM, reports"]
    end

    subgraph ProductTools["RapidDraft-owned tools"]
      BOM["DraftBomService<br/>STEP/model BOM extraction"]
      VQ["BOM visual enrichment queue<br/>background advisory notes"]
      REP["Engineering report service<br/>release report and PDF export"]
    end
  end

  subgraph Tunnel["Backend-to-Fedora connection"]
    SSH["Local SSH tunnel for Mac dev<br/>or Cloudflare Tunnel for Railway"]
  end

  subgraph Fedora["Fedora Local AI Server"]
    RAG["Knowledge/RAG API<br/>auth, search, answer, inventory"]
    PG["Postgres + pgvector<br/>documents, chunks, summaries"]
    LLM["LiteLLM + llama.cpp<br/>text, vision, embedding aliases"]
    ING["Ingestion/OCR pipeline<br/>inbox, processing, archive"]
  end

  RDUI --> RDAPI
  RDAPI --> AG
  AG --> TR
  TR --> BOM
  TR --> REP
  BOM --> VQ
  TR --> KC["KnowledgeClient"]
  KC --> SSH
  VQ --> SSH
  SSH --> RAG
  RAG --> PG
  RAG --> LLM
  ING --> PG

Runtime Boundary

Layer Runs where Owns Does not own
RapidDraft UI Mac/dev app, later hosted product Agent rail, chat cards, canvas tabs, artifact rendering local model serving
RapidDraft backend Mac/dev app, later product backend Agent runs, typed tools, BOM extraction, reports, approvals, artifacts Fedora RAG database
Fedora Knowledge service local-server-adeel indexed docs, retrieval, cited answers, inventory, local vision route RapidDraft product records or Agent UI
LiteLLM/llama.cpp local-server-adeel model aliases, local inference, embeddings product workflow state

Source Of Truth

The Agent does not make missing RapidDraft backend capabilities appear by itself. It can only orchestrate capabilities that exist as product tools or backend services.

flowchart LR
  subgraph RapidDraft["RapidDraft owns product truth"]
    M["CAD model metadata and STEP files"]
    B["BOM artifacts and future BOM store"]
    D["DFM findings and drawing checks"]
    A["Agent runs, messages, approvals"]
    R["Engineering reports and exports"]
  end

  subgraph LocalAI["Local AI Server owns knowledge/model infrastructure"]
    DOC["Indexed documents"]
    CH["Chunks, embeddings, summaries"]
    SRCH["Search and cited answers"]
    INF["Local inference aliases"]
    VIS["Vision descriptions"]
  end

  subgraph Shared["Shared transient evidence"]
    CITE["Citations returned to RapidDraft"]
    NOTE["Advisory vision notes"]
    HEALTH["Tunnel and service health"]
  end

  RapidDraft --> Shared
  LocalAI --> Shared
  Shared --> RapidDraft
Area Source of truth
BOM identity and quantity STEP assembly records first, RapidDraft model/component metadata fallback
BOM visual notes Local vision, advisory only
Knowledge citations Fedora Knowledge service
Report artifact RapidDraft Agent/report service
User workflow state RapidDraft Agent run store
Indexed document inventory Fedora Knowledge service

Knowledge Question Flow

sequenceDiagram
  actor User
  participant UI as RapidDraft UI
  participant API as RapidDraft backend
  participant AG as Agent orchestrator
  participant KC as KnowledgeClient
  participant RAG as Fedora Knowledge API
  participant DB as Postgres/pgvector
  participant LLM as Local model route

  User->>UI: Ask an engineering/document question
  UI->>API: POST /api/agent/runs
  API->>AG: Create run and context snapshot
  AG->>KC: Call rapiddraft.knowledge.answer
  KC->>RAG: POST /answer through local tunnel or knowledge.rapiddraft.ai
  RAG->>DB: Retrieve relevant chunks
  RAG->>LLM: Compose answer from retrieved evidence
  LLM-->>RAG: Draft answer
  RAG-->>KC: Answer, citations, warnings
  KC-->>AG: Tool result
  AG-->>API: Agent message and citation artifact
  API-->>UI: Compact card in chat
  UI-->>User: Answer with source citations

BOM Generation Flow

The BOM demo path is deliberately deterministic first and vision-assisted second.

sequenceDiagram
  actor User
  participant UI as RapidDraft UI
  participant AG as Agent orchestrator
  participant BOM as DraftBomService
  participant STEP as STEP extractor
  participant VQ as Visual enrichment queue
  participant VIS as Local vision route
  participant REP as Report service

  User->>UI: Ask Agent to generate BOM
  UI->>AG: Agent run
  AG->>BOM: Extract draft BOM for current model
  alt STEP assembly data available
    BOM->>STEP: Read PRODUCT and assembly occurrence records
    STEP-->>BOM: Grouped part rows and quantities
  else STEP data unavailable
    BOM->>BOM: Use model/component metadata fallback
  end
  BOM-->>AG: Immediate draft BOM artifact
  AG-->>UI: BOM table opens in canvas
  AG->>VQ: Queue background visual enrichment
  loop Each selected BOM row/component
    VQ->>VIS: Capture/render component view and ask local vision
    VIS-->>VQ: Advisory visual note
  end
  VQ-->>UI: Progress and enriched row notes
  User->>UI: Ask for release report
  UI->>AG: Report run
  AG->>REP: Compose report from BOM, DFM, drawings, Knowledge
  REP-->>UI: Report artifact and approval-gated PDF export

Theegarten Demo Flow

flowchart LR
  A["Upload/open STEP assembly"] --> B["Open RapidDraft Agent"]
  B --> C["Ask: generate BOM"]
  C --> D["Immediate STEP-derived BOM<br/>13 rows from 16 occurrences"]
  D --> E["Background visual enrichment<br/>advisory local vision notes"]
  E --> F["Ask for engineering release report"]
  F --> G["Report combines BOM, DFM, drawings, Knowledge"]
  G --> H["Human approval"]
  H --> I["Export PDF artifact"]

Demo Explanation For Theegarten

RapidDraft now has an Agent inside the product. It is not a separate chatbot. It is a workflow layer that can read the current product context, call typed backend tools, and create real engineering artifacts.

For the BOM demo, the important message is:

  1. The Agent creates the initial BOM from structured engineering evidence, not from model guessing.
  2. If STEP assembly records are available, they are used first for part identity and quantity.
  3. Vision runs afterward in the background as corroboration. It can add notes, but it does not override deterministic BOM truth.
  4. The resulting BOM and related evidence can feed a structured release report and approval-gated PDF export.

For the Knowledge demo, the important message is:

  1. The Agent calls the Fedora Knowledge service as a backend tool.
  2. The Knowledge service searches indexed documents and returns answers with citations.
  3. The RapidDraft UI presents the result as a product artifact or chat card, not as a separate RAG application.

What The Agent Can Do Now

Capability Current status Notes
Answer simple local questions Implemented Local intent routing works for basic prompts.
Knowledge search/answer/inventory Implemented Routed through Fedora Knowledge API.
Capture current canvas/model screenshot Implemented Requires the client surface to be ready.
Describe screenshots with local vision Implemented Uses local vision route; privacy and availability still matter.
Draft BOM from model metadata Implemented Fast fallback when STEP assembly extraction is unavailable.
Draft BOM from STEP assembly records Implemented Current demo: 13 BOM rows from 16 STEP occurrences.
Background BOM visual enrichment Implemented Advisory notes; not source of quantity truth.
Engineering report generation Implemented Composes from available model, BOM, DFM, drawing, and Knowledge evidence.
Report PDF export Implemented Approval-gated.
Thread restore after reload Implemented Agent thread state persists.
Human intervention checkpoints Implemented Workflows can pause and resume with user-provided information.

What Is Not Done Yet

Gap Why it matters
Persistent BOM editor/store Current BOM artifacts are draft artifacts, not a full editable BOM lifecycle.
Model revision/diff backend The Agent cannot compare model revisions until RapidDraft has revision concepts to call.
PLM release actions The Agent cannot push release state to PLM without backend release capability.
Supplier quality packages EMPB, VDA2, AS9102, PPAP, SPC, and ISO 13485 flows require more backend/report primitives.
Large-assembly visual enrichment scale Per-component vision can be slow, so it should remain background work with caching/sampling.
Fusion removal Fusion cleanup is a separate product/code cleanup item.

Vision Prompt Contract

RapidDraft Agent should use the canonical CAD screenshot prompt documented in Vision Model Quality Evaluation.

The prompt was benchmarked against the GE Jet Bracket RapidDraft screenshot crop. The important behavioral requirement is that local vision produces advisory visual notes with CAD vocabulary, visible UI metadata, uncertainty, and screenshot-quality flags. It must not be treated as the source of truth for BOM identity, quantity, dimensions, tolerances, DFM findings, or release decisions.

Current implementation points in RapidDraft:

server/agent/vision.py
server/agent/orchestrator.py

Local AI Server Capabilities Captured

The local server repo inspected for this update was:

/Users/adeelyj/code/local ai server setup/local-ai-stack-repo

Current local server branch:

main

The repo had uncommitted local changes at the time of this wiki update. Important Knowledge/RAG capabilities captured here:

  • RAG API bearer auth through LOCALAI_RAG_API_KEY.
  • /health reports whether auth is enabled.
  • /inventory exposes document inventory and ingest state for RapidDraft Knowledge clients.
  • /answer provides cited answers with warnings, scoped filters, and optional RapidDraft context.
  • /search supports hybrid, semantic, keyword, phrase, regex, metadata, and neighbor-style search.
  • /documents/{document_id} exposes document detail.
  • /documents/{document_id}/summary adds cached document summaries through rag_summaries.
  • Chat and answer routes cap source chunks and source characters to avoid context-window failures.
  • Search results include source URI/path, checksum, source kind, and scope metadata.
  • Search filters support source_kind, source_kinds, JSON scope, project_id, workspace_id, and customer_id.
  • Ingestion scans nested inbox folders recursively, ignores hidden/macOS archive noise, and preserves relative folder paths when moving files to processing, archive, or failed.
  • RAG schema includes document source/scoping columns and summary indexes.
  • The local RAG chat UI has a local browser-only API key field for protected API calls.

BOM Work Completed

BOM is currently the most important demo slice.

Implemented backend behavior:

  • DraftBomService creates immediate draft BOM payloads from structured RapidDraft model/component metadata when STEP assembly extraction is not available.
  • The first draft does not wait for vision. It is fast and usable immediately.
  • Rows include quantity, part number, material/process evidence, Part Facts status, missing-data notes, evidence status, source node names, and visual-enrichment status.
  • server/bom/step_bom.py extracts assembly usage records from STEP files before falling back to component metadata.
  • STEP extraction reads PRODUCT, PRODUCT_DEFINITION_FORMATION, PRODUCT_DEFINITION, and NEXT_ASSEMBLY_USAGE_OCCURRENCE.
  • Repeated STEP occurrences are grouped by canonicalized part number, including simple punctuation variants such as MS21209-F1-20 and MS21209F1-20.
  • The BOM payload marks whether it came from step_assembly or model_components.

Implemented frontend/report behavior:

  • Agent chat summaries, artifact cards, BOM canvas tables, and engineering reports expose the BOM extraction source.
  • The BOM canvas summary shows source, source count, visual enrichment status, and evidence status.
  • BOM evidence cells show model-evidence notes such as STEP occurrence ids.
  • The Theegarten-style report includes a structured release-check block with BOM consistency, evidence register, sign-off, and change history.

Demo fixture validation:

  • Local demo model id: 2ed3dfe4ecec4929a56fb38c4fd4e62c
  • Current STEP-derived BOM result: 13 rows from 16 STEP assembly occurrences.
  • Summary source field: bomExtractionSource = step_assembly.

Engineering Report Work Completed

Report generation now composes from RapidDraft-owned artifacts instead of inventing a report from chat alone.

Current report inputs can include:

  • active model metadata
  • draft BOM snapshot
  • visible DFM sidebar context
  • DraftLint drawing issue sets
  • Knowledge citations
  • source artifact trace

Current report outputs include:

  • Markdown summary
  • structured Theegarten-style releaseReport
  • report source cards that can reopen backing artifacts
  • caveats for missing DFM context, missing Knowledge citations, unavailable vision enrichment, or unavailable BOM data
  • approval-gated PDF export artifact

Current Connection Settings

The Mac dev path uses a tunnel to the Fedora Knowledge service:

Fedora host: [email protected]
Mac tunnel target: http://127.0.0.1:4100
Railway backend target: https://knowledge.rapiddraft.ai
SSH key: /Users/adeelyj/code/auth/local-server-adeel/local-server-adeel-codex_ed25519

The RapidDraft backend reads these non-secret environment variables:

RAPIDDRAFT_KNOWLEDGE_BASE_URL
RAPIDDRAFT_KNOWLEDGE_API_KEY
RAPIDDRAFT_KNOWLEDGE_TIMEOUT_SECONDS
RAPIDDRAFT_AGENT_TEXT_BASE_URL
RAPIDDRAFT_AGENT_TEXT_API_KEY
RAPIDDRAFT_AGENT_TEXT_MODEL
RAPIDDRAFT_AGENT_VISION_BASE_URL
RAPIDDRAFT_AGENT_VISION_API_KEY
RAPIDDRAFT_AGENT_VISION_MODEL
VISION_LOCAL_BASE_URL
VISION_LOCAL_API_KEY
VISION_LOCAL_MODEL
RAPIDDRAFT_AGENT_VISION_PROVIDER

The helper script scripts/dev/backend-start-knowledge-macos.sh starts the Fedora tunnel, fetches non-secret model aliases and the local RAG key from /etc/localai/localai.env on the Fedora machine, and starts the RapidDraft backend with local dev auth.

Do not paste the actual key values into this wiki.

Current RapidDraft Branch State

Primary RapidDraft repo:

/Users/adeelyj/code/rapiddraft/45_co2/rapiddraft_utumpitch

Current branch after the STEP BOM work:

theegarten-rapiddraft-agent-step-bom-occ

Latest commit:

26da899 Route Agent vision through local LiteLLM

Known intentionally unrelated dirty/untracked files at the time of this wiki update:

web/package-lock.json
package-lock.json
.abacusai/
agent_screenshots_report_20260530/
web/src/components/AiChatPanel.tsx

Commit Timeline

Important commits from the Agent/BOM/report work:

Date Commit Summary
2026-05-30 7b20155 Enrich draft BOM evidence
2026-05-30 e377179 Improve BOM artifact readability
2026-05-30 3b3add3 Add draft engineering report artifact
2026-05-30 5d07bcb Chain report artifacts to agent context
2026-05-30 5d3feb7 Handle unavailable BOM vision enrichment
2026-05-30 41b7c03 Show report source provenance in Agent UI
2026-05-30 399fb7d Open report source artifacts from Agent reports
2026-05-30 70ab46b Materialize DFM report source snapshots
2026-05-30 2f9d5a4 Link drawing checks into Agent reports
2026-05-30 7787a00 Link Knowledge citations into Agent reports
2026-05-30 1b75d83 Add approval-gated Agent report PDF export
2026-05-30 f756f0c Restore Agent thread state on reload
2026-05-30 639bdd8 Show Agent background job progress
2026-05-30 9a58e32 Add Agent intervention checkpoints
2026-05-30 af5f339 Add Agent intervention handoff contract
2026-05-30 aa320cf Resume Agent report workflow after interventions
2026-05-30 be09c99 Polish Agent BOM visual enrichment UI
2026-05-31 fda5cda Polish Agent BOM release report
2026-05-31 611f0f6 Add STEP BOM extraction provenance
2026-05-31 26da899 Route Agent vision through local LiteLLM

Validation Already Run

Recent validation from the RapidDraft repo:

python -m pytest server/tests/test_agent_foundation.py server/tests/test_bom.py server/tests/test_component_naming.py server/tests/test_knowledge_endpoints_wiring.py -q

Result:

56 passed

Frontend build:

npm --prefix web run build

Result:

passed

Direct BOM smoke on the STEP demo model:

13 BOM rows
16 STEP assembly occurrences
bomExtractionSource = step_assembly

Operational Guidance

  • Keep RapidDraft Agent architecture product-first: typed tool calls, artifacts, approvals, and canvas tabs.
  • Keep Knowledge as a backend tool family, not the main product shell.
  • Keep BOM generation deterministic first; vision enriches and corroborates later.
  • Do not expose raw llama.cpp ports to RapidDraft users.
  • Do not push, merge, or rebase Agent branches without checking remote branch state first.
  • If Knowledge is unavailable in the UI, check the Mac tunnel and RAPIDDRAFT_KNOWLEDGE_* environment variables before debugging Agent code.

Demo Readiness Checklist

  • The Fedora machine is reachable over Tailscale.
  • The Mac tunnel to http://127.0.0.1:4100 is active.
  • /health on the Knowledge service returns healthy auth status.
  • RapidDraft backend was started with the Knowledge helper script or equivalent environment.
  • The demo STEP model opens in the RapidDraft canvas.
  • Agent can generate the STEP-derived BOM with the expected step_assembly source.
  • Background visual enrichment failure does not block the initial BOM or report.
  • Report generation and approval-gated PDF export are checked before the demo.

Open Questions

  • Should the BOM visual-enrichment queue be hidden from normal users until the backend process is more reliable, or should it remain visible as a background status card?
  • When should a persistent BOM editor/store be introduced, and what backend object should own BOM revisions?
  • Which supplier-quality report families should be implemented first after the Theegarten release check demo: EMPB, VDA2, AS9102, PPAP, SPC, or ISO 13485?
  • What exact scope should the Fusion removal cleanup cover: UI rail only, API routes, stored reports, tests, and analysis-run records?

Sources

  • /Users/adeelyj/code/rapiddraft/45_co2/rapiddraft_utumpitch/server/agent/
  • /Users/adeelyj/code/rapiddraft/45_co2/rapiddraft_utumpitch/server/bom/
  • /Users/adeelyj/code/rapiddraft/45_co2/rapiddraft_utumpitch/server/knowledge_client.py
  • /Users/adeelyj/code/rapiddraft/45_co2/rapiddraft_utumpitch/scripts/dev/backend-start-knowledge-macos.sh
  • /Users/adeelyj/code/rapiddraft/45_co2/rapiddraft_utumpitch/web/src/components/agent/
  • /Users/adeelyj/code/rapiddraft/45_co2/rapiddraft_utumpitch/web/src/components/canvas/agentArtifactRenderers.tsx
  • /Users/adeelyj/code/local ai server setup/local-ai-stack-repo/rag/app/
  • /Users/adeelyj/code/local ai server setup/local-ai-stack-repo/rag/schema.sql
  • /Users/adeelyj/Library/CloudStorage/OneDrive-Personal/100_Knowledge/203_TextCAD/01_Product_Project_Management/00_Project_Management_n_skills/01_tracks/rapiddraft-studio/plans/260529_rapiddraft-ai-chat-bar/rapiddraft-agent-foundation-plan.md
  • RapidDraft commits listed in the commit timeline above.