Skip to content

A. Security & Auth

Parent: Code Review Index Priority: must-fix before any broader pilot exposure Surface affected: backend (server/)

These are findings that a serious security review would flag on day one. None of them require a new subsystem; they are tightenings of what is already there.


A1 — CORS is wide open

  • Status: Open
  • Impact: H | Complexity: L | Time: 0.5 D
  • Files: server/main.py:189-194

Finding. CORSMiddleware(allow_origins=["*"], allow_methods=["*"], allow_headers=["*"]) is set unconditionally. The app allows any origin to call any route with any header. This contradicts the "Cloudflare Access protected pilot" story in the wiki and turns every pilot deployment into a cross-origin free-for-all.

Action for Codex. 1. Introduce an env var RAPIDDRAFT_ALLOWED_ORIGINS (comma-separated, required in production). Parse in access_auth.py or a new server/settings.py. 2. Replace allow_origins=["*"] with the parsed list. 3. For allow_methods and allow_headers, restrict to what the app actually uses (GET, POST, PATCH, DELETE, OPTIONS and the specific header set in use, including cf-access-jwt-assertion). 4. Provide a development default of ["http://localhost:5173"] when ENV=dev is set; otherwise fail fast on startup if the env var is missing.

Acceptance criteria. - curl -H "Origin: https://evil.example" https://app.rapiddraft.ai/api/... does not return a permissive Access-Control-Allow-Origin header. - Unit test that asserts the middleware is configured from the env var.

Depends on: none.


A2 — Cloudflare Access is opt-in per route; mutating routes are not gated

  • Status: Open
  • Impact: H | Complexity: M | Time: 3–5 D
  • Files: server/access_auth.py, server/main.py (all route decorators)

Finding. access_auth.py implements JWT validation cleanly (Cloudflare Access JWKS, issuer + audience checks), but it is invoked inside each handler via resolve_access_identity(request). Most mutating endpoints — POST /api/models (upload), POST /api/draftlint/sessions (upload), POST /api/models/{id}/dfm/review-v2, POST /api/models/{id}/vision/reports, POST /api/models/{id}/dfm/candidate-sessions, POST /api/models/{id}/dfm/review-export/pdf — never call it. If an attacker reaches the backend bypassing Cloudflare (direct origin, misconfigured DNS, stale deploy), uploads and expensive compute are wide open.

Action for Codex. 1. Create a FastAPI dependency require_access_identity(request) -> AccessIdentity in server/access_auth.py. 2. Create two APIRouters: public_router (for /healthz, /api/config, static assets) and authed_router with dependencies=[Depends(require_access_identity)]. 3. Move every existing route that mutates state, accepts uploads, or consumes vendor spend (OpenAI vision) into authed_router. 4. Keep a narrow list of genuinely public endpoints; document them in docs/contracts/public_endpoints.md. 5. When ACCESS_ENFORCED=true but no JWT is present, return 403 (not 401) with an opaque error.

Acceptance criteria. - Test: a request to POST /api/models without a valid cf-access-jwt-assertion header, with ACCESS_ENFORCED=true, returns 403. - Test: the same request with a valid JWT returns 200 and an AccessIdentity is logged. - All POST/PATCH/DELETE routes appear under authed_router (enforce with a test that enumerates routes).

Depends on: none (but unlocks C1/C3).


A3 — API keys accepted in the request body and persisted to disk

  • Status: Open
  • Impact: H | Complexity: L | Time: 1 D
  • Files: server/main.py:375-380 (VisionProviderBody), server/vision_providers.py (~125-150, ~354), server/vision_analysis.py (~490-501)

Finding. VisionProviderBody defines api_key_override, base_url_override, model_override. The vision report request flow persists the full provider config to {model_id}/vision_reports/{report_id}/request.json — keys or "api_key_override_provided" metadata included. Users leak their own keys into server-side artifacts; operators risk storing customer keys in the model-store volume.

Action for Codex. 1. Remove api_key_override and base_url_override from VisionProviderBody. Keep route and (optionally) model_override. 2. Resolve the API key and base URL server-side only from env vars (OPENAI_API_KEY, OPENAI_BASE_URL, etc.), per route. 3. If multi-tenant keys are ever needed, store them in a secret manager and resolve per-identity — not per request. 4. In the request-artifact writer, explicitly allowlist which fields are persisted; never write unknown provider fields verbatim.

Acceptance criteria. - Test: a vision report request that sends provider.api_key_override in the body is rejected by pydantic with a 422, or the field is silently dropped (choose one and document). - Test: the persisted request.json contains provider.route, never api_key_*. - Test: when OPENAI_API_KEY is unset, vision routes return a 503 with an opaque error, not a stack trace.

Depends on: none.


A4 — SSRF via base_url_override

  • Status: Open
  • Impact: H | Complexity: L | Time: 1 D
  • Files: server/vision_providers.py:146-148, server/vision_providers.py:354-356

Finding. The only validation on base_url_override is if not base_url.startswith("http"). An internal URL like http://localhost:6379, http://169.254.169.254/latest/meta-data/, or http://internal-admin is accepted and fired. Combined with A3 (user-supplied base URL) this is a classic SSRF surface.

Action for Codex. 1. Remove user-supplied base_url_override (part of A3). 2. Define a provider allowlist (SUPPORTED_VISION_PROVIDERS) keyed by route, with the canonical base URL for each. 3. On any provider path that resolves a URL from config, validate: urlparse(url).scheme in ("https",) and hostname in allowlist; reject localhost, 127.0.0.0/8, 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 169.254.0.0/16. 4. Add a test_vision_providers_ssrf.py that covers each rejected class.

Acceptance criteria. - Test: setting VISION_BASE_URL=http://169.254.169.254/ causes startup validation to fail or the route to refuse to run. - Test: route=openai always resolves to the OpenAI host — user cannot divert it.

Depends on: A3 (remove the body-supplied URL first).


A5 — Uploads have no size limit, no MIME check, no extension allowlist

  • Status: Open
  • Impact: H | Complexity: M | Time: 2 D
  • Files: server/main.py (upload route ~2587), server/main.py (draftlint upload ~1477-1495)

Finding. UploadFile is read without a byte-cap, the filename (user-controlled) is preserved as original_name, and no MIME/extension validation is performed. A single oversized STEP file can fill the volume; a weaponised filename can collide with path handling downstream; MIME spoofing is trivial.

Action for Codex. 1. Add a streamed size check: read in CHUNK=4MB increments, abort and delete partial file if total exceeds MAX_UPLOAD_MB (env var, default 100). 2. Allowlist extensions: .step, .stp, .stl, .glb, .gltf, .json (as applicable per endpoint). Reject others with 415. 3. Validate the leading magic bytes per type where feasible (STEP starts with ISO-10303-21;). 4. Sanitize original_name: strip path separators, null bytes, and anything outside [A-Za-z0-9._-]; truncate to 200 chars. Always store under a UUID-derived path; the sanitized name is only for display. 5. Return 413 for oversize, 415 for wrong type, 400 for empty.

Acceptance criteria. - Tests cover: oversize, empty, valid STEP, .exe upload, filename with ..\\, filename with null byte. - Integration test: uploading a 200 MB blob when MAX_UPLOAD_MB=100 fails with 413 and no partial file remains on disk.

Depends on: A2 (gate the upload route first).


A6 — Error responses leak internal exception types

  • Status: Open
  • Impact: M | Complexity: L | Time: 1 D
  • Files: server/main.py many sites, e.g. main.py:1502 (detail=f"Unexpected DraftLint session error: {exc.__class__.__name__}: {exc}"), main.py:1835, main.py:2279, main.py:1871

Finding. Exception handlers include class names and messages in the HTTP response body. This leaks internal module structure and often the full str(exc). It also couples the client to implementation details.

Action for Codex. 1. Define a small exception hierarchy in server/errors.py (or re-use the existing per-module errors) and map each to an HTTP status via @app.exception_handler. 2. The handler logs the full exception server-side (logger.exception) and returns a sanitized body: {"error": "<short-code>", "request_id": "..."}. 3. Remove every inline HTTPException(status_code=500, detail=f"Unexpected ...") in main.py. Route unhandled exceptions through the global handler. 4. Generate a request_id in middleware and include it in every log line and every response body.

Acceptance criteria. - Test: forcing a CADProcessingError returns a response body that does not contain a Python class name. - Test: the request_id from the response appears verbatim in the log line for that request.

Depends on: C5 (centralized error mapping) — these can be the same PR.


A7 — No rate limiting

  • Status: Open
  • Impact: M | Complexity: L | Time: 1 D
  • Files: server/main.py (middleware stack around line 188)

Finding. Vision and DFM review endpoints consume paid vendor calls or heavy CPU. A single misbehaving client (or a malicious one past Cloudflare) can run up cost or DoS the service. No rate-limiter is present.

Action for Codex. 1. Add slowapi (or asgi-ratelimit). Key by AccessIdentity.email when present, fall back to client IP. 2. Per-route limits: - Uploads: 10/min/user. - Vision reports: 20/hour/user, 300/day/user. - DFM review: 60/hour/user. - All others: 120/min/user default. 3. Expose current limits in response headers (X-RateLimit-*). 4. On 429, include Retry-After.

Acceptance criteria. - Test: 11 upload requests in 60 seconds — the 11th returns 429. - Test: the limiter key is AccessIdentity.email when present.

Depends on: A2.