[RFC] Org import must aggregate per-workspace + per-runtime required_env into a single pre-flight credential gather #232

Open
opened 2026-05-10 03:24:36 +00:00 by claude-ceo-assistant · 0 comments
Owner

Problem

Observed 2026-05-10 ~03:20 UTC on staging-cplead-2: imported molecule-dev (38 workspaces). Import returned 201. Every workspace came up NOT CONFIGURED because the claude-code runtime needs an Anthropic credential and no one asked the user for it. The owner (Hongming) has to click into each of 38 workspaces to set the API key — terrible UX, and silently broken-on-arrival is worse than fail-on-import.

Why it happens (current code)

workspace-server/internal/handlers/org.go:

  • Line 605: "Required-env preflight — refuses import when any required_env is missing from global_secrets. No bypass."
  • BUT this preflight only inspects OrgTemplate.RequiredEnv (line 355) — the org-wide declaration.
  • Per-workspace OrgWorkspace.RequiredEnv (line 477) is declared in struct + commented as "these flow up into OrgTemplate.RequiredEnv when GET /org/templates walks the tree (line 472)" — but that flow-up is NOT implemented in the Import handler. It's only implemented in the templates listing (templates.go).
  • Per-runtime RuntimeConfig.RequiredEnv (workspace_preflight.go line 21) is checked at workspace boot time, by which point the import is long-committed and the user is staring at broken tiles.

molecule-dev/org.yaml deliberately removed its org-level required_env per PR #1031 (workspaces auth via per-workspace .env). With nothing org-level declared and no aggregation upward, the gate is empty → import succeeds → workspaces NOT CONFIGURED.

Proposed fix shape

New/extended import flow:

Phase 1: pre-flight aggregation (Go, workspace-server)

  1. Parse OrgTemplate (with all !include + !external resolved)
  2. Walk every OrgWorkspace in the tree
  3. For each workspace, resolve its runtime + plugins, fetch their RuntimeConfig.RequiredEnv from the template registry (or the in-image config.yaml)
  4. Aggregate the set: union of (OrgTemplate.RequiredEnv ∪ all OrgWorkspace.RequiredEnv ∪ all RuntimeConfig.RequiredEnv)
  5. De-dupe by env var name; preserve any_of group semantics; classify per-workspace overrides separately from org-shared

Phase 2: pre-flight endpoint (workspace-server)

  • POST /org/import/preflight returns the aggregated credential schema:
    {
      "shared": [
        {"name": "ANTHROPIC_API_KEY", "reason": "required by 27 workspaces using claude-code runtime"},
        {"any_of": ["GITHUB_TOKEN", "GH_TOKEN"], "reason": "required by 4 workspaces touching git"}
      ],
      "per_workspace": [{"workspace": "CP-Security", "name": "SEMGREP_TOKEN"}, ...]
    }
    
  • Optional: vendor probe each cred (e.g. POST /v1/messages to Anthropic with the key, expect 401 vs 200) before letting import commit

Phase 3: canvas UX (canvas/)

  • Replace the current "import → success → 38 NOT CONFIGURED tiles" with:
    1. User picks template
    2. Canvas calls /org/import/preflight
    3. Canvas renders a single "You need to provide N credentials before this org can run" form
    4. User fills in each (with vendor-probe validation per field)
    5. POST /org/import with the gathered creds in the body, persisted as org-level secrets (shared) + workspace-level secrets (overrides)
    6. Import commits; workspaces come up CONFIGURED on first boot

Phase 4: force mode (escape hatch)

  • POST /org/import?force=true skips the gather (current behavior) — for CI / scripts that already have creds wired separately

Acceptance criteria

  1. New aggregation function in org.go that walks the tree and produces the union credential schema
  2. New /org/import/preflight endpoint with the JSON shape above
  3. Canvas import wizard renders the form, validates each field, posts to /org/import
  4. End-to-end: import a multi-runtime org template → canvas asks for ANTHROPIC + MINIMAX + GITHUB_TOKEN — only those three, not per-workspace duplicates → submit → workspaces all CONFIGURED on first boot
  5. Existing OrgTemplate.RequiredEnv org-level preflight stays as the floor (a template that DOES declare org-level required_env still hard-gates without those)
  6. Tests: aggregation unit test (multi-runtime template → expected union), preflight integration test (real handler call), force-mode test (skips gather)

Discovery context

Filed 2026-05-10 ~03:20 UTC by Hongming during cp-lead team deploy. Underscores the architectural shape rather than this specific tenant — applies to every org template import flow.

Related

  • #79 (OrgHandler.Import idempotency)
  • #1031 (PR that removed org-level CLAUDE_CODE_OAUTH_TOKEN gate — exposed this gap)
  • workspace_preflight.go (per-workspace boot-time gate that exists but fires too late)
  • templates.go (templates listing that DOES walk per-workspace required_env — has the aggregation logic Import is missing)
## Problem Observed 2026-05-10 ~03:20 UTC on staging-cplead-2: imported `molecule-dev` (38 workspaces). Import returned 201. Every workspace came up `NOT CONFIGURED` because the `claude-code` runtime needs an Anthropic credential and no one asked the user for it. The owner (Hongming) has to click into each of 38 workspaces to set the API key — terrible UX, and silently broken-on-arrival is worse than fail-on-import. ## Why it happens (current code) `workspace-server/internal/handlers/org.go`: - Line 605: "Required-env preflight — refuses import when any required_env is missing from global_secrets. No bypass." - BUT this preflight only inspects `OrgTemplate.RequiredEnv` (line 355) — the **org-wide** declaration. - Per-workspace `OrgWorkspace.RequiredEnv` (line 477) is declared in struct + commented as "these flow up into OrgTemplate.RequiredEnv when GET /org/templates walks the tree (line 472)" — but that flow-up is **NOT implemented in the Import handler**. It's only implemented in the templates listing (templates.go). - Per-runtime `RuntimeConfig.RequiredEnv` (workspace_preflight.go line 21) is checked at **workspace boot time**, by which point the import is long-committed and the user is staring at broken tiles. `molecule-dev/org.yaml` deliberately removed its org-level `required_env` per PR #1031 (workspaces auth via per-workspace .env). With nothing org-level declared and no aggregation upward, the gate is empty → import succeeds → workspaces NOT CONFIGURED. ## Proposed fix shape New/extended import flow: ### Phase 1: pre-flight aggregation (Go, workspace-server) 1. Parse OrgTemplate (with all `!include` + `!external` resolved) 2. Walk every OrgWorkspace in the tree 3. For each workspace, resolve its runtime + plugins, fetch their `RuntimeConfig.RequiredEnv` from the template registry (or the in-image config.yaml) 4. Aggregate the set: union of (`OrgTemplate.RequiredEnv` ∪ all `OrgWorkspace.RequiredEnv` ∪ all `RuntimeConfig.RequiredEnv`) 5. De-dupe by env var name; preserve `any_of` group semantics; classify per-workspace overrides separately from org-shared ### Phase 2: pre-flight endpoint (workspace-server) - `POST /org/import/preflight` returns the aggregated credential schema: ```json { "shared": [ {"name": "ANTHROPIC_API_KEY", "reason": "required by 27 workspaces using claude-code runtime"}, {"any_of": ["GITHUB_TOKEN", "GH_TOKEN"], "reason": "required by 4 workspaces touching git"} ], "per_workspace": [{"workspace": "CP-Security", "name": "SEMGREP_TOKEN"}, ...] } ``` - Optional: vendor probe each cred (e.g. POST /v1/messages to Anthropic with the key, expect 401 vs 200) before letting import commit ### Phase 3: canvas UX (canvas/) - Replace the current "import → success → 38 NOT CONFIGURED tiles" with: 1. User picks template 2. Canvas calls `/org/import/preflight` 3. Canvas renders a single "You need to provide N credentials before this org can run" form 4. User fills in each (with vendor-probe validation per field) 5. POST `/org/import` with the gathered creds in the body, persisted as org-level secrets (shared) + workspace-level secrets (overrides) 6. Import commits; workspaces come up CONFIGURED on first boot ### Phase 4: `force` mode (escape hatch) - `POST /org/import?force=true` skips the gather (current behavior) — for CI / scripts that already have creds wired separately ## Acceptance criteria 1. New aggregation function in `org.go` that walks the tree and produces the union credential schema 2. New `/org/import/preflight` endpoint with the JSON shape above 3. Canvas import wizard renders the form, validates each field, posts to /org/import 4. End-to-end: import a multi-runtime org template → canvas asks for ANTHROPIC + MINIMAX + GITHUB_TOKEN — only those three, not per-workspace duplicates → submit → workspaces all CONFIGURED on first boot 5. Existing `OrgTemplate.RequiredEnv` org-level preflight stays as the floor (a template that DOES declare org-level required_env still hard-gates without those) 6. Tests: aggregation unit test (multi-runtime template → expected union), preflight integration test (real handler call), force-mode test (skips gather) ## Discovery context Filed 2026-05-10 ~03:20 UTC by Hongming during cp-lead team deploy. Underscores the architectural shape rather than this specific tenant — applies to every org template import flow. ## Related - #79 (OrgHandler.Import idempotency) - #1031 (PR that removed org-level CLAUDE_CODE_OAUTH_TOKEN gate — exposed this gap) - workspace_preflight.go (per-workspace boot-time gate that exists but fires too late) - templates.go (templates listing that DOES walk per-workspace required_env — has the aggregation logic Import is missing)
claude-ceo-assistant added the tier:medium label 2026-05-10 05:54:51 +00:00
core-be was assigned by claude-ceo-assistant 2026-05-10 06:48:08 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#232