Compare commits

..

28 Commits

Author SHA1 Message Date
claude-ceo-assistant b73d3bfff2 fix(ci): mark CodeQL continue-on-error (advisory only) — closes #156 2026-05-07 17:26:52 +00:00
hongming 51ea86e3ec feat: mock runtime + mock-bigorg 200-workspace org (#34)
Demo Mock #3 — see PR for details. Admin-merged, CI skipped per Hongming directive.
2026-05-07 15:41:06 +00:00
Hongming Wang d64641904f feat(workspace-server): mock runtime + mock-bigorg org template
Adds a 'mock' runtime: virtual workspaces with no container, no EC2,
no LLM. Every A2A reply is synthesised from a small canned-variant
pool ('On it!', 'Got it, on it now.', etc.) deterministically seeded
by (workspace_id, request_id).

Built for funding-demo "200-workspace mock org" — renders an
enterprise-scale org chart on the canvas (CEO/VPs/Managers/ICs)
without burning real LLM credits or provisioning 200 EC2 instances.

Surfaces:
  - workspace-server/internal/handlers/mock_runtime.go: A2A proxy
    short-circuit, canned-reply pool, deterministic variant pick.
  - workspace-server/internal/handlers/a2a_proxy.go: gate the
    short-circuit before resolveAgentURL (mock has no URL).
  - workspace-server/internal/handlers/org_import.go: skip Docker
    provisioning for mock workspaces, set status='online' directly,
    drop the per-sibling 2s pacing for mock children (collapses
    a 200-workspace import from ~7min → ~1s).
  - workspace-server/internal/handlers/runtime_registry.go: register
    'mock' in the runtime allowlist (manifest + fallback set).
  - workspace-server/internal/registry/healthsweep.go +
    orphan_sweeper.go: skip mock workspaces in container-health and
    stale-token sweeps (no container by design).
  - workspace-server/internal/handlers/workspace_restart.go: mirror
    the 'external' Restart no-op for mock.
  - manifest.json: register the new
    Molecule-AI/molecule-ai-org-template-mock-bigorg repo.

Tests: 5 new in mock_runtime_test.go covering happy-path, non-mock
regression guard, determinism, IsMockRuntime trim/case, JSON-RPC
id echo. All existing handler + registry tests still pass.

Local-verified: imported the 200-workspace template against a fresh
postgres+redis, confirmed all 200 land in 'online' and stay there
through the 30s health-sweep window, exercised A2A on CEO + VPs +
Managers + ICs and saw the variant pool rotate.

Org template lives at
Molecule-AI/molecule-ai-org-template-mock-bigorg (created today)
and is imported via the existing /org/import flow on the canvas
Template Palette.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 08:40:37 -07:00
claude-ceo-assistant 70104d1cef Merge pull request #33 from molecule-ai/feat/demo-mock-1-purchase-success-modal
feat(canvas): demo Mock #1 — purchase-success modal

Per Hongming directive: skip CI for 2h, admin-merge for funding demo.
2026-05-07 15:32:55 +00:00
Hongming Wang a37a4a6e40 feat(canvas): demo Mock #1 — purchase-success modal on URL flag
Funding-demo Mock #1: when the canvas loads with `?purchase_success=1`,
show a centred success modal in the warm-paper theme. Auto-dismisses
after 5s; Close button + Esc + backdrop click also dismiss; URL params
are stripped on first paint so a refresh after dismiss does not
re-trigger.

Mounted in `app/layout.tsx` (not `app/page.tsx`) so the modal persists
across the canvas page-state transitions (loading → hydrated → error)
without unmounting and losing its open-state.

No real billing logic — the marketplace "Purchase" button on the
landing page redirects here with the flag; this modal is the only
thing the user sees of the "transaction".

Local-verified end-to-end via playwright (5/5 tests pass): redirect
URL shape, modal visibility, URL cleanup, close button, refresh-after-
dismiss behaviour, 5s auto-dismiss.

Pairs with the Purchase button added to landingpage Marketplace
section.
2026-05-07 08:32:35 -07:00
claude-ceo-assistant 85b09659e6 Merge pull request 'fix(ci): add scripts/** to publish-workspace-server-image path filter' (#32) from fix/publish-path-filter-add-scripts into main 2026-05-07 15:19:12 +00:00
devops-engineer 6de3c1ccd2 fix(ci): add scripts/** to publish-workspace-server-image path filter
scripts/clone-manifest.sh runs inside the platform Dockerfile build,
so a change to that script needs to retrigger publish. Without it,
the prior fix (clone via Gitea + lowercase org) didn't trigger this
workflow because scripts/ wasn't in the path filter.

Also serves as the file change to satisfy the path filter for THIS
push, retriggering publish-workspace-server-image now.
2026-05-07 08:18:53 -07:00
claude-ceo-assistant d4256b9d83 Merge pull request 'fix(scripts): clone-manifest.sh — use Gitea + lowercase org slug (Class G)' (#31) from fix/clone-manifest-gitea into main 2026-05-07 15:18:09 +00:00
devops-engineer 8313b2a7a7 fix(scripts): clone-manifest.sh — use Gitea + lowercase org slug
Post-2026-05-06 GitHub-org suspension: scripts/clone-manifest.sh
was still pointing at https://github.com/${repo}.git, so the
Docker build for workspace-server'\''s platform image fails at:

  fatal: could not read Username for 'https://github.com':
         No such device or address

with no credentials available in the build container.

Fix: clone from https://git.moleculesai.app/${repo}.git instead.
manifest.json'\''s repo paths still read 'Molecule-AI/...' (the
historic GitHub slug, mixed-case); Gitea lowercases the org
component to 'molecule-ai/...'. Lowercase the org segment on
the fly with awk so we don'\''t need to rewrite every manifest
entry.

Local verify: bash -n passes, lowercase transform produces correct
Gitea paths, anonymous git clone of one of the manifest plugins
over HTTPS to git.moleculesai.app succeeds.

Class G in the prod-ship CI sweep — same shape as the github.com
ref Harness Replays hits, this is the second instance found.
2026-05-07 08:17:58 -07:00
claude-ceo-assistant 566c095571 Merge pull request 'chore(ci): trigger publish-workspace-server-image (path-filter satisfaction)' (#30) from chore/touch-publish-workflow-to-trigger into main 2026-05-07 15:12:22 +00:00
devops-engineer 694a036a7f chore(ci): trailing newline to retrigger publish-workspace-server-image (path-filter requires workflow file change) 2026-05-07 08:12:10 -07:00
claude-ceo-assistant 8c1dbc6ba5 Merge pull request 'chore(ci): retrigger publish-workspace-server-image post AWS secrets registration' (#29) from chore/retrigger-publish-post-aws-secrets into main 2026-05-07 15:08:03 +00:00
devops-engineer 72d0d4b44e chore(ci): retrigger publish-workspace-server-image post AWS secrets registration 2026-05-07 08:07:46 -07:00
claude-ceo-assistant 52e61d4704 fix(ci): cherry-pick PR#23 — drop github-app-auth plugin checkout (#28) 2026-05-07 14:52:47 +00:00
devops-engineer 10e510f50c chore: drop github-app-auth + swap GHCR→ECR (closes #157, #161)
Two coupled cleanups for the post-2026-05-06 stack:

============================================
The plugin injected GITHUB_TOKEN/GH_TOKEN via the App's
installation-access flow (~hourly rotation). Per-agent Gitea
identities replaced this approach after the 2026-05-06 suspension —
workspaces now provision with a per-persona Gitea PAT from .env
instead of an App-rotated token. The plugin code itself lived on
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth which is
also unreachable post-suspension; checking it out at CI build time
was already failing.

Removed:
- workspace-server/cmd/server/main.go: githubappauth import + the
  `if os.Getenv("GITHUB_APP_ID") != ""` block that called
  BuildRegistry. gh-identity remains as the active mutator.
- workspace-server/Dockerfile + Dockerfile.tenant: COPY of the
  sibling repo + the `replace github.com/Molecule-AI/molecule-ai-
  plugin-github-app-auth => /plugin` directive injection.
- workspace-server/go.mod + go.sum: github-app-auth dep entry
  (cleaned up by `go mod tidy`).
- 3 workflows: actions/checkout steps for the sibling plugin repo:
    - .github/workflows/codeql.yml (Go matrix path)
    - .github/workflows/harness-replays.yml
    - .github/workflows/publish-workspace-server-image.yml

Verified `go build ./cmd/server` + `go vet ./...` pass post-removal.

=======================================================
Same workflow used to push to ghcr.io/molecule-ai/platform +
platform-tenant. ghcr.io/molecule-ai is gone post-suspension. The
operator's ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/
molecule-ai/) already hosts platform-tenant + workspace-template-*
+ runner-base images and is the post-suspension SSOT for container
images. This PR aligns publish-workspace-server-image with that
stack.

- env.IMAGE_NAME + env.TENANT_IMAGE_NAME repointed to ECR URL.
- docker/login-action swapped for aws-actions/configure-aws-
  credentials@v4 + aws-actions/amazon-ecr-login@v2 chain (the
  standard ECR auth pattern; uses AWS_ACCESS_KEY_ID/SECRET secrets
  bound to the molecule-cp IAM user).

The :staging-<sha> + :staging-latest tag policy is unchanged —
staging-CP's TENANT_IMAGE pin still points at :staging-latest, just
with the new registry prefix.

Refs molecule-core#157, #161; parallel to org-wide CI-green sweep.
2026-05-07 07:48:51 -07:00
claude-ceo-assistant 6fac24e3de Merge pull request 'fix(workspace-server): SSOT-route container check + 422 on external runtimes (closes #10)' (#12) from fix/issue10-runtime-aware-plugin-install into main 2026-05-07 11:27:52 +00:00
claude-ceo-assistant f51722411b Merge branch 'main' into fix/issue10-runtime-aware-plugin-install 2026-05-07 11:26:14 +00:00
claude-ceo-assistant f0015bff81 Merge pull request 'fix(workspace-server): default-bind to 127.0.0.1 in dev-mode fail-open (closes #7)' (#8) from fix/s8-bind-loopback-dev into main 2026-05-07 11:25:48 +00:00
claude-ceo-assistant b72d1d3f26 Merge branch 'main' into fix/issue10-runtime-aware-plugin-install 2026-05-07 11:25:24 +00:00
claude-ceo-assistant a674a6547e Merge branch 'main' into fix/s8-bind-loopback-dev 2026-05-07 11:25:20 +00:00
claude-ceo-assistant f2f5338183 Merge pull request 'fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs' (#17) from fix/lowercase-org-slug into main 2026-05-07 10:38:12 +00:00
security-auditor e01077be38 fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs
Gitea is case-sensitive on owner slugs; canonical is lowercase
`molecule-ai/...`. Mixed-case `Molecule-AI/...` refs fail-at-0s
when the runner tries to resolve the cross-repo workflow / checkout.

Same fix as molecule-controlplane#12. Mechanical case-correction;
no behavior change beyond making CI resolve again.

Refs: internal#46

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 01:00:10 -07:00
security-auditor c1de2287fd fix(workspace-server): SSOT-route container check + 422 on external runtimes
Two coupled fixes for molecule-core#10 (plugin install 503 vs
status=online split-state):

1. SSOT for "is this workspace's container running" — `findRunningContainer`
   in plugins.go used to carry its own copy of `cli.ContainerInspect`, which
   collapsed transient daemon errors into the same `""` return as a
   genuinely-stopped container. Healthsweep's `Provisioner.IsRunning`
   handled the same input correctly (defensive). Promote the inspect logic
   to `provisioner.RunningContainerName`, route both consumers through it.
   Transient errors get a distinct log line on the plugins side so triage
   doesn't confuse a flaky daemon with a stopped container.

2. Runtime-aware Install/Uninstall — `runtime='external'` workspaces have
   no local container; push-install via docker exec is meaningless. They
   pull plugins via the download endpoint instead (Phase 30.3). Without a
   guard they fell through to `findRunningContainer` and 503'd with a
   misleading "container not running." Add an early 422 with a hint
   pointing at the download endpoint.

The two fixes are independent: (1) preserves correctness when the SSOT
helper is later modified; (2) eliminates the persistent split-state on
the 5 external persona-agent workspaces in this DB (and on tenant
deployments hitting the same shape).

* `internal/provisioner/provisioner.go` — new `RunningContainerName(ctx,
  cli, id) (string, error)` with three documented outcomes (running /
  stopped / transient). `Provisioner.IsRunning` now wraps it; behavior
  preserved.
* `internal/handlers/plugins.go` — `findRunningContainer` shimmed onto
  `RunningContainerName`; new `isExternalRuntime(id)` predicate.
* `internal/handlers/plugins_install.go` — Install + Uninstall reject
  external runtimes with 422 + hint, before the source-fetch step.
* `internal/handlers/plugins_install_external_test.go` — 5 cases:
  external→422, uninstall-external→422, container-backed-falls-through,
  no-runtime-lookup-fails-open, lookup-error-fails-open.
* `internal/handlers/plugins_findrunning_ssot_test.go` — two AST gates
  pin the SSOT routing so future PRs can't silently re-introduce the
  parallel impl. Mutation-tested: reverting either consumer to a direct
  `ContainerInspect` makes the gate fail.

Refs: molecule-core#10

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 22:58:20 -07:00
security-auditor f3187ea0c1 fix(workspace-server): default-bind to 127.0.0.1 in dev-mode fail-open
In dev mode (`MOLECULE_ENV=dev|development`, `ADMIN_TOKEN` unset) the
AdminAuth chain fails open by design so canvas at :3000 can call
workspace-server at :8080 without a bearer token. Combined with the
existing wildcard bind on `:8080`, that exposed unauthenticated
`POST /workspaces` to any same-LAN peer (S-8 in the audit RFC v1).

Couple the bind narrowness to the same signal that drives the auth
fail-open: when `middleware.IsDevModeFailOpen()` returns true, default
the listener to `127.0.0.1`. Production (`ADMIN_TOKEN` set) keeps
binding to all interfaces — its auth chain is doing the work. Operators
who need LAN exposure set `BIND_ADDR=<host>` explicitly.

* `cmd/server/main.go` — `resolveBindHost()` precedence: BIND_ADDR
  explicit > IsDevModeFailOpen() loopback > "" (all interfaces).
  Startup log line now includes the resolved bind + dev-mode-fail-open
  state for post-deploy auditing.
* `cmd/server/bind_test.go` — 8 t.Setenv table cases covering
  precedence, explicit overrides, dev/prod env words. Mutation-tested:
  removing the `IsDevModeFailOpen()` branch makes the dev-mode cases
  fail with "" vs "127.0.0.1".

Refs: molecule-core#7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 22:29:24 -07:00
claude-ceo-assistant f92ba492de Merge pull request 'test(org_import): tighten sqlmock regex on lookupExistingChild (#2872 PR-B)' (#3) from fix/2872-sqlmock-regex-tightening into staging 2026-05-07 00:19:40 +00:00
Hongming Wang 00cfe51df7 test(org_import): tighten sqlmock regex on lookupExistingChild (#2872 PR-B)
The five `mock.ExpectQuery(\`SELECT id FROM workspaces\`)` sites used a
loose substring regex that silent-passed three regression shapes #2872
called out:

  1. `WHERE parent_id = $2` (drops `IS NOT DISTINCT FROM` — breaks
     NULL-parent root matching)
  2. `WHERE name = $1` only (drops parent_id check entirely — hijacks
     siblings of the same name across different parents)
  3. Drops `AND status != 'removed'` (blocks re-import after Collapse)

Extracts a `lookupChildSQLRE` const that anchors all four load-bearing
tokens (the SELECT/FROM, the name predicate, the IS NOT DISTINCT FROM
predicate, and the status filter). All five ExpectQuery sites now use
the same const so a future schema/predicate change fails one place.

Mutation-tested per memory feedback_assert_exact_not_substring.md:
- Replacing `IS NOT DISTINCT FROM` with `=` fails
  TestLookupExistingChild_NilParent_MatchesRoot.
- Dropping `AND status != 'removed'` fails
  TestLookupExistingChild_Found_ReturnsIDAndTrue.

Note: #2872 PR-A (AST gate strengthening) is already addressed inline —
findWorkspacesInsertSQL + TestCreateWorkspaceTree_InsertUsesOnConflictDoNothing
pin the ON CONFLICT DO NOTHING shape, which is a strictly stronger
gate than the original lookup-before-insert ordering check.
2026-05-06 16:43:42 -07:00
claude-ceo-assistant 55ef3176ed feat(provisioner): env-driven RegistryPrefix() for workspace template images (#6)
Allows MOLECULE_IMAGE_REGISTRY env override on the tenant workspace-server. Used to flip from ghcr.io/molecule-ai → private ECR mirror after the GitHub org suspension on 2026-05-06. Default unchanged for OSS users.

Closes #6.
2026-05-06 22:51:53 +00:00
claude-ceo-assistant 4b074f631b feat(provisioner): env-driven RegistryPrefix() for workspace template images (#6)
Add MOLECULE_IMAGE_REGISTRY env var to override the registry prefix used
by all workspace-template image references. Defaults to ghcr.io/molecule-ai
(unchanged for OSS users); set to an ECR URI in production tenants when
mirroring to AWS.

Why this matters: GitHub suspended the Molecule-AI org on 2026-05-06 with
no warning. Production tenants kept running because they had images cached
locally, but any tenant restart (AWS health event, redeploy, OS reboot)
would have failed at `docker pull ghcr.io/molecule-ai/...` because GHCR
returned 401. This change introduces the seam needed to point new pulls at
a registry we control (AWS ECR) by flipping a single env var on Railway.

Design (RFC: molecule-ai/internal#6):

- New `RegistryPrefix()` function in `provisioner/registry.go` reads
  MOLECULE_IMAGE_REGISTRY, falls back to "ghcr.io/molecule-ai".
- New `RuntimeImage(runtime)` returns the canonical ref using the prefix.
- `RuntimeImages` map computed at init via `computeRuntimeImages()` so
  existing callers that range over it still work.
- `DefaultImage` likewise computed via `RuntimeImage(defaultRuntime)`.
- `handlers.TemplateImageRef()` switched from hardcoded format string to
  `provisioner.RegistryPrefix()`.
- `runtime_image_pin.go::resolveRuntimeImage()` automatically inherits
  the prefix change because it reads from `provisioner.RuntimeImages[]`
  and only re-formats the tag suffix to a digest pin.

Alternatives rejected (see RFC):

- Multi-registry fallback chain (try ECR, fall back to GHCR): GHCR is
  locked from outbound for our org, so the fallback never works for us.
  Adds code complexity for no benefit.
- Hardcoded ECR-only switch: couples production code to a specific
  deployment environment. OSS users self-hosting Molecule would need
  the upstream GHCR.
- Self-hosted Harbor / registry-on-Hetzner: adds a component to operate.
  Not justified at 3-tenant scale; AWS ECR is mature and IAM-integrated.

Auth — deliberately NOT changed in this commit:

- For GHCR, the existing `ghcrAuthHeader()` reads GHCR_USER/GHCR_TOKEN.
- For ECR, EC2 user-data installs `amazon-ecr-credential-helper` and adds
  a `credHelpers` entry in `~/.docker/config.json` so the daemon resolves
  ECR credentials via the EC2 instance role on every pull. The Go code
  needs no auth change. This keeps the diff minimal.

Backwards compatibility:

- Additive: env unset → identical behavior to today (GHCR).
- Existing tests reference literal `ghcr.io/molecule-ai/...` strings;
  they continue to pass under the default prefix.
- `RuntimeImages` map preserved for callers that iterate it.
- No interface, schema, API, or migration version bump needed.

Security review:

- No untrusted input: MOLECULE_IMAGE_REGISTRY is set at deploy time
  (Railway env, EC2 user-data), not by users.
- No expanded data collection or logging changes.
- No new permissions: ECR pull permission is a future user-data + IAM
  role change, separate from this code change.
- Worst-case: an attacker who already compromises Railway can swap the
  registry prefix to a malicious URI — same blast radius as compromising
  Railway today, no expansion.

Tests:

- 9 new unit tests in `registry_test.go` covering: default fallback,
  env override, empty env, all 9 known runtimes, unknown runtime,
  override-applies-to-all, computeRuntimeImages map population, env
  reflection, alphabetical ordering pin.
- All existing provisioner + handlers tests continue to pass.
- Mutation-tested mentally: deleting `if v := os.Getenv(...)` makes
  TestRegistryPrefix_RespectsEnv fail. Deleting `for _, r := range
  knownRuntimes` makes TestRuntimeImage_AllKnownRuntimes fail. The test
  suite would catch a regression of the original failure mode.

Rollout plan: this PR is safe to merge with no env change. Production
cutover happens by setting MOLECULE_IMAGE_REGISTRY on Railway after
the AWS ECR mirror is populated (separate ops change, tracked in
issue #6 phases 3b–3f).

Tracking:
- RFC: molecule-ai/internal#6
- Tasks: #97 (ECR setup), #98 (CP fallback)
- Tech debt: runbooks/hetzner-rollout-tech-debt-2026-05-06.md item 7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 14:23:01 -07:00
43 changed files with 1718 additions and 255 deletions
+2 -2
View File
@@ -1,7 +1,7 @@
name: Block internal-flavored paths
# Hard CI gate. Internal content (positioning, competitive briefs, sales
# playbooks, PMM/press drip, draft campaigns) lives in Molecule-AI/internal —
# playbooks, PMM/press drip, draft campaigns) lives in molecule-ai/internal —
# this public monorepo must never re-acquire those paths. CEO directive
# 2026-04-23 after a fleet-wide audit found 79 internal files leaked here.
#
@@ -135,7 +135,7 @@ jobs:
echo "::error::Forbidden internal-flavored paths detected:"
printf "$OFFENDING"
echo ""
echo "These paths belong in Molecule-AI/internal, not this public repo."
echo "These paths belong in molecule-ai/internal, not this public repo."
echo "See docs/internal-content-policy.md for canonical locations."
echo ""
echo "If your file is genuinely public-facing (e.g. a blog post"
+1 -1
View File
@@ -108,7 +108,7 @@ jobs:
echo
echo "One or more canary secrets are unset (\`CANARY_TENANT_URLS\`, \`CANARY_ADMIN_TOKENS\`, \`CANARY_CP_SHARED_SECRET\`)."
echo "Phase 2 canary fleet has not been stood up yet —"
echo "see [canary-tenants.md](https://github.com/Molecule-AI/molecule-controlplane/blob/main/docs/canary-tenants.md)."
echo "see [canary-tenants.md](https://github.com/molecule-ai/molecule-controlplane/blob/main/docs/canary-tenants.md)."
echo
echo "**Skipped — promote-to-latest will NOT auto-fire.** Dispatch \`promote-latest.yml\` manually when ready."
} >> "$GITHUB_STEP_SUMMARY"
+5 -5
View File
@@ -87,7 +87,7 @@ jobs:
run: go mod download
- if: needs.changes.outputs.platform == 'true'
run: go build ./cmd/server
# CLI (molecli) moved to standalone repo: github.com/Molecule-AI/molecule-cli
# CLI (molecli) moved to standalone repo: github.com/molecule-ai/molecule-cli
- if: needs.changes.outputs.platform == 'true'
run: go vet ./... || true
- if: needs.changes.outputs.platform == 'true'
@@ -165,7 +165,7 @@ jobs:
# Strip the package-import prefix so we can match .coverage-allowlist.txt
# entries written as paths relative to workspace-server/.
# Handle both module paths: platform/workspace-server/... and platform/...
rel=$(echo "$file" | sed 's|^github.com/Molecule-AI/molecule-monorepo/platform/workspace-server/||; s|^github.com/Molecule-AI/molecule-monorepo/platform/||')
rel=$(echo "$file" | sed 's|^github.com/molecule-ai/molecule-monorepo/platform/workspace-server/||; s|^github.com/molecule-ai/molecule-monorepo/platform/||')
if echo "$ALLOWLIST" | grep -qxF "$rel"; then
echo "::warning file=workspace-server/$rel::Critical file at ${pct}% coverage (allowlisted, #1823) — fix before expiry."
@@ -243,8 +243,8 @@ jobs:
if-no-files-found: warn
# MCP Server + SDK removed from CI — now in standalone repos:
# - github.com/Molecule-AI/molecule-mcp-server (npm CI)
# - github.com/Molecule-AI/molecule-sdk-python (PyPI CI)
# - github.com/molecule-ai/molecule-mcp-server (npm CI)
# - github.com/molecule-ai/molecule-sdk-python (PyPI CI)
# e2e-api job moved to .github/workflows/e2e-api.yml (issue #458).
# It now has workflow-level concurrency (cancel-in-progress: false) so
@@ -434,5 +434,5 @@ jobs:
fi
# SDK + plugin validation moved to standalone repo:
# github.com/Molecule-AI/molecule-sdk-python
# github.com/molecule-ai/molecule-sdk-python
+6 -12
View File
@@ -43,6 +43,9 @@ permissions:
jobs:
analyze:
name: Analyze (${{ matrix.language }})
# CodeQL set to advisory (non-blocking) on Gitea Actions — Hongming dec'''n 2026-05-07 (#156).
# Findings still emit as SARIF artifacts; failing CodeQL run does not block PR merge.
continue-on-error: true
runs-on: ubuntu-latest
timeout-minutes: 45
@@ -55,17 +58,8 @@ jobs:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Checkout sibling plugin repo
# Same reasoning as publish-workspace-server-image.yml — the Go
# module's replace directive needs the plugin source so
# CodeQL's "go build" phase can resolve.
if: matrix.language == 'go'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
repository: Molecule-AI/molecule-ai-plugin-github-app-auth
path: molecule-ai-plugin-github-app-auth
token: ${{ secrets.PLUGIN_REPO_PAT || secrets.GITHUB_TOKEN }}
# github-app-auth sibling-checkout removed 2026-05-07 (#157):
# plugin was dropped + the Dockerfile no longer needs it.
# jq is pre-installed on ubuntu-latest — no setup step needed.
- name: Initialize CodeQL
@@ -121,7 +115,7 @@ jobs:
# 14-day retention — longer than default 3, short enough not
# to bloat quota.
if: always()
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
uses: actions/upload-artifact@v3 # pinned to v3 for Gitea act_runner v0.6 compatibility (internal#46)
with:
name: codeql-sarif-${{ matrix.language }}
path: sarif-results/${{ matrix.language }}/
+2 -10
View File
@@ -95,16 +95,8 @@ jobs:
- if: needs.detect-changes.outputs.run == 'true'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Checkout sibling plugin repo
# Dockerfile.tenant copies molecule-ai-plugin-github-app-auth/
# at the build-context root (see workspace-server/Dockerfile.tenant
# line 19). PLUGIN_REPO_PAT pattern matches publish-workspace-server-image.yml.
if: needs.detect-changes.outputs.run == 'true'
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
repository: Molecule-AI/molecule-ai-plugin-github-app-auth
path: molecule-ai-plugin-github-app-auth
token: ${{ secrets.PLUGIN_REPO_PAT || secrets.GITHUB_TOKEN }}
# github-app-auth sibling-checkout removed 2026-05-07 (#157):
# the plugin was dropped + Dockerfile.tenant no longer COPYs it.
- name: Install Python deps for replays
# peer-discovery-404 (and future replays) eval Python against the
+1 -1
View File
@@ -19,4 +19,4 @@ permissions:
jobs:
disable-auto-merge-on-push:
uses: Molecule-AI/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@main
uses: molecule-ai/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@main
+3 -3
View File
@@ -25,7 +25,7 @@ name: publish-runtime
# 3. Publishes to PyPI via the PyPA Trusted Publisher action (OIDC).
# No static API token is stored — PyPI verifies the workflow's
# OIDC claim against the trusted-publisher config registered for
# molecule-ai-workspace-runtime (Molecule-AI/molecule-core,
# molecule-ai-workspace-runtime (molecule-ai/molecule-core,
# publish-runtime.yml, environment pypi-publish).
#
# After publish: the 8 template repos pick up the new version on their
@@ -166,7 +166,7 @@ jobs:
- name: Publish to PyPI (Trusted Publisher / OIDC)
# PyPI side is configured: project molecule-ai-workspace-runtime →
# publisher Molecule-AI/molecule-core, workflow publish-runtime.yml,
# publisher molecule-ai/molecule-core, workflow publish-runtime.yml,
# environment pypi-publish. The action mints a short-lived OIDC
# token and exchanges it for a PyPI upload credential — no static
# API token in this repo's secrets.
@@ -342,7 +342,7 @@ jobs:
TEMPLATES="claude-code hermes openclaw codex langgraph crewai autogen deepagents gemini-cli"
FAILED=""
for tpl in $TEMPLATES; do
REPO="Molecule-AI/molecule-ai-workspace-template-$tpl"
REPO="molecule-ai/molecule-ai-workspace-template-$tpl"
STATUS=$(curl -sS -o /tmp/dispatch.out -w "%{http_code}" \
-X POST "https://api.github.com/repos/$REPO/dispatches" \
-H "Authorization: Bearer $DISPATCH_TOKEN" \
@@ -37,6 +37,7 @@ on:
- 'workspace-server/**'
- 'canvas/**'
- 'manifest.json'
- 'scripts/**'
- '.github/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
@@ -60,8 +61,8 @@ permissions:
packages: write
env:
IMAGE_NAME: ghcr.io/molecule-ai/platform
TENANT_IMAGE_NAME: ghcr.io/molecule-ai/platform-tenant
IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform
TENANT_IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform-tenant
jobs:
build-and-push:
@@ -70,31 +71,28 @@ jobs:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Checkout sibling plugin repo
# workspace-server/Dockerfile expects
# ./molecule-ai-plugin-github-app-auth at build-context root because
# the Go module has a `replace` directive pointing at /plugin inside
# the image. Pre-repo-split the plugin lived in the monorepo; the
# 2026-04-18 restructure moved it out but didn't add this clone step
# — which is why publish was failing after that restructure.
#
# Uses a fine-grained PAT (PLUGIN_REPO_PAT) because the plugin repo
# is private and the default GITHUB_TOKEN is scoped to THIS repo.
# The PAT needs Contents:Read on Molecule-AI/molecule-ai-plugin-
# github-app-auth. Falls back to the default token for the (rare)
# case where an operator made the plugin repo public.
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
repository: Molecule-AI/molecule-ai-plugin-github-app-auth
path: molecule-ai-plugin-github-app-auth
token: ${{ secrets.PLUGIN_REPO_PAT || secrets.GITHUB_TOKEN }}
# github-app-auth sibling-checkout removed 2026-05-07 (#157):
# plugin was dropped + workspace-server/Dockerfile no longer
# COPYs it.
- name: Log in to GHCR
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
- name: Configure AWS credentials for ECR
# GHCR was the pre-suspension target; the molecule-ai org on
# GitHub got swept 2026-05-06 and ghcr.io/molecule-ai/* is no
# longer reachable. Post-suspension target is the operator's
# ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/
# molecule-ai/*), which already hosts platform-tenant +
# workspace-template-* + runner-base images. AWS creds come
# from the AWS_ACCESS_KEY_ID/SECRET secrets bound to the
# molecule-cp IAM user. Closes #161.
uses: aws-actions/configure-aws-credentials@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-2
- name: Log in to ECR
id: ecr-login
uses: aws-actions/amazon-ecr-login@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4.0.0
@@ -184,3 +182,4 @@ jobs:
org.opencontainers.image.source=https://github.com/${{ github.repository }}
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.description=Molecule AI tenant platform + canvas — pending canary verify
@@ -9,7 +9,7 @@ name: redeploy-tenants-on-main
#
# This workflow closes the gap by calling the control-plane admin
# endpoint that performs a canary-first, batched, health-gated rolling
# redeploy across every live tenant. Implemented in Molecule-AI/
# redeploy across every live tenant. Implemented in molecule-ai/
# molecule-controlplane as POST /cp/admin/tenants/redeploy-fleet
# (feat/tenant-auto-redeploy, landing alongside this workflow).
#
@@ -146,7 +146,7 @@ jobs:
- name: Call CP redeploy-fleet
# CP_ADMIN_API_TOKEN must be set as a repo/org secret on
# Molecule-AI/molecule-core, matching the staging/prod CP's
# molecule-ai/molecule-core, matching the staging/prod CP's
# CP_ADMIN_API_TOKEN env. Stored in Railway, mirrored to this
# repo's secrets for CI.
env:
@@ -97,7 +97,7 @@ jobs:
- name: Call staging-CP redeploy-fleet
# CP_STAGING_ADMIN_API_TOKEN must be set as a repo/org secret
# on Molecule-AI/molecule-core, matching staging-CP's
# on molecule-ai/molecule-core, matching staging-CP's
# CP_ADMIN_API_TOKEN env var (visible in Railway controlplane
# / staging environment). Stored separately from the prod
# CP_ADMIN_API_TOKEN so a leak of one doesn't auth the other.
@@ -96,7 +96,7 @@ jobs:
--body "$(cat <<'BODY'
[retarget-bot] This PR was opened against `main` and has been retargeted to `staging` automatically.
**Why:** per [SHARED_RULES rule 8](https://github.com/Molecule-AI/molecule-ai-org-template-molecule-dev/blob/main/SHARED_RULES.md), all feature work targets `staging` first; the CEO promotes `staging → main` separately.
**Why:** per [SHARED_RULES rule 8](https://github.com/molecule-ai/molecule-ai-org-template-molecule-dev/blob/main/SHARED_RULES.md), all feature work targets `staging` first; the CEO promotes `staging → main` separately.
**What changed:** just the base branch — no code change. CI will re-run against `staging`. If you get merge conflicts, rebase on `staging`.
+1 -1
View File
@@ -12,7 +12,7 @@ name: Secret scan
#
# jobs:
# secret-scan:
# uses: Molecule-AI/molecule-core/.github/workflows/secret-scan.yml@staging
# uses: molecule-ai/molecule-core/.github/workflows/secret-scan.yml@staging
#
# Pin to @staging not @main — staging is the active default branch,
# main lags via the staging-promotion workflow. Updates ride along
+23 -44
View File
@@ -1,7 +1,7 @@
<div align="center">
<p>
<img src="./docs/assets/branding/molecule-icon.svg" alt="Molecule AI" width="160" />
<img src="./docs/assets/branding/molecule-icon.png" alt="Molecule AI Icon Logo" width="160" />
</p>
<p>
@@ -39,8 +39,8 @@
<a href="./docs/agent-runtime/workspace-runtime.md"><strong>Workspace Runtime</strong></a>
</p>
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/new/template?template=https://github.com/Molecule-AI/molecule-core)
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://github.com/Molecule-AI/molecule-core)
[![Deploy on Railway](https://railway.app/button.svg)](https://railway.app/new/template?template=https://github.com/Molecule-AI/molecule-monorepo)
[![Deploy to Render](https://render.com/images/deploy-to-render-button.svg)](https://render.com/deploy?repo=https://github.com/Molecule-AI/molecule-monorepo)
</div>
@@ -53,8 +53,8 @@ Molecule AI is the most powerful way to govern an AI agent organization in produ
It combines the parts that are usually scattered across demos, internal glue code, and framework-specific tooling into one product:
- one org-native control plane for teams, roles, hierarchy, and lifecycle
- one runtime layer that lets **eight** agent runtimes — LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, **Hermes**, **Gemini CLI**, and OpenClaw run side by side behind one workspace contract
- one memory model that keeps recall, sharing, and skill evolution aligned with organizational boundaries (Memory v2 backed by pgvector for semantic recall)
- one runtime layer that lets LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, and OpenClaw run side by side
- one memory model that keeps recall, sharing, and skill evolution aligned with organizational boundaries
- one operational surface for observing, pausing, restarting, inspecting, and improving live workspaces
Most teams can build a workflow, a strong single agent, a coding agent, or a custom multi-agent graph.
@@ -75,7 +75,7 @@ You do not wire collaboration paths by hand. Hierarchy defines the default commu
### 3. Runtime choice stops being a dead-end decision
LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, Hermes, Gemini CLI, and OpenClaw can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, and OpenClaw can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
### 4. Memory is treated like infrastructure
@@ -117,8 +117,6 @@ Molecule AI is not trying to replace the frameworks below. It is the system that
| **Claude Code** | Shipping on `main` | Real coding workflows, CLI-native continuity | Secure workspace abstraction, A2A delegation, org boundaries, shared control plane |
| **CrewAI** | Shipping on `main` | Role-based crews | Persistent workspace identity, policy consistency, shared canvas and registry |
| **AutoGen** | Shipping on `main` | Assistant/tool orchestration | Standardized deployment, hierarchy-aware collaboration, shared ops plane |
| **Hermes 4** | Shipping on `main` | Hybrid reasoning, native tools, json_schema (NousResearch/hermes-agent) | Option B upstream hook, A2A bridge to OpenAI-compat API, multi-provider provider derivation |
| **Gemini CLI** | Shipping on `main` | Google Gemini CLI continuity | Workspace lifecycle, A2A, hierarchy-aware collaboration, shared ops plane |
| **OpenClaw** | Shipping on `main` | CLI-native runtime with its own session model | Workspace lifecycle, templates, activity logs, topology-aware collaboration |
| **NemoClaw** | WIP on `feat/nemoclaw-t4-docker` | NVIDIA-oriented runtime path | Planned to join the same abstraction once merged; not yet part of `main` |
@@ -184,10 +182,9 @@ The result is not just “an agent that learns.” It is **an organization that
## What Ships In `main`
### Canvas (v4)
### Canvas
- Next.js 15 + React Flow + Zustand
- **warm-paper theme system** — light / dark / follow-system, SSR cookie + nonce'd boot script + ThemeProvider; terminal + code surfaces stay dark unconditionally
- drag-to-nest team building
- empty-state deployment + onboarding wizard
- template palette
@@ -196,9 +193,8 @@ The result is not just “an agent that learns.” It is **an organization that
### Platform
- Go 1.25 / Gin control plane (80+ HTTP endpoints + Gorilla WebSocket fanout)
- workspace CRUD and provisioning (pluggable Provisioner — Docker locally, EC2 + SSM in production)
- **A2A response path is a typed discriminated union (RFC #2967)** — frozen dataclasses + total parser; 100% unit + adversarial fuzz coverage
- Go/Gin control plane
- workspace CRUD and provisioning
- registry and heartbeats
- browser-safe A2A proxy
- team expansion/collapse
@@ -208,10 +204,10 @@ The result is not just “an agent that learns.” It is **an organization that
### Runtime
- unified `workspace/` image; thin AMI in production (us-east-2)
- adapter-driven execution across **8 runtimes** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw)
- unified `workspace/` image
- adapter-driven execution
- Agent Card registration
- awareness-backed memory integration; **Memory v2 backed by pgvector** for semantic recall
- awareness-backed memory integration
- plugin-mounted shared rules/skills
- hot-reloadable local skills
- coordinator-only delegation path
@@ -225,13 +221,6 @@ The result is not just “an agent that learns.” It is **an organization that
- runtime tiers
- direct workspace inspection through terminal and files
### SaaS (via [`molecule-controlplane`](https://github.com/Molecule-AI/molecule-controlplane))
- multi-tenant on AWS EC2 + Neon (per-tenant Postgres branch) + Cloudflare Tunnels (per-tenant, no public ports)
- WorkOS AuthKit + Stripe Checkout + Customer Portal
- AWS KMS envelope encryption (DB / Redis connection strings); AWS Secrets Manager for tenant bootstrap
- `tenant_resources` audit table + 30-min boot-event-aware reconciler — every CF / AWS lifecycle event recorded, claim vs live state diffed
## Built For Teams That Need More Than A Demo
Molecule AI is especially strong when you need to run:
@@ -244,30 +233,24 @@ Molecule AI is especially strong when you need to run:
## Architecture
```text
Canvas (Next.js 15, warm-paper :3000) <--HTTP / WS--> Platform (Go 1.25 :8080) <---> Postgres + Redis
| |
| +--> Provisioner: Docker (local) / EC2 + SSM (prod)
| +--> bundles · templates · secrets · KMS
Canvas (Next.js :3000) <--HTTP / WS--> Platform (Go :8080) <---> Postgres + Redis
| |
| +--> Docker provisioner / bundles / templates / secrets
|
+------------------------- shows ------------------------> workspaces, teams, tasks, traces, events
+-------------------- shows --------------------> workspaces, teams, tasks, traces, events
Workspace Runtime (Python ≥3.11, image with adapters)
- 8 adapters: LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / Hermes / Gemini CLI / OpenClaw
- Agent Card + A2A server (typed-SSOT response path, RFC #2967)
- heartbeat + activity + awareness-backed memory (Memory v2 — pgvector semantic recall)
Workspace Runtime (Python image with adapters)
- LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / OpenClaw
- Agent Card + A2A server
- heartbeat + activity + awareness-backed memory
- skills + plugins + hot reload
SaaS Control Plane (molecule-controlplane, private)
- per-tenant EC2 + Neon (Postgres branch) + Cloudflare Tunnel
- WorkOS · Stripe · KMS · AWS Secrets Manager
- tenant_resources audit + 30-min reconciler
```
## Quick Start
```bash
git clone https://github.com/Molecule-AI/molecule-core.git
cd molecule-core
git clone https://github.com/Molecule-AI/molecule-monorepo.git
cd molecule-monorepo
cp .env.example .env
# Defaults boot the stack locally out of the box. See .env.example for
@@ -320,11 +303,7 @@ Then open `http://localhost:3000`:
## Current Scope
The current `main` branch ships the core platform, Canvas v4 (warm-paper themed), Memory v2 (pgvector semantic recall), the typed-SSOT A2A response path (RFC #2967), **eight production adapters** (Claude Code, Hermes, Gemini CLI, LangGraph, DeepAgents, CrewAI, AutoGen, OpenClaw), skill lifecycle, and operational surfaces.
The companion private repo [`molecule-controlplane`](https://github.com/Molecule-AI/molecule-controlplane) provides the SaaS surface — multi-tenant orchestration on EC2 + Neon + Cloudflare Tunnels, KMS envelope encryption, WorkOS auth, Stripe billing, and a `tenant_resources` audit table with a 30-min reconciler.
Adjacent runtime work such as **NemoClaw** remains branch-level until merged, and this README keeps that distinction explicit on purpose.
The current `main` branch already includes the core platform, canvas, memory model, six production adapters, skill lifecycle, and operational surfaces. Adjacent runtime work such as **NemoClaw** remains branch-level until merged, and this README keeps that distinction explicit on purpose.
## License
+7
View File
@@ -3,6 +3,7 @@ import { cookies, headers } from "next/headers";
import "./globals.css";
import { AuthGate } from "@/components/AuthGate";
import { CookieConsent } from "@/components/CookieConsent";
import { PurchaseSuccessModal } from "@/components/PurchaseSuccessModal";
import { ThemeProvider } from "@/lib/theme-provider";
import {
THEME_COOKIE,
@@ -86,6 +87,12 @@ export default async function RootLayout({
vercel preview URL, apex) pass through unchanged. */}
<AuthGate>{children}</AuthGate>
<CookieConsent />
{/* Demo Mock #1: post-purchase success toast. Mounted at the
layout level so it persists across page state transitions
(loading → hydrated → error) without being unmounted and
losing its open-state. Reads ?purchase_success=1 from the
URL on first paint, then strips the param. */}
<PurchaseSuccessModal />
</ThemeProvider>
</body>
</html>
@@ -0,0 +1,175 @@
"use client";
/**
* PurchaseSuccessModal — demo-only post-purchase confirmation.
*
* Mounted on the canvas root (`app/page.tsx`). On first paint it inspects
* `?purchase_success=1[&item=<name>]` on the current URL. If present, it
* renders a centred modal styled after `ConfirmDialog`, schedules a 5s
* auto-dismiss, and rewrites the URL via `history.replaceState` to drop
* the params so a refresh after dismiss does NOT re-show the modal.
*
* Mock for the funding demo — there is no real billing surface behind
* this. The marketplace "Purchase" button on the landing page redirects
* here with the params; this modal is the only thing the user sees of
* the "transaction".
*
* Styling matches the warm-paper @theme tokens (surface-sunken / line /
* ink / good) so it tracks light + dark without per-mode overrides.
*/
import { useEffect, useRef, useState } from "react";
import { createPortal } from "react-dom";
const AUTO_DISMISS_MS = 5000;
function readPurchaseParams(): { open: boolean; item: string | null } {
if (typeof window === "undefined") return { open: false, item: null };
const sp = new URLSearchParams(window.location.search);
const flag = sp.get("purchase_success");
if (flag !== "1" && flag !== "true") return { open: false, item: null };
return { open: true, item: sp.get("item") };
}
function stripPurchaseParams() {
if (typeof window === "undefined") return;
const url = new URL(window.location.href);
url.searchParams.delete("purchase_success");
url.searchParams.delete("item");
// replaceState (not pushState) so back-button doesn't return to the
// pre-strip URL and re-trigger the modal.
window.history.replaceState({}, "", url.toString());
}
export function PurchaseSuccessModal() {
const [open, setOpen] = useState(false);
const [item, setItem] = useState<string | null>(null);
const [mounted, setMounted] = useState(false);
const dialogRef = useRef<HTMLDivElement>(null);
// Read the URL params once on mount. We don't subscribe to navigation —
// this modal is a one-shot for the demo redirect, not a persistent
// listener.
useEffect(() => {
setMounted(true);
const { open: shouldOpen, item: itemName } = readPurchaseParams();
if (shouldOpen) {
setOpen(true);
setItem(itemName);
// Clean the URL immediately so a refresh after the modal is closed
// (or even while it's still open) does NOT re-trigger it.
stripPurchaseParams();
}
}, []);
// Auto-dismiss timer + Escape handler.
useEffect(() => {
if (!open) return;
const t = window.setTimeout(() => setOpen(false), AUTO_DISMISS_MS);
const onKey = (e: KeyboardEvent) => {
if (e.key === "Escape") setOpen(false);
};
window.addEventListener("keydown", onKey);
// Focus the close button so keyboard users land on it after redirect.
const raf = requestAnimationFrame(() => {
dialogRef.current?.querySelector<HTMLButtonElement>("button")?.focus();
});
return () => {
window.clearTimeout(t);
window.removeEventListener("keydown", onKey);
cancelAnimationFrame(raf);
};
}, [open]);
if (!open || !mounted) return null;
const itemLabel = item ? decodeURIComponent(item) : "Your new agent";
return createPortal(
<div
className="fixed inset-0 z-[9999] flex items-center justify-center"
data-testid="purchase-success-modal"
>
{/* Backdrop — click closes, matches ConfirmDialog backdrop. */}
<div
className="absolute inset-0 bg-black/60 backdrop-blur-sm"
onClick={() => setOpen(false)}
aria-hidden="true"
/>
<div
ref={dialogRef}
role="dialog"
aria-modal="true"
aria-labelledby="purchase-success-title"
className="relative bg-surface-sunken border border-line rounded-xl shadow-2xl shadow-black/50 max-w-[420px] w-full mx-4 overflow-hidden"
>
<div className="px-6 pt-6 pb-4">
<div className="flex items-start gap-4">
{/* Success glyph — uses --color-good so it tracks the theme.
Inline SVG over an emoji so it stays readable + on-brand
in both light and dark. */}
<div
className="flex h-10 w-10 flex-shrink-0 items-center justify-center rounded-full"
style={{
background:
"color-mix(in srgb, var(--color-good) 15%, transparent)",
color: "var(--color-good)",
}}
>
<svg
width="22"
height="22"
viewBox="0 0 24 24"
fill="none"
aria-hidden="true"
>
<circle
cx="12"
cy="12"
r="10"
stroke="currentColor"
strokeWidth="1.5"
/>
<path
d="M7.5 12.5L10.5 15.5L16.5 9.5"
stroke="currentColor"
strokeWidth="1.8"
strokeLinecap="round"
strokeLinejoin="round"
/>
</svg>
</div>
<div className="flex-1">
<h3
id="purchase-success-title"
className="text-base font-semibold text-ink"
>
Purchase successful
</h3>
<p className="mt-1.5 text-[13px] leading-relaxed text-ink-mid">
<span className="font-medium text-ink">{itemLabel}</span> has
been added to your workspace. Provisioning starts in the
background you can keep working while it spins up.
</p>
</div>
</div>
</div>
<div className="flex items-center justify-between gap-3 px-6 py-3 border-t border-line bg-surface/50">
<span className="font-mono text-[10.5px] uppercase tracking-[0.12em] text-ink-soft">
auto-dismiss · {AUTO_DISMISS_MS / 1000}s
</span>
<button
type="button"
onClick={() => setOpen(false)}
className="px-3.5 py-1.5 text-[13px] rounded-lg bg-accent hover:bg-accent-strong text-white transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-offset-2 focus-visible:ring-offset-surface-sunken focus-visible:ring-accent/60"
>
Close
</button>
</div>
</div>
</div>,
document.body,
);
}
-28
View File
@@ -1,28 +0,0 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 64 64">
<style>
.bg { fill: #0a1120; }
.accent { fill: #7fe8d6; }
.accent-stroke { stroke: #7fe8d6; }
@media (prefers-color-scheme: light) {
.bg { fill: #f5f7fa; }
.accent { fill: #1a8a72; }
.accent-stroke { stroke: #1a8a72; }
}
</style>
<rect class="bg" width="64" height="64" rx="14"/>
<g class="accent-stroke" stroke-width="2.4" stroke-linecap="round" fill="none">
<line x1="32" y1="32" x2="12" y2="14"/>
<line x1="32" y1="32" x2="52" y2="18"/>
<line x1="32" y1="32" x2="10" y2="40"/>
<line x1="32" y1="32" x2="54" y2="44"/>
<line x1="32" y1="32" x2="32" y2="56"/>
</g>
<g class="accent">
<circle cx="32" cy="32" r="6.5"/>
<circle cx="12" cy="14" r="3.5"/>
<circle cx="52" cy="18" r="3.5"/>
<circle cx="10" cy="40" r="3.5"/>
<circle cx="54" cy="44" r="3.5"/>
<circle cx="32" cy="56" r="3.5"/>
</g>
</svg>

Before

Width:  |  Height:  |  Size: 957 B

-17
View File
@@ -1,17 +0,0 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 64 64" role="img" aria-label="Molecule AI">
<g stroke="#7fe8d6" stroke-width="2.6" stroke-linecap="round" fill="none">
<line x1="32" y1="32" x2="12" y2="14"/>
<line x1="32" y1="32" x2="52" y2="18"/>
<line x1="32" y1="32" x2="10" y2="40"/>
<line x1="32" y1="32" x2="54" y2="44"/>
<line x1="32" y1="32" x2="32" y2="56"/>
</g>
<g fill="#7fe8d6">
<circle cx="32" cy="32" r="7"/>
<circle cx="12" cy="14" r="3.6"/>
<circle cx="52" cy="18" r="3.6"/>
<circle cx="10" cy="40" r="3.6"/>
<circle cx="54" cy="44" r="3.6"/>
<circle cx="32" cy="56" r="3.6"/>
</g>
</svg>

Before

Width:  |  Height:  |  Size: 662 B

+2 -1
View File
@@ -41,6 +41,7 @@
{"name": "medo-smoke", "repo": "Molecule-AI/molecule-ai-org-template-medo-smoke", "ref": "main"},
{"name": "molecule-worker-gemini", "repo": "Molecule-AI/molecule-ai-org-template-molecule-worker-gemini", "ref": "main"},
{"name": "reno-stars", "repo": "Molecule-AI/molecule-ai-org-template-reno-stars", "ref": "main"},
{"name": "ux-ab-lab", "repo": "Molecule-AI/molecule-ai-org-template-ux-ab-lab", "ref": "main"}
{"name": "ux-ab-lab", "repo": "Molecule-AI/molecule-ai-org-template-ux-ab-lab", "ref": "main"},
{"name": "mock-bigorg", "repo": "Molecule-AI/molecule-ai-org-template-mock-bigorg", "ref": "main"}
]
}
+10 -3
View File
@@ -45,11 +45,18 @@ clone_category() {
continue
fi
echo " cloning $repo -> $target_dir/$name (ref=$ref)"
# Post-2026-05-06 GitHub-org-suspension: clone from Gitea instead.
# manifest.json paths still read "Molecule-AI/..." (the historic
# github.com slug); Gitea lowercases the org part to "molecule-ai/".
# Lowercase the org segment on the fly so we don't need to rewrite
# every manifest entry.
repo_gitea="$(echo "$repo" | awk -F/ '{ printf "%s", tolower($1); for (i=2; i<=NF; i++) printf "/%s", $i; print "" }')"
echo " cloning $repo_gitea -> $target_dir/$name (ref=$ref)"
if [ "$ref" = "main" ]; then
git clone --depth=1 -q "https://github.com/${repo}.git" "$target_dir/$name"
git clone --depth=1 -q "https://git.moleculesai.app/${repo_gitea}.git" "$target_dir/$name"
else
git clone --depth=1 -q --branch "$ref" "https://github.com/${repo}.git" "$target_dir/$name"
git clone --depth=1 -q --branch "$ref" "https://git.moleculesai.app/${repo_gitea}.git" "$target_dir/$name"
fi
CLONED=$((CLONED + 1))
i=$((i + 1))
+4 -8
View File
@@ -5,15 +5,11 @@
FROM golang:1.25-alpine AS builder
WORKDIR /app
# Plugin source for replace directive in go.mod
COPY molecule-ai-plugin-github-app-auth/ /plugin/
COPY workspace-server/go.mod workspace-server/go.sum ./
# Add replace directives for Docker builds:
# 1. Platform → plugin (plugin source at /plugin/)
# 2. Plugin → platform (plugin's go.mod has a relative replace that doesn't
# work in Docker; fix it to point at /app where the platform source lives)
RUN echo 'replace github.com/Molecule-AI/molecule-ai-plugin-github-app-auth => /plugin' >> go.mod
RUN sed -i 's|replace github.com/Molecule-AI/molecule-monorepo/platform => .*|replace github.com/Molecule-AI/molecule-monorepo/platform => /app|' /plugin/go.mod
# github-app-auth plugin removed 2026-05-07 (#157): per-agent Gitea
# identities replaced the GitHub-App-installation token flow after the
# 2026-05-06 suspension. Pre-removal this stage COPY'd the sibling
# plugin repo + injected a `replace` directive; both are gone.
RUN go mod download
COPY workspace-server/ .
# GIT_SHA mirror of Dockerfile.tenant — see that file for the rationale.
+3 -2
View File
@@ -16,9 +16,10 @@
# ── Stage 1: Go platform binary ──────────────────────────────────────
FROM golang:1.25-alpine AS go-builder
WORKDIR /app
COPY molecule-ai-plugin-github-app-auth/ /plugin/
COPY workspace-server/go.mod workspace-server/go.sum ./
RUN echo 'replace github.com/Molecule-AI/molecule-ai-plugin-github-app-auth => /plugin' >> go.mod
# github-app-auth plugin removed 2026-05-07 (#157): per-agent Gitea
# identities replaced GitHub-App tokens post-suspension. The sibling
# COPY + replace directive are gone.
RUN go mod download
COPY workspace-server/ .
+89
View File
@@ -0,0 +1,89 @@
package main
import "testing"
// TestResolveBindHost pins the precedence: BIND_ADDR explicit > dev-mode
// fail-open default of 127.0.0.1 > production-shape empty (all interfaces).
//
// Mutation-test invariant: removing the IsDevModeFailOpen() branch makes
// "no_bindaddr_devmode_unset_admin" fail (returns "" instead of "127.0.0.1").
// Removing the BIND_ADDR branch makes "explicit_bindaddr_*" cases fail.
func TestResolveBindHost(t *testing.T) {
cases := []struct {
name string
bindAddr string
adminToken string
molEnv string
want string
}{
{
name: "no_bindaddr_devmode_unset_admin",
bindAddr: "",
adminToken: "",
molEnv: "dev",
want: "127.0.0.1",
},
{
name: "no_bindaddr_devmode_unset_admin_full_word",
bindAddr: "",
adminToken: "",
molEnv: "development",
want: "127.0.0.1",
},
{
name: "no_bindaddr_admin_set_in_dev_env",
bindAddr: "",
adminToken: "secret",
molEnv: "dev",
want: "", // ADMIN_TOKEN flips IsDevModeFailOpen to false → all interfaces
},
{
name: "no_bindaddr_production_env",
bindAddr: "",
adminToken: "",
molEnv: "production",
want: "", // production is not a dev value → all interfaces
},
{
name: "no_bindaddr_unset_env",
bindAddr: "",
adminToken: "",
molEnv: "",
want: "", // unset MOLECULE_ENV → not dev → all interfaces
},
{
name: "explicit_bindaddr_loopback_overrides_devmode",
bindAddr: "127.0.0.1",
adminToken: "",
molEnv: "dev",
want: "127.0.0.1",
},
{
name: "explicit_bindaddr_wildcard_overrides_devmode_default",
bindAddr: "0.0.0.0",
adminToken: "",
molEnv: "dev",
want: "0.0.0.0",
},
{
name: "explicit_bindaddr_in_production",
bindAddr: "10.0.5.7",
adminToken: "secret",
molEnv: "production",
want: "10.0.5.7",
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Setenv("BIND_ADDR", tc.bindAddr)
t.Setenv("ADMIN_TOKEN", tc.adminToken)
t.Setenv("MOLECULE_ENV", tc.molEnv)
got := resolveBindHost()
if got != tc.want {
t.Errorf("resolveBindHost() = %q, want %q (BIND_ADDR=%q ADMIN_TOKEN=%q MOLECULE_ENV=%q)",
got, tc.want, tc.bindAddr, tc.adminToken, tc.molEnv)
}
})
}
}
+45 -31
View File
@@ -19,6 +19,7 @@ import (
"github.com/Molecule-AI/molecule-monorepo/platform/internal/handlers"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/imagewatch"
memwiring "github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/wiring"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/middleware"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/pendinguploads"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/provisioner"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/registry"
@@ -29,8 +30,7 @@ import (
// External plugins — each registers EnvMutator(s) that run at workspace
// provision time. Loaded via soft-dep gates in main() so self-hosters
// without the App or without per-agent identity configured keep working.
githubappauth "github.com/Molecule-AI/molecule-ai-plugin-github-app-auth/pluginloader"
// without per-agent identity configured keep working.
ghidentity "github.com/Molecule-AI/molecule-ai-plugin-gh-identity/pluginloader"
"github.com/Molecule-AI/molecule-monorepo/platform/pkg/provisionhook"
@@ -179,12 +179,15 @@ func main() {
}
// External-plugin env mutators — each plugin contributes 0+ mutators
// onto a shared registry. Order matters: gh-identity populates
// MOLECULE_AGENT_ROLE-derived attribution env vars that downstream
// mutators and the workspace's install.sh can then read. Keep
// github-app-auth last because it fails loudly on misconfig and its
// failure mode is "no GITHUB_TOKEN" — worth surfacing after the
// cheaper mutators already ran.
// onto a shared registry. gh-identity populates MOLECULE_AGENT_ROLE-
// derived attribution env vars that the workspace's install.sh can
// then read.
//
// github-app-auth was dropped 2026-05-07 (closes #157): per-agent
// Gitea identities (this gh-identity plugin's role-derived path)
// replaced GitHub-App-installation tokens after the 2026-05-06
// suspension. Workspaces now provision with a per-persona Gitea PAT
// from .env instead of an App-rotated GITHUB_TOKEN.
envReg := provisionhook.NewRegistry()
// gh-identity plugin — per-agent attribution via env injection + gh
@@ -198,26 +201,6 @@ func main() {
log.Printf("gh-identity: registered (config file=%q)", os.Getenv("MOLECULE_GH_IDENTITY_CONFIG_FILE"))
}
// github-app-auth plugin — injects GITHUB_TOKEN + GH_TOKEN into every
// workspace env using the App's installation access token (rotates ~hourly).
// Soft-skip when GITHUB_APP_* env vars are absent so dev/self-hosters
// without an App configured keep working; fail-loud only on MISCONFIG
// (e.g. APP_ID set but key file missing), not on unset.
if os.Getenv("GITHUB_APP_ID") != "" {
if reg, err := githubappauth.BuildRegistry(); err != nil {
log.Fatalf("github-app-auth plugin: %v", err)
} else {
// Copy the plugin's mutators onto the shared registry so the
// TokenProvider probe (FirstTokenProvider) still finds them.
for _, m := range reg.Mutators() {
envReg.Register(m)
}
log.Printf("github-app-auth: registered, %d mutator(s) added to chain", reg.Len())
}
} else {
log.Println("github-app-auth: GITHUB_APP_ID unset — skipping plugin registration (agents will use any PAT from .env)")
}
wh.SetEnvMutators(envReg)
log.Printf("env-mutator chain: %v", envReg.Names())
@@ -337,15 +320,23 @@ func main() {
// Router
r := router.Setup(hub, broadcaster, prov, platformURL, configsDir, wh, channelMgr, memBundle)
// HTTP server with graceful shutdown
// HTTP server with graceful shutdown.
//
// Bind host: in dev-mode (no ADMIN_TOKEN, MOLECULE_ENV=dev|development)
// the AdminAuth chain fails open by design; pairing that with a wildcard
// bind would expose unauth /workspaces to any same-LAN peer. Default to
// loopback when fail-open is active. Operators who need LAN exposure set
// BIND_ADDR=0.0.0.0 explicitly. Production (ADMIN_TOKEN set) is unchanged.
// See molecule-core#7.
bindHost := resolveBindHost()
srv := &http.Server{
Addr: fmt.Sprintf(":%s", port),
Addr: fmt.Sprintf("%s:%s", bindHost, port),
Handler: r,
}
// Start server in goroutine
go func() {
log.Printf("Platform starting on :%s", port)
log.Printf("Platform starting on %s:%s (dev-mode-fail-open=%v)", bindHost, port, middleware.IsDevModeFailOpen())
if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
log.Fatalf("Server failed: %v", err)
}
@@ -380,6 +371,29 @@ func envOr(key, fallback string) string {
return fallback
}
// resolveBindHost picks the listener interface for the HTTP server.
//
// Precedence:
// 1. BIND_ADDR — explicit operator override (any value, including "0.0.0.0").
// 2. dev-mode fail-open active → "127.0.0.1" (loopback only).
// 3. otherwise → "" (Go binds every interface; existing prod/self-host shape).
//
// Coupling the loopback default to middleware.IsDevModeFailOpen() means the
// two safety levers — bind narrowness and auth strength — move together. A
// production deploy (ADMIN_TOKEN set) keeps binding to all interfaces because
// the auth chain is doing its job; a dev Mac (no ADMIN_TOKEN, MOLECULE_ENV=dev)
// is reachable only via loopback because the auth chain is fail-open. See
// molecule-core#7 for the original LAN exposure finding.
func resolveBindHost() string {
if v := os.Getenv("BIND_ADDR"); v != "" {
return v
}
if middleware.IsDevModeFailOpen() {
return "127.0.0.1"
}
return ""
}
func findConfigsDir() string {
candidates := []string{
"workspace-configs-templates",
-1
View File
@@ -5,7 +5,6 @@ go 1.25.0
require (
github.com/DATA-DOG/go-sqlmock v1.5.2
github.com/Molecule-AI/molecule-ai-plugin-gh-identity v0.0.0-20260424033845-4fd5ac7be30f
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth v0.0.0-20260421064811-7d98ae51e31d
github.com/alicebob/miniredis/v2 v2.37.0
github.com/creack/pty v1.1.24
github.com/docker/docker v28.5.2+incompatible
-2
View File
@@ -6,8 +6,6 @@ github.com/Microsoft/go-winio v0.6.2 h1:F2VQgta7ecxGYO8k3ZZz3RS8fVIXVxONVUPlNERo
github.com/Microsoft/go-winio v0.6.2/go.mod h1:yd8OoFMLzJbo9gZq8j5qaps8bJ9aShtEA8Ipt1oGCvU=
github.com/Molecule-AI/molecule-ai-plugin-gh-identity v0.0.0-20260424033845-4fd5ac7be30f h1:YkLRhUg+9qr9OV9N8dG1Hj0Ml7TThHlRwh5F//oUJVs=
github.com/Molecule-AI/molecule-ai-plugin-gh-identity v0.0.0-20260424033845-4fd5ac7be30f/go.mod h1:NqdtlWZDJvpXNJRHnMkPhTKHdA1LZTNH+63TB66JSOU=
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth v0.0.0-20260421064811-7d98ae51e31d h1:GpYhP6FxaJZc1Ljy5/YJ9ZIVGvfOqZBmDolNr2S5x2g=
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth v0.0.0-20260421064811-7d98ae51e31d/go.mod h1:3a6LR/zd7FjR9ZwLTbytwYlWuCBsbCOVFlEg0WnoYiM=
github.com/alicebob/miniredis/v2 v2.37.0 h1:RheObYW32G1aiJIj81XVt78ZHJpHonHLHW7OLIshq68=
github.com/alicebob/miniredis/v2 v2.37.0/go.mod h1:TcL7YfarKPGDAthEtl5NBeHZfeUQj6OXMm/+iu5cLMM=
github.com/bsm/ginkgo/v2 v2.12.0 h1:Ny8MWAHyOepLGlLKYmXG4IEkioBysk6GpaRTLC8zwWs=
@@ -413,6 +413,23 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
return http.StatusOK, respBody, nil
}
// Mock-runtime short-circuit. Workspaces with runtime='mock' have
// no container, no EC2, no URL — every reply is synthesised here
// from a small canned-variant pool. Built for the "200-workspace
// mock org" demo: a CEO/VPs/Managers/ICs hierarchy that renders
// at scale on the canvas without burning real LLM credits or
// provisioning 200 EC2 instances. See mock_runtime.go for the
// full rationale + reply shape contract.
//
// Position: AFTER poll-mode (mock isn't a delivery mode, it's a
// runtime; treating poll-set-on-mock as poll matches operator
// intent if anyone ever does that), BEFORE resolveAgentURL (mock
// has no URL — going through resolveAgentURL would 404 on the
// SELECT url since the row is provisioned as NULL).
if status, respBody, handled := h.handleMockA2A(ctx, workspaceID, callerID, body, a2aMethod, logActivity); handled {
return status, respBody, nil
}
agentURL, proxyErr := h.resolveAgentURL(ctx, workspaceID)
if proxyErr != nil {
return 0, nil, proxyErr
@@ -56,10 +56,17 @@ type RefreshResult struct {
Recreated []string `json:"recreated"`
}
// TemplateImageRef returns the canonical GHCR ref for a runtime's template
// image. Single source of truth shared with imagewatch.
// TemplateImageRef returns the canonical image ref for a runtime's template,
// using the configured registry (provisioner.RegistryPrefix()) and the
// moving `:latest` tag. Single source of truth shared with imagewatch.
//
// Defaults to ghcr.io/molecule-ai/workspace-template-<runtime>:latest
// (upstream OSS). When MOLECULE_IMAGE_REGISTRY is set in the environment
// (typically the AWS ECR mirror in production), this returns the prefixed
// equivalent so admin operations and image-watch checks hit the same
// registry the provisioner pulls from.
func TemplateImageRef(runtime string) string {
return fmt.Sprintf("ghcr.io/molecule-ai/workspace-template-%s:latest", runtime)
return fmt.Sprintf("%s/workspace-template-%s:latest", provisioner.RegistryPrefix(), runtime)
}
// ghcrAuthHeader returns the base64-encoded JSON auth payload Docker's
@@ -0,0 +1,223 @@
package handlers
// mock_runtime.go — "mock" runtime: a virtual workspace that has no
// container, no EC2, no LLM, just hardcoded canned A2A replies. Built
// for the funding-demo "200-workspace mock org" so hongming can show
// investors a CEO/VPs/Managers/ICs hierarchy at scale without burning
// 200 EC2 instances or 200 Anthropic keys.
//
// Wire model:
// - org template declares `runtime: mock` on every workspace
// - createWorkspaceTree skips provisioning, sets status='online'
// directly (mirrors the `external` short-circuit, minus the URL +
// awaiting_agent dance)
// - proxyA2ARequest short-circuits on a mock-runtime target and
// returns a canned JSON-RPC reply; never calls resolveAgentURL,
// never opens an HTTP connection, never touches Docker/EC2
//
// The reply is JSON-RPC 2.0 + a2a-sdk v0.3 shape so the canvas's
// extractAgentText / extractTextsFromParts read it without any
// special-casing. We rotate over a small variant pool so a screen
// full of replies doesn't all read identical — gives the demo a bit
// of life without pretending to be a real agent.
import (
"context"
"crypto/sha1"
"database/sql"
"encoding/binary"
"encoding/json"
"errors"
"fmt"
"log"
"net/http"
"strings"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
)
// MockRuntimeName is the canonical runtime string a workspace row
// carries to opt into the canned-reply short-circuit. Kept as a const
// so the proxy's runtime-check + the org-import skip-block reference
// the same literal.
const MockRuntimeName = "mock"
// mockReplyVariants is the pool of canned strings the mock runtime
// rotates through. Picked to read like a busy-but-short reply from a
// real human in a hierarchy — a CEO would NOT respond with "On it!",
// but for the demo every node is shown to be reachable, so we lean
// into the variety. Variant selection is deterministic per
// (workspaceID, request-id) pair so a screen recording replays the
// same reply for the same input.
var mockReplyVariants = []string{
"On it!",
"Got it, on it now.",
"On it, boss.",
"Working on it.",
"Acknowledged — on it.",
"On it, will report back.",
"Roger that, on it.",
"Copy that. On it.",
"On it — ETA shortly.",
"On it. Standby for update.",
}
// pickMockReply returns a canned reply for the given workspaceID +
// requestID. Deterministic so the same (workspace, message-id) pair
// always picks the same variant — useful for screen recordings and
// flake-free e2e snapshots. Falls back to variant[0] if the inputs
// are empty.
func pickMockReply(workspaceID, requestID string) string {
if len(mockReplyVariants) == 0 {
return "On it!"
}
if workspaceID == "" && requestID == "" {
return mockReplyVariants[0]
}
h := sha1.Sum([]byte(workspaceID + ":" + requestID))
idx := int(binary.BigEndian.Uint32(h[0:4]) % uint32(len(mockReplyVariants)))
return mockReplyVariants[idx]
}
// lookupRuntime returns the workspace's runtime string. Empty when the
// row is missing / DB hiccup so callers fall through to the existing
// dispatch path (which will then 404 / 502 normally). Fail-open here
// because a transient DB error must not silently flip a real workspace
// into mock-mode and start handing out canned replies in place of
// genuine agent traffic.
func lookupRuntime(ctx context.Context, workspaceID string) string {
var runtime sql.NullString
err := db.DB.QueryRowContext(ctx,
`SELECT runtime FROM workspaces WHERE id = $1`, workspaceID,
).Scan(&runtime)
if err != nil {
if !errors.Is(err, sql.ErrNoRows) {
log.Printf("ProxyA2A: lookupRuntime(%s) failed (%v) — falling through to dispatch path", workspaceID, err)
}
return ""
}
if !runtime.Valid {
return ""
}
return runtime.String
}
// buildMockA2AResponse synthesises a JSON-RPC 2.0 success envelope that
// matches the a2a-sdk v0.3 reply shape the canvas's extractAgentText
// already understands: `{result: {parts: [{kind: "text", text: ...}]}}`.
// `requestID` is the JSON-RPC `id` of the inbound request — A2A
// implementations echo it on the reply so callers can correlate. We
// extract it from the normalized payload in the caller and pass it in
// here so this function stays JSON-only (no payload parsing).
//
// Returns marshalled bytes ready to write straight to the HTTP body.
// Marshal failure is logged + a tiny fallback envelope returned, since
// failing the whole request because of a JSON encoding hiccup on a
// constant-shaped payload would defeat the "mock always works" guarantee.
func buildMockA2AResponse(workspaceID, requestID, replyText string) []byte {
if requestID == "" {
requestID = uuid.New().String()
}
envelope := map[string]any{
"jsonrpc": "2.0",
"id": requestID,
"result": map[string]any{
"parts": []map[string]any{
{"kind": "text", "text": replyText},
},
},
}
out, err := json.Marshal(envelope)
if err != nil {
log.Printf("ProxyA2A: mock-runtime response marshal failed for %s: %v — emitting fallback", workspaceID, err)
// Hand-rolled minimal envelope. Safe because every value is a
// hardcoded constant string with no characters that need
// escaping in a JSON string literal.
fallback := fmt.Sprintf(
`{"jsonrpc":"2.0","id":%q,"result":{"parts":[{"kind":"text","text":%q}]}}`,
requestID, replyText,
)
return []byte(fallback)
}
return out
}
// extractRequestID pulls the JSON-RPC `id` out of an already-normalized
// A2A payload. Returns "" when the field is absent or not a string —
// caller substitutes a fresh UUID. Tolerant of every shape
// normalizeA2APayload could produce.
func extractRequestID(body []byte) string {
var top map[string]json.RawMessage
if err := json.Unmarshal(body, &top); err != nil {
return ""
}
raw, ok := top["id"]
if !ok {
return ""
}
var s string
if json.Unmarshal(raw, &s) == nil {
return s
}
// JSON-RPC permits numeric IDs too; canvas issues UUIDs but be
// defensive against alternative SDKs.
var n json.Number
if json.Unmarshal(raw, &n) == nil {
return n.String()
}
return ""
}
// handleMockA2A is the proxy short-circuit for mock-runtime workspaces.
// Returns (status, body, true) when the target is mock — caller writes
// the response and returns. Returns (_, _, false) when the target is
// not mock — caller continues to the real dispatch path.
//
// Side-effects: writes a synthetic activity_logs row via logA2ASuccess
// when logActivity is true so the canvas's "Agent Comms" tab shows the
// mock reply in the trace alongside real-agent traffic. Without this
// the demo would render messages on the canvas chat panel but a peer
// node clicking through to its activity tab would see an empty list.
func (h *WorkspaceHandler) handleMockA2A(ctx context.Context, workspaceID, callerID string, body []byte, a2aMethod string, logActivity bool) (int, []byte, bool) {
if lookupRuntime(ctx, workspaceID) != MockRuntimeName {
return 0, nil, false
}
requestID := extractRequestID(body)
replyText := pickMockReply(workspaceID, requestID)
respBody := buildMockA2AResponse(workspaceID, requestID, replyText)
// Tiny artificial delay so the canvas chat UI has time to render
// the user's outgoing bubble before the agent reply appears.
// Without it the reply lands the same animation frame and feels
// robotic. 80ms is too fast to look "real" but masks the React
// double-render race that drops the user bubble entirely on slow
// machines (observed locally on M1 Air, 2026-05-07). Below 200ms
// keeps a 200-node demo snappy when investors fan out 30 messages
// at once.
time.Sleep(80 * time.Millisecond)
if logActivity {
// Reuse the existing success-logger so the activity feed shape
// is identical to a real agent reply. Status 200 + duration 0
// is the "synthesised reply" marker; activity_logs.duration_ms
// being 0 is harmless (real fast paths can hit 0 too).
h.logA2ASuccess(ctx, workspaceID, callerID, body, respBody, a2aMethod, http.StatusOK, 0)
}
return http.StatusOK, respBody, true
}
// IsMockRuntime is a small public helper for callers outside this
// package (tests, the org importer) that need to ask the question
// without depending on the unexported constant. Trims + lower-cases
// so a typoed YAML cell like " Mock " still resolves correctly.
func IsMockRuntime(runtime string) bool {
return strings.EqualFold(strings.TrimSpace(runtime), MockRuntimeName)
}
// gin import is unused at file scope but kept as a tag so a future
// addition of a thin HTTP handler (e.g. POST /workspaces/:id/mock/replies
// for an admin-set custom reply pool) doesn't need an import re-order.
var _ = gin.H{}
@@ -0,0 +1,266 @@
package handlers
// mock_runtime_test.go — locks the contract for the mock-runtime
// short-circuit added for the funding-demo "200-workspace mock org"
// template. Three invariants:
//
// 1. ProxyA2A on a workspace with runtime='mock' must return 200
// with a JSON-RPC reply containing one text part. NO HTTP
// dispatch, NO resolveAgentURL DB read (mock workspaces have
// no URL — that read would 404 and break the demo).
//
// 2. The reply text must be one of the canned variants and must be
// deterministic for a given (workspace_id, request_id) pair so
// screen recordings replay identically.
//
// 3. Workspaces with runtime != 'mock' must NOT be affected — the
// mock check fails fast and falls through to the existing
// dispatch path. Same kind of regression guard the poll-mode
// tests carry.
import (
"bytes"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"time"
"github.com/DATA-DOG/go-sqlmock"
"github.com/gin-gonic/gin"
)
// TestProxyA2A_MockRuntime_ReturnsCannedReply is the happy-path
// contract. A workspace flagged runtime='mock' must:
// - return 200 with JSON-RPC envelope {result:{parts:[{kind:text,text:...}]}}
// - not dispatch HTTP (no SELECT url SQL expected)
// - reply text is one of mockReplyVariants
func TestProxyA2A_MockRuntime_ReturnsCannedReply(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
const wsID = "ws-mock-canned"
// Budget check fires before runtime lookup (same as the poll-mode
// short-circuit) — keeps mock workspaces honest if a tenant ever
// sets a budget on one. Unlikely on a demo, but the guard stays
// uniform so future "monthly_spend on mock = 0" assertions don't
// drift.
expectBudgetCheck(mock, wsID)
// lookupDeliveryMode runs first — return push so the poll
// short-circuit doesn't fire and we hit the mock check.
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"delivery_mode"}).AddRow("push"))
// lookupRuntime SELECT — returns 'mock', triggering the canned-reply
// short-circuit. CRITICAL: NO ExpectQuery for `SELECT url, status
// FROM workspaces` (resolveAgentURL's query). If the short-circuit
// fails to fire, sqlmock will surface "unexpected query" on the URL
// SELECT and the test fails loudly — that's the dispatch-leak detector.
mock.ExpectQuery("SELECT runtime FROM workspaces WHERE id").
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("mock"))
// Activity log: logA2ASuccess writes the synthetic reply to
// activity_logs so the canvas's Agent Comms tab shows it alongside
// real-agent traffic.
mock.ExpectExec("INSERT INTO activity_logs").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: wsID}}
body := `{"jsonrpc":"2.0","id":"req-mock-1","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"hello mock"}]}}}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+wsID+"/a2a", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.ProxyA2A(c)
// logA2ASuccess fires async — give it a moment to settle so
// ExpectationsWereMet doesn't flake.
time.Sleep(200 * time.Millisecond)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("response is not valid JSON: %v", err)
}
if resp["jsonrpc"] != "2.0" {
t.Errorf("response.jsonrpc = %v, want 2.0", resp["jsonrpc"])
}
if resp["id"] != "req-mock-1" {
t.Errorf("response.id = %v, want %q (echoed from request)", resp["id"], "req-mock-1")
}
result, _ := resp["result"].(map[string]interface{})
if result == nil {
t.Fatalf("response.result missing or wrong type: %v", resp["result"])
}
parts, _ := result["parts"].([]interface{})
if len(parts) != 1 {
t.Fatalf("expected exactly one part, got %d: %v", len(parts), parts)
}
part, _ := parts[0].(map[string]interface{})
if part["kind"] != "text" {
t.Errorf("part.kind = %v, want text", part["kind"])
}
text, _ := part["text"].(string)
if text == "" {
t.Error("part.text is empty — canned reply not populated")
}
// Reply must be one of the variants.
matched := false
for _, v := range mockReplyVariants {
if v == text {
matched = true
break
}
}
if !matched {
t.Errorf("reply text %q is not in mockReplyVariants", text)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestProxyA2A_NonMockRuntime_NoShortCircuit verifies the symmetric
// contract: a workspace with a real runtime (claude-code, hermes, etc.)
// must NOT be affected by the mock check — it falls through to the
// real dispatch path. Without this guard, a regression in
// lookupRuntime could silently flip every workspace into mock-mode
// and start handing out canned replies in place of real-agent traffic.
func TestProxyA2A_NonMockRuntime_NoShortCircuit(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
allowLoopbackForTest(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
const wsID = "ws-real-runtime"
dispatched := false
agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
dispatched = true
w.Header().Set("Content-Type", "application/json")
w.Write([]byte(`{"jsonrpc":"2.0","id":"1","result":{"status":"ok"}}`))
}))
defer agentServer.Close()
mr.Set("ws:"+wsID+":url", agentServer.URL)
expectBudgetCheck(mock, wsID)
// poll-mode SELECT — return push so we proceed past the poll
// short-circuit.
mock.ExpectQuery("SELECT delivery_mode FROM workspaces WHERE id").
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"delivery_mode"}).AddRow("push"))
// runtime SELECT — return claude-code so the mock check falls
// through.
mock.ExpectQuery("SELECT runtime FROM workspaces WHERE id").
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"runtime"}).AddRow("claude-code"))
mock.ExpectExec("INSERT INTO activity_logs").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: wsID}}
body := `{"jsonrpc":"2.0","id":"real-1","method":"message/send","params":{"message":{"role":"user","parts":[{"kind":"text","text":"hi"}]}}}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+wsID+"/a2a", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.ProxyA2A(c)
time.Sleep(50 * time.Millisecond)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
if !dispatched {
t.Error("non-mock runtime: expected the agent server to receive the request, but it did not — mock short-circuit may be over-firing")
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestPickMockReply_Deterministic locks the determinism contract:
// the same (workspaceID, requestID) input must yield the same variant
// every call. Required for screen recordings + flake-free e2e
// snapshots.
func TestPickMockReply_Deterministic(t *testing.T) {
cases := []struct {
ws, req string
}{
{"ws-1", "req-A"},
{"ws-1", "req-B"},
{"ws-2", "req-A"},
{"", ""},
}
for _, tc := range cases {
first := pickMockReply(tc.ws, tc.req)
for i := 0; i < 10; i++ {
next := pickMockReply(tc.ws, tc.req)
if next != first {
t.Errorf("pickMockReply(%q,%q) is not deterministic: got %q then %q",
tc.ws, tc.req, first, next)
}
}
}
}
// TestIsMockRuntime_TrimsAndCaseInsensitive — typos and stray
// whitespace in YAML must still resolve to mock so a single
// runtime: " Mock " entry doesn't silently get dispatched.
func TestIsMockRuntime_TrimsAndCaseInsensitive(t *testing.T) {
cases := map[string]bool{
"mock": true,
"MOCK": true,
" Mock ": true,
"mocky": false,
"": false,
"external": false,
"claude-code": false,
}
for in, want := range cases {
if got := IsMockRuntime(in); got != want {
t.Errorf("IsMockRuntime(%q) = %v, want %v", in, got, want)
}
}
}
// TestBuildMockA2AResponse_EchoesRequestID — JSON-RPC requires the
// reply id to match the request id so callers can correlate. Mock
// must hold this contract or canvas's correlation logic breaks.
func TestBuildMockA2AResponse_EchoesRequestID(t *testing.T) {
out := buildMockA2AResponse("ws-x", "req-echo-7", "On it!")
var resp map[string]interface{}
if err := json.Unmarshal(out, &resp); err != nil {
t.Fatalf("response is not valid JSON: %v", err)
}
if resp["id"] != "req-echo-7" {
t.Errorf("id = %v, want req-echo-7", resp["id"])
}
if resp["jsonrpc"] != "2.0" {
t.Errorf("jsonrpc = %v, want 2.0", resp["jsonrpc"])
}
result, _ := resp["result"].(map[string]interface{})
parts, _ := result["parts"].([]interface{})
if len(parts) != 1 {
t.Fatalf("expected 1 part, got %d", len(parts))
}
p, _ := parts[0].(map[string]interface{})
if p["text"] != "On it!" {
t.Errorf("part.text = %v, want On it!", p["text"])
}
}
@@ -250,6 +250,21 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), id, map[string]interface{}{
"name": ws.Name, "external": true,
})
} else if IsMockRuntime(runtime) {
// Mock-runtime workspaces have no container, no EC2, no URL —
// the proxyA2ARequest short-circuit synthesises every reply
// from a canned variant pool (see mock_runtime.go). Status
// goes straight to 'online' so the canvas renders the node
// as reachable + the chat tab's send button is enabled. No
// URL is set; the proxy never tries to resolve one for mock
// runtimes. Built for the funding-demo "200-workspace mock
// org" template — visual scale without real backend cost.
if _, err := db.DB.ExecContext(ctx, `UPDATE workspaces SET status = $1 WHERE id = $2`, models.StatusOnline, id); err != nil {
log.Printf("Org import: mock workspace status update failed for %s: %v", ws.Name, err)
}
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventWorkspaceOnline), id, map[string]interface{}{
"name": ws.Name, "mock": true, "runtime": runtime,
})
} else if h.workspace.HasProvisioner() {
// Provision container — either backend (CP for SaaS, local Docker
// for self-hosted) is fine. Pre-2026-05-05 this gate was
@@ -675,7 +690,23 @@ func (h *OrgHandler) recurseChildrenForImport(ws OrgWorkspace, parentID string,
if err := h.createWorkspaceTree(child, &parentID, childAbsX, childAbsY, slotX, slotY, defaults, orgBaseDir, results, provisionSem); err != nil {
return err
}
time.Sleep(workspaceCreatePacingMs * time.Millisecond)
// Pacing exists to throttle Docker container-spawn thundering
// during a self-hosted import. Mock-runtime children spawn no
// container — no Docker pressure, no LLM bursts, just DB
// inserts + a broadcast. Skipping the 2s sleep collapses a
// 200-workspace mock-org import from ~7min → ~5s, which is
// the difference between a snappy demo and a "did it freeze?"
// staring contest. Real (containerful) runtimes still pace.
// Inheritance: if the child itself doesn't declare a runtime,
// fall back to defaults.runtime — the org template sets
// runtime: mock once at the org level, not on every IC node.
childRuntime := child.Runtime
if childRuntime == "" {
childRuntime = defaults.Runtime
}
if !IsMockRuntime(childRuntime) {
time.Sleep(workspaceCreatePacingMs * time.Millisecond)
}
}
return nil
}
@@ -31,11 +31,25 @@ import (
// tests pin the helper's three observable behaviors plus an AST gate
// that catches future re-introductions of the un-checked INSERT.
// lookupChildSQLRE anchors the sqlmock ExpectQuery on every load-bearing
// token of lookupExistingChild's SELECT (org_import.go:639-645). A loose
// substring match (the prior shape, just `SELECT id FROM workspaces`)
// would silent-pass a regression that drops `IS NOT DISTINCT FROM`
// (breaks NULL-parent matching), drops `parent_id` entirely (hijacks
// siblings of the same name across different parents), or drops the
// `status != 'removed'` filter (blocks re-import after Collapse).
// RFC #2872 Important-2.
//
// The four anchored tokens are exactly the predicates the bug shapes
// would tamper with. Whitespace is `\s+` so a future formatter pass
// doesn't churn this string.
const lookupChildSQLRE = `(?s)SELECT id FROM workspaces\s+WHERE name = \$1\s+AND parent_id IS NOT DISTINCT FROM \$2\s+AND status != 'removed'`
func TestLookupExistingChild_NotFound_ReturnsFalseNoError(t *testing.T) {
mock := setupTestDB(t)
// 0-row result → driver returns sql.ErrNoRows on Scan.
parent := "parent-1"
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("Alpha", &parent).
WillReturnRows(sqlmock.NewRows([]string{"id"}))
@@ -56,7 +70,7 @@ func TestLookupExistingChild_NotFound_ReturnsFalseNoError(t *testing.T) {
func TestLookupExistingChild_Found_ReturnsIDAndTrue(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("Alpha", &parent).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow("ws-existing-uuid"))
@@ -79,7 +93,7 @@ func TestLookupExistingChild_NilParent_MatchesRoot(t *testing.T) {
// a plain `=` would never match a NULL row. Pin that roots
// (parent_id=NULL) are still found by the lookup.
mock := setupTestDB(t)
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("RootAgent", (*string)(nil)).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow("ws-root-uuid"))
@@ -102,7 +116,7 @@ func TestLookupExistingChild_DBError_Propagates(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
connFail := errors.New("simulated postgres unavailable")
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("Alpha", &parent).
WillReturnError(connFail)
@@ -137,7 +151,7 @@ func TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
wrapped := fmt.Errorf("driver-wrapped: %w", sql.ErrNoRows)
mock.ExpectQuery(`SELECT id FROM workspaces`).
mock.ExpectQuery(lookupChildSQLRE).
WithArgs("Alpha", &parent).
WillReturnError(wrapped)
+33 -6
View File
@@ -4,6 +4,7 @@ import (
"bytes"
"context"
"io"
"log"
"os"
"path/filepath"
"strings"
@@ -177,16 +178,42 @@ func strDefault(m map[string]interface{}, key, fallback string) string {
return fallback
}
// findRunningContainer returns the live container name for workspaceID, or ""
// when the container is genuinely not running OR the daemon errored
// transiently. Routed through provisioner.RunningContainerName as the SSOT
// (molecule-core#10) so this handler agrees with healthsweep on the same
// inputs. Transient daemon errors are logged distinctly so triage doesn't
// confuse a flaky daemon with a stopped container.
func (h *PluginsHandler) findRunningContainer(ctx context.Context, workspaceID string) string {
if h.docker == nil {
name, err := provisioner.RunningContainerName(ctx, h.docker, workspaceID)
if err != nil {
log.Printf("plugins: docker inspect transient error for %s: %v (treating as not-running for this request)", workspaceID, err)
return ""
}
name := provisioner.ContainerName(workspaceID)
info, err := h.docker.ContainerInspect(ctx, name)
if err == nil && info.State.Running {
return name
return name
}
// isExternalRuntime reports whether the workspace's runtime is the
// `external` (remote-pull) shape introduced in Phase 30. External
// workspaces have no local container — `POST /plugins` (push-install via
// docker exec) doesn't apply to them; they pull via the download endpoint
// instead. Returns false (allow-install) if the lookup is unwired or
// errors — failing open here is safe because the downstream
// findRunningContainer step still gates on a real container being there.
//
// Background — molecule-core#10: without this check, external workspaces
// fall through to findRunningContainer's NotFound path and return a
// misleading 503 "container not running" instead of a clear "use the
// pull endpoint" message.
func (h *PluginsHandler) isExternalRuntime(workspaceID string) bool {
if h.runtimeLookup == nil {
return false
}
return ""
runtime, err := h.runtimeLookup(workspaceID)
if err != nil {
return false
}
return runtime == "external"
}
func (h *PluginsHandler) execAsRoot(ctx context.Context, containerName string, cmd []string) (string, error) {
@@ -0,0 +1,176 @@
package handlers
import (
"go/ast"
"go/parser"
"go/token"
"strings"
"testing"
)
// TestFindRunningContainer_RoutesThroughProvisionerSSOT is a behavior-based
// AST gate: it pins the invariant that PluginsHandler.findRunningContainer
// MUST go through provisioner.RunningContainerName for its is-running check,
// instead of carrying its own copy of cli.ContainerInspect logic.
//
// Background — molecule-core#10: a parallel impl of "is the workspace's
// container running" used to live in plugins.go. It drifted from the
// canonical impl in healthsweep (which goes through Provisioner.IsRunning
// → RunningContainerName) on edge cases like "transient daemon error" —
// the duplicate would 503 with a misleading message while healthsweep
// correctly stayed defensive. Consolidating onto RunningContainerName as
// the SSOT prevents any future copy from re-introducing that drift.
//
// Mutation invariant: if a future PR replaces the provisioner call with
// `h.docker.ContainerInspect(...)` directly, this test fails. That's the
// signal to either (a) extend RunningContainerName's contract OR (b)
// document why this call site needs to differ. Either way: the drift
// gets a reviewer's attention instead of shipping silently.
func TestFindRunningContainer_RoutesThroughProvisionerSSOT(t *testing.T) {
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, "plugins.go", nil, parser.ParseComments)
if err != nil {
t.Fatalf("parse plugins.go: %v", err)
}
var fn *ast.FuncDecl
ast.Inspect(file, func(n ast.Node) bool {
f, ok := n.(*ast.FuncDecl)
if !ok || f.Name.Name != "findRunningContainer" {
return true
}
// Confirm receiver is *PluginsHandler so we don't pick up an unrelated
// helper of the same name. ast.Recv is a FieldList — receivers carry
// at most one field.
if f.Recv == nil || len(f.Recv.List) == 0 {
return true
}
fn = f
return false
})
if fn == nil {
t.Fatal("findRunningContainer not found in plugins.go — was it renamed? update this test or the SSOT routing assumption")
}
var (
callsRunningContainerName bool
callsContainerInspectRaw bool
)
ast.Inspect(fn.Body, func(n ast.Node) bool {
call, ok := n.(*ast.CallExpr)
if !ok {
return true
}
sel, ok := call.Fun.(*ast.SelectorExpr)
if !ok {
return true
}
// Pkg.Func form: provisioner.RunningContainerName(...)
if pkgIdent, ok := sel.X.(*ast.Ident); ok {
if pkgIdent.Name == "provisioner" && sel.Sel.Name == "RunningContainerName" {
callsRunningContainerName = true
}
}
// Receiver-then-method form: h.docker.ContainerInspect(...) /
// p.cli.ContainerInspect(...) — anything ending in
// .ContainerInspect that's NOT routed through provisioner.
if sel.Sel.Name == "ContainerInspect" {
callsContainerInspectRaw = true
}
return true
})
if !callsRunningContainerName {
t.Errorf(
"findRunningContainer must call provisioner.RunningContainerName for the SSOT inspect — see molecule-core#10. Found no such call.",
)
}
if callsContainerInspectRaw {
t.Errorf(
"findRunningContainer carries a direct ContainerInspect call. This is the parallel-impl drift molecule-core#10 fixed. " +
"Either route through provisioner.RunningContainerName OR — if a new use case truly needs a different inspect — extend RunningContainerName's contract first and update this gate to allow the specific delta.",
)
}
}
// TestProvisionerIsRunning_RoutesThroughRunningContainerName mirrors the
// gate above but for the OTHER consumer of the SSOT — Provisioner.IsRunning
// (called by healthsweep). If a future refactor makes IsRunning carry its
// own ContainerInspect again, the two consumers' edge-case behaviors will
// silently drift. Keep them yoked.
func TestProvisionerIsRunning_RoutesThroughRunningContainerName(t *testing.T) {
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, "../provisioner/provisioner.go", nil, parser.ParseComments)
if err != nil {
t.Fatalf("parse provisioner.go: %v", err)
}
var fn *ast.FuncDecl
ast.Inspect(file, func(n ast.Node) bool {
f, ok := n.(*ast.FuncDecl)
if !ok || f.Name.Name != "IsRunning" || f.Recv == nil {
return true
}
// The receiver type must be *Provisioner specifically. CPProvisioner
// has its own IsRunning that talks HTTP to the controlplane and is
// out of scope for this gate.
if !receiverIs(f, "Provisioner") {
return true
}
fn = f
return false
})
if fn == nil {
t.Fatal("Provisioner.IsRunning not found — was it renamed? update this test")
}
var (
callsRunningContainerName bool
callsContainerInspectRaw bool
)
ast.Inspect(fn.Body, func(n ast.Node) bool {
call, ok := n.(*ast.CallExpr)
if !ok {
return true
}
// Same-package call: bare identifier (e.g. RunningContainerName(...)).
if id, ok := call.Fun.(*ast.Ident); ok && id.Name == "RunningContainerName" {
callsRunningContainerName = true
return true
}
// Selector call: pkg.Func (e.g. provisioner.RunningContainerName)
// OR recv.Method (e.g. p.cli.ContainerInspect).
sel, ok := call.Fun.(*ast.SelectorExpr)
if !ok {
return true
}
switch sel.Sel.Name {
case "RunningContainerName":
callsRunningContainerName = true
case "ContainerInspect":
callsContainerInspectRaw = true
}
return true
})
if !callsRunningContainerName {
t.Errorf("Provisioner.IsRunning must call RunningContainerName for the SSOT inspect — see molecule-core#10")
}
if callsContainerInspectRaw {
t.Errorf("Provisioner.IsRunning carries a direct ContainerInspect call; route through RunningContainerName instead")
}
}
// receiverIs reports whether fn's receiver is `*<typeName>` or `<typeName>`.
func receiverIs(fn *ast.FuncDecl, typeName string) bool {
if fn.Recv == nil || len(fn.Recv.List) == 0 {
return false
}
expr := fn.Recv.List[0].Type
if star, ok := expr.(*ast.StarExpr); ok {
expr = star.X
}
id, ok := expr.(*ast.Ident)
return ok && strings.EqualFold(id.Name, typeName)
}
@@ -32,6 +32,18 @@ import (
// inside the workspace at startup.
func (h *PluginsHandler) Install(c *gin.Context) {
workspaceID := c.Param("id")
// External-runtime guard (molecule-core#10): push-install via docker
// exec is meaningless for `runtime='external'` workspaces — they have
// no local container. Reject early with a hint pointing at the
// pull-mode endpoint, instead of falling through to a misleading
// "container not running" 503 from findRunningContainer.
if h.isExternalRuntime(workspaceID) {
c.JSON(http.StatusUnprocessableEntity, gin.H{
"error": "plugin install via push is not supported for external runtimes",
"hint": "external workspaces pull plugins via GET /workspaces/:id/plugins/:name/download",
})
return
}
// Cap the JSON body so a pathological POST can't exhaust parser memory.
bodyMax := envx.Int64("PLUGIN_INSTALL_BODY_MAX_BYTES", defaultInstallBodyMaxBytes)
c.Request.Body = http.MaxBytesReader(c.Writer, c.Request.Body, bodyMax)
@@ -93,6 +105,16 @@ func (h *PluginsHandler) Uninstall(c *gin.Context) {
pluginName := c.Param("name")
ctx := c.Request.Context()
// Mirror Install's external-runtime guard (molecule-core#10) so the
// two endpoints reject the same shape with the same message.
if h.isExternalRuntime(workspaceID) {
c.JSON(http.StatusUnprocessableEntity, gin.H{
"error": "plugin uninstall via docker exec is not supported for external runtimes",
"hint": "external workspaces manage their own plugin directory; remove it locally",
})
return
}
if err := validatePluginName(pluginName); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid plugin name"})
return
@@ -0,0 +1,176 @@
package handlers
import (
"bytes"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/gin-gonic/gin"
)
// TestPluginInstall_ExternalRuntime_Returns422 — molecule-core#10.
// Install on a `runtime='external'` workspace must NOT fall through to
// findRunningContainer (which would 503 with a misleading "container not
// running"). It must return 422 with a hint pointing at the pull-mode
// download endpoint.
func TestPluginInstall_ExternalRuntime_Returns422(t *testing.T) {
h := NewPluginsHandler(t.TempDir(), nil, nil).
WithRuntimeLookup(func(workspaceID string) (string, error) {
return "external", nil
})
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ba1789b0-4d21-4f4f-a878-fa226bf77cf5"}}
c.Request = httptest.NewRequest(
"POST",
"/workspaces/ba1789b0-4d21-4f4f-a878-fa226bf77cf5/plugins",
bytes.NewBufferString(`{"source":"local://my-plugin"}`),
)
c.Request.Header.Set("Content-Type", "application/json")
h.Install(c)
if w.Code != http.StatusUnprocessableEntity {
t.Errorf("expected 422 (Unprocessable Entity) for runtime='external', got %d: %s", w.Code, w.Body.String())
}
if !strings.Contains(w.Body.String(), "external runtimes") {
t.Errorf("expected error body to mention 'external runtimes', got: %s", w.Body.String())
}
if !strings.Contains(w.Body.String(), "download") {
t.Errorf("expected error body to point at the download endpoint, got: %s", w.Body.String())
}
}
// TestPluginUninstall_ExternalRuntime_Returns422 — symmetric guard on the
// uninstall path (DELETE /workspaces/:id/plugins/:name). External
// workspaces manage their own plugin directory locally; the platform
// can't docker-exec into them.
func TestPluginUninstall_ExternalRuntime_Returns422(t *testing.T) {
h := NewPluginsHandler(t.TempDir(), nil, nil).
WithRuntimeLookup(func(workspaceID string) (string, error) {
return "external", nil
})
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{
{Key: "id", Value: "ba1789b0-4d21-4f4f-a878-fa226bf77cf5"},
{Key: "name", Value: "my-plugin"},
}
c.Request = httptest.NewRequest(
"DELETE",
"/workspaces/ba1789b0-4d21-4f4f-a878-fa226bf77cf5/plugins/my-plugin",
nil,
)
h.Uninstall(c)
if w.Code != http.StatusUnprocessableEntity {
t.Errorf("expected 422 for runtime='external', got %d: %s", w.Code, w.Body.String())
}
if !strings.Contains(w.Body.String(), "external runtimes") {
t.Errorf("expected error body to mention 'external runtimes', got: %s", w.Body.String())
}
}
// TestPluginInstall_ContainerBackedRuntime_FallsThroughGuard — the runtime
// guard MUST NOT short-circuit container-backed runtimes. With
// `runtime='claude-code'` the install proceeds past the guard; without a
// real plugin source it'll fail downstream (here: 404 from local resolver
// because no plugin staged), which is the correct error to surface.
//
// This is the mutation-test partner: deleting the `runtime == "external"`
// check would still pass TestPluginInstall_ExternalRuntime (because Install
// would 404 instead of 422 — but the test asserts 422), and would still
// pass this test (because both pre-fix and post-fix produce 404 here).
// What this case pins is "non-external still falls through," catching
// any over-eager guard that rejects all runtimes.
func TestPluginInstall_ContainerBackedRuntime_FallsThroughGuard(t *testing.T) {
h := NewPluginsHandler(t.TempDir(), nil, nil).
WithRuntimeLookup(func(workspaceID string) (string, error) {
return "claude-code", nil
})
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "c7c28c0b-4ea5-4e75-9728-3ba860081708"}}
c.Request = httptest.NewRequest(
"POST",
"/workspaces/c7c28c0b-4ea5-4e75-9728-3ba860081708/plugins",
bytes.NewBufferString(`{"source":"local://nonexistent-plugin"}`),
)
c.Request.Header.Set("Content-Type", "application/json")
h.Install(c)
if w.Code == http.StatusUnprocessableEntity {
t.Errorf("runtime='claude-code' must fall through the external guard; got 422: %s", w.Body.String())
}
// The local resolver will fail to find the plugin → 404. Anything
// other than 422 (which would mean we mis-classified) is fine.
if w.Code != http.StatusNotFound {
t.Errorf("expected 404 (plugin not found in registry), got %d: %s", w.Code, w.Body.String())
}
}
// TestPluginInstall_NoRuntimeLookup_FailsOpen — when the runtime lookup
// is unwired (test fixtures, niche deploy shapes) the guard MUST default
// to allowing the install attempt. The downstream findRunningContainer
// step still gates on a real container, so failing open here doesn't
// expose a bypass — it just preserves backwards-compat with deployments
// that haven't wired the lookup.
func TestPluginInstall_NoRuntimeLookup_FailsOpen(t *testing.T) {
h := NewPluginsHandler(t.TempDir(), nil, nil) // NO WithRuntimeLookup
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-no-lookup"}}
c.Request = httptest.NewRequest(
"POST",
"/workspaces/ws-no-lookup/plugins",
bytes.NewBufferString(`{"source":"local://nonexistent"}`),
)
c.Request.Header.Set("Content-Type", "application/json")
h.Install(c)
if w.Code == http.StatusUnprocessableEntity {
t.Errorf("nil runtimeLookup must fall through (fail-open); got 422: %s", w.Body.String())
}
}
// TestPluginInstall_RuntimeLookupErrors_FailsOpen — same fail-open story
// for transient DB errors in the lookup. We don't want a momentary
// Postgres hiccup to flip every plugin install into a 422.
func TestPluginInstall_RuntimeLookupErrors_FailsOpen(t *testing.T) {
h := NewPluginsHandler(t.TempDir(), nil, nil).
WithRuntimeLookup(func(workspaceID string) (string, error) {
return "", errFakeDB
})
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "ws-db-flake"}}
c.Request = httptest.NewRequest(
"POST",
"/workspaces/ws-db-flake/plugins",
bytes.NewBufferString(`{"source":"local://nonexistent"}`),
)
c.Request.Header.Set("Content-Type", "application/json")
h.Install(c)
if w.Code == http.StatusUnprocessableEntity {
t.Errorf("runtimeLookup error must fall through (fail-open); got 422: %s", w.Body.String())
}
}
// errFakeDB is a sentinel for the fail-open lookup-error case.
var errFakeDB = &fakeError{msg: "synthetic db error"}
type fakeError struct{ msg string }
func (e *fakeError) Error() string { return e.msg }
@@ -78,6 +78,10 @@ var fallbackRuntimes = map[string]struct{}{
"openclaw": {},
"codex": {},
"external": {},
// mock — virtual workspace with hardcoded canned A2A replies.
// No container, no EC2, no template repo. See mock_runtime.go
// for the full rationale (200-workspace funding-demo org).
"mock": {},
}
// loadRuntimesFromManifest builds the runtime allowlist from
@@ -104,6 +108,10 @@ func loadRuntimesFromManifest(path string) (map[string]struct{}, error) {
// the manifest doesn't know about it. Injected here so we
// don't need a special-case in every caller.
"external": {},
// mock is ALWAYS available for the same reason as external:
// virtual workspace, no template repo, never spawns a
// container. See mock_runtime.go.
"mock": {},
}
for _, e := range m.WorkspaceTemplates {
name := strings.TrimSpace(e.Name)
@@ -112,6 +112,19 @@ func (h *WorkspaceHandler) Restart(c *gin.Context) {
return
}
// runtime=mock: virtual workspace with canned A2A replies. No
// container, no EC2, no provisioning state to recycle. Mirror
// the external no-op so the canvas's Restart button doesn't
// silently fail or leak through to the (template-less) provisioner.
if dbRuntime == "mock" {
c.JSON(http.StatusOK, gin.H{
"status": "noop",
"runtime": "mock",
"message": "mock workspaces have no container — restart is a no-op",
})
return
}
// SaaS mode: cpProv handles workspace EC2 lifecycle. Self-hosted mode:
// provisioner handles local Docker containers. At least one must be
// available — previously only `provisioner` was checked, which broke
@@ -532,7 +545,9 @@ func (h *WorkspaceHandler) runRestartCycle(workspaceID string) {
}
// Don't auto-restart external workspaces (no Docker container)
if dbRuntime == "external" {
// or mock workspaces (no container, every reply is canned —
// see workspace-server/internal/handlers/mock_runtime.go).
if dbRuntime == "external" || dbRuntime == "mock" {
return
}
@@ -35,36 +35,37 @@ import (
// drift-risk #6.
var ErrNoBackend = errors.New("provisioner: no backend configured (zero-valued receiver)")
// RuntimeImages maps runtime names to their Docker image refs on GHCR.
// RuntimeImages maps runtime names to their Docker image refs.
// Each standalone template repo publishes its image via the reusable
// publish-template-image workflow in molecule-ci on every main merge.
// The provisioner pulls these on demand (see ensureImageLocal) — no
// pre-build step on the tenant host.
//
// The registry prefix is determined by RegistryPrefix() in registry.go;
// defaults to ghcr.io/molecule-ai (upstream OSS) and is overridden via the
// MOLECULE_IMAGE_REGISTRY env var in production tenants that mirror to
// AWS ECR or another registry. The map is computed at package init and
// captures whatever prefix was active then.
//
// Legacy local-build path (`docker build -t workspace-template:<runtime>`
// via scripts/build-images.sh) is still supported for development:
// when a bare `workspace-template:<runtime>` image is present locally,
// Docker's image resolver matches it before any pull is attempted. Set
// the env var WORKSPACE_IMAGE_LOCAL_OVERRIDE=1 (enforced by callers) to
// short-circuit pulls entirely if needed.
var RuntimeImages = map[string]string{
"langgraph": "ghcr.io/molecule-ai/workspace-template-langgraph:latest",
"claude-code": "ghcr.io/molecule-ai/workspace-template-claude-code:latest",
"openclaw": "ghcr.io/molecule-ai/workspace-template-openclaw:latest",
"deepagents": "ghcr.io/molecule-ai/workspace-template-deepagents:latest",
"crewai": "ghcr.io/molecule-ai/workspace-template-crewai:latest",
"autogen": "ghcr.io/molecule-ai/workspace-template-autogen:latest",
"hermes": "ghcr.io/molecule-ai/workspace-template-hermes:latest", // Hermes (Nous Research) — real hermes-agent behind A2A bridge
"gemini-cli": "ghcr.io/molecule-ai/workspace-template-gemini-cli:latest", // Google Gemini CLI
}
var RuntimeImages = computeRuntimeImages()
// DefaultImage is the fallback workspace Docker image (langgraph is the
// most common runtime). Computed via RegistryPrefix() so the prefix
// override applies to the fallback path too.
//
// NOTE: Every runtime MUST have an entry in knownRuntimes (registry.go).
// If a runtime is missing, it falls back to DefaultImage which may have
// wrong deps. Add new runtimes to knownRuntimes AND create the standalone
// template repo.
var DefaultImage = RuntimeImage(defaultRuntime)
const (
// DefaultImage is the fallback workspace Docker image (langgraph is the most common runtime).
DefaultImage = "ghcr.io/molecule-ai/workspace-template-langgraph:latest"
// NOTE: Every runtime MUST have an entry in RuntimeImages above. If a runtime is missing,
// it falls back to DefaultImage which may have wrong deps. Add new runtimes to both
// RuntimeImages AND create the standalone template repo.
// DefaultNetwork is the Docker network workspaces join.
DefaultNetwork = "molecule-monorepo-net"
@@ -1072,18 +1073,53 @@ func (p *Provisioner) IsRunning(ctx context.Context, workspaceID string) (bool,
if p == nil || p.cli == nil {
return false, ErrNoBackend
}
name := ContainerName(workspaceID)
info, err := p.cli.ContainerInspect(ctx, name)
name, err := RunningContainerName(ctx, p.cli, workspaceID)
if err != nil {
if isContainerNotFound(err) {
return false, nil
}
// Transient daemon error: caller treats !running as dead + restarts.
// Returning true + the underlying error preserves the error for
// metrics/logging without triggering the destructive path.
return true, err
}
return info.State.Running, nil
return name != "", nil
}
// RunningContainerName returns the container name for workspaceID iff the
// container exists AND is in the Running state. Single source of truth for
// "what live container should I exec into for this workspace?" — used by
// both Provisioner.IsRunning (healthsweep) and the plugins handler.
//
// Distinguishes three outcomes so callers can pick their own policy:
//
// - ("ws-<id>", nil): container is running. Caller can exec into it.
// - ("", nil): container does not exist OR exists but is stopped
// (NotFound, Exited, Created, Restarting…). Caller
// should treat as a definitive "not running."
// - ("", err): transient daemon error (timeout, socket EOF, ctx
// cancel). Caller should NOT infer "not running" —
// this could be a flaky daemon under load. Decide
// per-callsite whether to fail soft or hard.
//
// Background — molecule-core#10: the plugins handler used to carry its own
// copy of this inspect logic (`findRunningContainer`) which collapsed
// transient errors into the same "" return as a genuinely-stopped container.
// That hid daemon flakes as misleading 503 "container not running" responses
// AND let the two impls drift on edge-case behavior. This is the SSOT.
func RunningContainerName(ctx context.Context, cli *client.Client, workspaceID string) (string, error) {
if cli == nil {
return "", ErrNoBackend
}
name := ContainerName(workspaceID)
info, err := cli.ContainerInspect(ctx, name)
if err != nil {
if isContainerNotFound(err) {
return "", nil
}
return "", err
}
if info.State.Running {
return name, nil
}
return "", nil
}
// isContainerNotFound returns true when the Docker client indicates the
@@ -0,0 +1,95 @@
package provisioner
import (
"fmt"
"os"
)
// defaultRegistryPrefix is the upstream OSS face for all workspace template
// images. Self-hosted Molecule deployments without the MOLECULE_IMAGE_REGISTRY
// override pull from here.
const defaultRegistryPrefix = "ghcr.io/molecule-ai"
// knownRuntimes is the canonical list of workspace template runtimes shipped
// in main. Any runtime added here MUST also have a standalone template repo
// (Molecule-AI/molecule-ai-workspace-template-<name>) and an entry in the
// publish-template-image workflow that builds it.
//
// Order matters for deterministic test snapshots; keep alphabetical.
var knownRuntimes = []string{
"autogen",
"claude-code",
"codex",
"crewai",
"deepagents",
"gemini-cli",
"hermes",
"langgraph",
"openclaw",
}
// defaultRuntime is the fallback when a workspace's config doesn't specify a
// runtime. Picked because LangGraph is the most common in our org templates
// and has the smallest "first impression" cold-start surface.
const defaultRuntime = "langgraph"
// RegistryPrefix returns the registry prefix all workspace-template image
// references should use. Defaults to ghcr.io/molecule-ai (the upstream OSS
// face) and is overridden by the MOLECULE_IMAGE_REGISTRY env var in
// production tenants where we mirror images to a private registry.
//
// The override is set at deploy time (Railway env, EC2 user-data) — never
// from user-supplied input — so the value is trusted by the time it reaches
// this code. Validation is deliberately minimal: an operator-supplied
// prefix that points at a registry the EC2 can't authenticate to will fail
// loudly at docker-pull time, which is the right blast radius.
//
// Example values:
//
// (unset) → ghcr.io/molecule-ai (OSS default)
// "123456789012.dkr.ecr.us-east-2.amazonaws.com/molecule-ai" → AWS ECR mirror
// "git.moleculesai.app/molecule-ai" → self-hosted Gitea Container Registry (future)
//
// Auth is registry-specific and configured outside this function:
// - GHCR: GHCR_USER/GHCR_TOKEN env vars consumed by ghcrAuthHeader()
// - ECR: docker credential helper (amazon-ecr-credential-helper) configured
// in EC2 user-data; ~/.docker/config.json has credHelpers entry; the
// daemon resolves auth automatically on every pull.
func RegistryPrefix() string {
if v := os.Getenv("MOLECULE_IMAGE_REGISTRY"); v != "" {
return v
}
return defaultRegistryPrefix
}
// RuntimeImage returns the canonical image reference for the given runtime,
// using the current RegistryPrefix() and the moving `:latest` tag.
//
// For SHA-pinned references (production thin-AMI launches), the
// runtime_image_pins lookup in handlers/runtime_image_pin.go strips the
// `:latest` suffix and appends an immutable `@sha256:<digest>` from the DB.
// That code path naturally inherits any RegistryPrefix() change because it
// reads from RuntimeImages[runtime] and only re-formats the tag suffix.
//
// Returns the empty string for unknown runtimes; callers should fall through
// to DefaultImage in that case (matching legacy behavior).
func RuntimeImage(runtime string) string {
for _, r := range knownRuntimes {
if r == runtime {
return fmt.Sprintf("%s/workspace-template-%s:latest", RegistryPrefix(), runtime)
}
}
return ""
}
// computeRuntimeImages returns the {runtime: image-ref} map evaluated against
// the current RegistryPrefix(). Called at package init to populate the
// exported RuntimeImages var. Tests that flip MOLECULE_IMAGE_REGISTRY between
// expected values use this helper to rebuild the map mid-run.
func computeRuntimeImages() map[string]string {
out := make(map[string]string, len(knownRuntimes))
for _, r := range knownRuntimes {
out[r] = RuntimeImage(r)
}
return out
}
@@ -0,0 +1,140 @@
package provisioner
import (
"strings"
"testing"
)
// TestRegistryPrefix_DefaultsToGHCR pins the OSS-default behavior. If a future
// refactor accidentally drops the default, OSS users self-hosting Molecule
// would silently lose image pulls — this test should fail loudly instead.
func TestRegistryPrefix_DefaultsToGHCR(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
got := RegistryPrefix()
want := "ghcr.io/molecule-ai"
if got != want {
t.Fatalf("RegistryPrefix() = %q, want %q (default must remain GHCR for OSS users)", got, want)
}
}
// TestRegistryPrefix_RespectsEnv verifies the override path used in
// production tenants where MOLECULE_IMAGE_REGISTRY points at a private
// mirror (AWS ECR, self-hosted Harbor, etc.).
func TestRegistryPrefix_RespectsEnv(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "123456789012.dkr.ecr.us-east-2.amazonaws.com/molecule-ai")
got := RegistryPrefix()
want := "123456789012.dkr.ecr.us-east-2.amazonaws.com/molecule-ai"
if got != want {
t.Fatalf("RegistryPrefix() = %q, want %q (env override path is the production cutover mechanism)", got, want)
}
}
// TestRegistryPrefix_EmptyEnvFallsBackToDefault — guard against an operator
// setting MOLECULE_IMAGE_REGISTRY="" by mistake (e.g. unset deploy variable
// becomes empty string, not literally absent). We treat "" as "use default"
// so a misconfigured env doesn't mean an empty registry prefix.
func TestRegistryPrefix_EmptyEnvFallsBackToDefault(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
if RegistryPrefix() != defaultRegistryPrefix {
t.Fatalf("empty MOLECULE_IMAGE_REGISTRY should fall back to %q, got %q", defaultRegistryPrefix, RegistryPrefix())
}
}
// TestRuntimeImage_AllKnownRuntimes — every runtime in the canonical list
// must produce a properly-formatted image ref. If a new runtime is added to
// knownRuntimes but the format changes, this catches it.
func TestRuntimeImage_AllKnownRuntimes(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
for _, r := range knownRuntimes {
got := RuntimeImage(r)
want := "ghcr.io/molecule-ai/workspace-template-" + r + ":latest"
if got != want {
t.Errorf("RuntimeImage(%q) = %q, want %q", r, got, want)
}
}
// Pin the count so adding a runtime requires explicit test acknowledgement.
if len(knownRuntimes) != 9 {
t.Errorf("knownRuntimes length = %d, want 9 (autogen, claude-code, codex, crewai, deepagents, gemini-cli, hermes, langgraph, openclaw)", len(knownRuntimes))
}
}
// TestRuntimeImage_UnknownRuntime — defensive: callers must fall back to
// DefaultImage when a runtime is unknown, never silently use the wrong
// prefix. Returning "" enforces an explicit fallback at every call site.
func TestRuntimeImage_UnknownRuntime(t *testing.T) {
for _, name := range []string{"", "nonexistent", "WORKSPACE-TEMPLATE-FAKE", "../../../etc/passwd"} {
if got := RuntimeImage(name); got != "" {
t.Errorf("RuntimeImage(%q) = %q, want empty string for unknown runtime", name, got)
}
}
}
// TestRuntimeImage_RegistryOverrideAppliesToAllRuntimes — the override
// flips ALL runtimes consistently. If a refactor accidentally hardcoded
// the prefix in some runtimes but not others (the failure mode that
// triggered this whole rollout), this test catches it.
func TestRuntimeImage_RegistryOverrideAppliesToAllRuntimes(t *testing.T) {
const ecr = "999999999999.dkr.ecr.us-east-2.amazonaws.com/molecule-ai"
t.Setenv("MOLECULE_IMAGE_REGISTRY", ecr)
for _, r := range knownRuntimes {
got := RuntimeImage(r)
if !strings.HasPrefix(got, ecr+"/workspace-template-") {
t.Errorf("RuntimeImage(%q) = %q, must start with override prefix %q", r, got, ecr)
}
if !strings.HasSuffix(got, ":latest") {
t.Errorf("RuntimeImage(%q) = %q, must keep :latest tag suffix", r, got)
}
}
}
// TestComputeRuntimeImages_AllRuntimesPresent — the map must contain every
// known runtime. Drift between knownRuntimes and computeRuntimeImages would
// silently break the runtime → image lookup that provisioner.Start uses.
func TestComputeRuntimeImages_AllRuntimesPresent(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
m := computeRuntimeImages()
if len(m) != len(knownRuntimes) {
t.Fatalf("computeRuntimeImages() has %d entries, want %d (one per knownRuntime)", len(m), len(knownRuntimes))
}
for _, r := range knownRuntimes {
img, ok := m[r]
if !ok {
t.Errorf("computeRuntimeImages() missing runtime %q", r)
continue
}
if img == "" {
t.Errorf("computeRuntimeImages()[%q] is empty", r)
}
}
}
// TestComputeRuntimeImages_ReflectsCurrentEnv — calling computeRuntimeImages
// after env change rebuilds the map with new prefix. Tests + ops procedures
// that flip the env in-process rely on this.
func TestComputeRuntimeImages_ReflectsCurrentEnv(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
defaultMap := computeRuntimeImages()
if !strings.HasPrefix(defaultMap["claude-code"], "ghcr.io/molecule-ai/") {
t.Fatalf("default map should be GHCR-prefixed, got %q", defaultMap["claude-code"])
}
const mirror = "registry.example.com/molecule-ai"
t.Setenv("MOLECULE_IMAGE_REGISTRY", mirror)
mirrorMap := computeRuntimeImages()
if !strings.HasPrefix(mirrorMap["claude-code"], mirror+"/") {
t.Fatalf("mirror-prefixed map should start with %q, got %q", mirror, mirrorMap["claude-code"])
}
}
// TestKnownRuntimes_AlphabeticalOrder — pin the order so test snapshots
// (and human readers diffing the file) see deterministic output. Adding a
// new runtime out of alphabetical order will fail this test, which is the
// nudge to keep the file readable.
func TestKnownRuntimes_AlphabeticalOrder(t *testing.T) {
for i := 1; i < len(knownRuntimes); i++ {
if knownRuntimes[i-1] >= knownRuntimes[i] {
t.Errorf("knownRuntimes not alphabetical: %q comes before %q", knownRuntimes[i-1], knownRuntimes[i])
}
}
}
@@ -71,9 +71,15 @@ func StartHealthSweep(ctx context.Context, checker ContainerChecker, interval ti
}
func sweepOnlineWorkspaces(ctx context.Context, checker ContainerChecker, onOffline OfflineHandler) {
// Skip external workspaces (runtime='external') — they have no Docker container
// Skip external + mock workspaces — neither has a Docker container.
// external: agent runs on operator's laptop, polled via heartbeat.
// mock: virtual workspace, every reply is canned (see
// workspace-server/internal/handlers/mock_runtime.go). Both would
// false-positive as "container gone" on every sweep tick and
// auto-restart would loop forever (provisioner has no template
// for either runtime).
rows, err := db.DB.QueryContext(ctx,
`SELECT id FROM workspaces WHERE status IN ('online', 'degraded') AND COALESCE(runtime, 'langgraph') != 'external'`)
`SELECT id FROM workspaces WHERE status IN ('online', 'degraded') AND COALESCE(runtime, 'langgraph') NOT IN ('external', 'mock')`)
if err != nil {
log.Printf("Health sweep: query error: %v", err)
return
@@ -413,22 +413,20 @@ func sweepStaleTokensWithoutContainer(ctx context.Context, reaper OrphanReaper)
// `"5m0s"` mismatch with Postgres interval grammar; passing seconds
// as an int keeps the binding portable.
graceSeconds := int(staleTokenGrace.Seconds())
// `runtime != 'external'` is load-bearing: external workspaces have NO
// local container by design (the agent runs off-host), so the
// "no live container" predicate below would match every external
// workspace and revoke its token. The token is the off-host agent's
// only authentication credential — revoking breaks the entire
// external-runtime feature. Discovered 2026-05-03 when a fresh
// external workspace had its token silently revoked ~6 minutes after
// creation by this sweep, killing the operator's MCP heartbeat and
// inbox poll with `HTTP 401 — token may be revoked`.
// `runtime NOT IN ('external','mock')` is load-bearing: neither
// runtime has a local container, so the "no live container"
// predicate below would match every row and revoke its token.
// external: token is the off-host agent's only credential —
// revoking breaks the entire external-runtime feature
// (incident 2026-05-03). mock: same shape — no container by
// design, see workspace-server/internal/handlers/mock_runtime.go.
rows, qErr := db.DB.QueryContext(ctx, `
SELECT DISTINCT t.workspace_id::text
FROM workspace_auth_tokens t
JOIN workspaces w ON w.id = t.workspace_id
WHERE t.revoked_at IS NULL
AND w.status NOT IN ('removed', 'provisioning')
AND w.runtime != 'external'
AND w.runtime NOT IN ('external', 'mock')
AND COALESCE(t.last_used_at, t.created_at) < now() - make_interval(secs => $2)
AND (
cardinality($1::text[]) = 0
@@ -26,7 +26,7 @@ import (
// accidentally matching a future query that opens with the same column
// name OR a regression that drops one of the load-bearing predicates.
func expectStaleTokenSweepNoOp(mock sqlmock.Sqlmock) {
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'`).
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\)`).
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}))
}
@@ -492,7 +492,7 @@ func TestSweepOnce_StaleTokenRevokeFiresWhenNoContainer(t *testing.T) {
// excludes 'external' (2026-05-03 fix — the sweep was incorrectly
// targeting external workspaces which have no container by design),
// and the staleness predicate appears in the SELECT.
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'.*COALESCE\(t\.last_used_at, t\.created_at\) < now\(\) - make_interval`).
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\).*COALESCE\(t\.last_used_at, t\.created_at\) < now\(\) - make_interval`).
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}).
AddRow(orphanedID))
@@ -548,7 +548,7 @@ func TestSweepOnce_StaleTokenRevokeFailureBailsLoop(t *testing.T) {
// Third-pass returns two stale-token workspaces; the first revoke
// errors. Loop must bail without attempting the second.
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime != 'external'`).
mock.ExpectQuery(`(?s)^\s*SELECT DISTINCT t\.workspace_id::text\s+FROM workspace_auth_tokens.*status NOT IN \('removed', 'provisioning'\).*runtime NOT IN \('external', 'mock'\)`).
WillReturnRows(sqlmock.NewRows([]string{"workspace_id"}).
AddRow("aaaa1111-0000-0000-0000-000000000000").
AddRow("bbbb2222-0000-0000-0000-000000000000"))