Compare commits

...

53 Commits

Author SHA1 Message Date
molecule-ai[bot] a1e40fe0d9 Merge pull request #2612 from Molecule-AI/staging
staging → main: auto-promote a8708ca
2026-05-03 11:18:27 -07:00
Hongming Wang a8708caf73 Merge pull request #2611 from Molecule-AI/fix/2488-trust-boundary-meta-fields
feat(security): trust-boundary gate non-peer_id meta fields in _build_channel_notification (#2488)
2026-05-03 18:02:10 +00:00
Hongming Wang 02ae2fd6fb feat(security): trust-boundary gate non-peer_id meta fields in _build_channel_notification (#2488)
Defense-in-depth follow-up to #2481 (peer_id trust-boundary gate).
Same XML-attribute injection vector applies to the four other meta
fields rendered as agent-context attrs in the <channel> tag:

  <channel kind="..." method="..." activity_id="..." ts="..." source="molecule">

Each field is now passed through a closed-set / shape-validate gate:

- kind     → frozenset {canvas_user, peer_agent} via _safe_meta_field
- method   → frozenset {message/send, tasks/send, tasks/get, notify, ""}
- activity_id → UUID-shape regex via _safe_activity_id
- ts       → ISO-8601 RFC3339 regex via _safe_ts

Any value outside the allowed shape is replaced with empty string.
Today the values come from a platform-DB column so they're trusted,
but "trust the source" was the same assumption that got peer_id into
trouble (#2481). Closed-enum allowlists make this row-content-blind.

5 new tests mirroring test_envelope_enrichment_strips_path_traversal_peer_id:
- test_envelope_strips_unknown_kind         — kind injection stripped
- test_envelope_strips_unknown_method       — method injection stripped
- test_envelope_strips_malformed_activity_id — non-UUID stripped
- test_envelope_strips_malformed_ts         — non-ISO8601 stripped
- test_envelope_keeps_valid_meta_fields_unchanged — happy-path negative case

Mutation-tested: temporarily making _safe_meta_field permissive kills
both kind/method strip tests with the injection payload reflecting
into the meta dict, confirming the gate is what blocks them.

Two existing tests updated to use UUID-shaped activity_ids ("act-7",
"act-bridge-test" → real UUIDs) since the gate strips synthetic ids.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 10:58:52 -07:00
Hongming Wang f21d79c4ad Merge pull request #2610 from Molecule-AI/auto-sync/main-120bb1f0
chore: sync main → staging (auto, ff to 120bb1f0)
2026-05-03 17:53:10 +00:00
molecule-ai[bot] 120bb1f0a2 Merge pull request #2609 from Molecule-AI/staging
staging → main: auto-promote 257079c
2026-05-03 10:48:41 -07:00
Hongming Wang cfd5ec8d82 Merge pull request #2608 from Molecule-AI/ui/canvas-workspace-card-contrast
fix(canvas): WorkspaceNode + tier-config contrast in light theme
2026-05-03 17:32:22 +00:00
Hongming Wang a4a32cded5 fix(canvas): WorkspaceNode + tier-config contrast in light theme
Cards on the canvas had multiple low-contrast surfaces in light mode:

WorkspaceNode.tsx (parent + TeamMemberChip) — same fixes both copies:
- "N sub" badge: hardcoded text-violet-300 + bg-violet-900/40 → semantic
  text-accent + bg-accent/15 + border-accent/40 (themes correctly).
- "REMOTE" pill: hardcoded violet/40 alpha → solid bg-violet-600 text-white
  (works on either surface with WCAG AA contrast).
- Runtime pill: drop /60 + /30 alpha modifiers, use solid surface-card +
  border-line tokens.
- Skill chips (online): text-good/80 + bg-emerald-950/30 (washed-out on
  warm-paper) → text-good + bg-good/15 + border-good/40 semantic.
- Skill chips (offline): text-ink-mid + bg-surface-card without alpha.
- Restart-to-apply banner: bg-sky-950/30 + text-sky-300/80 → bg-accent/10 +
  text-accent (sky-950 was nearly invisible on cream).
- Provisioning status text: text-sky-400 (poor on cream) → text-accent.
- "+N more" badges: text-ink-soft (3.5:1) → text-ink-mid (7:1).
- Active-tasks dot: bg-amber-400 + text-warm/80 → semantic bg-warm + text-warm.
- Degraded error preview: bg-amber-950/20 + text-warm/60 → bg-warm/10 +
  text-warm + border-warm/40.
- Eject icon hover: hover:text-sky-400 → hover:text-accent.
- Role text: text-ink-soft → text-ink-mid.

design-tokens.ts:
- TIER_CONFIG was dark-only: T2 (text-sky-400 + bg-sky-950/50), T3
  (text-violet-400 + bg-violet-950/50), T4 (text-warm + bg-amber-950/50).
  Migrated to solid bg + white text patterns: T2=accent, T3=violet-600,
  T4=warm. T1 stays neutral (surface-card + ink-mid). All four pass WCAG
  AA on either theme.

No globals.css changes; uses existing semantic tokens.
2026-05-03 10:28:49 -07:00
Hongming Wang 257079c7a2 Merge pull request #2605 from Molecule-AI/fix/2485-chat-history-followups
fix(chat-history): correct docstring inversion + pin empty-history JSON shape (#2485)
2026-05-03 17:24:42 +00:00
Hongming Wang 0567502316 Merge pull request #2607 from Molecule-AI/auto-sync/main-7cba0477
chore: sync main → staging (auto, ff to 7cba0477)
2026-05-03 17:23:35 +00:00
molecule-ai[bot] 7cba0477cc Merge pull request #2606 from Molecule-AI/staging
staging → main: auto-promote 4e72f1d
2026-05-03 10:18:56 -07:00
Hongming Wang ff3dcd37f6 fix(chat-history): correct docstring inversion + pin empty-history JSON shape (#2485)
Two follow-ups from the multi-axis review of #2474:

1. **Docstring inversion** in tool_chat_history. The doc said
   '(source_id=peer)' meant 'this workspace is the sender' — actually
   it means the *peer* is the sender (source_id is where the activity
   came FROM). Reframed to 'where the peer is either the sender or
   the recipient' to match the underlying SQL semantics.

2. **Empty-history test**. TestChatHistory had 10 tests but no
   200+[] happy-path pin. Added test_empty_history_returns_empty_json_list
   asserting result == '[]' on exact-equality (per assert-exact
   memory — substring '[]' would match envelope shapes too).

Both changes are pure docs+tests — no behaviour change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 10:09:15 -07:00
Hongming Wang 4e72f1d1db Merge pull request #2604 from Molecule-AI/ui/canvas-chat-contrast
fix(canvas): chat bubble + sub-tab contrast in light theme
2026-05-03 17:00:54 +00:00
Hongming Wang e22f7969f8 Merge pull request #2603 from Molecule-AI/auto-sync/main-46c8c1de
chore: sync main → staging (auto, ff to 46c8c1de)
2026-05-03 17:00:37 +00:00
Hongming Wang 3d145da99d fix(canvas): chat bubble + sub-tab contrast in light theme
Chat bubble fixes (canvas/src/components/tabs/ChatTab.tsx):
- User bubble: bg-accent-strong/30 + text-blue-100 → bg-accent + text-white
  (translucent dark-blue overlay on warm-paper surface read as pale lavender
  with near-invisible light-blue text — a real WCAG AA failure on the
  highest-traffic surface in canvas).
- System/error bubble: bg-red-900/30 + text-red-200 → bg-bad/10 + text-bad,
  using semantic tokens so dark-mode adapts automatically.
- Agent bubble: drop /80 + /30 opacity modifiers; solid bg-surface-card +
  text-ink + border-line gives consistent contrast in both themes.
- prose-invert was unconditional, so markdown text on agent/system bubbles
  rendered as light text on a light surface in light mode. Make it apply
  only on the user bubble (the only dark surface in this component).
- Timestamp: text-ink-soft is too pale on warm-paper; use text-ink-mid for
  agent/system, white/70 for user (visible on the now-solid blue bg).

Sub-tab bar fixes (canvas/src/components/SidePanel.tsx):
- Right-edge fade was hardcoded `from-zinc-950` — that paints a dark vertical
  strip on the right edge of the tab bar in light mode. Switch to
  `from-surface` so the gradient blends into whichever theme is active.
- Inactive tab text: text-ink-soft (~3.5:1 on warm-paper) → text-ink-mid
  (~7:1). Active tab background: drop the /40 opacity so the selection is
  unambiguous on either surface.

No semantic-token additions; all changes use existing warm-paper tokens
that already work in both themes.
2026-05-03 09:58:18 -07:00
molecule-ai[bot] 46c8c1de23 Merge pull request #2602 from Molecule-AI/staging
staging → main: auto-promote 6d38b96
2026-05-03 16:49:40 +00:00
Hongming Wang 6d38b96043 Merge pull request #2601 from Molecule-AI/fix/2483-negative-cache-branch-tests
test(envelope-enrichment): pin negative-cache for non-JSON 200 + non-dict JSON 200 (#2483)
2026-05-03 16:37:30 +00:00
Hongming Wang 270a95aa67 test(envelope-enrichment): pin negative-cache for non-JSON 200 + non-dict JSON 200 (#2483)
The two missing branch tests called out by the multi-axis review of #2471.

a2a_client.enrich_peer_metadata handles two failure shapes (lines 105-112)
that the existing 12 envelope-enrichment tests don't exercise:

  1. HTTP 200, response.json() raises (non-JSON body)
  2. HTTP 200, valid JSON, but body is list/string/number not dict

Both paths land at the negative-cache write, but no test verified the
discriminator. Pin both with the same call_count == 1 assertion shape
the 5xx + network-exception tests already use.

Verified: temporarily removing the negative-cache write in either
branch makes the corresponding test fail with call_count == 2 — the
assertion correctly discriminates the contract from a fall-through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 09:35:21 -07:00
Hongming Wang 6431bdc631 Merge pull request #2600 from Molecule-AI/auto-sync/main-72b6be82
chore: sync main → staging (auto, ff to 72b6be82)
2026-05-03 16:23:16 +00:00
molecule-ai[bot] 72b6be82b0 Merge pull request #2599 from Molecule-AI/staging
staging → main: auto-promote b425995
2026-05-03 09:18:48 -07:00
Hongming Wang b42599585e Merge pull request #2598 from Molecule-AI/fix/auto-promote-skip-empty-tree
fix(auto-promote): skip empty-tree promotes to break perpetual cycle
2026-05-03 15:59:05 +00:00
Hongming Wang 06bfed2e35 Merge pull request #2597 from Molecule-AI/auto-sync/main-d1eab79d
chore: sync main → staging (auto, ff to d1eab79d)
2026-05-03 15:57:47 +00:00
Hongming Wang 80b38900de fix(auto-promote): skip empty-tree promotes to break perpetual cycle
The auto-promote ↔ auto-sync chain has been generating empty PRs
indefinitely since the staging merge_queue ruleset uses MERGE
strategy:

1. Auto-promote merges PR via queue → main = merge commit M2 not in staging
2. Auto-sync opens sync-back PR. Workflow's local `git merge --ff-only`
   succeeds (PR title even says "ff to ..."), but the queue lands the
   PR via MERGE → staging = merge commit S2 not in main
3. Auto-promote sees staging ahead by 1 → opens new promote PR. Tree
   diff vs main = 0 (S2's tree == main's tree). But the gate logic
   only checks "all required workflows green", not "actual code to
   ship" → opens an empty promote PR
4. ... repeat indefinitely

Each round costs ~30-40 min wallclock, ~2 manual approvals (the queue
requires 1 review and the bot can't self-approve without admin
bypass), and one full CodeQL Go run (~15 min).

Observed today (2026-05-03) across PRs #2592#2594#2595#2596#2597 — 5 PRs, ~3 hours, all empty content.

Fix: before opening the promote PR, check that staging's tree
actually differs from main's tree. If they're identical (the
empty-merge-commit cycle), skip cleanly and let the cycle terminate.

Implementation:
- New step `Skip if staging tree == main tree` runs before the
  existing gate check.
- `git diff --quiet origin/main $HEAD_SHA` exits 0 iff trees match.
- On match: emits a step summary explaining the skip + sets
  `skip=true`; subsequent gate-check + promote steps are gated on
  `skip != 'true'` so they short-circuit.
- Fail-open: if `git fetch` errors, fall through to gate check
  (preserve existing behavior). Only skip when diff is DEFINITIVELY
  empty.

Long-term, the cleaner fix is to switch the merge_queue ruleset's
merge_method away from MERGE so FF-able PRs land cleanly without a
new commit — but that's a broader change affecting every staging
PR's commit shape. This guard is the surgical one-step break.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 08:56:44 -07:00
molecule-ai[bot] d1eab79d28 Merge pull request #2596 from Molecule-AI/staging
staging → main: auto-promote 824a2a7
2026-05-03 15:50:12 +00:00
Hongming Wang 824a2a7657 Merge pull request #2595 from Molecule-AI/auto-sync/main-876d6ec8
chore: sync main → staging (auto, ff to 876d6ec8)
2026-05-03 15:38:22 +00:00
molecule-ai[bot] 876d6ec8c9 Merge pull request #2594 from Molecule-AI/staging
staging → main: auto-promote 63e3d38
2026-05-03 08:33:51 -07:00
Hongming Wang 63e3d385d6 Merge pull request #2592 from Molecule-AI/auto-sync/main-2e78812f
chore: sync main → staging (auto, ff to 2e78812f)
2026-05-03 15:15:01 +00:00
molecule-ai[bot] 2e78812ff9 Merge pull request #2591 from Molecule-AI/staging
staging → main: auto-promote 19cc833
2026-05-03 15:04:00 +00:00
Hongming Wang 9664d66e4b Merge branch 'main' into staging 2026-05-03 07:48:31 -07:00
Hongming Wang 19cc83313a Merge pull request #2589 from Molecule-AI/fix/retarget-skip-staging-head
fix(retarget): skip PRs whose head is staging (auto-promote PRs)
2026-05-03 14:36:44 +00:00
molecule-ai[bot] 097d513b65 Merge pull request #2588 from Molecule-AI/staging
staging → main: auto-promote c45aa8d
2026-05-03 07:35:05 -07:00
Hongming Wang 2b3f44c3c8 fix(retarget): skip PRs whose head is staging (auto-promote PRs)
The retarget-main-to-staging workflow tries to PATCH base=staging on
every bot-authored PR opened against main. Auto-promote staging→main
PRs have head=staging, base=main — retargeting them sets head AND
base to staging, which GitHub rejects with HTTP 422 "no new commits
between base 'staging' and head 'staging'".

This started surfacing on PR #2588 (2026-05-03 14:30) once #2586
switched the auto-promote workflow to an App token. Before #2586
the auto-promote PR was authored by github-actions[bot], which the
retarget filter happened to skip; now it's molecule-ai[bot], which
passes the bot filter and triggers the broken retarget attempt.

Add a head-ref != 'staging' guard so auto-promote PRs short-circuit
before the PATCH. The existing 422 "duplicate base" detector is
left alone — it covers a different operational case.
2026-05-03 07:34:24 -07:00
Hongming Wang c45aa8d7ee Merge pull request #2587 from Molecule-AI/auto-sync/main-b4e45374
chore: sync main → staging (auto, ff to b4e45374)
2026-05-03 14:19:28 +00:00
Hongming Wang b4e45374bf Merge pull request #2586 from Molecule-AI/fix/auto-promote-app-token
fix(auto-promote): use App token for auto-merge to fire downstream cascade (#2357)
2026-05-03 07:15:31 -07:00
Hongming Wang f2d69f0088 Merge pull request #2585 from Molecule-AI/fix/canvas-loading-state-aria
fix(canvas): add role=status + aria-live to remaining loading states
2026-05-03 14:14:33 +00:00
Hongming Wang bc11ed8a2b fix(auto-promote): use App token for auto-merge to fire downstream cascade (#2357)
GITHUB_TOKEN-initiated merges suppress the downstream `push` event on
main per GitHub's documented limitation:
  https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#triggering-a-workflow-from-a-workflow

Result before this fix: every staging→main promote landed silently —
publish-workspace-server-image, canary-verify, and redeploy-tenants-on-main
all stayed dark. The polling tail was the SOLE cascade trigger; if it
ever 30-min-timed-out the chain dead-locked invisibly.

Symptom (from the issue body, 2026-04-30):

| Time     | Event                                            | Triggered? |
|----------|--------------------------------------------------|-----------|
| 05:48:04 | Promote PR #2352 merged (c140ad28)               | No fired  |
| 06:07:29 | Promote PR #2356 merged (5973c9bd)               | No fired  |

Fix: mint the molecule-ai App token BEFORE the promote-PR step and
hand it to the auto-merge call. App-token-initiated merges DO trigger
downstream workflow_run cascades.

The polling tail stays as defense-in-depth (with comments updated):
once we've observed >=10 successful natural cascades it can be dropped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 07:13:26 -07:00
Hongming Wang e2328abedc fix(canvas): add role=status + aria-live to remaining loading states
Three loading-state divs were missing the role/aria pattern that
TemplatePalette.tsx and EmptyState.tsx already follow. Screen readers
get no signal that the page is waiting:

- canvas/src/app/page.tsx — full-screen "Loading canvas..." while
  the websocket hydrates. First paint of the entire app.
- canvas/src/components/settings/TokensTab.tsx — "Loading tokens..."
- canvas/src/components/settings/OrgTokensTab.tsx — "Loading keys..."

Add role="status" + aria-live="polite" to the wrapping div so
assistive tech announces the wait and the eventual transition.
Visual rendering unchanged.
2026-05-03 07:11:48 -07:00
github-actions[bot] bdad75ae3e Merge pull request #2582 from Molecule-AI/staging
staging → main: auto-promote 90ba2cd
2026-05-03 07:06:58 -07:00
Hongming Wang 90ba2cd4df Merge pull request #2580 from Molecule-AI/auto-sync/main-b002247f
chore: sync main → staging (auto, ff to b002247f)
2026-05-03 13:54:03 +00:00
Hongming Wang b002247f12 Merge pull request #2576 from Molecule-AI/staging
staging → main: auto-promote effbcd7
2026-05-03 06:44:31 -07:00
Hongming Wang 03bcce3eb3 Merge pull request #2574 from Molecule-AI/auto-sync/main-55d85147
chore: sync main → staging (auto, ff to 55d85147)
2026-05-03 13:18:34 +00:00
Hongming Wang c74e71d604 Merge branch 'staging' into auto-sync/main-55d85147 2026-05-03 06:07:20 -07:00
Hongming Wang d7f88674d8 Merge pull request #2577 from Molecule-AI/fix/canvas-tier-legend-t3-t4-contract
fix(canvas): align tier text contracts with 4-tier reality (T1/T2/T3/T4)
2026-05-03 12:58:52 +00:00
Hongming Wang 7abb94dab8 fix(canvas): align tier text contracts with 4-tier reality (T1/T2/T3/T4)
The tier system in CreateWorkspaceDialog and design-tokens has been
T1 Sandboxed / T2 Standard / T3 Privileged / T4 Full Access, but two
chrome surfaces still showed the older 3-tier mapping with T3 as
"Full Access":

- Legend (bottom-left chrome on every canvas page) listed only T1/T2/T3
  and called T3 "Full Access". On a SaaS tenant the actual workspace
  badges render T4 (in amber/warm) — there was no T4 entry in the
  legend at all, so the user sees an undocumented orange badge.

- ConfigTab tier dropdown (per-workspace settings → Sandboxing) had no
  T4 option at all and called T3 "Full Access". So an existing T4
  workspace would show "T3 — Full Access" as the selected option,
  silently downgrading the displayed tier on the settings panel.

- tenant.ts isSaaSTenant() doc comment claimed SaaS workspaces are
  "inherently T3 Full Access" — wrong on both the number and the lock
  rationale (SaaS hides T1/T2/T3, not just T1/T2).

Fix:
- Legend now imports TIER_CONFIG and renders all four tiers
  (Sandboxed/Standard/Privileged/Full Access) using the same color
  swatches as the badges on workspace cards. Eliminates the previous
  drift where Legend's hardcoded sky/violet/warm chips didn't match
  the gray/sky/violet/amber actually rendered on nodes.
- ConfigTab adds the missing T4 — Full Access option and renames T3
  to Privileged.
- tenant.ts comment updated to match the picker's actual hide list.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:56:18 -07:00
Hongming Wang effbcd737b Merge pull request #2575 from Molecule-AI/fix/cascade-include-all-active-templates
fix(publish-runtime): re-add 5 templates wrongly removed from cascade — fixes #2566
2026-05-03 12:45:48 +00:00
Hongming Wang 6eb79adfd5 manifest: re-add 5 workspace templates pruned by #2536
The cascade-list-vs-manifest drift gate (PR #2556's behavior-based
test) caught my previous-commit cascade additions as 'extra-in-cascade'.
Manifest is the source of truth — restoring there.

All 5 templates have successful publish-image runs in the past 24h
(verified before the cascade fix), and continuous-synth-e2e defaults
to langgraph as its primary canary. None deprecated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:43:07 -07:00
Hongming Wang 8f48a38550 fix(publish-runtime): re-add 5 templates wrongly removed from cascade (#2566)
The PR #2536 cascade prune ('deprecated, no shipping images') was
empirically wrong. Re-confirmed 2026-05-03:

- continuous-synth-e2e.yml defaults to langgraph as its primary canary
- All 5 'deprecated' templates have successful publish-image runs in
  the past 24h: langgraph, crewai, autogen, deepagents, gemini-cli

Symptom this fixes — issue #2566 (priority-high, failing 36+h):

  Synthetic E2E (staging): langgraph adapter A2A failure
  'Received Message object in task mode' — failing for >36h

Today at 11:06 commit e1628c4 fixed the underlying a2a-sdk strict-mode
issue in workspace/a2a_executor.py. publish-runtime fired at 11:13 and
cascaded — but only to claude-code, hermes, openclaw, codex. langgraph
was excluded by the prune, so its image stayed on the broken runtime
and the synth E2E (which defaults to langgraph) kept failing despite
the fix being live in PyPI.

After this lands + the next runtime publish fires, langgraph image
re-bakes with the fix and synth-E2E goes green.

Test plan:

- [x] yaml-validate the workflow
- [ ] After merge, watch publish-runtime cascade to all 9 templates
- [ ] Confirm langgraph publish-image fires + succeeds
- [ ] Confirm next continuous-synth-e2e run goes green

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:41:53 -07:00
github-actions[bot] 55d85147f7 Merge pull request #2573 from Molecule-AI/staging
staging → main: auto-promote dc6425f
2026-05-03 05:34:23 -07:00
github-actions[bot] f7e8f98cf7 Merge pull request #2570 from Molecule-AI/staging
staging → main: auto-promote 173e22e
2026-05-03 12:22:52 +00:00
Hongming Wang dc6425fe39 Merge pull request #2571 from Molecule-AI/fix/synth-e2e-model-slug-by-runtime
fix(synth-e2e): branch MODEL_SLUG by runtime so langgraph gets colon-form
2026-05-03 12:22:19 +00:00
Hongming Wang cbc69f5e7e fix(synth-e2e): branch MODEL_SLUG by runtime so langgraph gets colon-form
The original script hardcoded `MODEL_SLUG="openai/gpt-4o"` (slash) and
claimed "non-hermes runtimes ignore the prefix" — wrong for langgraph,
which delegates model resolution to langchain's `init_chat_model`. That
function requires `<provider>:<model>` (colon) and treats slash-form as
OpenRouter routing, falling through without auth even when
OPENAI_API_KEY is set.

Surfaced 2026-05-03 after the a2a-sdk v1 contract bugs (PR
#2558+#2563+#2567) cleared the masking layers — synth-E2E firing
2026-05-03T12:14 returned a properly-shaped task with state=failed +
"Could not resolve authentication method" inside the agent body.

continuous-synth-e2e.yml defaults E2E_RUNTIME=langgraph for the cron,
so every firing hit this. Hermes still gets the slash-form it
needs; claude-code uses the entry-id pattern.

Adds E2E_MODEL_SLUG override for operator-dispatched runs that want
to pin a specific slug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:17:55 -07:00
Hongming Wang c71f641b12 Merge pull request #2569 from Molecule-AI/fix/redeploy-canary-default
ci(redeploy): fix stale canary_slug default 'hongmingwang' → 'hongming'
2026-05-03 12:08:26 +00:00
Hongming Wang 173e22e091 Merge pull request #2568 from Molecule-AI/auto-sync/main-c0838d63
chore: sync main → staging (auto, ff to c0838d63)
2026-05-03 12:07:29 +00:00
Hongming Wang 60a516bc8d ci(redeploy): fix stale canary_slug default 'hongmingwang' → 'hongming'
The workflow_dispatch input default and the workflow_run env fallback
both pointed at 'hongmingwang', which doesn't match any current prod
tenant (slugs are: hongming, chloe-dong, reno-stars). CP silently
skipped the missing canary and put every tenant in batch-1 in parallel,
defeating the canary-first soak gate that exists to catch image-boot
regressions before they hit the whole fleet.

Concrete example from today's c0838d6 redeploy at 11:53Z (run 25278434388):
the dispatched body was `{"target_tag":"staging-c0838d6","canary_slug":"hongmingwang",...}`
and the CP response showed all 3 tenants in `"phase":"batch-1"` — no
soak, no canary. The deploy happened to be safe, but a broken image
would have hit hongming + chloe-dong + reno-stars simultaneously.

Fixed in three places: the runtime ordering comment, the
workflow_dispatch default, and the env fallback used by the
workflow_run trigger. Comment documents the rationale so the next
slug rename doesn't silently regress this again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 05:06:01 -07:00
20 changed files with 434 additions and 124 deletions
+94 -44
View File
@@ -111,7 +111,60 @@ jobs:
all_green: ${{ steps.gates.outputs.all_green }}
head_sha: ${{ steps.gates.outputs.head_sha }}
steps:
# Skip empty-tree promotes (the perpetual auto-promote↔auto-sync cycle
# observed 2026-05-03). Sequence: auto-promote merges via the staging
# merge-queue's MERGE strategy, creating a merge commit on main that
# staging doesn't have. auto-sync then merges main back into staging
# via another merge commit (the queue's MERGE strategy applies on
# the staging side too, even when the workflow's local FF would
# have sufficed). Now staging has a new merge-commit SHA whose
# tree == main's tree — but auto-promote sees "staging ahead of
# main by 1" and opens YET another empty promote PR. Each round
# costs ~30-40 min wallclock, ~2 manual approvals, and burns a
# full CodeQL Go run (~15 min). Without this guard the cycle
# repeats indefinitely.
#
# Long-term fix is to switch the merge_queue ruleset's
# `merge_method` away from MERGE so FF-able PRs land cleanly,
# but that's a broader change affecting every staging PR's
# commit shape. This guard is the one-line surgical fix that
# breaks the cycle without touching merge-queue config.
#
# Fail-open: if `git diff` errors for any reason, fall through
# to the gate check (preserve existing behavior). Only skip
# when the diff is DEFINITIVELY empty.
- name: Checkout for tree-diff check
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 0
ref: staging
- name: Skip if staging tree == main tree (perpetual-cycle break)
id: tree-diff
env:
HEAD_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
run: |
set -eu
git fetch origin main --depth=50 || { echo "::warning::git fetch main failed — proceeding (fail-open)"; exit 0; }
# Compare staging tip's tree against main's tree. `git diff
# --quiet` exits 0 if no differences, 1 if there are.
if git diff --quiet origin/main "$HEAD_SHA" -- 2>/dev/null; then
{
echo "## ⏭ Skipped — no code to promote"
echo
echo "staging tip (\`${HEAD_SHA:0:8}\`) and \`main\` have identical trees."
echo "This is the auto-promote↔auto-sync merge-commit cycle: staging has a"
echo "new SHA (a sync-back merge commit) but the underlying file tree is"
echo "already on main, so there's no real code to ship."
echo
echo "Skipping to avoid opening an empty promote PR. Cycle terminates here."
} >> "$GITHUB_STEP_SUMMARY"
echo "::notice::auto-promote: staging tree == main tree — no code to promote, skipping"
echo "skip=true" >> "$GITHUB_OUTPUT"
else
echo "skip=false" >> "$GITHUB_OUTPUT"
fi
- name: Check all required gates on this SHA
if: steps.tree-diff.outputs.skip != 'true'
id: gates
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
@@ -209,10 +262,25 @@ jobs:
exit 0
fi
# Mint the App token BEFORE the promote-PR step so the auto-merge
# call can use it. GITHUB_TOKEN-initiated merges suppress the
# downstream `push` event on main, breaking the
# publish-workspace-server-image → canary-verify → redeploy-tenants
# chain (issue #2357). Using the App token here means the
# merge-queue-landed merge IS able to fire the cascade naturally;
# the polling tail below stays as defense-in-depth.
- name: Mint App token for promote-PR + downstream dispatch
if: ${{ vars.AUTO_PROMOTE_ENABLED == 'true' || github.event.inputs.force == 'true' }}
id: app-token
uses: actions/create-github-app-token@1b10c78c7865c340bc4f6099eb2f838309f1e8c3 # v3.1.1
with:
app-id: ${{ secrets.MOLECULE_AI_APP_ID }}
private-key: ${{ secrets.MOLECULE_AI_APP_PRIVATE_KEY }}
- name: Open (or reuse) staging → main promote PR + enable auto-merge
if: ${{ vars.AUTO_PROMOTE_ENABLED == 'true' || github.event.inputs.force == 'true' }}
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GH_TOKEN: ${{ steps.app-token.outputs.token }}
REPO: ${{ github.repository }}
TARGET_SHA: ${{ needs.check-all-gates-green.outputs.head_sha }}
run: |
@@ -267,52 +335,34 @@ jobs:
echo "promote_pr_num=${PR_NUM}" >> "$GITHUB_OUTPUT"
id: promote_pr
# Mint a short-lived GitHub App installation token for the dispatch
# step below. We CANNOT use `secrets.GITHUB_TOKEN` to dispatch the
# downstream publish chain — workflow runs created by GITHUB_TOKEN
# do not fire `workflow_run` triggers on completion (the
# documented "no recursion" rule —
# https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#triggering-a-workflow-from-a-workflow).
#
# Symptom this caused (root-caused on 2026-04-30): publish-image
# ran successfully twice (21313dc 14:41Z, 59dec57 15:21Z) but
# canary-verify and redeploy-tenants-on-main never chained,
# because the publish run's `triggering_actor` was
# `github-actions[bot]` (i.e. GITHUB_TOKEN). A manual dispatch
# earlier in the day with the operator's PAT (d850ec7 06:52Z) did
# chain — same workflow file, only the actor differed.
#
# An App token's triggering_actor is the App user (e.g.
# `molecule-ai[bot]`), which IS allowed to fire downstream
# workflow_run cascades.
- name: Mint App token for downstream dispatch
if: steps.promote_pr.outputs.promote_pr_num != ''
id: app-token
uses: actions/create-github-app-token@1b10c78c7865c340bc4f6099eb2f838309f1e8c3 # v3.1.1
with:
app-id: ${{ secrets.MOLECULE_AI_APP_ID }}
private-key: ${{ secrets.MOLECULE_AI_APP_PRIVATE_KEY }}
# The App token minted above (before the promote-PR step) is
# also used by the polling tail below. Defense-in-depth: with
# the merge-queue-landed merge now using the App token, the
# main-branch push event SHOULD fire the publish/canary/redeploy
# cascade naturally — but if for any reason it doesn't (e.g. an
# unrelated event-suppression edge case), the explicit dispatches
# below still wake the chain.
- name: Wait for promote merge, then dispatch publish + redeploy (#2357)
# GITHUB_TOKEN-initiated merges suppress downstream `push` events
# (https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#triggering-a-workflow-from-a-workflow).
# Result: when the merge queue lands the promote PR, the resulting
# main-branch push DOES NOT fire publish-workspace-server-image,
# so canary-verify and redeploy-tenants-on-main never run and
# tenants stay on stale code (issue #2357).
# Defense-in-depth dispatch. With the auto-merge call above
# now using the App token (this commit), the merge-queue-landed
# merge SHOULD fire publish-workspace-server-image naturally
# via on:push:[main] — App-token-initiated pushes DO trigger
# workflow_run cascades, unlike GITHUB_TOKEN-initiated ones
# (the documented "no recursion" rule —
# https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#triggering-a-workflow-from-a-workflow).
#
# Workaround: poll for the merge to land, then explicitly
# `gh workflow run` publish-workspace-server-image. The dispatch
# MUST authenticate as the molecule-ai App (App token minted
# above) — not GITHUB_TOKEN — so that the resulting publish
# run's completion event can fire the workflow_run cascade
# into canary-verify + redeploy-tenants-on-main. See the prior
# step's comment for the GITHUB_TOKEN no-recursion details.
# This explicit dispatch stays as belt-and-suspenders for any
# edge case where the natural cascade misfires. If it never
# observably fires after this token swap (i.e. the publish
# workflow has already started by the time we get here), the
# second dispatch is a harmless no-op (publish-workspace-server-image
# has its own concurrency group that dedupes).
#
# Long-term fix: switch the auto-merge call above to use the
# same App token, so the merge's push event fires
# publish-workspace-server-image naturally and this polling tail
# becomes unnecessary. Tracked in #2357.
# See PR for #2357: pre-fix the merge action was via
# GITHUB_TOKEN, suppressing the cascade and forcing this tail
# to be the SOLE chain trigger. With the auto-merge token swap
# the tail becomes redundant in the happy path; keep until
# we've observed >=10 successful natural cascades, then drop.
if: steps.promote_pr.outputs.promote_pr_num != ''
env:
GH_TOKEN: ${{ steps.app-token.outputs.token }}
+13 -7
View File
@@ -327,13 +327,19 @@ jobs:
echo "::error::publish job did not expose a version output — cascade cannot fan out"
exit 1
fi
# Source of truth: manifest.json workspace_templates (PR #2536 pruned
# to 4 actively-supported runtimes: claude-code, hermes, openclaw, codex).
# Removed langgraph/crewai/autogen/deepagents/gemini-cli (deprecated, no
# shipping images); added codex (had been missing since #2512).
# Long-term: derive this list from manifest.json so the cascade can't
# drift again — tracked in RFC #388 as a Phase-1 invariant.
TEMPLATES="claude-code hermes openclaw codex"
# All 9 active workspace template repos. The PR #2536 pruning
# ("deprecated, no shipping images") was empirically wrong:
# continuous-synth-e2e.yml defaults to langgraph as its primary
# canary (line 44), and every excluded template had successful
# publish-image runs as of 2026-05-03 — none were dormant.
# Symptom of the prune: today's a2a-sdk strict-mode fix
# (#2566 / commit e1628c4) cascaded to 4 templates but never
# reached langgraph, so the synth-E2E correctly canary'd a fix
# that had landed but not deployed. Re-added the 5 templates.
# Long-term: derive this list from manifest.json so cascade
# scope can't drift from E2E scope — tracked in RFC #388 as a
# Phase-1 invariant.
TEMPLATES="claude-code hermes openclaw codex langgraph crewai autogen deepagents gemini-cli"
FAILED=""
for tpl in $TEMPLATES; do
REPO="Molecule-AI/molecule-ai-workspace-template-$tpl"
@@ -17,7 +17,7 @@ name: redeploy-tenants-on-main
# 1. publish-workspace-server-image completes → new :latest in GHCR.
# 2. This workflow fires via workflow_run, waits 30s for GHCR's
# CDN to propagate the new tag to the region the tenants pull from.
# 3. Calls redeploy-fleet with canary_slug=hongmingwang and a 60s
# 3. Calls redeploy-fleet with canary_slug=hongming and a 60s
# soak. Canary proves the image boots; batches follow.
# 4. Any failure aborts the rollout and leaves older tenants on the
# prior image — safer default than half-and-half state.
@@ -56,7 +56,12 @@ on:
description: 'Tenant slug to deploy first + soak (empty = skip canary, fan out immediately).'
required: false
type: string
default: 'hongmingwang'
# Must be an actual prod tenant slug (current: hongming,
# chloe-dong, reno-stars). The previous default 'hongmingwang'
# didn't match any tenant — CP soft-skipped the missing canary
# and the fleet rolled out without the soak gate, defeating the
# whole point of canary-first.
default: 'hongming'
soak_seconds:
description: 'Seconds to wait after canary before fanning out.'
required: false
@@ -148,7 +153,7 @@ jobs:
CP_URL: ${{ vars.CP_URL || 'https://api.moleculesai.app' }}
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
TARGET_TAG: ${{ steps.tag.outputs.target_tag }}
CANARY_SLUG: ${{ inputs.canary_slug || 'hongmingwang' }}
CANARY_SLUG: ${{ inputs.canary_slug || 'hongming' }}
SOAK_SECONDS: ${{ inputs.soak_seconds || '60' }}
BATCH_SIZE: ${{ inputs.batch_size || '3' }}
DRY_RUN: ${{ inputs.dry_run || false }}
+15 -4
View File
@@ -26,11 +26,22 @@ jobs:
runs-on: ubuntu-latest
# Only fire for bot-authored PRs. Human CEO PRs (staging→main promotion)
# are intentional and pass through.
#
# Head-ref guard: never retarget a PR whose head IS `staging` — those
# are the auto-promote staging→main PRs (opened by molecule-ai[bot]
# since #2586 switched to an App token, which now passes the bot
# filter below). Retargeting head=staging onto base=staging fails
# with HTTP 422 "no new commits between base 'staging' and head
# 'staging'", which used to surface as a noisy red workflow run on
# every auto-promote (caught 2026-05-03 on PR #2588).
if: >-
github.event.pull_request.user.type == 'Bot'
|| endsWith(github.event.pull_request.user.login, '[bot]')
|| github.event.pull_request.user.login == 'app/molecule-ai'
|| github.event.pull_request.user.login == 'molecule-ai[bot]'
github.event.pull_request.head.ref != 'staging'
&& (
github.event.pull_request.user.type == 'Bot'
|| endsWith(github.event.pull_request.user.login, '[bot]')
|| github.event.pull_request.user.login == 'app/molecule-ai'
|| github.event.pull_request.user.login == 'molecule-ai[bot]'
)
steps:
- name: Retarget PR base to staging
id: retarget
+1 -1
View File
@@ -54,7 +54,7 @@ export default function Home() {
if (hydrating) {
return (
<div className="fixed inset-0 flex items-center justify-center bg-surface">
<div className="flex flex-col items-center gap-3">
<div role="status" aria-live="polite" className="flex flex-col items-center gap-3">
<Spinner size="lg" />
<span className="text-xs text-ink-soft">Loading canvas...</span>
</div>
+16 -4
View File
@@ -1,11 +1,23 @@
"use client";
import { useEffect, useState } from "react";
import { STATUS_CONFIG } from "@/lib/design-tokens";
import { STATUS_CONFIG, TIER_CONFIG } from "@/lib/design-tokens";
import { useCanvasStore } from "@/store/canvas";
const LEGEND_STATUSES = ["online", "provisioning", "degraded", "failed", "paused", "offline"] as const;
// Tier descriptions kept in sync with CreateWorkspaceDialog.tsx (the
// source of truth for what each tier means semantically). Colors come
// from TIER_CONFIG so the legend swatch matches the badge actually
// rendered on every WorkspaceNode — drift here misled users into
// thinking the legend documented a different tier than the one shown.
const LEGEND_TIERS: ReadonlyArray<{ tier: number; label: string }> = [
{ tier: 1, label: "Sandboxed" },
{ tier: 2, label: "Standard" },
{ tier: 3, label: "Privileged" },
{ tier: 4, label: "Full Access" },
];
// Persist the user's choice across sessions. Default is "open" so
// first-time users still see the symbol key; once dismissed we
// respect that until they explicitly reopen via the floating pill.
@@ -102,9 +114,9 @@ export function Legend() {
<div className="mb-2">
<div className="text-[11px] text-ink-soft font-medium mb-1">Tier</div>
<div className="flex flex-wrap gap-x-3 gap-y-1">
<TierItem tier={1} label="Sandboxed" color="text-sky-300 bg-sky-950/40 border-sky-700/30" />
<TierItem tier={2} label="Standard" color="text-violet-300 bg-violet-950/40 border-violet-700/30" />
<TierItem tier={3} label="Full Access" color="text-warm bg-amber-950/40 border-amber-700/30" />
{LEGEND_TIERS.map(({ tier, label }) => (
<TierItem key={tier} tier={tier} label={label} color={TIER_CONFIG[tier].border} />
))}
</div>
</div>
+3 -3
View File
@@ -202,7 +202,7 @@ export function SidePanel() {
{/* Tabs — relative wrapper lets the fade gradient position against the scroll container */}
<div className="relative border-b border-line/40">
{/* Right-edge fade: signals more tabs are hidden off-screen when the bar overflows */}
<div className="pointer-events-none absolute inset-y-0 right-0 w-8 bg-gradient-to-l from-zinc-950 to-transparent z-10" aria-hidden="true" />
<div className="pointer-events-none absolute inset-y-0 right-0 w-8 bg-gradient-to-l from-surface to-transparent z-10" aria-hidden="true" />
<div
role="tablist"
aria-label="Workspace panel tabs"
@@ -232,8 +232,8 @@ export function SidePanel() {
onClick={() => setPanelTab(tab.id)}
className={`shrink-0 px-3 py-2.5 text-[10px] font-medium tracking-wide transition-all rounded-t-lg mx-0.5 focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/70 ${
panelTab === tab.id
? "text-ink bg-surface-card/40 border-b-2 border-accent"
: "text-ink-soft hover:text-ink hover:bg-surface-card/40"
? "text-ink bg-surface-card border-b-2 border-accent"
: "text-ink-mid hover:text-ink hover:bg-surface-card/60"
}`}
>
<span className="mr-1 opacity-50" aria-hidden="true">{tab.icon}</span>
+23 -23
View File
@@ -36,7 +36,7 @@ function EjectIcon(props: React.SVGProps<SVGSVGElement>) {
export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>) {
const statusCfg = STATUS_CONFIG[data.status] || STATUS_CONFIG.offline;
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-soft bg-surface-card" };
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-mid bg-surface-card border border-line" };
// Org-deploy context — four derived flags off one store subscription.
// Drives the shimmer while provisioning, the dimmed/non-draggable
// treatment on locked descendants, and the Cancel pill on the root.
@@ -179,7 +179,7 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
</div>
<div className="flex items-center gap-1.5 shrink-0">
{hasChildren && (
<span className="text-[10px] font-mono text-violet-300 bg-violet-900/40 border border-violet-700/30 px-1.5 py-0.5 rounded-md">
<span className="text-[10px] font-mono text-accent bg-accent/15 border border-accent/40 px-1.5 py-0.5 rounded-md">
{descendantCount} sub
</span>
)}
@@ -207,13 +207,13 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
<div className="mb-1 flex items-center gap-1">
{runtime === "external" ? (
<span
className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-violet-200 bg-violet-900/50 border border-violet-500/40"
className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-white bg-violet-600 border border-violet-700"
title="Phase 30 remote agent — runs outside this platform's Docker network. Lifecycle managed via heartbeat-based polling, not Docker exec."
>
REMOTE
</span>
) : (
<span className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-ink-mid bg-surface-card/60 border border-line/30">
<span className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-ink-mid bg-surface-card border border-line">
{runtime}
</span>
)}
@@ -237,15 +237,15 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
key={skill}
className={`text-[10px] px-1.5 py-0.5 rounded-md border ${
isOnline
? "text-good/80 bg-emerald-950/30 border-emerald-800/30"
: "text-ink-mid bg-surface-card/60 border-line/40"
? "text-good bg-good/15 border-good/40"
: "text-ink-mid bg-surface-card border-line"
}`}
>
{skill}
</span>
))}
{skills.length > 4 && (
<span className="text-[10px] text-ink-soft self-center">
<span className="text-[10px] text-ink-mid self-center">
+{skills.length - 4}
</span>
)}
@@ -274,10 +274,10 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
e.stopPropagation();
useCanvasStore.getState().restartWorkspace(id).catch(() => showToast("Restart failed", "error"));
}}
className="flex items-center gap-1.5 mt-1 w-full bg-sky-950/30 px-2 py-1 rounded-md border border-sky-800/30 hover:bg-sky-900/40 transition-colors text-left focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:outline-none"
className="flex items-center gap-1.5 mt-1 w-full bg-accent/10 px-2 py-1 rounded-md border border-accent/40 hover:bg-accent/20 transition-colors text-left focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:outline-none"
>
<span className="text-[10px]"></span>
<span className="text-[10px] text-sky-300/80">Restart to apply changes</span>
<span className="text-[10px] text-accent"></span>
<span className="text-[10px] text-accent">Restart to apply changes</span>
</button>
)}
@@ -287,8 +287,8 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
<div className={`text-[10px] uppercase tracking-widest font-medium ${
data.status === "failed" ? "text-bad" :
data.status === "degraded" ? "text-warm" :
data.status === "provisioning" ? "text-sky-400" :
"text-ink-soft"
data.status === "provisioning" ? "text-accent" :
"text-ink-mid"
}`}>
{statusCfg.label}
</div>
@@ -296,8 +296,8 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
{data.activeTasks > 0 && (
<div className="flex items-center gap-1">
<div className="w-1 h-1 rounded-full bg-amber-400 motion-safe:animate-pulse" />
<span className="text-[10px] text-warm/80 tabular-nums">
<div className="w-1 h-1 rounded-full bg-warm motion-safe:animate-pulse" />
<span className="text-[10px] text-warm tabular-nums">
{data.activeTasks} task{data.activeTasks > 1 ? "s" : ""}
</span>
</div>
@@ -307,7 +307,7 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
{/* Degraded error preview */}
{data.status === "degraded" && data.lastSampleError && (
<div
className="text-[10px] text-warm/60 truncate mt-1 bg-amber-950/20 px-1.5 py-0.5 rounded border border-amber-800/20"
className="text-[10px] text-warm truncate mt-1 bg-warm/10 px-1.5 py-0.5 rounded border border-warm/40"
title={data.lastSampleError}
>
{data.lastSampleError}
@@ -357,7 +357,7 @@ function TeamMemberChip({
}) {
const { data } = node;
const statusCfg = STATUS_CONFIG[data.status] || STATUS_CONFIG.offline;
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-soft bg-surface-card" };
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-mid bg-surface-card border border-line" };
const isOnline = data.status === "online";
const skills = getSkillNames(data.agentCard);
@@ -408,7 +408,7 @@ function TeamMemberChip({
</div>
<div className="flex items-center gap-1 shrink-0">
{hasSubChildren && (
<span className="text-[7px] font-mono text-violet-300 bg-violet-900/40 border border-violet-700/30 px-1 py-0.5 rounded">
<span className="text-[7px] font-mono text-accent bg-accent/15 border border-accent/40 px-1 py-0.5 rounded">
{descendantCount}
</span>
)}
@@ -423,7 +423,7 @@ function TeamMemberChip({
e.stopPropagation();
onExtract(node.id);
}}
className="opacity-0 group-hover/child:opacity-100 text-ink-soft hover:text-sky-400 transition-all focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:outline-none rounded"
className="opacity-0 group-hover/child:opacity-100 text-ink-mid hover:text-accent transition-all focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:outline-none rounded"
>
<EjectIcon aria-hidden="true" />
</button>
@@ -432,7 +432,7 @@ function TeamMemberChip({
{/* Role */}
{data.role && (
<div className="text-[10px] text-ink-soft mb-1 leading-tight truncate">{data.role}</div>
<div className="text-[10px] text-ink-mid mb-1 leading-tight truncate">{data.role}</div>
)}
{/* Skills */}
@@ -443,8 +443,8 @@ function TeamMemberChip({
key={skill}
className={`text-[10px] px-1 py-0.5 rounded border ${
isOnline
? "text-good/70 bg-emerald-950/20 border-emerald-800/20"
: "text-ink-soft bg-surface-card/40 border-line/30"
? "text-good bg-good/15 border-good/40"
: "text-ink-mid bg-surface-card border-line"
}`}
>
{skill}
@@ -462,8 +462,8 @@ function TeamMemberChip({
<span className={`text-[10px] uppercase tracking-widest font-medium ${
data.status === "failed" ? "text-bad" :
data.status === "degraded" ? "text-warm" :
data.status === "provisioning" ? "text-sky-400" :
"text-ink-soft"
data.status === "provisioning" ? "text-accent" :
"text-ink-mid"
}`}>
{statusCfg.label}
</span>
@@ -182,7 +182,7 @@ export function OrgTokensTab() {
{/* Token list */}
{loading ? (
<div className="flex items-center justify-center gap-2 py-6 text-ink-soft text-xs">
<div role="status" aria-live="polite" className="flex items-center justify-center gap-2 py-6 text-ink-soft text-xs">
<Spinner /> Loading keys...
</div>
) : tokens.length === 0 ? (
+1 -1
View File
@@ -129,7 +129,7 @@ export function TokensTab({ workspaceId }: TokensTabProps) {
{/* Token list */}
{loading ? (
<div className="flex items-center justify-center gap-2 py-6 text-ink-soft text-xs">
<div role="status" aria-live="polite" className="flex items-center justify-center gap-2 py-6 text-ink-soft text-xs">
<Spinner /> Loading tokens...
</div>
) : tokens.length === 0 ? (
+5 -5
View File
@@ -773,14 +773,14 @@ function MyChatPanel({ workspaceId, data }: Props) {
<div
className={`max-w-[85%] rounded-lg px-3 py-2 text-xs ${
msg.role === "user"
? "bg-accent-strong/30 text-blue-100 border border-accent/20"
? "bg-accent text-white border border-accent-strong"
: msg.role === "system"
? "bg-red-900/30 text-red-200 border border-red-800/30"
: "bg-surface-card/80 text-ink border border-line/30"
? "bg-bad/10 text-bad border border-bad/40"
: "bg-surface-card text-ink border border-line"
}`}
>
{msg.content && (
<div className="prose prose-sm prose-invert max-w-none [&>p]:mb-1 [&>p:last-child]:mb-0">
<div className={`prose prose-sm max-w-none [&>p]:mb-1 [&>p:last-child]:mb-0 ${msg.role === "user" ? "prose-invert" : ""}`}>
<ReactMarkdown remarkPlugins={[remarkGfm]}>{msg.content}</ReactMarkdown>
</div>
)}
@@ -796,7 +796,7 @@ function MyChatPanel({ workspaceId, data }: Props) {
))}
</div>
)}
<div className="text-[9px] text-ink-soft mt-1">
<div className={`text-[9px] mt-1 ${msg.role === "user" ? "text-white/70" : "text-ink-mid"}`}>
{new Date(msg.timestamp).toLocaleTimeString()}
</div>
</div>
+2 -1
View File
@@ -655,7 +655,8 @@ export function ConfigTab({ workspaceId }: Props) {
>
<option value={1}>T1 Sandboxed</option>
<option value={2}>T2 Standard</option>
<option value={3}>T3 Full Access</option>
<option value={3}>T3 Privileged</option>
<option value={4}>T4 Full Access</option>
</select>
</div>
</div>
+4 -4
View File
@@ -12,10 +12,10 @@ export function statusDotClass(status: string): string {
}
export const TIER_CONFIG: Record<number, { label: string; color: string; border: string }> = {
1: { label: "T1", color: "text-ink-soft bg-surface-card/80", border: "text-ink-mid border-line/60" },
2: { label: "T2", color: "text-sky-400 bg-sky-950/50", border: "text-sky-400 border-sky-500/30" },
3: { label: "T3", color: "text-violet-400 bg-violet-950/50", border: "text-violet-400 border-violet-500/30" },
4: { label: "T4", color: "text-warm bg-amber-950/50", border: "text-warm border-amber-500/30" },
1: { label: "T1", color: "text-ink-mid bg-surface-card border border-line", border: "text-ink-mid border-line" },
2: { label: "T2", color: "text-white bg-accent border border-accent-strong", border: "text-accent border-accent" },
3: { label: "T3", color: "text-white bg-violet-600 border border-violet-700", border: "text-violet-600 border-violet-500" },
4: { label: "T4", color: "text-white bg-warm border border-warm", border: "text-warm border-warm" },
};
export const COMM_TYPE_LABELS: Record<string, string> = {
+2 -2
View File
@@ -59,8 +59,8 @@ export function getTenantSlug(): string {
* isSaaSTenant reports whether the canvas is running as the UI for a
* SaaS tenant (served at <slug>.moleculesai.app). Use for client-side
* UX branches that should behave differently on SaaS vs self-hosted —
* e.g. the workspace tier picker hides T1/T2 sandbox tiers because every
* SaaS workspace gets its own EC2 VM (inherently T3 Full Access).
* e.g. the workspace tier picker hides T1/T2/T3 sandbox tiers because
* every SaaS workspace gets its own EC2 VM (inherently T4 Full Access).
*
* SSR-safe: returns false on the server to avoid hydration drift; call
* sites should tolerate a flip from false→true on first client render.
+6 -1
View File
@@ -28,7 +28,12 @@
{"name": "claude-code-default", "repo": "Molecule-AI/molecule-ai-workspace-template-claude-code", "ref": "main"},
{"name": "hermes", "repo": "Molecule-AI/molecule-ai-workspace-template-hermes", "ref": "main"},
{"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"},
{"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"}
{"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"},
{"name": "langgraph", "repo": "Molecule-AI/molecule-ai-workspace-template-langgraph", "ref": "main"},
{"name": "crewai", "repo": "Molecule-AI/molecule-ai-workspace-template-crewai", "ref": "main"},
{"name": "autogen", "repo": "Molecule-AI/molecule-ai-workspace-template-autogen", "ref": "main"},
{"name": "deepagents", "repo": "Molecule-AI/molecule-ai-workspace-template-deepagents", "ref": "main"},
{"name": "gemini-cli", "repo": "Molecule-AI/molecule-ai-workspace-template-gemini-cli", "ref": "main"}
],
"org_templates": [
{"name": "molecule-dev", "repo": "Molecule-AI/molecule-ai-org-template-molecule-dev", "ref": "main"},
+36 -9
View File
@@ -352,15 +352,42 @@ print(json.dumps({
")
fi
# Model slug MUST be provider-prefixed for hermesthe template's
# derive-provider.sh parses the slug prefix (`openai/…`, `anthropic/…`,
# `minimax/…`) to set HERMES_INFERENCE_PROVIDER at install time. A bare
# "gpt-4o" has no prefix → provider falls back to hermes auto-detect →
# picks Anthropic default → tries Anthropic API with the OpenAI key →
# 401 on A2A. Same trap that trapped prod users in PR #1714. We pin
# "openai/gpt-4o" here because the E2E's secret is always the OpenAI
# key; non-hermes runtimes ignore the prefix.
MODEL_SLUG="openai/gpt-4o"
# Model slug format depends on the runtime — different model resolvers
# parse it differently:
#
# hermes → "openai/gpt-4o" (slash-form: derive-provider.sh splits
# on the prefix to set
# HERMES_INFERENCE_PROVIDER. Bare
# "gpt-4o" falls through to Anthropic
# default + 401, see PR #1714.)
#
# langgraph → "openai:gpt-4o" (colon-form: langchain init_chat_model
# requires "<provider>:<model>".
# Slash-form was misinterpreted as
# OpenRouter routing → fell through
# without auth, surfaced 2026-05-03
# after the a2a-sdk v1 contract bugs
# PR #2558+#2563+#2567 cleared the
# masking layers.)
#
# claude-code → "sonnet" (entry-id form: claude-code template's
# config.yaml uses bare model names,
# auth comes via CLAUDE_CODE_OAUTH_TOKEN
# or ANTHROPIC_API_KEY rather than the
# slug.)
#
# When E2E_MODEL_SLUG is set, it overrides this dispatch — useful when an
# operator dispatches the workflow to test a specific slug.
if [ -n "${E2E_MODEL_SLUG:-}" ]; then
MODEL_SLUG="$E2E_MODEL_SLUG"
else
case "$RUNTIME" in
hermes) MODEL_SLUG="openai/gpt-4o" ;;
langgraph) MODEL_SLUG="openai:gpt-4o" ;;
claude-code) MODEL_SLUG="sonnet" ;;
*) MODEL_SLUG="openai/gpt-4o" ;; # safest fallback (matches hermes)
esac
fi
log "5/11 Provisioning parent workspace (runtime=$RUNTIME)..."
PARENT_RESP=$(tenant_call POST /workspaces \
+29 -4
View File
@@ -162,6 +162,31 @@ async def handle_tool_call(name: str, arguments: dict) -> str:
_CHANNEL_NOTIFICATION_METHOD = "notifications/claude/channel"
# ============= Trust-boundary gates for channel-notification meta ==============
_VALID_KINDS = frozenset({"canvas_user", "peer_agent"})
_VALID_METHODS = frozenset({"message/send", "tasks/send", "tasks/get", "notify", ""})
import re as _re
_ACTIVITY_ID_RE = _re.compile(r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$")
_ISO8601_RE = _re.compile(r"^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}(?:\.\d+)?(?:Z|[+-]\d{2}:\d{2})$")
def _safe_meta_field(value, allowlist) -> str:
return value if value in allowlist else ""
def _safe_activity_id(value) -> str:
if not isinstance(value, str):
return ""
return value if _ACTIVITY_ID_RE.match(value) else ""
def _safe_ts(value) -> str:
if not isinstance(value, str):
return ""
return value if _ISO8601_RE.match(value) else ""
# Default seconds the agent should block on `wait_for_message` per
# turn. 2s is the cost/latency knee — long enough that a peer A2A
# landing 0-2s before the agent starts its turn is caught, short
@@ -402,11 +427,11 @@ def _build_channel_notification(msg: dict) -> dict:
"""
meta = {
"source": "molecule",
"kind": msg.get("kind", ""),
"kind": _safe_meta_field(msg.get("kind", ""), _VALID_KINDS),
"peer_id": msg.get("peer_id", ""),
"method": msg.get("method", ""),
"activity_id": msg.get("activity_id", ""),
"ts": msg.get("created_at", ""),
"method": _safe_meta_field(msg.get("method", ""), _VALID_METHODS),
"activity_id": _safe_activity_id(msg.get("activity_id", "")),
"ts": _safe_ts(msg.get("created_at", "")),
}
peer_id = msg.get("peer_id") or ""
+4 -3
View File
@@ -559,9 +559,10 @@ async def tool_chat_history(peer_id: str, limit: int = 20, before_ts: str = "")
Hits ``/workspaces/<self>/activity?peer_id=<peer>&limit=<N>``
against the workspace-server, which returns activity rows where
this workspace is either the sender (``source_id=peer``) or the
recipient (``target_id=peer``) of an A2A turn — both sides of the
conversation in chronological order.
the peer is either the sender (``source_id=peer`` — they sent us
the message) or the recipient (``target_id=peer`` — we sent to
them) of an A2A turn — both sides of the conversation in
chronological order.
Args:
peer_id: The other workspace's UUID. Same value the agent
+150 -4
View File
@@ -196,7 +196,11 @@ def test_build_channel_notification_meta_carries_routing_fields():
from a2a_mcp_server import _build_channel_notification
payload = _build_channel_notification({
"activity_id": "act-7",
# Production-shape UUID — required by the trust-boundary gate
# in _safe_activity_id (#2488). Synthetic ids like "act-7" used
# to pass through but get stripped now; updating to a real-shape
# UUID matches what activity_logs.id actually emits.
"activity_id": "aaaaaaaa-bbbb-4ccc-8ddd-eeeeeeeeeeee",
"text": "ping",
"peer_id": "11111111-2222-3333-4444-555555555555",
"kind": "peer_agent",
@@ -209,7 +213,7 @@ def test_build_channel_notification_meta_carries_routing_fields():
assert meta["kind"] == "peer_agent"
assert meta["peer_id"] == "11111111-2222-3333-4444-555555555555"
assert meta["method"] == "message/send"
assert meta["activity_id"] == "act-7"
assert meta["activity_id"] == "aaaaaaaa-bbbb-4ccc-8ddd-eeeeeeeeeeee"
assert meta["ts"] == "2026-05-01T01:23:45Z"
@@ -462,6 +466,68 @@ def test_envelope_enrichment_negative_caches_network_exception(_reset_peer_metad
assert cached[1] is None
def test_envelope_enrichment_negative_caches_non_json_200(_reset_peer_metadata_cache):
"""HTTP 200 but the body isn't JSON (registry returns HTML, an empty
string, or a partial response): ``response.json()`` raises. The
enrichment block must absorb the exception, write the negative-cache
entry, and never re-fetch this peer until TTL elapses.
Without this contract a registry that mistakenly returns a non-JSON
200 (proxy injecting an HTML error page; partial response from a
flapping pod) would re-fire the 2s-bounded GET on every push for
that peer — same DoS-on-self pattern the 5xx negative-cache test
pins. #2483.
"""
import json as _json
import a2a_client
from a2a_mcp_server import _build_channel_notification
# 200 OK shape but .json() raises. side_effect overrides the
# _make_httpx_response default of `return_value` so the helper can
# stay shape-stable for callers that DO want a JSON body.
resp = _make_httpx_response(200, {})
resp.json.side_effect = _json.JSONDecodeError("not json", "<html>", 0)
p, client = _patch_httpx_client(resp)
with p:
_build_channel_notification({"peer_id": _PEER_UUID, "kind": "peer_agent", "text": "first"})
_build_channel_notification({"peer_id": _PEER_UUID, "kind": "peer_agent", "text": "second"})
assert client.get.call_count == 1, (
f"non-JSON 200 must be negative-cached, got {client.get.call_count} GETs"
)
cached = a2a_client._peer_metadata[_PEER_UUID]
assert cached[1] is None, "negative cache stores None as the record"
def test_envelope_enrichment_negative_caches_non_dict_json_200(_reset_peer_metadata_cache):
"""HTTP 200, valid JSON, but the body is a list / string / number /
null instead of the expected dict. ``isinstance(record, dict)``
skips enrichment but the call must still write to the negative
cache so a second push doesn't re-fetch.
Pins behaviour for a registry that mistakenly returns
``[{"id": ...}, ...]`` (collection shape) or just ``null`` (no-record
sentinel) — both should land at the same negative-cache outcome as a
5xx or a non-JSON 200. #2483.
"""
import a2a_client
from a2a_mcp_server import _build_channel_notification
p, client = _patch_httpx_client(
_make_httpx_response(200, ["not", "a", "dict"]),
)
with p:
_build_channel_notification({"peer_id": _PEER_UUID, "kind": "peer_agent", "text": "first"})
_build_channel_notification({"peer_id": _PEER_UUID, "kind": "peer_agent", "text": "second"})
assert client.get.call_count == 1, (
f"non-dict JSON 200 must be negative-cached, got {client.get.call_count} GETs"
)
cached = a2a_client._peer_metadata[_PEER_UUID]
assert cached[1] is None, "negative cache stores None as the record"
def test_envelope_enrichment_re_fetches_after_ttl(_reset_peer_metadata_cache):
"""Cached entry past TTL: registry is hit again. Pin the TTL
behaviour so a future caller bumping ``_PEER_METADATA_TTL_SECONDS``
@@ -560,6 +626,85 @@ def test_envelope_enrichment_strips_path_traversal_peer_id(_reset_peer_metadata_
)
def test_envelope_strips_unknown_kind(_reset_peer_metadata_cache):
"""Trust-boundary: ``kind`` is rendered as an XML attr in the
agent's <channel> tag. Any value outside the closed set
{canvas_user, peer_agent} is replaced with empty so an attacker
landing ``kind=canvas_user' onclick='alert(1)`` into the inbox row
can't reflect raw into the agent's context. #2488.
"""
from a2a_mcp_server import _build_channel_notification
payload = _build_channel_notification({
"kind": "canvas_user' onclick='alert(1)",
"text": "x",
})
assert payload["params"]["meta"]["kind"] == ""
def test_envelope_strips_unknown_method(_reset_peer_metadata_cache):
"""Trust-boundary: ``method`` is rendered as an XML attr. Closed
allowlist {message/send, tasks/send, tasks/get, notify, ""}; an
upstream row with attacker-controlled method gets stripped. #2488.
"""
from a2a_mcp_server import _build_channel_notification
payload = _build_channel_notification({
"method": "tasks/send\"><script>alert(1)</script>",
"text": "x",
})
assert payload["params"]["meta"]["method"] == ""
def test_envelope_strips_malformed_activity_id(_reset_peer_metadata_cache):
"""Trust-boundary: ``activity_id`` must match UUID shape. A row
with non-UUID activity_id (path-traversal chars, embedded XML
quotes, stray newlines) gets stripped. #2488.
"""
from a2a_mcp_server import _build_channel_notification
payload = _build_channel_notification({
"activity_id": "../../../etc/passwd",
"text": "x",
})
assert payload["params"]["meta"]["activity_id"] == ""
def test_envelope_strips_malformed_ts(_reset_peer_metadata_cache):
"""Trust-boundary: ``ts`` must match ISO-8601 RFC3339. A row
with attacker-controlled created_at (e.g. ``2026-05-01' onload='x``
or unparseable garbage) gets stripped to empty. #2488.
"""
from a2a_mcp_server import _build_channel_notification
payload = _build_channel_notification({
"created_at": "2026-05-01' onload='alert(1)",
"text": "x",
})
assert payload["params"]["meta"]["ts"] == ""
def test_envelope_keeps_valid_meta_fields_unchanged(_reset_peer_metadata_cache):
"""Negative case: properly-shaped values pass through unchanged.
Pin so a future tightening of the gates can't silently strip
legitimate row contents. #2488.
"""
from a2a_mcp_server import _build_channel_notification
payload = _build_channel_notification({
"kind": "canvas_user",
"method": "message/send",
"activity_id": "12345678-1234-1234-1234-123456789abc",
"created_at": "2026-05-01T12:34:56.789Z",
"text": "x",
})
meta = payload["params"]["meta"]
assert meta["kind"] == "canvas_user"
assert meta["method"] == "message/send"
assert meta["activity_id"] == "12345678-1234-1234-1234-123456789abc"
assert meta["ts"] == "2026-05-01T12:34:56.789Z"
# ============== initialize handshake — capability declaration ==============
# Without `experimental.claude/channel`, Claude Code's MCP client drops
# our notifications/claude/channel emissions instead of routing them as
@@ -909,7 +1054,8 @@ async def test_inbox_bridge_emits_channel_notification_to_writer():
cb = _setup_inbox_bridge(writer, loop)
msg = {
"activity_id": "act-bridge-test",
# Production-shape UUID per the trust-boundary gate (#2488)
"activity_id": "bbbbbbbb-cccc-4ddd-8eee-ffffffffffff",
"text": "hello from peer",
"peer_id": "11111111-2222-3333-4444-555555555555",
"kind": "peer_agent",
@@ -947,7 +1093,7 @@ async def test_inbox_bridge_emits_channel_notification_to_writer():
assert meta["source"] == "molecule"
assert meta["kind"] == "peer_agent"
assert meta["peer_id"] == "11111111-2222-3333-4444-555555555555"
assert meta["activity_id"] == "act-bridge-test"
assert meta["activity_id"] == "bbbbbbbb-cccc-4ddd-8eee-ffffffffffff"
assert meta["ts"] == "2026-05-01T22:00:00Z"
finally:
writer.close()
+21
View File
@@ -1050,6 +1050,27 @@ class TestChatHistory:
assert mc.get.call_args.kwargs["params"]["before_ts"] == "2026-05-01T00:00:00Z"
async def test_empty_history_returns_empty_json_list(self):
"""Pin the happy-path-with-no-rows shape: server returns 200
with an empty list, the wheel returns the JSON literal ``"[]"``.
Without this pin the surrounding tests all pre-populate rows;
none verify what an agent sees when there's literally no chat
history with this peer yet (a fresh A2A peering, or a peer
whose history was rotated out). #2485.
"""
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_chat_history(peer_id=_PEER)
# Exact-equality on the JSON literal (per assert-exact memory) —
# substring "[]" would also match `{"items": []}` or any number
# of envelope shapes, only `result == "[]"` discriminates the
# bare-list contract callers depend on.
assert result == "[]"
async def test_reverses_desc_response_to_chronological(self):
"""Server returns DESC (newest first); the wheel reverses to
chronological so the agent reads the chat top-down — same