Merge pull request #2576 from Molecule-AI/staging

staging → main: auto-promote effbcd7
Merge pull request #2574 from Molecule-AI/auto-sync/main-55d85147
2026-05-03 06:44:31 -07:00 · 2026-05-03 13:18:34 +00:00 · 2026-05-03 06:07:20 -07:00 · 2026-05-03 12:58:52 +00:00 · 2026-05-03 05:56:18 -07:00 · 2026-05-03 12:45:48 +00:00
7 changed files with 83 additions and 27 deletions
@@ -327,13 +327,19 @@ jobs:
            echo "::error::publish job did not expose a version output — cascade cannot fan out"
            exit 1
          fi
-          # Source of truth: manifest.json workspace_templates (PR #2536 pruned
-          # to 4 actively-supported runtimes: claude-code, hermes, openclaw, codex).
-          # Removed langgraph/crewai/autogen/deepagents/gemini-cli (deprecated, no
-          # shipping images); added codex (had been missing since #2512).
-          # Long-term: derive this list from manifest.json so the cascade can't
-          # drift again — tracked in RFC #388 as a Phase-1 invariant.
-          TEMPLATES="claude-code hermes openclaw codex"
+          # All 9 active workspace template repos. The PR #2536 pruning
+          # ("deprecated, no shipping images") was empirically wrong:
+          # continuous-synth-e2e.yml defaults to langgraph as its primary
+          # canary (line 44), and every excluded template had successful
+          # publish-image runs as of 2026-05-03 — none were dormant.
+          # Symptom of the prune: today's a2a-sdk strict-mode fix
+          # (#2566 / commit e1628c4) cascaded to 4 templates but never
+          # reached langgraph, so the synth-E2E correctly canary'd a fix
+          # that had landed but not deployed. Re-added the 5 templates.
+          # Long-term: derive this list from manifest.json so cascade
+          # scope can't drift from E2E scope — tracked in RFC #388 as a
+          # Phase-1 invariant.
+          TEMPLATES="claude-code hermes openclaw codex langgraph crewai autogen deepagents gemini-cli"
          FAILED=""
          for tpl in $TEMPLATES; do
            REPO="Molecule-AI/molecule-ai-workspace-template-$tpl"
@@ -17,7 +17,7 @@ name: redeploy-tenants-on-main
 #   1. publish-workspace-server-image completes → new :latest in GHCR.
 #   2. This workflow fires via workflow_run, waits 30s for GHCR's
 #      CDN to propagate the new tag to the region the tenants pull from.
-#   3. Calls redeploy-fleet with canary_slug=hongmingwang and a 60s
+#   3. Calls redeploy-fleet with canary_slug=hongming and a 60s
 #      soak. Canary proves the image boots; batches follow.
 #   4. Any failure aborts the rollout and leaves older tenants on the
 #      prior image — safer default than half-and-half state.
@@ -56,7 +56,12 @@ on:
        description: 'Tenant slug to deploy first + soak (empty = skip canary, fan out immediately).'
        required: false
        type: string
-        default: 'hongmingwang'
+        # Must be an actual prod tenant slug (current: hongming,
+        # chloe-dong, reno-stars). The previous default 'hongmingwang'
+        # didn't match any tenant — CP soft-skipped the missing canary
+        # and the fleet rolled out without the soak gate, defeating the
+        # whole point of canary-first.
+        default: 'hongming'
      soak_seconds:
        description: 'Seconds to wait after canary before fanning out.'
        required: false
@@ -148,7 +153,7 @@ jobs:
          CP_URL: ${{ vars.CP_URL || 'https://api.moleculesai.app' }}
          CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
          TARGET_TAG: ${{ steps.tag.outputs.target_tag }}
-          CANARY_SLUG: ${{ inputs.canary_slug || 'hongmingwang' }}
+          CANARY_SLUG: ${{ inputs.canary_slug || 'hongming' }}
          SOAK_SECONDS: ${{ inputs.soak_seconds || '60' }}
          BATCH_SIZE: ${{ inputs.batch_size || '3' }}
          DRY_RUN: ${{ inputs.dry_run || false }}
@@ -1,11 +1,23 @@
 "use client";

 import { useEffect, useState } from "react";
-import { STATUS_CONFIG } from "@/lib/design-tokens";
+import { STATUS_CONFIG, TIER_CONFIG } from "@/lib/design-tokens";
 import { useCanvasStore } from "@/store/canvas";

 const LEGEND_STATUSES = ["online", "provisioning", "degraded", "failed", "paused", "offline"] as const;

+// Tier descriptions kept in sync with CreateWorkspaceDialog.tsx (the
+// source of truth for what each tier means semantically). Colors come
+// from TIER_CONFIG so the legend swatch matches the badge actually
+// rendered on every WorkspaceNode — drift here misled users into
+// thinking the legend documented a different tier than the one shown.
+const LEGEND_TIERS: ReadonlyArray<{ tier: number; label: string }> = [
+  { tier: 1, label: "Sandboxed" },
+  { tier: 2, label: "Standard" },
+  { tier: 3, label: "Privileged" },
+  { tier: 4, label: "Full Access" },
+];
+
 // Persist the user's choice across sessions. Default is "open" so
 // first-time users still see the symbol key; once dismissed we
 // respect that until they explicitly reopen via the floating pill.
@@ -102,9 +114,9 @@ export function Legend() {
      <div className="mb-2">
        <div className="text-[11px] text-ink-soft font-medium mb-1">Tier</div>
        <div className="flex flex-wrap gap-x-3 gap-y-1">
-          <TierItem tier={1} label="Sandboxed" color="text-sky-300 bg-sky-950/40 border-sky-700/30" />
-          <TierItem tier={2} label="Standard" color="text-violet-300 bg-violet-950/40 border-violet-700/30" />
-          <TierItem tier={3} label="Full Access" color="text-warm bg-amber-950/40 border-amber-700/30" />
+          {LEGEND_TIERS.map(({ tier, label }) => (
+            <TierItem key={tier} tier={tier} label={label} color={TIER_CONFIG[tier].border} />
+          ))}
        </div>
      </div>

@@ -655,7 +655,8 @@ export function ConfigTab({ workspaceId }: Props) {
                >
                  <option value={1}>T1 — Sandboxed</option>
                  <option value={2}>T2 — Standard</option>
-                  <option value={3}>T3 — Full Access</option>
+                  <option value={3}>T3 — Privileged</option>
+                  <option value={4}>T4 — Full Access</option>
                </select>
              </div>
            </div>
@@ -59,8 +59,8 @@ export function getTenantSlug(): string {
 * isSaaSTenant reports whether the canvas is running as the UI for a
 * SaaS tenant (served at <slug>.moleculesai.app). Use for client-side
 * UX branches that should behave differently on SaaS vs self-hosted —
- * e.g. the workspace tier picker hides T1/T2 sandbox tiers because every
- * SaaS workspace gets its own EC2 VM (inherently T3 Full Access).
+ * e.g. the workspace tier picker hides T1/T2/T3 sandbox tiers because
+ * every SaaS workspace gets its own EC2 VM (inherently T4 Full Access).
 *
 * SSR-safe: returns false on the server to avoid hydration drift; call
 * sites should tolerate a flip from false→true on first client render.
@@ -28,7 +28,12 @@
    {"name": "claude-code-default", "repo": "Molecule-AI/molecule-ai-workspace-template-claude-code", "ref": "main"},
    {"name": "hermes", "repo": "Molecule-AI/molecule-ai-workspace-template-hermes", "ref": "main"},
    {"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"},
-    {"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"}
+    {"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"},
+    {"name": "langgraph", "repo": "Molecule-AI/molecule-ai-workspace-template-langgraph", "ref": "main"},
+    {"name": "crewai", "repo": "Molecule-AI/molecule-ai-workspace-template-crewai", "ref": "main"},
+    {"name": "autogen", "repo": "Molecule-AI/molecule-ai-workspace-template-autogen", "ref": "main"},
+    {"name": "deepagents", "repo": "Molecule-AI/molecule-ai-workspace-template-deepagents", "ref": "main"},
+    {"name": "gemini-cli", "repo": "Molecule-AI/molecule-ai-workspace-template-gemini-cli", "ref": "main"}
  ],
  "org_templates": [
    {"name": "molecule-dev", "repo": "Molecule-AI/molecule-ai-org-template-molecule-dev", "ref": "main"},
@@ -352,15 +352,42 @@ print(json.dumps({
 ")
 fi

-# Model slug MUST be provider-prefixed for hermes — the template's
-# derive-provider.sh parses the slug prefix (`openai/…`, `anthropic/…`,
-# `minimax/…`) to set HERMES_INFERENCE_PROVIDER at install time. A bare
-# "gpt-4o" has no prefix → provider falls back to hermes auto-detect →
-# picks Anthropic default → tries Anthropic API with the OpenAI key →
-# 401 on A2A. Same trap that trapped prod users in PR #1714. We pin
-# "openai/gpt-4o" here because the E2E's secret is always the OpenAI
-# key; non-hermes runtimes ignore the prefix.
-MODEL_SLUG="openai/gpt-4o"
+# Model slug format depends on the runtime — different model resolvers
+# parse it differently:
+#
+#   hermes      → "openai/gpt-4o"  (slash-form: derive-provider.sh splits
+#                                    on the prefix to set
+#                                    HERMES_INFERENCE_PROVIDER. Bare
+#                                    "gpt-4o" falls through to Anthropic
+#                                    default + 401, see PR #1714.)
+#
+#   langgraph   → "openai:gpt-4o"  (colon-form: langchain init_chat_model
+#                                    requires "<provider>:<model>".
+#                                    Slash-form was misinterpreted as
+#                                    OpenRouter routing → fell through
+#                                    without auth, surfaced 2026-05-03
+#                                    after the a2a-sdk v1 contract bugs
+#                                    PR #2558+#2563+#2567 cleared the
+#                                    masking layers.)
+#
+#   claude-code → "sonnet"         (entry-id form: claude-code template's
+#                                    config.yaml uses bare model names,
+#                                    auth comes via CLAUDE_CODE_OAUTH_TOKEN
+#                                    or ANTHROPIC_API_KEY rather than the
+#                                    slug.)
+#
+# When E2E_MODEL_SLUG is set, it overrides this dispatch — useful when an
+# operator dispatches the workflow to test a specific slug.
+if [ -n "${E2E_MODEL_SLUG:-}" ]; then
+  MODEL_SLUG="$E2E_MODEL_SLUG"
+else
+  case "$RUNTIME" in
+    hermes)      MODEL_SLUG="openai/gpt-4o" ;;
+    langgraph)   MODEL_SLUG="openai:gpt-4o" ;;
+    claude-code) MODEL_SLUG="sonnet" ;;
+    *)           MODEL_SLUG="openai/gpt-4o" ;;  # safest fallback (matches hermes)
+  esac
+fi

 log "5/11 Provisioning parent workspace (runtime=$RUNTIME)..."
 PARENT_RESP=$(tenant_call POST /workspaces \
Author	SHA1	Message	Date
Hongming Wang	b002247f12	Merge pull request #2576 from Molecule-AI/staging staging → main: auto-promote `effbcd7`	2026-05-03 06:44:31 -07:00
Hongming Wang	03bcce3eb3	Merge pull request #2574 from Molecule-AI/auto-sync/main-55d85147 chore: sync main → staging (auto, ff to `55d85147`)	2026-05-03 13:18:34 +00:00
Hongming Wang	c74e71d604	Merge branch 'staging' into auto-sync/main-55d85147	2026-05-03 06:07:20 -07:00
Hongming Wang	d7f88674d8	Merge pull request #2577 from Molecule-AI/fix/canvas-tier-legend-t3-t4-contract fix(canvas): align tier text contracts with 4-tier reality (T1/T2/T3/T4)	2026-05-03 12:58:52 +00:00
Hongming Wang	7abb94dab8	fix(canvas): align tier text contracts with 4-tier reality (T1/T2/T3/T4) The tier system in CreateWorkspaceDialog and design-tokens has been T1 Sandboxed / T2 Standard / T3 Privileged / T4 Full Access, but two chrome surfaces still showed the older 3-tier mapping with T3 as "Full Access": - Legend (bottom-left chrome on every canvas page) listed only T1/T2/T3 and called T3 "Full Access". On a SaaS tenant the actual workspace badges render T4 (in amber/warm) — there was no T4 entry in the legend at all, so the user sees an undocumented orange badge. - ConfigTab tier dropdown (per-workspace settings → Sandboxing) had no T4 option at all and called T3 "Full Access". So an existing T4 workspace would show "T3 — Full Access" as the selected option, silently downgrading the displayed tier on the settings panel. - tenant.ts isSaaSTenant() doc comment claimed SaaS workspaces are "inherently T3 Full Access" — wrong on both the number and the lock rationale (SaaS hides T1/T2/T3, not just T1/T2). Fix: - Legend now imports TIER_CONFIG and renders all four tiers (Sandboxed/Standard/Privileged/Full Access) using the same color swatches as the badges on workspace cards. Eliminates the previous drift where Legend's hardcoded sky/violet/warm chips didn't match the gray/sky/violet/amber actually rendered on nodes. - ConfigTab adds the missing T4 — Full Access option and renames T3 to Privileged. - tenant.ts comment updated to match the picker's actual hide list. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 05:56:18 -07:00
Hongming Wang	effbcd737b	Merge pull request #2575 from Molecule-AI/fix/cascade-include-all-active-templates fix(publish-runtime): re-add 5 templates wrongly removed from cascade — fixes #2566	2026-05-03 12:45:48 +00:00
Hongming Wang	6eb79adfd5	manifest: re-add 5 workspace templates pruned by #2536 The cascade-list-vs-manifest drift gate (PR #2556's behavior-based test) caught my previous-commit cascade additions as 'extra-in-cascade'. Manifest is the source of truth — restoring there. All 5 templates have successful publish-image runs in the past 24h (verified before the cascade fix), and continuous-synth-e2e defaults to langgraph as its primary canary. None deprecated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 05:43:07 -07:00
Hongming Wang	8f48a38550	fix(publish-runtime): re-add 5 templates wrongly removed from cascade (#2566 ) The PR #2536 cascade prune ('deprecated, no shipping images') was empirically wrong. Re-confirmed 2026-05-03: - continuous-synth-e2e.yml defaults to langgraph as its primary canary - All 5 'deprecated' templates have successful publish-image runs in the past 24h: langgraph, crewai, autogen, deepagents, gemini-cli Symptom this fixes — issue #2566 (priority-high, failing 36+h): Synthetic E2E (staging): langgraph adapter A2A failure 'Received Message object in task mode' — failing for >36h Today at 11:06 commit `e1628c4` fixed the underlying a2a-sdk strict-mode issue in workspace/a2a_executor.py. publish-runtime fired at 11:13 and cascaded — but only to claude-code, hermes, openclaw, codex. langgraph was excluded by the prune, so its image stayed on the broken runtime and the synth E2E (which defaults to langgraph) kept failing despite the fix being live in PyPI. After this lands + the next runtime publish fires, langgraph image re-bakes with the fix and synth-E2E goes green. Test plan: - [x] yaml-validate the workflow - [ ] After merge, watch publish-runtime cascade to all 9 templates - [ ] Confirm langgraph publish-image fires + succeeds - [ ] Confirm next continuous-synth-e2e run goes green Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 05:41:53 -07:00
github-actions[bot]	55d85147f7	Merge pull request #2573 from Molecule-AI/staging staging → main: auto-promote `dc6425f`	2026-05-03 05:34:23 -07:00
github-actions[bot]	f7e8f98cf7	Merge pull request #2570 from Molecule-AI/staging staging → main: auto-promote `173e22e`	2026-05-03 12:22:52 +00:00
Hongming Wang	dc6425fe39	Merge pull request #2571 from Molecule-AI/fix/synth-e2e-model-slug-by-runtime fix(synth-e2e): branch MODEL_SLUG by runtime so langgraph gets colon-form	2026-05-03 12:22:19 +00:00
Hongming Wang	cbc69f5e7e	fix(synth-e2e): branch MODEL_SLUG by runtime so langgraph gets colon-form The original script hardcoded `MODEL_SLUG="openai/gpt-4o"` (slash) and claimed "non-hermes runtimes ignore the prefix" — wrong for langgraph, which delegates model resolution to langchain's `init_chat_model`. That function requires `<provider>:<model>` (colon) and treats slash-form as OpenRouter routing, falling through without auth even when OPENAI_API_KEY is set. Surfaced 2026-05-03 after the a2a-sdk v1 contract bugs (PR #2558+#2563+#2567) cleared the masking layers — synth-E2E firing 2026-05-03T12:14 returned a properly-shaped task with state=failed + "Could not resolve authentication method" inside the agent body. continuous-synth-e2e.yml defaults E2E_RUNTIME=langgraph for the cron, so every firing hit this. Hermes still gets the slash-form it needs; claude-code uses the entry-id pattern. Adds E2E_MODEL_SLUG override for operator-dispatched runs that want to pin a specific slug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 05:17:55 -07:00
Hongming Wang	c71f641b12	Merge pull request #2569 from Molecule-AI/fix/redeploy-canary-default ci(redeploy): fix stale canary_slug default 'hongmingwang' → 'hongming'	2026-05-03 12:08:26 +00:00
Hongming Wang	173e22e091	Merge pull request #2568 from Molecule-AI/auto-sync/main-c0838d63 chore: sync main → staging (auto, ff to `c0838d63`)	2026-05-03 12:07:29 +00:00
Hongming Wang	60a516bc8d	ci(redeploy): fix stale canary_slug default 'hongmingwang' → 'hongming' The workflow_dispatch input default and the workflow_run env fallback both pointed at 'hongmingwang', which doesn't match any current prod tenant (slugs are: hongming, chloe-dong, reno-stars). CP silently skipped the missing canary and put every tenant in batch-1 in parallel, defeating the canary-first soak gate that exists to catch image-boot regressions before they hit the whole fleet. Concrete example from today's `c0838d6` redeploy at 11:53Z (run 25278434388): the dispatched body was `{"target_tag":"staging-c0838d6","canary_slug":"hongmingwang",...}` and the CP response showed all 3 tenants in `"phase":"batch-1"` — no soak, no canary. The deploy happened to be safe, but a broken image would have hit hongming + chloe-dong + reno-stars simultaneously. Fixed in three places: the runtime ordering comment, the workflow_dispatch default, and the env fallback used by the workflow_run trigger. Comment documents the rationale so the next slug rename doesn't silently regress this again. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 05:06:01 -07:00