forked from molecule-ai/molecule-core
Compare commits
62 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 120bb1f0a2 | |||
| cfd5ec8d82 | |||
| a4a32cded5 | |||
| 257079c7a2 | |||
| 0567502316 | |||
| 7cba0477cc | |||
| ff3dcd37f6 | |||
| 4e72f1d1db | |||
| e22f7969f8 | |||
| 3d145da99d | |||
| 46c8c1de23 | |||
| 6d38b96043 | |||
| 270a95aa67 | |||
| 6431bdc631 | |||
| 72b6be82b0 | |||
| b42599585e | |||
| 06bfed2e35 | |||
| 80b38900de | |||
| d1eab79d28 | |||
| 824a2a7657 | |||
| 876d6ec8c9 | |||
| 63e3d385d6 | |||
| 2e78812ff9 | |||
| 9664d66e4b | |||
| 19cc83313a | |||
| 097d513b65 | |||
| 2b3f44c3c8 | |||
| c45aa8d7ee | |||
| b4e45374bf | |||
| f2d69f0088 | |||
| bc11ed8a2b | |||
| e2328abedc | |||
| bdad75ae3e | |||
| 90ba2cd4df | |||
| b002247f12 | |||
| 03bcce3eb3 | |||
| c74e71d604 | |||
| d7f88674d8 | |||
| 7abb94dab8 | |||
| effbcd737b | |||
| 6eb79adfd5 | |||
| 8f48a38550 | |||
| 55d85147f7 | |||
| f7e8f98cf7 | |||
| dc6425fe39 | |||
| cbc69f5e7e | |||
| c71f641b12 | |||
| 173e22e091 | |||
| 60a516bc8d | |||
| c0838d637e | |||
| 493ab2566e | |||
| 5e46ea70d6 | |||
| 5cf3dc4369 | |||
| 596e797dca | |||
| 3ce638d6e6 | |||
| df7edfcd3f | |||
| 3ecb25eb4f | |||
| e1628c4d56 | |||
| 78721f7a42 | |||
| 09010212a0 | |||
| bb63e60114 | |||
| 06240ab67b |
@@ -111,7 +111,60 @@ jobs:
|
||||
all_green: ${{ steps.gates.outputs.all_green }}
|
||||
head_sha: ${{ steps.gates.outputs.head_sha }}
|
||||
steps:
|
||||
# Skip empty-tree promotes (the perpetual auto-promote↔auto-sync cycle
|
||||
# observed 2026-05-03). Sequence: auto-promote merges via the staging
|
||||
# merge-queue's MERGE strategy, creating a merge commit on main that
|
||||
# staging doesn't have. auto-sync then merges main back into staging
|
||||
# via another merge commit (the queue's MERGE strategy applies on
|
||||
# the staging side too, even when the workflow's local FF would
|
||||
# have sufficed). Now staging has a new merge-commit SHA whose
|
||||
# tree == main's tree — but auto-promote sees "staging ahead of
|
||||
# main by 1" and opens YET another empty promote PR. Each round
|
||||
# costs ~30-40 min wallclock, ~2 manual approvals, and burns a
|
||||
# full CodeQL Go run (~15 min). Without this guard the cycle
|
||||
# repeats indefinitely.
|
||||
#
|
||||
# Long-term fix is to switch the merge_queue ruleset's
|
||||
# `merge_method` away from MERGE so FF-able PRs land cleanly,
|
||||
# but that's a broader change affecting every staging PR's
|
||||
# commit shape. This guard is the one-line surgical fix that
|
||||
# breaks the cycle without touching merge-queue config.
|
||||
#
|
||||
# Fail-open: if `git diff` errors for any reason, fall through
|
||||
# to the gate check (preserve existing behavior). Only skip
|
||||
# when the diff is DEFINITIVELY empty.
|
||||
- name: Checkout for tree-diff check
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
with:
|
||||
fetch-depth: 0
|
||||
ref: staging
|
||||
- name: Skip if staging tree == main tree (perpetual-cycle break)
|
||||
id: tree-diff
|
||||
env:
|
||||
HEAD_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
|
||||
run: |
|
||||
set -eu
|
||||
git fetch origin main --depth=50 || { echo "::warning::git fetch main failed — proceeding (fail-open)"; exit 0; }
|
||||
# Compare staging tip's tree against main's tree. `git diff
|
||||
# --quiet` exits 0 if no differences, 1 if there are.
|
||||
if git diff --quiet origin/main "$HEAD_SHA" -- 2>/dev/null; then
|
||||
{
|
||||
echo "## ⏭ Skipped — no code to promote"
|
||||
echo
|
||||
echo "staging tip (\`${HEAD_SHA:0:8}\`) and \`main\` have identical trees."
|
||||
echo "This is the auto-promote↔auto-sync merge-commit cycle: staging has a"
|
||||
echo "new SHA (a sync-back merge commit) but the underlying file tree is"
|
||||
echo "already on main, so there's no real code to ship."
|
||||
echo
|
||||
echo "Skipping to avoid opening an empty promote PR. Cycle terminates here."
|
||||
} >> "$GITHUB_STEP_SUMMARY"
|
||||
echo "::notice::auto-promote: staging tree == main tree — no code to promote, skipping"
|
||||
echo "skip=true" >> "$GITHUB_OUTPUT"
|
||||
else
|
||||
echo "skip=false" >> "$GITHUB_OUTPUT"
|
||||
fi
|
||||
- name: Check all required gates on this SHA
|
||||
if: steps.tree-diff.outputs.skip != 'true'
|
||||
id: gates
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
@@ -209,10 +262,25 @@ jobs:
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Mint the App token BEFORE the promote-PR step so the auto-merge
|
||||
# call can use it. GITHUB_TOKEN-initiated merges suppress the
|
||||
# downstream `push` event on main, breaking the
|
||||
# publish-workspace-server-image → canary-verify → redeploy-tenants
|
||||
# chain (issue #2357). Using the App token here means the
|
||||
# merge-queue-landed merge IS able to fire the cascade naturally;
|
||||
# the polling tail below stays as defense-in-depth.
|
||||
- name: Mint App token for promote-PR + downstream dispatch
|
||||
if: ${{ vars.AUTO_PROMOTE_ENABLED == 'true' || github.event.inputs.force == 'true' }}
|
||||
id: app-token
|
||||
uses: actions/create-github-app-token@1b10c78c7865c340bc4f6099eb2f838309f1e8c3 # v3.1.1
|
||||
with:
|
||||
app-id: ${{ secrets.MOLECULE_AI_APP_ID }}
|
||||
private-key: ${{ secrets.MOLECULE_AI_APP_PRIVATE_KEY }}
|
||||
|
||||
- name: Open (or reuse) staging → main promote PR + enable auto-merge
|
||||
if: ${{ vars.AUTO_PROMOTE_ENABLED == 'true' || github.event.inputs.force == 'true' }}
|
||||
env:
|
||||
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
GH_TOKEN: ${{ steps.app-token.outputs.token }}
|
||||
REPO: ${{ github.repository }}
|
||||
TARGET_SHA: ${{ needs.check-all-gates-green.outputs.head_sha }}
|
||||
run: |
|
||||
@@ -267,52 +335,34 @@ jobs:
|
||||
echo "promote_pr_num=${PR_NUM}" >> "$GITHUB_OUTPUT"
|
||||
id: promote_pr
|
||||
|
||||
# Mint a short-lived GitHub App installation token for the dispatch
|
||||
# step below. We CANNOT use `secrets.GITHUB_TOKEN` to dispatch the
|
||||
# downstream publish chain — workflow runs created by GITHUB_TOKEN
|
||||
# do not fire `workflow_run` triggers on completion (the
|
||||
# documented "no recursion" rule —
|
||||
# https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#triggering-a-workflow-from-a-workflow).
|
||||
#
|
||||
# Symptom this caused (root-caused on 2026-04-30): publish-image
|
||||
# ran successfully twice (21313dc 14:41Z, 59dec57 15:21Z) but
|
||||
# canary-verify and redeploy-tenants-on-main never chained,
|
||||
# because the publish run's `triggering_actor` was
|
||||
# `github-actions[bot]` (i.e. GITHUB_TOKEN). A manual dispatch
|
||||
# earlier in the day with the operator's PAT (d850ec7 06:52Z) did
|
||||
# chain — same workflow file, only the actor differed.
|
||||
#
|
||||
# An App token's triggering_actor is the App user (e.g.
|
||||
# `molecule-ai[bot]`), which IS allowed to fire downstream
|
||||
# workflow_run cascades.
|
||||
- name: Mint App token for downstream dispatch
|
||||
if: steps.promote_pr.outputs.promote_pr_num != ''
|
||||
id: app-token
|
||||
uses: actions/create-github-app-token@1b10c78c7865c340bc4f6099eb2f838309f1e8c3 # v3.1.1
|
||||
with:
|
||||
app-id: ${{ secrets.MOLECULE_AI_APP_ID }}
|
||||
private-key: ${{ secrets.MOLECULE_AI_APP_PRIVATE_KEY }}
|
||||
|
||||
# The App token minted above (before the promote-PR step) is
|
||||
# also used by the polling tail below. Defense-in-depth: with
|
||||
# the merge-queue-landed merge now using the App token, the
|
||||
# main-branch push event SHOULD fire the publish/canary/redeploy
|
||||
# cascade naturally — but if for any reason it doesn't (e.g. an
|
||||
# unrelated event-suppression edge case), the explicit dispatches
|
||||
# below still wake the chain.
|
||||
- name: Wait for promote merge, then dispatch publish + redeploy (#2357)
|
||||
# GITHUB_TOKEN-initiated merges suppress downstream `push` events
|
||||
# (https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#triggering-a-workflow-from-a-workflow).
|
||||
# Result: when the merge queue lands the promote PR, the resulting
|
||||
# main-branch push DOES NOT fire publish-workspace-server-image,
|
||||
# so canary-verify and redeploy-tenants-on-main never run and
|
||||
# tenants stay on stale code (issue #2357).
|
||||
# Defense-in-depth dispatch. With the auto-merge call above
|
||||
# now using the App token (this commit), the merge-queue-landed
|
||||
# merge SHOULD fire publish-workspace-server-image naturally
|
||||
# via on:push:[main] — App-token-initiated pushes DO trigger
|
||||
# workflow_run cascades, unlike GITHUB_TOKEN-initiated ones
|
||||
# (the documented "no recursion" rule —
|
||||
# https://docs.github.com/en/actions/using-workflows/triggering-a-workflow#triggering-a-workflow-from-a-workflow).
|
||||
#
|
||||
# Workaround: poll for the merge to land, then explicitly
|
||||
# `gh workflow run` publish-workspace-server-image. The dispatch
|
||||
# MUST authenticate as the molecule-ai App (App token minted
|
||||
# above) — not GITHUB_TOKEN — so that the resulting publish
|
||||
# run's completion event can fire the workflow_run cascade
|
||||
# into canary-verify + redeploy-tenants-on-main. See the prior
|
||||
# step's comment for the GITHUB_TOKEN no-recursion details.
|
||||
# This explicit dispatch stays as belt-and-suspenders for any
|
||||
# edge case where the natural cascade misfires. If it never
|
||||
# observably fires after this token swap (i.e. the publish
|
||||
# workflow has already started by the time we get here), the
|
||||
# second dispatch is a harmless no-op (publish-workspace-server-image
|
||||
# has its own concurrency group that dedupes).
|
||||
#
|
||||
# Long-term fix: switch the auto-merge call above to use the
|
||||
# same App token, so the merge's push event fires
|
||||
# publish-workspace-server-image naturally and this polling tail
|
||||
# becomes unnecessary. Tracked in #2357.
|
||||
# See PR for #2357: pre-fix the merge action was via
|
||||
# GITHUB_TOKEN, suppressing the cascade and forcing this tail
|
||||
# to be the SOLE chain trigger. With the auto-merge token swap
|
||||
# the tail becomes redundant in the happy path; keep until
|
||||
# we've observed >=10 successful natural cascades, then drop.
|
||||
if: steps.promote_pr.outputs.promote_pr_num != ''
|
||||
env:
|
||||
GH_TOKEN: ${{ steps.app-token.outputs.token }}
|
||||
|
||||
@@ -0,0 +1,39 @@
|
||||
name: cascade-list-drift-gate
|
||||
|
||||
# Structural gate: TEMPLATES list in publish-runtime.yml must match
|
||||
# manifest.json's workspace_templates exactly. Closes the recurrence
|
||||
# path of PR #2556 (the data fix) and is the first concrete deliverable
|
||||
# of RFC #388 PR-3.
|
||||
#
|
||||
# Why a gate, not just discipline: PR #2536 pruned the manifest, but the
|
||||
# cascade list wasn't updated for ~weeks before someone (PR #2556)
|
||||
# noticed during an unrelated audit. During that window, codex never
|
||||
# rebuilt on a runtime publish. A structural gate catches the drift
|
||||
# the same day either file changes.
|
||||
#
|
||||
# Triggers narrowly to keep CI quiet: only on PRs that actually change
|
||||
# one of the two files. The path-filtered split + always-emit-result
|
||||
# pattern (memory: "Required check names need a job that always runs")
|
||||
# is unnecessary here because the workflow IS the check name and PR
|
||||
# branch protection should require it directly. Future-proof: if this
|
||||
# becomes a required check, add a no-op aggregator with always() so the
|
||||
# name still emits when paths don't match.
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
branches: [staging, main]
|
||||
paths:
|
||||
- manifest.json
|
||||
- .github/workflows/publish-runtime.yml
|
||||
- scripts/check-cascade-list-vs-manifest.sh
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
|
||||
jobs:
|
||||
check:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
|
||||
- name: Check cascade list matches manifest
|
||||
run: bash scripts/check-cascade-list-vs-manifest.sh
|
||||
@@ -88,6 +88,15 @@ jobs:
|
||||
E2E_KEEP_ORG: ${{ github.event.inputs.keep_org == 'true' && '1' || '' }}
|
||||
MOLECULE_CP_URL: ${{ vars.STAGING_CP_URL || 'https://staging-api.moleculesai.app' }}
|
||||
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
|
||||
# Provisioned tenant's default model (langgraph: openai:gpt-4.1-mini)
|
||||
# needs OPENAI_API_KEY at first call. Sibling workflows
|
||||
# e2e-staging-saas.yml + canary-staging.yml use the same secret;
|
||||
# without this wire-up the tenant boots, accepts a2a messages,
|
||||
# then returns "Could not resolve authentication method" — masked
|
||||
# earlier by the a2a-sdk task-mode contract bugs PR #2558+#2563
|
||||
# fixed. tests/e2e/test_staging_full_saas.sh:325 reads this and
|
||||
# persists it as a workspace_secret on tenant create.
|
||||
E2E_OPENAI_API_KEY: ${{ secrets.MOLECULE_STAGING_OPENAI_KEY }}
|
||||
steps:
|
||||
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
|
||||
|
||||
@@ -327,13 +327,19 @@ jobs:
|
||||
echo "::error::publish job did not expose a version output — cascade cannot fan out"
|
||||
exit 1
|
||||
fi
|
||||
# Source of truth: manifest.json workspace_templates (PR #2536 pruned
|
||||
# to 4 actively-supported runtimes: claude-code, hermes, openclaw, codex).
|
||||
# Removed langgraph/crewai/autogen/deepagents/gemini-cli (deprecated, no
|
||||
# shipping images); added codex (had been missing since #2512).
|
||||
# Long-term: derive this list from manifest.json so the cascade can't
|
||||
# drift again — tracked in RFC #388 as a Phase-1 invariant.
|
||||
TEMPLATES="claude-code hermes openclaw codex"
|
||||
# All 9 active workspace template repos. The PR #2536 pruning
|
||||
# ("deprecated, no shipping images") was empirically wrong:
|
||||
# continuous-synth-e2e.yml defaults to langgraph as its primary
|
||||
# canary (line 44), and every excluded template had successful
|
||||
# publish-image runs as of 2026-05-03 — none were dormant.
|
||||
# Symptom of the prune: today's a2a-sdk strict-mode fix
|
||||
# (#2566 / commit e1628c4) cascaded to 4 templates but never
|
||||
# reached langgraph, so the synth-E2E correctly canary'd a fix
|
||||
# that had landed but not deployed. Re-added the 5 templates.
|
||||
# Long-term: derive this list from manifest.json so cascade
|
||||
# scope can't drift from E2E scope — tracked in RFC #388 as a
|
||||
# Phase-1 invariant.
|
||||
TEMPLATES="claude-code hermes openclaw codex langgraph crewai autogen deepagents gemini-cli"
|
||||
FAILED=""
|
||||
for tpl in $TEMPLATES; do
|
||||
REPO="Molecule-AI/molecule-ai-workspace-template-$tpl"
|
||||
|
||||
@@ -17,7 +17,7 @@ name: redeploy-tenants-on-main
|
||||
# 1. publish-workspace-server-image completes → new :latest in GHCR.
|
||||
# 2. This workflow fires via workflow_run, waits 30s for GHCR's
|
||||
# CDN to propagate the new tag to the region the tenants pull from.
|
||||
# 3. Calls redeploy-fleet with canary_slug=hongmingwang and a 60s
|
||||
# 3. Calls redeploy-fleet with canary_slug=hongming and a 60s
|
||||
# soak. Canary proves the image boots; batches follow.
|
||||
# 4. Any failure aborts the rollout and leaves older tenants on the
|
||||
# prior image — safer default than half-and-half state.
|
||||
@@ -56,7 +56,12 @@ on:
|
||||
description: 'Tenant slug to deploy first + soak (empty = skip canary, fan out immediately).'
|
||||
required: false
|
||||
type: string
|
||||
default: 'hongmingwang'
|
||||
# Must be an actual prod tenant slug (current: hongming,
|
||||
# chloe-dong, reno-stars). The previous default 'hongmingwang'
|
||||
# didn't match any tenant — CP soft-skipped the missing canary
|
||||
# and the fleet rolled out without the soak gate, defeating the
|
||||
# whole point of canary-first.
|
||||
default: 'hongming'
|
||||
soak_seconds:
|
||||
description: 'Seconds to wait after canary before fanning out.'
|
||||
required: false
|
||||
@@ -148,7 +153,7 @@ jobs:
|
||||
CP_URL: ${{ vars.CP_URL || 'https://api.moleculesai.app' }}
|
||||
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
|
||||
TARGET_TAG: ${{ steps.tag.outputs.target_tag }}
|
||||
CANARY_SLUG: ${{ inputs.canary_slug || 'hongmingwang' }}
|
||||
CANARY_SLUG: ${{ inputs.canary_slug || 'hongming' }}
|
||||
SOAK_SECONDS: ${{ inputs.soak_seconds || '60' }}
|
||||
BATCH_SIZE: ${{ inputs.batch_size || '3' }}
|
||||
DRY_RUN: ${{ inputs.dry_run || false }}
|
||||
|
||||
@@ -176,35 +176,41 @@ jobs:
|
||||
#
|
||||
# CP returns HTTP 500 + ok=false whenever ANY tenant in the
|
||||
# fleet failed SSM or healthz. In practice the recurring source
|
||||
# of these is ephemeral e2e-* tenants (saas/canvas/ext) being
|
||||
# torn down by their parent E2E run mid-redeploy: the EC2 dies →
|
||||
# SSM exit=2 or healthz timeout → CP marks the fleet failed →
|
||||
# this workflow goes red even though every operator-facing
|
||||
# tenant rolled fine.
|
||||
# of these is ephemeral test tenants being torn down by their
|
||||
# parent E2E run mid-redeploy: the EC2 dies → SSM exit=2 or
|
||||
# healthz timeout → CP marks the fleet failed → this workflow
|
||||
# goes red even though every operator-facing tenant rolled fine.
|
||||
#
|
||||
# Filter: if HTTP=500/ok=false AND every failed slug matches
|
||||
# ^e2e-, treat as soft-warn and let the verify step downstream
|
||||
# handle the unreachable-vs-stale distinction (it already knows
|
||||
# the difference per #2402). Any non-e2e-* failure or a non-500
|
||||
# HTTP response remains a hard failure.
|
||||
# Ephemeral slug prefixes (kept in sync with sweep-stale-e2e-orgs.yml
|
||||
# — see that file for the source-of-truth list and rationale):
|
||||
# - e2e-* — canvas/saas/ext E2E suites
|
||||
# - rt-e2e-* — runtime-test harness fixtures (RFC #2251)
|
||||
# Long-lived prefixes that are NOT ephemeral and MUST hard-fail:
|
||||
# demo-prep, dryrun-*, dryrun2-*, plus all human tenant slugs.
|
||||
#
|
||||
# Filter: if HTTP=500/ok=false AND every failed slug matches an
|
||||
# ephemeral prefix, treat as soft-warn and let the verify step
|
||||
# downstream handle unreachable-vs-stale (#2402). Any non-ephemeral
|
||||
# failure or a non-500 HTTP response remains a hard failure.
|
||||
OK=$(jq -r '.ok // "false"' "$HTTP_RESPONSE")
|
||||
FAILED_SLUGS=$(jq -r '
|
||||
.results[]?
|
||||
| select((.healthz_ok != true) or (.ssm_status != "Success"))
|
||||
| .slug' "$HTTP_RESPONSE" 2>/dev/null || true)
|
||||
NON_E2E_FAILED=$(printf '%s\n' "$FAILED_SLUGS" | grep -v '^$' | grep -v '^e2e-' || true)
|
||||
EPHEMERAL_PREFIX_RE='^(e2e-|rt-e2e-)'
|
||||
NON_EPHEMERAL_FAILED=$(printf '%s\n' "$FAILED_SLUGS" | grep -v '^$' | grep -Ev "$EPHEMERAL_PREFIX_RE" || true)
|
||||
|
||||
if [ "$HTTP_CODE" = "200" ] && [ "$OK" = "true" ]; then
|
||||
: # happy path — fall through to verification
|
||||
elif [ "$HTTP_CODE" = "500" ] && [ -z "$NON_E2E_FAILED" ] && [ -n "$FAILED_SLUGS" ]; then
|
||||
COUNT=$(printf '%s\n' "$FAILED_SLUGS" | grep -c '^e2e-' || true)
|
||||
echo "::warning::redeploy-fleet returned HTTP 500 but every failed tenant ($COUNT) is e2e-* ephemeral — treating as teardown race, soft-warning."
|
||||
elif [ "$HTTP_CODE" = "500" ] && [ -z "$NON_EPHEMERAL_FAILED" ] && [ -n "$FAILED_SLUGS" ]; then
|
||||
COUNT=$(printf '%s\n' "$FAILED_SLUGS" | grep -Ec "$EPHEMERAL_PREFIX_RE" || true)
|
||||
echo "::warning::redeploy-fleet returned HTTP 500 but every failed tenant ($COUNT) is ephemeral (e2e-*/rt-e2e-*) — treating as teardown race, soft-warning."
|
||||
printf '%s\n' "$FAILED_SLUGS" | sed 's/^/::warning:: failed: /'
|
||||
elif [ "$HTTP_CODE" != "200" ]; then
|
||||
echo "::error::redeploy-fleet returned HTTP $HTTP_CODE"
|
||||
if [ -n "$NON_E2E_FAILED" ]; then
|
||||
echo "::error::non-e2e tenant(s) failed:"
|
||||
printf '%s\n' "$NON_E2E_FAILED" | sed 's/^/::error:: /'
|
||||
if [ -n "$NON_EPHEMERAL_FAILED" ]; then
|
||||
echo "::error::non-ephemeral tenant(s) failed:"
|
||||
printf '%s\n' "$NON_EPHEMERAL_FAILED" | sed 's/^/::error:: /'
|
||||
fi
|
||||
exit 1
|
||||
else
|
||||
|
||||
@@ -26,11 +26,22 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
# Only fire for bot-authored PRs. Human CEO PRs (staging→main promotion)
|
||||
# are intentional and pass through.
|
||||
#
|
||||
# Head-ref guard: never retarget a PR whose head IS `staging` — those
|
||||
# are the auto-promote staging→main PRs (opened by molecule-ai[bot]
|
||||
# since #2586 switched to an App token, which now passes the bot
|
||||
# filter below). Retargeting head=staging onto base=staging fails
|
||||
# with HTTP 422 "no new commits between base 'staging' and head
|
||||
# 'staging'", which used to surface as a noisy red workflow run on
|
||||
# every auto-promote (caught 2026-05-03 on PR #2588).
|
||||
if: >-
|
||||
github.event.pull_request.user.type == 'Bot'
|
||||
|| endsWith(github.event.pull_request.user.login, '[bot]')
|
||||
|| github.event.pull_request.user.login == 'app/molecule-ai'
|
||||
|| github.event.pull_request.user.login == 'molecule-ai[bot]'
|
||||
github.event.pull_request.head.ref != 'staging'
|
||||
&& (
|
||||
github.event.pull_request.user.type == 'Bot'
|
||||
|| endsWith(github.event.pull_request.user.login, '[bot]')
|
||||
|| github.event.pull_request.user.login == 'app/molecule-ai'
|
||||
|| github.event.pull_request.user.login == 'molecule-ai[bot]'
|
||||
)
|
||||
steps:
|
||||
- name: Retarget PR base to staging
|
||||
id: retarget
|
||||
|
||||
@@ -87,20 +87,28 @@ jobs:
|
||||
> orgs.json
|
||||
|
||||
# Filter:
|
||||
# 1. slug starts with 'e2e-' (covers e2e-, e2e-canary-,
|
||||
# e2e-canvas-* — all variants the test scripts mint)
|
||||
# 1. slug starts with one of the ephemeral test prefixes:
|
||||
# - 'e2e-' — covers e2e-canary-, e2e-canvas-*, etc.
|
||||
# - 'rt-e2e-' — runtime-test harness fixtures (RFC #2251);
|
||||
# missing this prefix left two such tenants
|
||||
# orphaned 8h on staging (2026-05-03), then
|
||||
# hard-failed redeploy-tenants-on-staging
|
||||
# and broke the staging→main auto-promote
|
||||
# chain. Kept in sync with the EPHEMERAL_PREFIX_RE
|
||||
# regex in redeploy-tenants-on-staging.yml.
|
||||
# 2. created_at is older than MAX_AGE_MINUTES ago
|
||||
# Output one slug per line to a file the next step reads.
|
||||
python3 > stale_slugs.txt <<'PY'
|
||||
import json, os
|
||||
from datetime import datetime, timezone, timedelta
|
||||
EPHEMERAL_PREFIXES = ("e2e-", "rt-e2e-")
|
||||
with open("orgs.json") as f:
|
||||
data = json.load(f)
|
||||
max_age = int(os.environ["MAX_AGE_MINUTES"])
|
||||
cutoff = datetime.now(timezone.utc) - timedelta(minutes=max_age)
|
||||
for o in data.get("orgs", []):
|
||||
slug = o.get("slug", "")
|
||||
if not slug.startswith("e2e-"):
|
||||
if not slug.startswith(EPHEMERAL_PREFIXES):
|
||||
continue
|
||||
created = o.get("created_at")
|
||||
if not created:
|
||||
|
||||
@@ -54,7 +54,7 @@ export default function Home() {
|
||||
if (hydrating) {
|
||||
return (
|
||||
<div className="fixed inset-0 flex items-center justify-center bg-surface">
|
||||
<div className="flex flex-col items-center gap-3">
|
||||
<div role="status" aria-live="polite" className="flex flex-col items-center gap-3">
|
||||
<Spinner size="lg" />
|
||||
<span className="text-xs text-ink-soft">Loading canvas...</span>
|
||||
</div>
|
||||
|
||||
@@ -13,6 +13,7 @@ import {
|
||||
import "@xyflow/react/dist/style.css";
|
||||
|
||||
import { useCanvasStore } from "@/store/canvas";
|
||||
import { useTheme } from "@/lib/theme-provider";
|
||||
import { A2ATopologyOverlay } from "./A2ATopologyOverlay";
|
||||
import { WorkspaceNode } from "./WorkspaceNode";
|
||||
import { SidePanel } from "./SidePanel";
|
||||
@@ -69,6 +70,14 @@ export function Canvas() {
|
||||
}
|
||||
|
||||
function CanvasInner() {
|
||||
// ReactFlow's `colorMode` prop drives the styling of every viewport
|
||||
// primitive it renders directly (background dots, edge defaults,
|
||||
// selection rings, controls, minimap mask). Pre-fix this was hard-pinned
|
||||
// to "dark" — so on light theme the chrome (toolbar, side panel) flipped
|
||||
// to warm-paper but the canvas backplate + edges stayed black, leaving a
|
||||
// half-themed page. Pull resolvedTheme so the canvas matches the user's
|
||||
// selected mode (and the system preference when they pick "system").
|
||||
const { resolvedTheme } = useTheme();
|
||||
const rawNodes = useCanvasStore((s) => s.nodes);
|
||||
const edges = useCanvasStore((s) => s.edges);
|
||||
const a2aEdges = useCanvasStore((s) => s.a2aEdges);
|
||||
@@ -250,7 +259,7 @@ function CanvasInner() {
|
||||
</a>
|
||||
<main id="canvas-main" className="w-screen h-screen bg-surface">
|
||||
<ReactFlow
|
||||
colorMode="dark"
|
||||
colorMode={resolvedTheme}
|
||||
nodes={nodes}
|
||||
edges={allEdges}
|
||||
onNodesChange={onNodesChange}
|
||||
@@ -273,7 +282,9 @@ function CanvasInner() {
|
||||
variant={BackgroundVariant.Dots}
|
||||
gap={24}
|
||||
size={1}
|
||||
color="#27272a"
|
||||
// Match the line token so dots fade with the surface.
|
||||
// Hard-coded zinc-800 was invisible on warm-paper.
|
||||
color={resolvedTheme === "dark" ? "#27272a" : "#d4d0c4"}
|
||||
/>
|
||||
<Controls
|
||||
className="!bg-surface-sunken/90 !border-line/50 !rounded-lg !shadow-xl !shadow-black/20 [&>button]:!bg-surface-card [&>button]:!border-line/50 [&>button]:!text-ink-mid [&>button:hover]:!bg-surface-card [&>button:hover]:!text-ink"
|
||||
@@ -281,7 +292,9 @@ function CanvasInner() {
|
||||
/>
|
||||
<MiniMap
|
||||
className="!bg-surface-sunken/90 !border-line/50 !rounded-lg !shadow-xl !shadow-black/20"
|
||||
maskColor="rgba(0, 0, 0, 0.7)"
|
||||
// Mask dims off-viewport areas; tint matches the surface so
|
||||
// the dimming doesn't show as a black bar in light mode.
|
||||
maskColor={resolvedTheme === "dark" ? "rgba(0, 0, 0, 0.7)" : "rgba(232, 226, 211, 0.7)"}
|
||||
nodeColor={(node) => {
|
||||
// Parents show as a filled region — hierarchy visible at
|
||||
// a glance in the minimap without needing to zoom.
|
||||
|
||||
@@ -1,11 +1,23 @@
|
||||
"use client";
|
||||
|
||||
import { useEffect, useState } from "react";
|
||||
import { STATUS_CONFIG } from "@/lib/design-tokens";
|
||||
import { STATUS_CONFIG, TIER_CONFIG } from "@/lib/design-tokens";
|
||||
import { useCanvasStore } from "@/store/canvas";
|
||||
|
||||
const LEGEND_STATUSES = ["online", "provisioning", "degraded", "failed", "paused", "offline"] as const;
|
||||
|
||||
// Tier descriptions kept in sync with CreateWorkspaceDialog.tsx (the
|
||||
// source of truth for what each tier means semantically). Colors come
|
||||
// from TIER_CONFIG so the legend swatch matches the badge actually
|
||||
// rendered on every WorkspaceNode — drift here misled users into
|
||||
// thinking the legend documented a different tier than the one shown.
|
||||
const LEGEND_TIERS: ReadonlyArray<{ tier: number; label: string }> = [
|
||||
{ tier: 1, label: "Sandboxed" },
|
||||
{ tier: 2, label: "Standard" },
|
||||
{ tier: 3, label: "Privileged" },
|
||||
{ tier: 4, label: "Full Access" },
|
||||
];
|
||||
|
||||
// Persist the user's choice across sessions. Default is "open" so
|
||||
// first-time users still see the symbol key; once dismissed we
|
||||
// respect that until they explicitly reopen via the floating pill.
|
||||
@@ -102,9 +114,9 @@ export function Legend() {
|
||||
<div className="mb-2">
|
||||
<div className="text-[11px] text-ink-soft font-medium mb-1">Tier</div>
|
||||
<div className="flex flex-wrap gap-x-3 gap-y-1">
|
||||
<TierItem tier={1} label="Sandboxed" color="text-sky-300 bg-sky-950/40 border-sky-700/30" />
|
||||
<TierItem tier={2} label="Standard" color="text-violet-300 bg-violet-950/40 border-violet-700/30" />
|
||||
<TierItem tier={3} label="Full Access" color="text-warm bg-amber-950/40 border-amber-700/30" />
|
||||
{LEGEND_TIERS.map(({ tier, label }) => (
|
||||
<TierItem key={tier} tier={tier} label={label} color={TIER_CONFIG[tier].border} />
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
@@ -202,7 +202,7 @@ export function SidePanel() {
|
||||
{/* Tabs — relative wrapper lets the fade gradient position against the scroll container */}
|
||||
<div className="relative border-b border-line/40">
|
||||
{/* Right-edge fade: signals more tabs are hidden off-screen when the bar overflows */}
|
||||
<div className="pointer-events-none absolute inset-y-0 right-0 w-8 bg-gradient-to-l from-zinc-950 to-transparent z-10" aria-hidden="true" />
|
||||
<div className="pointer-events-none absolute inset-y-0 right-0 w-8 bg-gradient-to-l from-surface to-transparent z-10" aria-hidden="true" />
|
||||
<div
|
||||
role="tablist"
|
||||
aria-label="Workspace panel tabs"
|
||||
@@ -232,8 +232,8 @@ export function SidePanel() {
|
||||
onClick={() => setPanelTab(tab.id)}
|
||||
className={`shrink-0 px-3 py-2.5 text-[10px] font-medium tracking-wide transition-all rounded-t-lg mx-0.5 focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/70 ${
|
||||
panelTab === tab.id
|
||||
? "text-ink bg-surface-card/40 border-b-2 border-accent"
|
||||
: "text-ink-soft hover:text-ink hover:bg-surface-card/40"
|
||||
? "text-ink bg-surface-card border-b-2 border-accent"
|
||||
: "text-ink-mid hover:text-ink hover:bg-surface-card/60"
|
||||
}`}
|
||||
>
|
||||
<span className="mr-1 opacity-50" aria-hidden="true">{tab.icon}</span>
|
||||
|
||||
@@ -36,7 +36,7 @@ function EjectIcon(props: React.SVGProps<SVGSVGElement>) {
|
||||
|
||||
export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>) {
|
||||
const statusCfg = STATUS_CONFIG[data.status] || STATUS_CONFIG.offline;
|
||||
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-soft bg-surface-card" };
|
||||
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-mid bg-surface-card border border-line" };
|
||||
// Org-deploy context — four derived flags off one store subscription.
|
||||
// Drives the shimmer while provisioning, the dimmed/non-draggable
|
||||
// treatment on locked descendants, and the Cancel pill on the root.
|
||||
@@ -179,7 +179,7 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
|
||||
</div>
|
||||
<div className="flex items-center gap-1.5 shrink-0">
|
||||
{hasChildren && (
|
||||
<span className="text-[10px] font-mono text-violet-300 bg-violet-900/40 border border-violet-700/30 px-1.5 py-0.5 rounded-md">
|
||||
<span className="text-[10px] font-mono text-accent bg-accent/15 border border-accent/40 px-1.5 py-0.5 rounded-md">
|
||||
{descendantCount} sub
|
||||
</span>
|
||||
)}
|
||||
@@ -207,13 +207,13 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
|
||||
<div className="mb-1 flex items-center gap-1">
|
||||
{runtime === "external" ? (
|
||||
<span
|
||||
className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-violet-200 bg-violet-900/50 border border-violet-500/40"
|
||||
className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-white bg-violet-600 border border-violet-700"
|
||||
title="Phase 30 remote agent — runs outside this platform's Docker network. Lifecycle managed via heartbeat-based polling, not Docker exec."
|
||||
>
|
||||
★ REMOTE
|
||||
</span>
|
||||
) : (
|
||||
<span className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-ink-mid bg-surface-card/60 border border-line/30">
|
||||
<span className="text-[7px] font-mono px-1.5 py-0.5 rounded-md text-ink-mid bg-surface-card border border-line">
|
||||
{runtime}
|
||||
</span>
|
||||
)}
|
||||
@@ -237,15 +237,15 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
|
||||
key={skill}
|
||||
className={`text-[10px] px-1.5 py-0.5 rounded-md border ${
|
||||
isOnline
|
||||
? "text-good/80 bg-emerald-950/30 border-emerald-800/30"
|
||||
: "text-ink-mid bg-surface-card/60 border-line/40"
|
||||
? "text-good bg-good/15 border-good/40"
|
||||
: "text-ink-mid bg-surface-card border-line"
|
||||
}`}
|
||||
>
|
||||
{skill}
|
||||
</span>
|
||||
))}
|
||||
{skills.length > 4 && (
|
||||
<span className="text-[10px] text-ink-soft self-center">
|
||||
<span className="text-[10px] text-ink-mid self-center">
|
||||
+{skills.length - 4}
|
||||
</span>
|
||||
)}
|
||||
@@ -274,10 +274,10 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
|
||||
e.stopPropagation();
|
||||
useCanvasStore.getState().restartWorkspace(id).catch(() => showToast("Restart failed", "error"));
|
||||
}}
|
||||
className="flex items-center gap-1.5 mt-1 w-full bg-sky-950/30 px-2 py-1 rounded-md border border-sky-800/30 hover:bg-sky-900/40 transition-colors text-left focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:outline-none"
|
||||
className="flex items-center gap-1.5 mt-1 w-full bg-accent/10 px-2 py-1 rounded-md border border-accent/40 hover:bg-accent/20 transition-colors text-left focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:outline-none"
|
||||
>
|
||||
<span className="text-[10px]">↻</span>
|
||||
<span className="text-[10px] text-sky-300/80">Restart to apply changes</span>
|
||||
<span className="text-[10px] text-accent">↻</span>
|
||||
<span className="text-[10px] text-accent">Restart to apply changes</span>
|
||||
</button>
|
||||
)}
|
||||
|
||||
@@ -287,8 +287,8 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
|
||||
<div className={`text-[10px] uppercase tracking-widest font-medium ${
|
||||
data.status === "failed" ? "text-bad" :
|
||||
data.status === "degraded" ? "text-warm" :
|
||||
data.status === "provisioning" ? "text-sky-400" :
|
||||
"text-ink-soft"
|
||||
data.status === "provisioning" ? "text-accent" :
|
||||
"text-ink-mid"
|
||||
}`}>
|
||||
{statusCfg.label}
|
||||
</div>
|
||||
@@ -296,8 +296,8 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
|
||||
|
||||
{data.activeTasks > 0 && (
|
||||
<div className="flex items-center gap-1">
|
||||
<div className="w-1 h-1 rounded-full bg-amber-400 motion-safe:animate-pulse" />
|
||||
<span className="text-[10px] text-warm/80 tabular-nums">
|
||||
<div className="w-1 h-1 rounded-full bg-warm motion-safe:animate-pulse" />
|
||||
<span className="text-[10px] text-warm tabular-nums">
|
||||
{data.activeTasks} task{data.activeTasks > 1 ? "s" : ""}
|
||||
</span>
|
||||
</div>
|
||||
@@ -307,7 +307,7 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
|
||||
{/* Degraded error preview */}
|
||||
{data.status === "degraded" && data.lastSampleError && (
|
||||
<div
|
||||
className="text-[10px] text-warm/60 truncate mt-1 bg-amber-950/20 px-1.5 py-0.5 rounded border border-amber-800/20"
|
||||
className="text-[10px] text-warm truncate mt-1 bg-warm/10 px-1.5 py-0.5 rounded border border-warm/40"
|
||||
title={data.lastSampleError}
|
||||
>
|
||||
{data.lastSampleError}
|
||||
@@ -357,7 +357,7 @@ function TeamMemberChip({
|
||||
}) {
|
||||
const { data } = node;
|
||||
const statusCfg = STATUS_CONFIG[data.status] || STATUS_CONFIG.offline;
|
||||
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-soft bg-surface-card" };
|
||||
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-mid bg-surface-card border border-line" };
|
||||
const isOnline = data.status === "online";
|
||||
const skills = getSkillNames(data.agentCard);
|
||||
|
||||
@@ -408,7 +408,7 @@ function TeamMemberChip({
|
||||
</div>
|
||||
<div className="flex items-center gap-1 shrink-0">
|
||||
{hasSubChildren && (
|
||||
<span className="text-[7px] font-mono text-violet-300 bg-violet-900/40 border border-violet-700/30 px-1 py-0.5 rounded">
|
||||
<span className="text-[7px] font-mono text-accent bg-accent/15 border border-accent/40 px-1 py-0.5 rounded">
|
||||
{descendantCount}
|
||||
</span>
|
||||
)}
|
||||
@@ -423,7 +423,7 @@ function TeamMemberChip({
|
||||
e.stopPropagation();
|
||||
onExtract(node.id);
|
||||
}}
|
||||
className="opacity-0 group-hover/child:opacity-100 text-ink-soft hover:text-sky-400 transition-all focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:outline-none rounded"
|
||||
className="opacity-0 group-hover/child:opacity-100 text-ink-mid hover:text-accent transition-all focus-visible:ring-2 focus-visible:ring-accent/70 focus-visible:outline-none rounded"
|
||||
>
|
||||
<EjectIcon aria-hidden="true" />
|
||||
</button>
|
||||
@@ -432,7 +432,7 @@ function TeamMemberChip({
|
||||
|
||||
{/* Role */}
|
||||
{data.role && (
|
||||
<div className="text-[10px] text-ink-soft mb-1 leading-tight truncate">{data.role}</div>
|
||||
<div className="text-[10px] text-ink-mid mb-1 leading-tight truncate">{data.role}</div>
|
||||
)}
|
||||
|
||||
{/* Skills */}
|
||||
@@ -443,8 +443,8 @@ function TeamMemberChip({
|
||||
key={skill}
|
||||
className={`text-[10px] px-1 py-0.5 rounded border ${
|
||||
isOnline
|
||||
? "text-good/70 bg-emerald-950/20 border-emerald-800/20"
|
||||
: "text-ink-soft bg-surface-card/40 border-line/30"
|
||||
? "text-good bg-good/15 border-good/40"
|
||||
: "text-ink-mid bg-surface-card border-line"
|
||||
}`}
|
||||
>
|
||||
{skill}
|
||||
@@ -462,8 +462,8 @@ function TeamMemberChip({
|
||||
<span className={`text-[10px] uppercase tracking-widest font-medium ${
|
||||
data.status === "failed" ? "text-bad" :
|
||||
data.status === "degraded" ? "text-warm" :
|
||||
data.status === "provisioning" ? "text-sky-400" :
|
||||
"text-ink-soft"
|
||||
data.status === "provisioning" ? "text-accent" :
|
||||
"text-ink-mid"
|
||||
}`}>
|
||||
{statusCfg.label}
|
||||
</span>
|
||||
|
||||
@@ -182,7 +182,7 @@ export function OrgTokensTab() {
|
||||
|
||||
{/* Token list */}
|
||||
{loading ? (
|
||||
<div className="flex items-center justify-center gap-2 py-6 text-ink-soft text-xs">
|
||||
<div role="status" aria-live="polite" className="flex items-center justify-center gap-2 py-6 text-ink-soft text-xs">
|
||||
<Spinner /> Loading keys...
|
||||
</div>
|
||||
) : tokens.length === 0 ? (
|
||||
|
||||
@@ -129,7 +129,7 @@ export function TokensTab({ workspaceId }: TokensTabProps) {
|
||||
|
||||
{/* Token list */}
|
||||
{loading ? (
|
||||
<div className="flex items-center justify-center gap-2 py-6 text-ink-soft text-xs">
|
||||
<div role="status" aria-live="polite" className="flex items-center justify-center gap-2 py-6 text-ink-soft text-xs">
|
||||
<Spinner /> Loading tokens...
|
||||
</div>
|
||||
) : tokens.length === 0 ? (
|
||||
|
||||
@@ -773,14 +773,14 @@ function MyChatPanel({ workspaceId, data }: Props) {
|
||||
<div
|
||||
className={`max-w-[85%] rounded-lg px-3 py-2 text-xs ${
|
||||
msg.role === "user"
|
||||
? "bg-accent-strong/30 text-blue-100 border border-accent/20"
|
||||
? "bg-accent text-white border border-accent-strong"
|
||||
: msg.role === "system"
|
||||
? "bg-red-900/30 text-red-200 border border-red-800/30"
|
||||
: "bg-surface-card/80 text-ink border border-line/30"
|
||||
? "bg-bad/10 text-bad border border-bad/40"
|
||||
: "bg-surface-card text-ink border border-line"
|
||||
}`}
|
||||
>
|
||||
{msg.content && (
|
||||
<div className="prose prose-sm prose-invert max-w-none [&>p]:mb-1 [&>p:last-child]:mb-0">
|
||||
<div className={`prose prose-sm max-w-none [&>p]:mb-1 [&>p:last-child]:mb-0 ${msg.role === "user" ? "prose-invert" : ""}`}>
|
||||
<ReactMarkdown remarkPlugins={[remarkGfm]}>{msg.content}</ReactMarkdown>
|
||||
</div>
|
||||
)}
|
||||
@@ -796,7 +796,7 @@ function MyChatPanel({ workspaceId, data }: Props) {
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
<div className="text-[9px] text-ink-soft mt-1">
|
||||
<div className={`text-[9px] mt-1 ${msg.role === "user" ? "text-white/70" : "text-ink-mid"}`}>
|
||||
{new Date(msg.timestamp).toLocaleTimeString()}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
@@ -655,7 +655,8 @@ export function ConfigTab({ workspaceId }: Props) {
|
||||
>
|
||||
<option value={1}>T1 — Sandboxed</option>
|
||||
<option value={2}>T2 — Standard</option>
|
||||
<option value={3}>T3 — Full Access</option>
|
||||
<option value={3}>T3 — Privileged</option>
|
||||
<option value={4}>T4 — Full Access</option>
|
||||
</select>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
@@ -12,10 +12,10 @@ export function statusDotClass(status: string): string {
|
||||
}
|
||||
|
||||
export const TIER_CONFIG: Record<number, { label: string; color: string; border: string }> = {
|
||||
1: { label: "T1", color: "text-ink-soft bg-surface-card/80", border: "text-ink-mid border-line/60" },
|
||||
2: { label: "T2", color: "text-sky-400 bg-sky-950/50", border: "text-sky-400 border-sky-500/30" },
|
||||
3: { label: "T3", color: "text-violet-400 bg-violet-950/50", border: "text-violet-400 border-violet-500/30" },
|
||||
4: { label: "T4", color: "text-warm bg-amber-950/50", border: "text-warm border-amber-500/30" },
|
||||
1: { label: "T1", color: "text-ink-mid bg-surface-card border border-line", border: "text-ink-mid border-line" },
|
||||
2: { label: "T2", color: "text-white bg-accent border border-accent-strong", border: "text-accent border-accent" },
|
||||
3: { label: "T3", color: "text-white bg-violet-600 border border-violet-700", border: "text-violet-600 border-violet-500" },
|
||||
4: { label: "T4", color: "text-white bg-warm border border-warm", border: "text-warm border-warm" },
|
||||
};
|
||||
|
||||
export const COMM_TYPE_LABELS: Record<string, string> = {
|
||||
|
||||
@@ -59,8 +59,8 @@ export function getTenantSlug(): string {
|
||||
* isSaaSTenant reports whether the canvas is running as the UI for a
|
||||
* SaaS tenant (served at <slug>.moleculesai.app). Use for client-side
|
||||
* UX branches that should behave differently on SaaS vs self-hosted —
|
||||
* e.g. the workspace tier picker hides T1/T2 sandbox tiers because every
|
||||
* SaaS workspace gets its own EC2 VM (inherently T3 Full Access).
|
||||
* e.g. the workspace tier picker hides T1/T2/T3 sandbox tiers because
|
||||
* every SaaS workspace gets its own EC2 VM (inherently T4 Full Access).
|
||||
*
|
||||
* SSR-safe: returns false on the server to avoid hydration drift; call
|
||||
* sites should tolerate a flip from false→true on first client render.
|
||||
|
||||
+6
-1
@@ -28,7 +28,12 @@
|
||||
{"name": "claude-code-default", "repo": "Molecule-AI/molecule-ai-workspace-template-claude-code", "ref": "main"},
|
||||
{"name": "hermes", "repo": "Molecule-AI/molecule-ai-workspace-template-hermes", "ref": "main"},
|
||||
{"name": "openclaw", "repo": "Molecule-AI/molecule-ai-workspace-template-openclaw", "ref": "main"},
|
||||
{"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"}
|
||||
{"name": "codex", "repo": "Molecule-AI/molecule-ai-workspace-template-codex", "ref": "main"},
|
||||
{"name": "langgraph", "repo": "Molecule-AI/molecule-ai-workspace-template-langgraph", "ref": "main"},
|
||||
{"name": "crewai", "repo": "Molecule-AI/molecule-ai-workspace-template-crewai", "ref": "main"},
|
||||
{"name": "autogen", "repo": "Molecule-AI/molecule-ai-workspace-template-autogen", "ref": "main"},
|
||||
{"name": "deepagents", "repo": "Molecule-AI/molecule-ai-workspace-template-deepagents", "ref": "main"},
|
||||
{"name": "gemini-cli", "repo": "Molecule-AI/molecule-ai-workspace-template-gemini-cli", "ref": "main"}
|
||||
],
|
||||
"org_templates": [
|
||||
{"name": "molecule-dev", "repo": "Molecule-AI/molecule-ai-org-template-molecule-dev", "ref": "main"},
|
||||
|
||||
Executable
+95
@@ -0,0 +1,95 @@
|
||||
#!/usr/bin/env bash
|
||||
# check-cascade-list-vs-manifest.sh — structural drift gate for the
|
||||
# publish-runtime cascade list vs manifest.json workspace_templates.
|
||||
#
|
||||
# WHY: PR #2536 pruned the manifest to 4 supported runtimes; PR #2556
|
||||
# realigned the cascade list to match. The underlying drift hazard
|
||||
# (cascade-list ≠ manifest) was unguarded — the data fix didn't prevent
|
||||
# recurrence. This script is the structural gate that does.
|
||||
#
|
||||
# Behavior-based per project pattern: derives the expected set from
|
||||
# manifest.json and the actual set from the workflow YAML, fails on
|
||||
# any divergence in either direction.
|
||||
#
|
||||
# missing-from-cascade → templates in manifest that publish-runtime.yml
|
||||
# won't auto-rebuild on a new wheel publish
|
||||
# (the codex-stuck-on-stale-runtime bug class)
|
||||
# extra-in-cascade → cascade dispatches to deprecated templates
|
||||
# (the wasted-API-calls + dead-CI-noise class)
|
||||
#
|
||||
# Suffix mapping: manifest names map to GHCR repos via
|
||||
# {name without -default suffix} → molecule-ai-workspace-template-<suffix>
|
||||
# That's the same map publish-runtime.yml's TEMPLATES variable iterates.
|
||||
#
|
||||
# Exit:
|
||||
# 0 cascade matches manifest exactly
|
||||
# 1 drift detected (script prints the diff)
|
||||
# 2 bad usage / missing inputs
|
||||
|
||||
set -eu
|
||||
|
||||
MANIFEST="${1:-manifest.json}"
|
||||
WORKFLOW="${2:-.github/workflows/publish-runtime.yml}"
|
||||
|
||||
if [ ! -f "$MANIFEST" ]; then
|
||||
echo "::error::manifest not found: $MANIFEST" >&2
|
||||
exit 2
|
||||
fi
|
||||
if [ ! -f "$WORKFLOW" ]; then
|
||||
echo "::error::workflow not found: $WORKFLOW" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
# Expected cascade entries: manifest workspace_templates → suffix-only
|
||||
# (strip -default tail, e.g. claude-code-default → claude-code, since
|
||||
# publish-runtime.yml's TEMPLATES uses suffixes that match the
|
||||
# molecule-ai-workspace-template-<suffix> repo naming).
|
||||
EXPECTED=$(jq -r '.workspace_templates[].name' "$MANIFEST" \
|
||||
| sed 's/-default$//' \
|
||||
| sort -u)
|
||||
|
||||
# Actual cascade entries: extract from the TEMPLATES="…" line. We look
|
||||
# for the line, pull the contents between the quotes, and split into
|
||||
# one-per-line. Single source of truth in the workflow itself, no
|
||||
# parallel registry needed.
|
||||
#
|
||||
# Why not \s in the regex: BSD sed (macOS) doesn't recognize \s as
|
||||
# whitespace — treats it as literal `s`. POSIX [[:space:]] works on
|
||||
# both BSD and GNU sed. Same hazard nuked the original draft of this
|
||||
# script: \s* matched empty-prefix-of-literal-s, then the leading
|
||||
# whitespace stayed in the captured group.
|
||||
ACTUAL=$(grep -E '[[:space:]]*TEMPLATES="' "$WORKFLOW" \
|
||||
| head -1 \
|
||||
| sed -E 's/^[[:space:]]*TEMPLATES="([^"]*)".*$/\1/' \
|
||||
| tr ' ' '\n' \
|
||||
| grep -v '^$' \
|
||||
| sort -u)
|
||||
|
||||
if [ -z "$ACTUAL" ]; then
|
||||
echo "::error::could not extract TEMPLATES=\"…\" from $WORKFLOW — has the variable name or quoting changed?" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
MISSING=$(comm -23 <(printf '%s\n' "$EXPECTED") <(printf '%s\n' "$ACTUAL"))
|
||||
EXTRA=$(comm -13 <(printf '%s\n' "$EXPECTED") <(printf '%s\n' "$ACTUAL"))
|
||||
|
||||
if [ -z "$MISSING" ] && [ -z "$EXTRA" ]; then
|
||||
echo "✓ cascade list matches manifest workspace_templates ($(echo "$EXPECTED" | wc -l | tr -d ' ') entries)"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "::error::cascade list drift detected between $MANIFEST and $WORKFLOW" >&2
|
||||
echo "" >&2
|
||||
if [ -n "$MISSING" ]; then
|
||||
echo " Templates in manifest but MISSING from cascade (won't auto-rebuild on wheel publish):" >&2
|
||||
echo "$MISSING" | sed 's/^/ - /' >&2
|
||||
echo "" >&2
|
||||
fi
|
||||
if [ -n "$EXTRA" ]; then
|
||||
echo " Templates in cascade but NOT in manifest (deprecated, wasting dispatch calls):" >&2
|
||||
echo "$EXTRA" | sed 's/^/ - /' >&2
|
||||
echo "" >&2
|
||||
fi
|
||||
echo " Fix: edit the TEMPLATES=\"…\" line in $WORKFLOW so the set matches" >&2
|
||||
echo " manifest.json's workspace_templates (suffix-stripped). See PR #2556 for context." >&2
|
||||
exit 1
|
||||
@@ -352,15 +352,42 @@ print(json.dumps({
|
||||
")
|
||||
fi
|
||||
|
||||
# Model slug MUST be provider-prefixed for hermes — the template's
|
||||
# derive-provider.sh parses the slug prefix (`openai/…`, `anthropic/…`,
|
||||
# `minimax/…`) to set HERMES_INFERENCE_PROVIDER at install time. A bare
|
||||
# "gpt-4o" has no prefix → provider falls back to hermes auto-detect →
|
||||
# picks Anthropic default → tries Anthropic API with the OpenAI key →
|
||||
# 401 on A2A. Same trap that trapped prod users in PR #1714. We pin
|
||||
# "openai/gpt-4o" here because the E2E's secret is always the OpenAI
|
||||
# key; non-hermes runtimes ignore the prefix.
|
||||
MODEL_SLUG="openai/gpt-4o"
|
||||
# Model slug format depends on the runtime — different model resolvers
|
||||
# parse it differently:
|
||||
#
|
||||
# hermes → "openai/gpt-4o" (slash-form: derive-provider.sh splits
|
||||
# on the prefix to set
|
||||
# HERMES_INFERENCE_PROVIDER. Bare
|
||||
# "gpt-4o" falls through to Anthropic
|
||||
# default + 401, see PR #1714.)
|
||||
#
|
||||
# langgraph → "openai:gpt-4o" (colon-form: langchain init_chat_model
|
||||
# requires "<provider>:<model>".
|
||||
# Slash-form was misinterpreted as
|
||||
# OpenRouter routing → fell through
|
||||
# without auth, surfaced 2026-05-03
|
||||
# after the a2a-sdk v1 contract bugs
|
||||
# PR #2558+#2563+#2567 cleared the
|
||||
# masking layers.)
|
||||
#
|
||||
# claude-code → "sonnet" (entry-id form: claude-code template's
|
||||
# config.yaml uses bare model names,
|
||||
# auth comes via CLAUDE_CODE_OAUTH_TOKEN
|
||||
# or ANTHROPIC_API_KEY rather than the
|
||||
# slug.)
|
||||
#
|
||||
# When E2E_MODEL_SLUG is set, it overrides this dispatch — useful when an
|
||||
# operator dispatches the workflow to test a specific slug.
|
||||
if [ -n "${E2E_MODEL_SLUG:-}" ]; then
|
||||
MODEL_SLUG="$E2E_MODEL_SLUG"
|
||||
else
|
||||
case "$RUNTIME" in
|
||||
hermes) MODEL_SLUG="openai/gpt-4o" ;;
|
||||
langgraph) MODEL_SLUG="openai:gpt-4o" ;;
|
||||
claude-code) MODEL_SLUG="sonnet" ;;
|
||||
*) MODEL_SLUG="openai/gpt-4o" ;; # safest fallback (matches hermes)
|
||||
esac
|
||||
fi
|
||||
|
||||
log "5/11 Provisioning parent workspace (runtime=$RUNTIME)..."
|
||||
PARENT_RESP=$(tenant_call POST /workspaces \
|
||||
|
||||
@@ -509,7 +509,15 @@ class LangGraphA2AExecutor(AgentExecutor):
|
||||
# accept the assignment. See #1787 + commit dcbcf19
|
||||
# for the original test-mock motivation.
|
||||
logger.debug("metadata attach skipped (non-Message return from new_text_message)")
|
||||
await event_queue.enqueue_event(msg)
|
||||
# A2A v1 (a2a-sdk ≥ 1.0): once Task is enqueued (above, PR #2558),
|
||||
# the executor is in task mode and raw Message enqueues are
|
||||
# rejected with InvalidAgentResponseError("Received Message
|
||||
# object in task mode. Use TaskStatusUpdateEvent or
|
||||
# TaskArtifactUpdateEvent instead."). updater.complete()
|
||||
# wraps the Message in a terminal TaskStatusUpdateEvent
|
||||
# (state=COMPLETED, final=True) which both streaming and
|
||||
# non-streaming clients accept.
|
||||
await updater.complete(message=msg)
|
||||
_result = final_text
|
||||
|
||||
except Exception as e:
|
||||
@@ -520,10 +528,13 @@ class LangGraphA2AExecutor(AgentExecutor):
|
||||
task_span.set_status(StatusCode.ERROR, str(e))
|
||||
except Exception:
|
||||
pass
|
||||
# Emit a Message so both streaming and non-streaming clients
|
||||
# receive an error response rather than hanging.
|
||||
await event_queue.enqueue_event(
|
||||
new_text_message(
|
||||
# A2A v1: in task mode, terminal errors must publish a
|
||||
# FAILED TaskStatusUpdateEvent (carrying the error Message)
|
||||
# rather than a raw Message enqueue. updater.failed() does
|
||||
# exactly this — both streaming and non-streaming clients
|
||||
# receive the error and stop polling.
|
||||
await updater.failed(
|
||||
message=new_text_message(
|
||||
f"Agent error: {e}", task_id=task_id, context_id=context_id
|
||||
)
|
||||
)
|
||||
|
||||
@@ -559,9 +559,10 @@ async def tool_chat_history(peer_id: str, limit: int = 20, before_ts: str = "")
|
||||
|
||||
Hits ``/workspaces/<self>/activity?peer_id=<peer>&limit=<N>``
|
||||
against the workspace-server, which returns activity rows where
|
||||
this workspace is either the sender (``source_id=peer``) or the
|
||||
recipient (``target_id=peer``) of an A2A turn — both sides of the
|
||||
conversation in chronological order.
|
||||
the peer is either the sender (``source_id=peer`` — they sent us
|
||||
the message) or the recipient (``target_id=peer`` — we sent to
|
||||
them) of an A2A turn — both sides of the conversation in
|
||||
chronological order.
|
||||
|
||||
Args:
|
||||
peer_id: The other workspace's UUID. Same value the agent
|
||||
|
||||
+27
-5
@@ -180,16 +180,38 @@ def run_preflight(config: WorkspaceConfig, config_path: str) -> PreflightReport:
|
||||
required_env = list(entry.get("required_env") or [])
|
||||
break
|
||||
|
||||
# Smoke mode skips the auth-env block: the boot smoke (CI publish-image,
|
||||
# issue #2275) exercises executor.execute() against stub deps, never
|
||||
# hits the real provider, and CI cannot enumerate every adapter's auth
|
||||
# env without forming a maintenance treadmill. Hermes 2026-05-03 outage:
|
||||
# template smoke crashed for two cycles because molecule-ci injected
|
||||
# CLAUDE_CODE_OAUTH_TOKEN/ANTHROPIC_API_KEY/etc. but not HERMES_API_KEY.
|
||||
# Bypass here means new templates can ship without the workflow
|
||||
# learning their env names.
|
||||
smoke_mode = os.environ.get("MOLECULE_SMOKE_MODE", "").strip().lower() in (
|
||||
"1", "true", "yes", "on",
|
||||
)
|
||||
for env_var in required_env:
|
||||
if not os.environ.get(env_var):
|
||||
report.failures.append(
|
||||
if os.environ.get(env_var):
|
||||
continue
|
||||
if smoke_mode:
|
||||
report.warnings.append(
|
||||
PreflightIssue(
|
||||
severity="fail",
|
||||
severity="warn",
|
||||
title="Required env",
|
||||
detail=f"Missing required environment variable: {env_var}",
|
||||
fix=f"Set {env_var} via the secrets API (global or workspace-level).",
|
||||
detail=f"Missing {env_var} (skipped — MOLECULE_SMOKE_MODE)",
|
||||
fix="",
|
||||
)
|
||||
)
|
||||
continue
|
||||
report.failures.append(
|
||||
PreflightIssue(
|
||||
severity="fail",
|
||||
title="Required env",
|
||||
detail=f"Missing required environment variable: {env_var}",
|
||||
fix=f"Set {env_var} via the secrets API (global or workspace-level).",
|
||||
)
|
||||
)
|
||||
|
||||
# Backward compat: if legacy auth_token_file is set, warn but don't block
|
||||
# if the token is available via required_env or auth_token_env.
|
||||
|
||||
@@ -35,27 +35,41 @@ def _make_a2a_mocks():
|
||||
|
||||
events_mod.EventQueue = EventQueue
|
||||
|
||||
# a2a.server.tasks needs a TaskUpdater stub whose async methods are no-ops.
|
||||
# In tests, TaskUpdater calls go to this stub rather than the real SDK so
|
||||
# event_queue.enqueue_event is only called via explicit executor code paths.
|
||||
# a2a.server.tasks needs a TaskUpdater stub whose async methods are no-ops
|
||||
# for status transitions but ROUTE the terminal message back through
|
||||
# event_queue.enqueue_event so legacy assertions on enqueue_event keep
|
||||
# working. The wrapper preserves identity (the same Message object the
|
||||
# executor passed in) so tests inspecting str(event_arg) still see the
|
||||
# response text. complete()/failed() also record their last call on the
|
||||
# event_queue itself (`_complete_calls`, `_failed_calls`) so the v1
|
||||
# contract regression test (#262 follow-on to #2558) can pin the proper
|
||||
# path was taken — raw enqueue from executor would NOT touch these.
|
||||
tasks_mod = ModuleType("a2a.server.tasks")
|
||||
|
||||
class TaskUpdater:
|
||||
"""Stub TaskUpdater — no-op async methods for unit tests."""
|
||||
"""Stub TaskUpdater — terminal helpers route through event_queue."""
|
||||
|
||||
def __init__(self, event_queue, task_id, context_id, *args, **kwargs):
|
||||
self.event_queue = event_queue
|
||||
self.task_id = task_id
|
||||
self.context_id = context_id
|
||||
if not hasattr(event_queue, "_complete_calls"):
|
||||
event_queue._complete_calls = []
|
||||
if not hasattr(event_queue, "_failed_calls"):
|
||||
event_queue._failed_calls = []
|
||||
|
||||
async def start_work(self, message=None):
|
||||
pass
|
||||
|
||||
async def complete(self, message=None):
|
||||
pass
|
||||
self.event_queue._complete_calls.append(message)
|
||||
if message is not None:
|
||||
await self.event_queue.enqueue_event(message)
|
||||
|
||||
async def failed(self, message=None):
|
||||
pass
|
||||
self.event_queue._failed_calls.append(message)
|
||||
if message is not None:
|
||||
await self.event_queue.enqueue_event(message)
|
||||
|
||||
async def add_artifact(
|
||||
self, parts, artifact_id=None, name=None, metadata=None,
|
||||
|
||||
@@ -1123,3 +1123,81 @@ async def test_no_task_enqueue_on_continuation():
|
||||
assert not isinstance(event, Task), (
|
||||
f"continuation must not re-enqueue Task, but got Task at {call}"
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# A2A v1 task-mode terminal-event contract (PR #2558 follow-up, task #262)
|
||||
# ---------------------------------------------------------------------------
|
||||
# After PR #2558 enqueues a Task at the start of new requests, the executor
|
||||
# is in v1 "task mode". The SDK then rejects any subsequent raw Message
|
||||
# enqueue with InvalidAgentResponseError("Received Message object in task
|
||||
# mode. Use TaskStatusUpdateEvent or TaskArtifactUpdateEvent instead.") —
|
||||
# see a2a/server/agent_execution/active_task.py validation site. Synth-E2E
|
||||
# 2026-05-03T11:00:34Z surfaced this. The fix routes the terminal Message
|
||||
# through TaskUpdater.complete()/failed() which wrap it in a
|
||||
# TaskStatusUpdateEvent. Both tests below pin that path so the regression
|
||||
# can't recur (raw enqueue at the terminal step would NOT touch
|
||||
# event_queue._complete_calls / _failed_calls).
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_terminal_success_routes_via_updater_complete():
|
||||
"""A successful run must terminate via updater.complete(message=...) —
|
||||
raw event_queue.enqueue_event(Message) crashes the v1 SDK in task mode."""
|
||||
agent = MagicMock()
|
||||
agent.astream_events = MagicMock(return_value=_stream(_text_chunk("Hello")))
|
||||
executor = LangGraphA2AExecutor(agent)
|
||||
|
||||
part = MagicMock()
|
||||
part.text = "Hi"
|
||||
|
||||
context = _make_context([part], "ctx-term-ok", task_id="task-term-ok")
|
||||
context.current_task = None # forces task-mode (Task gets enqueued)
|
||||
eq = _make_event_queue()
|
||||
# Pre-init real lists so the AsyncMock event_queue doesn't auto-spec
|
||||
# _complete_calls/_failed_calls into child MagicMocks. The conftest
|
||||
# TaskUpdater stub appends to these lists when complete/failed fire.
|
||||
eq._complete_calls = []
|
||||
eq._failed_calls = []
|
||||
|
||||
await executor.execute(context, eq)
|
||||
|
||||
assert eq._complete_calls, (
|
||||
"terminal Message must route via updater.complete() in task mode — "
|
||||
"raw event_queue.enqueue_event(Message) is rejected by a2a-sdk v1"
|
||||
)
|
||||
final_msg = eq._complete_calls[-1]
|
||||
assert "Hello" in str(final_msg)
|
||||
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_terminal_error_routes_via_updater_failed():
|
||||
"""An agent crash must terminate via updater.failed(message=...) — raw
|
||||
enqueue in task mode hits the same v1 contract violation."""
|
||||
async def _error_stream(*args, **kwargs):
|
||||
raise RuntimeError("model crashed")
|
||||
yield # pragma: no cover — makes this an async generator
|
||||
|
||||
agent = MagicMock()
|
||||
agent.astream_events = MagicMock(return_value=_error_stream())
|
||||
executor = LangGraphA2AExecutor(agent)
|
||||
|
||||
part = MagicMock()
|
||||
part.text = "Break things"
|
||||
|
||||
context = _make_context([part], "ctx-term-err", task_id="task-term-err")
|
||||
context.current_task = None # forces task-mode
|
||||
eq = _make_event_queue()
|
||||
eq._complete_calls = []
|
||||
eq._failed_calls = []
|
||||
|
||||
await executor.execute(context, eq)
|
||||
|
||||
assert eq._failed_calls, (
|
||||
"terminal error Message must route via updater.failed() in task mode"
|
||||
)
|
||||
err_msg = eq._failed_calls[-1]
|
||||
assert "model crashed" in str(err_msg)
|
||||
# And complete() must NOT have been called on the failure path.
|
||||
assert not eq._complete_calls, (
|
||||
"complete() should not fire when execute() raises"
|
||||
)
|
||||
|
||||
@@ -462,6 +462,68 @@ def test_envelope_enrichment_negative_caches_network_exception(_reset_peer_metad
|
||||
assert cached[1] is None
|
||||
|
||||
|
||||
def test_envelope_enrichment_negative_caches_non_json_200(_reset_peer_metadata_cache):
|
||||
"""HTTP 200 but the body isn't JSON (registry returns HTML, an empty
|
||||
string, or a partial response): ``response.json()`` raises. The
|
||||
enrichment block must absorb the exception, write the negative-cache
|
||||
entry, and never re-fetch this peer until TTL elapses.
|
||||
|
||||
Without this contract a registry that mistakenly returns a non-JSON
|
||||
200 (proxy injecting an HTML error page; partial response from a
|
||||
flapping pod) would re-fire the 2s-bounded GET on every push for
|
||||
that peer — same DoS-on-self pattern the 5xx negative-cache test
|
||||
pins. #2483.
|
||||
"""
|
||||
import json as _json
|
||||
|
||||
import a2a_client
|
||||
from a2a_mcp_server import _build_channel_notification
|
||||
|
||||
# 200 OK shape but .json() raises. side_effect overrides the
|
||||
# _make_httpx_response default of `return_value` so the helper can
|
||||
# stay shape-stable for callers that DO want a JSON body.
|
||||
resp = _make_httpx_response(200, {})
|
||||
resp.json.side_effect = _json.JSONDecodeError("not json", "<html>", 0)
|
||||
p, client = _patch_httpx_client(resp)
|
||||
with p:
|
||||
_build_channel_notification({"peer_id": _PEER_UUID, "kind": "peer_agent", "text": "first"})
|
||||
_build_channel_notification({"peer_id": _PEER_UUID, "kind": "peer_agent", "text": "second"})
|
||||
|
||||
assert client.get.call_count == 1, (
|
||||
f"non-JSON 200 must be negative-cached, got {client.get.call_count} GETs"
|
||||
)
|
||||
cached = a2a_client._peer_metadata[_PEER_UUID]
|
||||
assert cached[1] is None, "negative cache stores None as the record"
|
||||
|
||||
|
||||
def test_envelope_enrichment_negative_caches_non_dict_json_200(_reset_peer_metadata_cache):
|
||||
"""HTTP 200, valid JSON, but the body is a list / string / number /
|
||||
null instead of the expected dict. ``isinstance(record, dict)``
|
||||
skips enrichment but the call must still write to the negative
|
||||
cache so a second push doesn't re-fetch.
|
||||
|
||||
Pins behaviour for a registry that mistakenly returns
|
||||
``[{"id": ...}, ...]`` (collection shape) or just ``null`` (no-record
|
||||
sentinel) — both should land at the same negative-cache outcome as a
|
||||
5xx or a non-JSON 200. #2483.
|
||||
"""
|
||||
import a2a_client
|
||||
from a2a_mcp_server import _build_channel_notification
|
||||
|
||||
p, client = _patch_httpx_client(
|
||||
_make_httpx_response(200, ["not", "a", "dict"]),
|
||||
)
|
||||
with p:
|
||||
_build_channel_notification({"peer_id": _PEER_UUID, "kind": "peer_agent", "text": "first"})
|
||||
_build_channel_notification({"peer_id": _PEER_UUID, "kind": "peer_agent", "text": "second"})
|
||||
|
||||
assert client.get.call_count == 1, (
|
||||
f"non-dict JSON 200 must be negative-cached, got {client.get.call_count} GETs"
|
||||
)
|
||||
cached = a2a_client._peer_metadata[_PEER_UUID]
|
||||
assert cached[1] is None, "negative cache stores None as the record"
|
||||
|
||||
|
||||
def test_envelope_enrichment_re_fetches_after_ttl(_reset_peer_metadata_cache):
|
||||
"""Cached entry past TTL: registry is hit again. Pin the TTL
|
||||
behaviour so a future caller bumping ``_PEER_METADATA_TTL_SECONDS``
|
||||
|
||||
@@ -1050,6 +1050,27 @@ class TestChatHistory:
|
||||
|
||||
assert mc.get.call_args.kwargs["params"]["before_ts"] == "2026-05-01T00:00:00Z"
|
||||
|
||||
async def test_empty_history_returns_empty_json_list(self):
|
||||
"""Pin the happy-path-with-no-rows shape: server returns 200
|
||||
with an empty list, the wheel returns the JSON literal ``"[]"``.
|
||||
|
||||
Without this pin the surrounding tests all pre-populate rows;
|
||||
none verify what an agent sees when there's literally no chat
|
||||
history with this peer yet (a fresh A2A peering, or a peer
|
||||
whose history was rotated out). #2485.
|
||||
"""
|
||||
import a2a_tools
|
||||
|
||||
mc = _make_http_mock(get_resp=_resp(200, []))
|
||||
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
|
||||
result = await a2a_tools.tool_chat_history(peer_id=_PEER)
|
||||
|
||||
# Exact-equality on the JSON literal (per assert-exact memory) —
|
||||
# substring "[]" would also match `{"items": []}` or any number
|
||||
# of envelope shapes, only `result == "[]"` discriminates the
|
||||
# bare-list contract callers depend on.
|
||||
assert result == "[]"
|
||||
|
||||
async def test_reverses_desc_response_to_chronological(self):
|
||||
"""Server returns DESC (newest first); the wheel reverses to
|
||||
chronological so the agent reads the chat top-down — same
|
||||
|
||||
@@ -286,6 +286,55 @@ def test_required_env_empty_list_passes(tmp_path):
|
||||
assert report.ok is True
|
||||
|
||||
|
||||
def test_required_env_skipped_in_smoke_mode(tmp_path, monkeypatch):
|
||||
"""MOLECULE_SMOKE_MODE=1 demotes Required-env failures to warnings.
|
||||
|
||||
Boot smoke (issue #2275) exercises executor.execute() against stub
|
||||
deps and never hits the real provider, so missing auth env is not
|
||||
a real blocker. Without this bypass, every adapter that introduces
|
||||
a new auth env var (HERMES_API_KEY, OPENROUTER_API_KEY, etc.)
|
||||
would silently break the publish-image gate until molecule-ci's
|
||||
fake-env list catches up — the 2026-05-03 hermes outage. The
|
||||
warning still surfaces in the report so unset env doesn't go
|
||||
completely silent.
|
||||
"""
|
||||
monkeypatch.delenv("HERMES_API_KEY", raising=False)
|
||||
monkeypatch.setenv("MOLECULE_SMOKE_MODE", "1")
|
||||
|
||||
config = make_config(
|
||||
runtime_config=RuntimeConfig(required_env=["HERMES_API_KEY"]),
|
||||
)
|
||||
|
||||
report = run_preflight(config, str(tmp_path))
|
||||
|
||||
assert report.ok is True
|
||||
assert any(
|
||||
issue.title == "Required env" and "HERMES_API_KEY" in issue.detail
|
||||
for issue in report.warnings
|
||||
), "smoke-mode bypass should still warn so unset env stays visible"
|
||||
assert not any(
|
||||
issue.title == "Required env" for issue in report.failures
|
||||
)
|
||||
|
||||
|
||||
def test_required_env_smoke_mode_off_still_fails(tmp_path, monkeypatch):
|
||||
"""Sanity: smoke bypass is OFF when MOLECULE_SMOKE_MODE is unset."""
|
||||
monkeypatch.delenv("HERMES_API_KEY", raising=False)
|
||||
monkeypatch.delenv("MOLECULE_SMOKE_MODE", raising=False)
|
||||
|
||||
config = make_config(
|
||||
runtime_config=RuntimeConfig(required_env=["HERMES_API_KEY"]),
|
||||
)
|
||||
|
||||
report = run_preflight(config, str(tmp_path))
|
||||
|
||||
assert report.ok is False
|
||||
assert any(
|
||||
issue.title == "Required env" and "HERMES_API_KEY" in issue.detail
|
||||
for issue in report.failures
|
||||
)
|
||||
|
||||
|
||||
# ---------- Per-model required_env (models[] override) ----------
|
||||
|
||||
|
||||
|
||||
Reference in New Issue
Block a user