Compare commits

..

7 Commits

Author SHA1 Message Date
core-platform 871501dfc9 fix(ws-server): fail-closed on unresolvable template runtime (controlplane#188)
E2E API Smoke Test / detect-changes (pull_request) Failing after 2s
E2E Chat / detect-changes (pull_request) Failing after 1s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has been skipped
E2E Chat / E2E Chat (pull_request) Has been skipped
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 8s
qa-review / approved (pull_request) Failing after 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
gate-check-v3 / gate-check (pull_request) Successful in 12s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 17s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
security-review / approved (pull_request) Failing after 15s
sop-tier-check / tier-check (pull_request) Successful in 14s
Harness Replays / Harness Replays (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m45s
CI / Canvas (Next.js) (pull_request) Successful in 4m14s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 4m30s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m5s
CI / Python Lint & Test (pull_request) Successful in 7m17s
CI / all-required (pull_request) Successful in 7m18s
audit-force-merge / audit (pull_request) Has been skipped
POST /workspaces silently substituted langgraph and returned 201 when a
caller named a `template` (intent for a specific runtime) but the runtime
could not be resolved from it (config.yaml unreadable / no `runtime:`
key). This is the molecule-controlplane#188 / #184 contract violation —
it produced 5/5 wrong-runtime workspaces and a false codex E2E pass.

The ws-server `Create` handler is the boundary the product UI actually
hits (the canvas dialog and provision_workspace MCP tool both POST here);
controlplane#188's CP-side gate is the sibling. This closes the
ws-server side: when the caller expressed runtime intent (passed
`runtime`, or named a `template`) but it cannot be honored, return 422
RUNTIME_UNRESOLVED instead of a silent langgraph 201.

The legitimate default path (bare {"name":...} — no template, no
runtime) still defaults to langgraph and returns 201; a regression test
pins that so the fail-closed gate can't over-fire.

Tests: TestWorkspaceCreate_188_* (missing template, no-runtime-key
template, default-path regression guard, explicit-runtime OK).

Refs: molecule-controlplane#188, #184

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 01:51:16 -07:00
hongming 7cff067b6e fix(ci): unblock runtime publish and secret scan (#1479)
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 7s
CI / Shellcheck (E2E scripts) (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Chat / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 4s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 4s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 3s
publish-workspace-server-image / build-and-push (push) Successful in 4m1s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m17s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 1m26s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4s
Harness Replays / Harness Replays (push) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 1m5s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m11s
E2E Chat / E2E Chat (push) Failing after 2m21s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m2s
CI / Platform (Go) (push) Failing after 4m59s
CI / all-required (push) Failing after 4m21s
publish-workspace-server-image / Production auto-deploy (push) Failing after 2m47s
CI / Python Lint & Test (push) Successful in 6m39s
CI / Canvas (Next.js) (push) Successful in 7m42s
CI / Canvas Deploy Reminder (push) Successful in 2s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 26s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 7m3s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m6s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 7s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 6m59s
main-red-watchdog / watchdog (push) Successful in 27s
gate-check-v3 / gate-check (push) Successful in 38s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 3s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 11s
ci-required-drift / drift (push) Successful in 33s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 6m54s
gitea-merge-queue / queue (push) Successful in 1m0s
status-reaper / reap (push) Successful in 1m22s
Co-authored-by: hongming <hongmingwang@moleculesai.app>
Co-committed-by: hongming <hongmingwang@moleculesai.app>
2026-05-18 06:16:59 +00:00
hongming-pc2 684d9b699c fix(ci): document event-suffix requirement for branch protection context (#1473) (#1474)
CI / Platform (Go) (push) Waiting to run
CI / all-required (push) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
publish-runtime-autobump / pr-validate (push) Successful in 36s
publish-runtime-autobump / bump-and-tag (push) Failing after 34s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 1m21s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m11s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
Co-authored-by: hongming-pc2 <hongming-pc2@moleculesai.app>
Co-committed-by: hongming-pc2 <hongming-pc2@moleculesai.app>
2026-05-18 06:16:43 +00:00
infra-sre b49d5bbe6c fix(ci): add 10m timeout to secret-scan job (mc#1099 follow-up) (#1258)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Waiting to run
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 29s
Co-authored-by: Molecule AI Infra-SRE <infra-sre@agents.moleculesai.app>
Co-committed-by: Molecule AI Infra-SRE <infra-sre@agents.moleculesai.app>
2026-05-18 06:16:24 +00:00
devops-engineer b27826d148 fix(ci): review-check.sh — diagnose wrong-event-string PENDING reviews (internal#503) (#1482)
Block internal-flavored paths / Block forbidden paths (push) Waiting to run
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Waiting to run
CI / Canvas (Next.js) (push) Waiting to run
CI / Shellcheck (E2E scripts) (push) Waiting to run
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Waiting to run
CI / all-required (push) Waiting to run
E2E API Smoke Test / detect-changes (push) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Chat / detect-changes (push) Waiting to run
E2E Chat / E2E Chat (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / detect-changes (push) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions
Runtime PR-Built Compatibility / detect-changes (push) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (push) Waiting to run
publish-workspace-server-image / build-and-push (push) Has been cancelled
review-check-tests / review-check.sh regression tests (push) Successful in 18s
Ops Scripts Tests / Ops scripts (unittest) (push) Has been cancelled
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 2s
gitea-merge-queue / queue (push) Successful in 6s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
status-reaper / reap (push) Successful in 1m16s
Co-authored-by: devops-engineer <devops-engineer@agents.moleculesai.app>
Co-committed-by: devops-engineer <devops-engineer@agents.moleculesai.app>
2026-05-18 06:14:34 +00:00
devops-engineer b4427ac8a6 fix(ci): exclude secrets-detector test fixtures from secret-scan (unblocks A2A-P0 deploy) (#1477)
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CI / Detect changes (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 14s
E2E Chat / detect-changes (push) Successful in 18s
E2E API Smoke Test / detect-changes (push) Successful in 18s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 20s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 10s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 48s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 40s
E2E Chat / E2E Chat (push) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m37s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 1m1s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m34s
CI / Canvas (Next.js) (push) Successful in 4m17s
publish-workspace-server-image / build-and-push (push) Successful in 6m14s
CI / Python Lint & Test (push) Successful in 6m20s
CI / Platform (Go) (push) Successful in 6m51s
CI / all-required (push) Successful in 6m50s
publish-workspace-server-image / Production auto-deploy (push) Successful in 2m18s
CI / Canvas Deploy Reminder (push) Successful in 2s
Sweep stale Cloudflare Tunnels / Sweep CF tunnels (push) Successful in 5s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 5s
E2E Staging Sanity (leak-detection self-check) / Intentional-failure teardown sanity (push) Successful in 1m58s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 5m15s
status-reaper / reap (push) Has started running
main-red-watchdog / watchdog (push) Successful in 36s
gitea-merge-queue / queue (push) Has started running
gate-check-v3 / gate-check (push) Successful in 55s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m57s
2026-05-18 05:18:24 +00:00
devops-engineer 5324e69049 Merge pull request 'promote: staging→main — A2A P0 (internal#498) + 25 gated staging fixes' (#1450) from staging into main
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Chat / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
E2E Staging SaaS (full lifecycle) / pr-validate (push) Successful in 29s
Handlers Postgres Integration / detect-changes (push) Successful in 5s
Harness Replays / detect-changes (push) Successful in 5s
publish-runtime-autobump / pr-validate (push) Successful in 29s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (push) Successful in 1m39s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (push) Failing after 10s
publish-runtime-autobump / bump-and-tag (push) Failing after 33s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (push) Has been skipped
E2E Chat / E2E Chat (push) Failing after 53s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m27s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 31s
Harness Replays / Harness Replays (push) Successful in 4s
publish-canvas-image / Build & push canvas image (push) Successful in 3m45s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Failing after 34s
publish-workspace-server-image / build-and-push (push) Successful in 5m24s
E2E Staging External Runtime / E2E Staging External Runtime (push) Successful in 5m27s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push) Successful in 5m22s
publish-workspace-server-image / Production auto-deploy (push) Failing after 18s
CI / Platform (Go) (push) Successful in 6m12s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 29s
CI / Python Lint & Test (push) Successful in 7m1s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
CI / Canvas (Next.js) (push) Successful in 7m11s
CI / Canvas Deploy Reminder (push) Successful in 1s
CI / all-required (push) Successful in 7m15s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6m58s
main-red-watchdog / watchdog (push) Successful in 27s
gate-check-v3 / gate-check (push) Successful in 1m11s
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 2s
gitea-merge-queue / queue (push) Successful in 5s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
status-reaper / reap (push) Successful in 56s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m54s
ci-required-drift / drift (push) Successful in 1m10s
2026-05-18 04:54:22 +00:00
29 changed files with 324 additions and 1415 deletions
+23
View File
@@ -206,6 +206,29 @@ CANDIDATES=$(jq -r --arg author "$PR_AUTHOR" --arg head "$PR_HEAD_SHA" "$JQ_FILT
debug "candidate non-author approvers: $(echo "$CANDIDATES" | tr '\n' ' ')"
if [ -z "$CANDIDATES" ]; then
# --- Guardrail (internal#503): explain the most common false
# "no candidates" red. Gitea's review event enum is EXACTLY
# APPROVED/REQUEST_CHANGES/COMMENT/PENDING. A wrong value ("APPROVE",
# lowercase, ...) is silently accepted (HTTP 200) and stored as
# state=PENDING. A correctly-started draft review has an EMPTY body;
# a NON-empty body + state==PENDING by a non-author == an intended
# verdict mis-filed by a wrong event string. Surface it actionably.
# This does NOT change the gate result (still fail-closed below) — it
# only converts a mystery red into a named, self-fixing error.
MISFILED_FILTER='.[]
| select(.state == "PENDING")
| select(.dismissed != true)
| select(.user.login != $author)
| select(((.body // "") | gsub("^\\s+|\\s+$";"") | length) > 0)
| "\(.id)\t\(.user.login)"'
MISFILED=$(jq -r --arg author "$PR_AUTHOR" "$MISFILED_FILTER" "$REVIEWS_JSON" 2>/dev/null || true)
if [ -n "$MISFILED" ]; then
echo "::error::${TEAM}-review: non-author review(s) were SUBMITTED but stored as PENDING — almost certainly the wrong Gitea review event string (internal#503)."
echo "::error::Gitea accepts ONLY the exact enum APPROVED / REQUEST_CHANGES / COMMENT. 'APPROVE' or lowercase is silently (HTTP 200) filed as PENDING and is invisible to this gate."
printf '%s\n' "$MISFILED" | while IFS="$(printf '\t')" read -r _rid _rl; do
[ -n "${_rid:-}" ] && echo "::error:: review id=${_rid} by '${_rl}': RE-SUBMIT via POST ${API}/repos/${OWNER}/${NAME}/pulls/${PR_NUMBER}/reviews with {\"event\":\"APPROVED\"} (correct enum) — do NOT edit the DB."
done
fi
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (no candidates yet)"
exit 1
fi
+17 -17
View File
@@ -145,10 +145,10 @@ jobs:
# the diagnostic step with its own continue-on-error: true (line 203).
# Flip confirmed by CI / Platform (Go) status = success on main HEAD 363905d3.
continue-on-error: false
# Job-level ceiling. The go test step below runs with a per-step 30m timeout;
# this cap catches any step that leaks past that. Set well above 30m so
# Job-level ceiling. The go test step below runs with a per-step 10m timeout;
# this cap catches any step that leaks past that. Set well above 10m so
# the per-step timeout is the active constraint.
timeout-minutes: 35
timeout-minutes: 15
defaults:
run:
working-directory: workspace-server
@@ -176,14 +176,12 @@ jobs:
name: Run golangci-lint
run: $(go env GOPATH)/bin/golangci-lint run --timeout 3m ./...
- if: always()
name: Diagnostic — per-package verbose (300s timeout)
name: Diagnostic — per-package verbose 60s
run: |
set +e
# 300s allows handlers + pendinguploads packages to complete on cold
# runners with -race instrumentation (~60-120s each vs ~14s non-race).
go test -race -v -timeout 300s ./internal/handlers/... 2>&1 | tee /tmp/test-handlers.log
go test -race -v -timeout 60s ./internal/handlers/... 2>&1 | tee /tmp/test-handlers.log
handlers_exit=$?
go test -race -v -timeout 300s ./internal/pendinguploads/... 2>&1 | tee /tmp/test-pu.log
go test -race -v -timeout 60s ./internal/pendinguploads/... 2>&1 | tee /tmp/test-pu.log
pu_exit=$?
echo "::group::handlers exit=$handlers_exit (last 100 lines)"
tail -100 /tmp/test-handlers.log
@@ -196,10 +194,10 @@ jobs:
- if: always()
name: Run tests with race detection and coverage
# Explicit timeout: cold runner cache causes OOM kills at ~4m39s on the
# full ./... suite with race detection + coverage. A 30m per-step timeout
# lets the suite complete on cold cache (~13-25m) while failing cleanly
# instead of OOM-killing. The job-level timeout (35m) is a backstop.
run: go test -race -timeout 30m -coverprofile=coverage.out ./...
# full ./... suite with race detection + coverage. A 10m per-step timeout
# lets the suite complete on cold cache (~5-7m) while failing cleanly
# instead of OOM-killing. The job-level timeout (15m) is a backstop.
run: go test -race -timeout 10m -coverprofile=coverage.out ./...
- if: always()
name: Per-file coverage report
@@ -540,11 +538,13 @@ jobs:
all-required:
# Aggregator sentinel — RFC internal#219 §2 (Phase 4 — closes internal#286).
#
# Single stable required-status name that branch protection points at;
# CI churns underneath in `needs:` without any protection edits. Mirrors
# the molecule-controlplane Phase 2a impl shipped in CP PR#112 and
# referenced by `internal#286` ("Phase 4 is a single small PR... mirrors
# CP's existing one").
# Emits `CI / all-required (<event>)` where <event> is the workflow trigger
# (e.g. `CI / all-required (pull_request)`, `CI / all-required (push)`).
# Branch protection MUST be updated to require the event-suffixed name —
# requiring `CI / all-required` (bare, no suffix) silently blocks all merges
# because Gitea treats absent status contexts as pending (not skipped), and
# no workflow emits the bare name. Fixed: BP now requires
# `CI / all-required (pull_request)` per issue #1473.
#
# Closes the failure mode where status_check_contexts on molecule-core/main
# only listed `Secret scan` + `sop-tier-check` (the 2 meta-gates), so real
+4
View File
@@ -52,5 +52,9 @@ jobs:
# explicitly instead of the combined state avoids false-pause when
# non-blocking jobs (continue-on-error: true) have failed — those
# failures pollute combined state but do not gate merges.
# NOTE: the event-suffixed context name is intentional — branch protection
# MUST require `CI / all-required (pull_request)` (with suffix), NOT the
# bare `CI / all-required`. Gitea treats absent contexts as pending, not
# skipped; requiring the bare name silently blocks all merges (issue #1473).
PUSH_REQUIRED_CONTEXTS: CI / all-required (push)
run: python3 .gitea/scripts/gitea-merge-queue.py
+19 -4
View File
@@ -104,7 +104,7 @@ jobs:
with:
python-version: "3.11"
- name: Compute next version from PyPI latest
- name: Compute next version from PyPI latest and existing tags
id: bump
run: |
set -eu
@@ -112,9 +112,24 @@ jobs:
| python -c "import sys,json; print(json.load(sys.stdin)['info']['version'])")
MAJOR=$(echo "$LATEST" | cut -d. -f1)
MINOR=$(echo "$LATEST" | cut -d. -f2)
PATCH=$(echo "$LATEST" | cut -d. -f3)
VERSION="${MAJOR}.${MINOR}.$((PATCH+1))"
echo "PyPI latest=$LATEST -> next=$VERSION"
TAG_LATEST=$(git tag --list "runtime-v${MAJOR}.${MINOR}.*" \
| sed -E 's/^runtime-v//' \
| grep -E '^[0-9]+\.[0-9]+\.[0-9]+$' \
| sort -V \
| tail -1 || true)
VERSION=$(PYPI_LATEST="$LATEST" TAG_LATEST="$TAG_LATEST" python - <<'PY'
import os
def parse(v):
return tuple(int(part) for part in v.split("."))
pypi = os.environ["PYPI_LATEST"]
tag = os.environ.get("TAG_LATEST") or pypi
base = max(parse(pypi), parse(tag))
print(f"{base[0]}.{base[1]}.{base[2] + 1}")
PY
)
echo "PyPI latest=$LATEST, latest runtime tag=${TAG_LATEST:-none} -> next=$VERSION"
if ! echo "$VERSION" | grep -qE '^[0-9]+\.[0-9]+\.[0-9]+$'; then
echo "::error::computed version $VERSION does not match PEP 440 X.Y.Z"
exit 1
+13
View File
@@ -30,6 +30,11 @@ jobs:
scan:
name: Scan diff for credential-shaped strings
runs-on: ubuntu-latest
# Hard CI gate — must complete or the PR is unmergable. 10-minute ceiling
# is generous for a diff-scan against a single SHA. If this times out, the
# runner is frozen and holding a slot — the step timeout triggers clean
# failure, releasing the runner for the next job.
timeout-minutes: 10
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
@@ -133,6 +138,14 @@ jobs:
[ -z "$f" ] && continue
[ "$f" = "$SELF_GITHUB" ] && continue
[ "$f" = "$SELF_GITEA" ] && continue
# Test-fixture exclude (internal#425): the secrets-detector's OWN
# unit-test corpus deliberately embeds credential-SHAPED example
# strings to exercise the detector. Verified 2026-05-18 synthetic
# (fabricated ghp_* fixtures, not real). Without this the scanner
# self-trips on its own fixtures and fail-closes every deploy.
# Same rationale as the SELF_* excludes above; gate NOT weakened
# (all other paths still fully scanned).
[ "$f" = "workspace-server/internal/secrets/patterns_test.go" ] && continue
if [ -n "$DIFF_RANGE" ]; then
ADDED=$(git diff --no-color --unified=0 "$BASE" "$HEAD" -- "$f" 2>/dev/null | grep -E '^\+[^+]' || true)
else
+1 -8
View File
@@ -748,14 +748,7 @@ export function MobileChat({
border: "none",
outline: "none",
background: "transparent",
// 16px floor: iOS Safari/WebKit auto-zooms the viewport on
// focus when a focused field's font-size is < 16px. Anything
// below this re-introduces the tap-to-zoom layout jump on the
// mobile PWA. Do NOT lower this without also adding a
// maximum-scale/user-scalable viewport lock — and that lock
// breaks pinch-to-zoom accessibility, so 16px here is the
// correct trade.
fontSize: 16,
fontSize: 14.5,
lineHeight: 1.4,
color: p.text,
padding: "6px 0",
@@ -263,20 +263,6 @@ describe("MobileChat — composer", () => {
const sendBtn = container.querySelector('[aria-label="Send"]') as HTMLButtonElement;
expect(sendBtn.disabled).toBe(true);
});
// iOS Safari/WebKit auto-zooms the viewport on focus when a focused
// <input>/<textarea> has an effective font-size below 16px. On the
// mobile PWA this made the whole layout scale up the moment the user
// tapped into the chat box. Keeping the composer font ≥16px is the
// root-cause fix — it suppresses the focus-zoom WITHOUT disabling
// pinch-to-zoom (which a maximum-scale/user-scalable viewport hack
// would have done at the cost of accessibility).
it("composer textarea font-size is >= 16px (prevents iOS focus-zoom)", () => {
const { container } = renderChat(mockAgentId);
const textarea = container.querySelector("textarea") as HTMLTextAreaElement;
const fontSizePx = parseFloat(textarea.style.fontSize);
expect(fontSizePx).toBeGreaterThanOrEqual(16);
});
});
// ─── Tabs ─────────────────────────────────────────────────────────────────────
@@ -248,88 +248,6 @@ describe("extractResponseText", () => {
});
});
describe("extractAgentText", () => {
it("extracts text from top-level parts", () => {
const task = {
parts: [{ kind: "text", text: "Agent said hello" }],
};
expect(extractAgentText(task)).toBe("Agent said hello");
});
it("extracts from artifacts[0].parts when top-level parts absent", () => {
const task = {
artifacts: [
{ parts: [{ kind: "text", text: "From artifact block" }] },
],
};
expect(extractAgentText(task)).toBe("From artifact block");
});
it("extracts from status.message.parts as fallback", () => {
const task = {
status: {
message: { parts: [{ kind: "text", text: "Status text" }] },
},
};
expect(extractAgentText(task)).toBe("Status text");
});
it("prefers top-level parts over artifacts", () => {
const task = {
parts: [{ kind: "text", text: "top-level wins" }],
artifacts: [
{ parts: [{ kind: "text", text: "artifact text" }] },
],
};
expect(extractAgentText(task)).toBe("top-level wins");
});
it("prefers top-level parts over status.message", () => {
const task = {
parts: [{ kind: "text", text: "parts wins" }],
status: {
message: { parts: [{ kind: "text", text: "status text" }] },
},
};
expect(extractAgentText(task)).toBe("parts wins");
});
it("returns string identity when task itself is a string", () => {
expect(extractAgentText("plain string task" as unknown as Record<string, unknown>)).toBe(
"plain string task",
);
});
it("returns fallback when task is an empty object", () => {
expect(extractAgentText({})).toBe("(Could not extract response text)");
});
it("returns fallback when task has no extractable text", () => {
expect(
extractAgentText({ status: "running", other: "fields" }),
).toBe("(Could not extract response text)");
});
it("tolerates malformed nested shapes without throwing", () => {
const task = {
parts: null,
artifacts: "not an array",
status: { message: 42 },
};
expect(extractAgentText(task)).toBe("(Could not extract response text)");
});
it("joins multiple text parts with newline", () => {
const task = {
parts: [
{ kind: "text", text: "Line one" },
{ kind: "text", text: "Line two" },
],
};
expect(extractAgentText(task)).toBe("Line one\nLine two");
});
});
describe("extractTextsFromParts", () => {
it("extracts text parts with kind=text", () => {
const parts = [
@@ -1,102 +0,0 @@
import { describe, it, expect, beforeEach } from "vitest";
import { useCanvasStore } from "@/store/canvas";
import { resolveWorkspaceName } from "../hooks/resolveWorkspaceName";
beforeEach(() => {
// Reset store to a clean slate between tests so node lookup is deterministic.
useCanvasStore.setState({ nodes: [] });
});
describe("resolveWorkspaceName", () => {
it("returns the workspace name when a node with that ID exists", () => {
useCanvasStore.setState({
nodes: [
{
id: "ws-alpha-001",
type: "workspace",
data: { name: "Alpha Agent" },
position: { x: 0, y: 0 },
},
],
});
expect(resolveWorkspaceName("ws-alpha-001")).toBe("Alpha Agent");
});
it("falls back to the first 8 chars of the ID when no matching node exists", () => {
expect(resolveWorkspaceName("ws-zzz-not-found")).toBe("ws-zzz-n");
});
it("falls back to the first 8 chars when the node exists but has no name", () => {
useCanvasStore.setState({
nodes: [
{
id: "ws-no-name",
type: "workspace",
// data.name is deliberately absent
data: {},
position: { x: 0, y: 0 },
},
],
});
expect(resolveWorkspaceName("ws-no-name")).toBe("ws-no-na");
});
it("returns the first 8 chars for a very short ID", () => {
expect(resolveWorkspaceName("ab")).toBe("ab");
});
it("returns the first 8 chars when the ID is exactly 8 characters", () => {
// slice(0,8) of an 8-char string is the full string
const id = "12345678";
expect(resolveWorkspaceName(id)).toBe(id);
});
it("picks the right node when multiple workspaces share a prefix", () => {
useCanvasStore.setState({
nodes: [
{
id: "00000000-0000-0000-0000-000000000001",
type: "workspace",
data: { name: "Backend Agent" },
position: { x: 0, y: 0 },
},
{
id: "00000000-0000-0000-0000-000000000002",
type: "workspace",
data: { name: "Frontend Agent" },
position: { x: 100, y: 0 },
},
],
});
expect(resolveWorkspaceName("00000000-0000-0000-0000-000000000002")).toBe(
"Frontend Agent"
);
expect(resolveWorkspaceName("00000000-0000-0000-0000-000000000001")).toBe(
"Backend Agent"
);
});
it("does not mutate store state between calls", () => {
useCanvasStore.setState({
nodes: [
{
id: "stable-id",
type: "workspace",
data: { name: "Stable Workspace" },
position: { x: 0, y: 0 },
},
],
});
resolveWorkspaceName("stable-id");
resolveWorkspaceName("unknown-id");
// Store nodes must be unchanged — resolveWorkspaceName is read-only.
const nodes = useCanvasStore.getState().nodes;
expect(nodes).toHaveLength(1);
expect((nodes[0] as { id: string }).id).toBe("stable-id");
});
});
@@ -1,209 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for useChatSend — the canvas user→agent send hook.
*
* Behavioural focus: the poll-mode ("queued") path. When the target
* workspace is an external / MCP-registered agent (delivery_mode=poll,
* e.g. an operator laptop running the molecule MCP channel), the
* platform's POST /workspaces/:id/a2a returns a synthetic
* {status:"queued", delivery_mode:"poll"} envelope IMMEDIATELY with no
* reply — the real reply arrives later over the AGENT_MESSAGE
* WebSocket push.
*
* Pre-fix the hook treated that synthetic envelope as a terminal
* response and called releaseSendGuards() → `sending` went false the
* instant the POST returned → the "agent is working" indicator
* vanished and the external turn looked dead. This suite pins the
* fixed contract:
*
* - a real reply still clears `sending` (regression guard)
* - a poll "queued" envelope KEEPS `sending` true (no terminal
* clear) so the existing thinking indicator persists
* - the eventual reply path (releaseSendGuards, the same call the
* AGENT_MESSAGE WS push makes via useChatSocket) clears it
* - an offline poll agent that never replies eventually surfaces an
* honest error instead of an infinite spinner
*
* Plus pure-function coverage for the poll-envelope detector.
*
* Root cause: workspace-server a2a_proxy.go:402 poll-mode
* short-circuit returns {status:"queued"} synchronously.
*/
import {
describe,
it,
expect,
vi,
beforeEach,
afterEach,
type Mock,
} from "vitest";
import { act, renderHook, cleanup } from "@testing-library/react";
const { mockApiPost } = vi.hoisted(() => ({ mockApiPost: vi.fn() }));
vi.mock("@/lib/api", () => ({
api: { post: mockApiPost },
}));
vi.mock("../uploads", () => ({
uploadChatFiles: vi.fn(),
}));
// Import AFTER mocks.
import {
useChatSend,
isPollQueuedResponse,
extractReplyText,
POLL_QUEUED_REPLY_TIMEOUT_MS,
} from "../useChatSend";
const flush = () => act(async () => { await Promise.resolve(); });
describe("isPollQueuedResponse", () => {
it("is true only for the synthetic poll-mode queued envelope", () => {
expect(isPollQueuedResponse({ status: "queued", delivery_mode: "poll" })).toBe(true);
});
it("is false for a real agent reply", () => {
expect(
isPollQueuedResponse({ result: { parts: [{ kind: "text", text: "hi" }] } }),
).toBe(false);
});
it("is false for null / undefined / partial shapes", () => {
expect(isPollQueuedResponse(null)).toBe(false);
expect(isPollQueuedResponse(undefined)).toBe(false);
// status=queued without delivery_mode=poll is NOT the poll envelope
// — don't accidentally swallow a real reply that happens to carry
// an unrelated status field.
expect(isPollQueuedResponse({ status: "queued" })).toBe(false);
expect(isPollQueuedResponse({ delivery_mode: "poll" })).toBe(false);
});
});
describe("extractReplyText (regression guard — unchanged by fix)", () => {
it("collects text parts from result", () => {
expect(
extractReplyText({ result: { parts: [{ kind: "text", text: "hello" }] } }),
).toBe("hello");
});
it("returns empty for the poll-queued envelope", () => {
expect(extractReplyText({ status: "queued", delivery_mode: "poll" })).toBe("");
});
});
describe("useChatSend — poll-mode in-progress state", () => {
beforeEach(() => {
vi.useFakeTimers();
mockApiPost.mockReset();
});
afterEach(() => {
vi.runOnlyPendingTimers();
vi.useRealTimers();
cleanup();
});
const setup = () => {
const onUserMessage = vi.fn();
const onAgentMessage = vi.fn();
const { result } = renderHook(() =>
useChatSend("ws-ext-1", {
getHistoryMessages: () => [],
onUserMessage,
onAgentMessage,
}),
);
return { result, onUserMessage, onAgentMessage };
};
it("a real reply clears `sending` (regression guard)", async () => {
mockApiPost.mockResolvedValue({
result: { parts: [{ kind: "text", text: "real reply" }] },
});
const { result, onAgentMessage } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(onAgentMessage).toHaveBeenCalledTimes(1);
expect(result.current.sending).toBe(false);
});
it("keeps `sending` true on a poll 'queued' envelope (no terminal clear)", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result, onAgentMessage } = setup();
await act(async () => {
void result.current.sendMessage("hi external agent");
});
await flush();
// The POST resolved, but it was only a queued ack — the indicator
// must stay up and no agent bubble should be rendered yet.
expect(result.current.sending).toBe(true);
expect(onAgentMessage).not.toHaveBeenCalled();
expect(result.current.error).toBeNull();
});
it("releaseSendGuards (the AGENT_MESSAGE-push path) clears the poll in-progress state", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(result.current.sending).toBe(true);
// Simulate the terminal AGENT_MESSAGE WebSocket push arriving:
// useChatSocket's onAgentMessage / onSendComplete call
// releaseSendGuards. That must clear the in-progress state AND the
// safety timer (asserted by the next test).
act(() => {
result.current.releaseSendGuards();
});
expect(result.current.sending).toBe(false);
});
it("surfaces an honest error if a poll agent never replies (safety timeout)", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
expect(result.current.sending).toBe(true);
act(() => {
vi.advanceTimersByTime(POLL_QUEUED_REPLY_TIMEOUT_MS + 1000);
});
expect(result.current.sending).toBe(false);
expect(result.current.error).toMatch(/queued/i);
});
it("does NOT fire the safety error when the reply arrives before timeout", async () => {
mockApiPost.mockResolvedValue({ status: "queued", delivery_mode: "poll" });
const { result } = setup();
await act(async () => {
void result.current.sendMessage("hi");
});
await flush();
// Reply arrives (releaseSendGuards) well before the timeout.
act(() => {
result.current.releaseSendGuards();
});
act(() => {
vi.advanceTimersByTime(POLL_QUEUED_REPLY_TIMEOUT_MS + 1000);
});
expect(result.current.error).toBeNull();
expect(result.current.sending).toBe(false);
});
});
@@ -1,6 +1,6 @@
"use client";
import { useCallback, useEffect, useRef, useState } from "react";
import { useCallback, useRef, useState } from "react";
import { api } from "@/lib/api";
import { uploadChatFiles } from "../uploads";
import { createMessage, type ChatMessage, type ChatAttachment } from "../types";
@@ -22,42 +22,8 @@ interface A2AResponse {
parts?: A2APart[];
artifacts?: Array<{ parts: A2APart[] }>;
};
/** Synthetic poll-mode envelope. The platform returns this
* immediately (HTTP 200) when the target workspace is registered
* delivery_mode=poll — an external / MCP-registered agent with no
* public URL (e.g. an operator's laptop running the molecule MCP
* channel). The request has only been QUEUED into activity_logs;
* the agent will pick it up on its next poll and the real reply
* arrives asynchronously over the AGENT_MESSAGE WebSocket push
* (consumed by useChatSocket). See workspace-server
* a2a_proxy.go:402 (poll-mode short-circuit) and
* a2a_proxy_helpers.go:516 (logA2AReceiveQueued). */
status?: string;
delivery_mode?: string;
}
/** True when `resp` is the platform's synthetic poll-mode "queued"
* envelope rather than a real agent reply. For these the send is
* acknowledged-but-pending: the user's message landed and the agent
* is working, but there is no reply yet — the terminal AGENT_MESSAGE
* push will arrive later over the WebSocket. Treating this as a
* terminal response (the pre-fix behaviour) cleared the "agent is
* working" indicator the instant the POST returned, so an external
* workspace turn looked dead even though work had not started. */
export function isPollQueuedResponse(resp: A2AResponse | null | undefined): boolean {
return !!resp && resp.status === "queued" && resp.delivery_mode === "poll";
}
/** Hard ceiling on how long the "agent is working" indicator stays up
* for a poll-mode turn with no reply. The terminal AGENT_MESSAGE push
* normally clears it well before this. The cap exists so a poll-mode
* workspace that is offline / never consumes its queue doesn't pin a
* spinner forever — at which point we surface an honest, actionable
* error instead of an opaque dead spinner. Generous because poll
* agents (an operator laptop) can legitimately take minutes to wake,
* poll, and respond; the goal is "eventually honest", not fail-fast. */
export const POLL_QUEUED_REPLY_TIMEOUT_MS = 15 * 60 * 1000;
export function extractReplyText(resp: A2AResponse): string {
const collect = (parts: A2APart[] | undefined): string => {
if (!parts) return "";
@@ -93,29 +59,14 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
const sendInFlightRef = useRef(false);
const sendingFromAPIRef = useRef(false);
const sendTokenRef = useRef(0);
// Safety-net timer armed only for poll-mode ("queued") turns: the
// POST returns immediately with no reply, so the normal
// POST-resolves-→-clear-spinner path can't drive the indicator. The
// terminal AGENT_MESSAGE WebSocket push clears it via
// releaseSendGuards (which also clears this timer); the timer is the
// backstop for an offline poll agent that never consumes its queue.
const pollTimeoutRef = useRef<ReturnType<typeof setTimeout> | null>(null);
const optionsRef = useRef(options);
optionsRef.current = options;
const clearPollTimeout = useCallback(() => {
if (pollTimeoutRef.current !== null) {
clearTimeout(pollTimeoutRef.current);
pollTimeoutRef.current = null;
}
}, []);
const releaseSendGuards = useCallback(() => {
clearPollTimeout();
setSending(false);
sendingFromAPIRef.current = false;
sendInFlightRef.current = false;
}, [clearPollTimeout]);
}, []);
const clearError = useCallback(() => setError(null), []);
@@ -195,33 +146,6 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
sendInFlightRef.current = false;
return;
}
// Poll-mode ("queued") turn: the message landed and the
// external/MCP agent will pick it up on its next poll, but
// there is NO reply in this response. Pre-fix this fell
// through to releaseSendGuards() below and the "agent is
// working" indicator vanished the instant the POST returned —
// an external-workspace turn looked dead even though work had
// not started. Instead, keep `sending` true so the existing
// thinking indicator (the same one internal agents use)
// persists as a "received — agent is working" state; the
// terminal AGENT_MESSAGE WebSocket push (consumed by
// useChatSocket → onAgentMessage / onSendComplete →
// releaseSendGuards) clears it when the real reply arrives,
// exactly the path an internal async reply already uses.
if (isPollQueuedResponse(resp)) {
clearPollTimeout();
pollTimeoutRef.current = setTimeout(() => {
if (sendTokenRef.current !== myToken) return;
if (!sendingFromAPIRef.current) return;
releaseSendGuards();
setError(
"No response yet from this agent — it may be offline or " +
"busy. Your message was delivered and is queued; the " +
"reply will appear here if the agent picks it up.",
);
}, POLL_QUEUED_REPLY_TIMEOUT_MS);
return;
}
const replyText = extractReplyText(resp);
const replyFiles = extractFilesFromTask(
(resp?.result ?? {}) as Record<string, unknown>,
@@ -243,15 +167,9 @@ export function useChatSend(workspaceId: string, options: UseChatSendOptions) {
setError("Failed to send message — agent may be unreachable");
});
},
[workspaceId, sending, uploading, clearPollTimeout],
[workspaceId, sending, uploading],
);
// Drop the poll-mode safety timer on unmount / workspace switch so a
// stale timeout can't fire setError against a panel the user has
// already navigated away from. sendTokenRef guards correctness if it
// ever did fire; this just avoids the wasted timer + setState churn.
useEffect(() => clearPollTimeout, [clearPollTimeout]);
return {
sending,
uploading,
@@ -67,21 +67,9 @@ export function useChatSocket(
const own = (targetId || msg.workspace_id) === workspaceId;
if (own) {
callbacksRef.current.onSendComplete?.();
// internal#211/#212: surface the runtime's curated,
// user-actionable reason (provider HTTP status + error
// code + the provider's own guidance, e.g. a 403 "org
// disabled · use an API key / ask your admin"). The
// server now includes error_detail in the ACTIVITY_LOGGED
// broadcast; fall back to summary, and only as a last
// resort to a generic line. The old hardcoded
// "Agent error (Exception) — see workspace logs for
// details." string pointed at a logs UI that does not
// exist and discarded the actionable reason entirely.
const detail =
(p.error_detail as string) ||
(p.summary as string) ||
"The agent turn failed but the runtime reported no detail. Retry once; if it repeats the workspace runtime may need a restart.";
callbacksRef.current.onSendError?.(detail);
callbacksRef.current.onSendError?.(
"Agent error (Exception) — see workspace logs for details.",
);
}
}
} else if (type === "a2a_send") {
+1 -1
View File
@@ -58,11 +58,11 @@ TOP_LEVEL_MODULES = {
"a2a_response",
"a2a_tools",
"a2a_tools_delegation",
"a2a_tools_identity",
"a2a_tools_inbox",
"a2a_tools_memory",
"a2a_tools_messaging",
"a2a_tools_rbac",
"a2a_tools_identity",
"adapter_base",
"agent",
"agents_md",
@@ -691,19 +691,6 @@ func logActivityExec(ctx context.Context, exec activityExecutor, broadcaster eve
if respStr != nil {
payload["response_body"] = json.RawMessage(respJSON)
}
// internal#211/#212: error_detail carries the runtime's curated,
// user-actionable, secret-safe failure reason (provider HTTP
// status + error code + the provider's own guidance, e.g. a 403
// "org disabled · use an API key / ask your admin"). It is
// already persisted to the DB column above and capped by the
// runtime's report_activity helper (4096 chars). Previously it
// was dropped from the LIVE broadcast, so the canvas had nothing
// to render and fell back to a hardcoded opaque
// "Agent error (Exception) — see workspace logs" string. Include
// it so the chat bubble shows the real reason in real time.
if params.ErrorDetail != nil && *params.ErrorDetail != "" {
payload["error_detail"] = *params.ErrorDetail
}
}
return func() {
@@ -107,29 +107,10 @@ func (h *ChatFilesHandler) WithPendingUploads(storage pendinguploads.Storage, br
}
// chatUploadMaxBytes caps the full multipart request body so a
// malicious / runaway client can't OOM the proxy hop. 100 MB matches
// the workspace-side total limit; anything larger is rejected at the
// malicious / runaway client can't OOM the proxy hop. 50 MB matches
// the workspace-side limit; anything larger is rejected at the
// network boundary before forwarding.
//
// SSOT NOTE (issue #1520): this constant is the source of truth for
// chat upload limits across the platform. Its value is exported to
// the workspace container at provision time via the env var
// CHAT_UPLOAD_MAX_TOTAL_BYTES (see
// workspace_provision_shared.go::applyChatUploadLimits) so the
// Python runtime cap stays in lock-step. Do NOT change this without
// updating the per-file cap chatUploadMaxFileBytes below and
// verifying the env-injection site is unchanged.
const chatUploadMaxBytes = 100 * 1024 * 1024
// chatUploadMaxFileBytes caps any single multipart part. Mirrors the
// total cap by default because most chat uploads are a single file;
// keeping per-file equal to total avoids the surprise of "my 60 MB
// file fit under the total but got 413'd on per-file". Exported to
// the workspace container as CHAT_UPLOAD_MAX_FILE_BYTES so the
// Starlette parser's max_part_size matches and any single part above
// Starlette's default 1 MiB no longer raises MultiPartException
// (root cause of issue #1520).
const chatUploadMaxFileBytes = 100 * 1024 * 1024
const chatUploadMaxBytes = 50 * 1024 * 1024
// resolveWorkspaceForwardCreds resolves the workspace's URL +
// platform_inbound_secret for an /internal/* forward, applying
@@ -1,63 +0,0 @@
package handlers
// chat_upload_limits_test.go — pins the SSOT env-injection contract
// for chat-upload caps (issue #1520). The Python workspace runtime
// reads these env vars at module init; drift between the constant in
// chat_files.go and the env-var name here silently breaks chat upload
// fleet-wide, so the contract is asserted as a unit test in the same
// package as the producer.
import (
"fmt"
"testing"
)
// applyChatUploadLimits MUST seed both env vars to the byte-count
// stringification of the Go-side constants. Anything else means a
// Python-side parser cap that disagrees with the Go-side network cap,
// which is exactly the drift that shipped #1520.
func TestApplyChatUploadLimits_DefaultsMatchGoConstants(t *testing.T) {
env := map[string]string{}
applyChatUploadLimits(env)
wantFile := fmt.Sprintf("%d", chatUploadMaxFileBytes)
if got := env["CHAT_UPLOAD_MAX_FILE_BYTES"]; got != wantFile {
t.Errorf("CHAT_UPLOAD_MAX_FILE_BYTES = %q, want %q", got, wantFile)
}
wantTotal := fmt.Sprintf("%d", chatUploadMaxBytes)
if got := env["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; got != wantTotal {
t.Errorf("CHAT_UPLOAD_MAX_TOTAL_BYTES = %q, want %q", got, wantTotal)
}
}
// Pre-existing values win. A tenant override, plugin mutator, or A/B
// experiment that already set the env MUST be preserved — the SSOT
// helper is a defaulting layer, not an override layer.
func TestApplyChatUploadLimits_PreExistingValuesPreserved(t *testing.T) {
env := map[string]string{
"CHAT_UPLOAD_MAX_FILE_BYTES": "1234",
"CHAT_UPLOAD_MAX_TOTAL_BYTES": "5678",
}
applyChatUploadLimits(env)
if got := env["CHAT_UPLOAD_MAX_FILE_BYTES"]; got != "1234" {
t.Errorf("pre-existing CHAT_UPLOAD_MAX_FILE_BYTES overwritten: got %q", got)
}
if got := env["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; got != "5678" {
t.Errorf("pre-existing CHAT_UPLOAD_MAX_TOTAL_BYTES overwritten: got %q", got)
}
}
// The 100 MB minimum is the CTO-directed allowance floor (issue #1520).
// Pin so a future "tidy up: 100 MB seems large" refactor surfaces here
// before reverting the user-visible behaviour change.
func TestChatUploadCaps_MinimumAllowanceFloor(t *testing.T) {
const floor = 100 * 1024 * 1024
if chatUploadMaxBytes < floor {
t.Errorf("chatUploadMaxBytes = %d, below #1520 floor %d", chatUploadMaxBytes, floor)
}
if chatUploadMaxFileBytes < floor {
t.Errorf("chatUploadMaxFileBytes = %d, below #1520 floor %d", chatUploadMaxFileBytes, floor)
}
}
@@ -1,53 +0,0 @@
package handlers
// plugins_install_test.go — additional coverage for plugins_install.go.
//
// Gaps filled vs. existing test files:
// - plugins_install_external_test.go: Install + Uninstall 422 (external runtime) ✓ covered
// - plugins_test.go: Install 400 (missing source, invalid body, etc.) ✓ covered
// Uninstall 400 (invalid plugin name, empty name) ✓ covered
// Download auth gate ✓ covered
// - org_import_helpers_test.go: countWorkspaces, envRequirementKey, sanitizeEnvMembers,
// flattenAndSortRequirements, collectOrgEnv ✓ covered
//
// New test added here:
// - Uninstall 503: container not running, no SaaS dispatch.
//
// NOTE: validateWorkspaceID is not called inside the Install/Uninstall handlers.
// UUID validation is the responsibility of the WorkspaceAuth middleware, so no
// 400 test is needed here for UUID format.
import (
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/gin-gonic/gin"
"github.com/stretchr/testify/require"
)
// TestPluginUninstall_ContainerNotRunning_Returns503 exercises the 503 path
// where neither a local Docker container nor a SaaS instance-id dispatch
// resolves. The handler must return "workspace container not running" — NOT a
// generic 500 or a misleading 422 (external-runtime) message.
func TestPluginUninstall_ContainerNotRunning_Returns503(t *testing.T) {
// No docker client + no instance-id lookup → falls through to 503.
h := NewPluginsHandler(t.TempDir(), nil, nil)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{
{Key: "id", Value: "550e8400-e29b-41d4-a716-446655440000"},
{Key: "name", Value: "some-plugin"},
}
c.Request = httptest.NewRequest("DELETE",
"/workspaces/550e8400-e29b-41d4-a716-446655440000/plugins/some-plugin", nil)
h.Uninstall(c)
require.Equal(t, http.StatusServiceUnavailable, w.Code)
var body map[string]string
json.Unmarshal(w.Body.Bytes(), &body)
require.Equal(t, "workspace container not running", body["error"])
}
@@ -198,6 +198,17 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
// back to its compiled-in Anthropic default and 401s when the user's
// key is for a different provider. Non-hermes runtimes are unaffected
// (the server still passes model through, they just don't use it).
// runtimeExplicitlyRequested is true when the caller expressed intent for
// a SPECIFIC runtime — either by passing `runtime` directly, or by naming
// a `template` (a template encodes a runtime). When true, we must NOT
// silently fall back to langgraph if that intent can't be honored: that
// is the molecule-controlplane#188 / #184 contract violation (caller asks
// for codex/claude-code, gets a langgraph workspace, 201, no error — a
// false success). #188 mandates fail-closed (error+notify) on mismatch,
// not an advisory degrade. The legitimate "no template, no runtime →
// langgraph default" path (bare {"name":...}) is unaffected.
runtimeExplicitlyRequested := payload.Runtime != "" || payload.Template != ""
templateRuntimeResolved := payload.Runtime != ""
if payload.Template != "" && (payload.Runtime == "" || payload.Model == "") {
// #226: payload.Template is attacker-controllable. resolveInsideRoot
// rejects absolute paths and any ".." that escapes configsDir so the
@@ -230,6 +241,9 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
switch {
case payload.Runtime == "" && !indented && strings.HasPrefix(stripped, "runtime:") && !strings.HasPrefix(stripped, "runtime_config"):
payload.Runtime = strings.TrimSpace(strings.TrimPrefix(stripped, "runtime:"))
if payload.Runtime != "" {
templateRuntimeResolved = true
}
case payload.Model == "" && !indented && strings.HasPrefix(stripped, "model:"):
// Legacy top-level `model:` — pre-runtime_config templates.
payload.Model = strings.Trim(strings.TrimSpace(strings.TrimPrefix(stripped, "model:")), `"'`)
@@ -242,7 +256,27 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
}
}
}
// Fail-closed (molecule-controlplane#188 / #184): if the caller expressed
// intent for a specific runtime (passed `runtime`, or named a `template`)
// but we could NOT resolve a concrete runtime from it (template's
// config.yaml unreadable, or it has no `runtime:` key), DO NOT silently
// substitute langgraph and return 201 — that is the silent contract
// violation that produced 5/5 wrong workspaces and a false codex E2E pass.
// Return 422 so the caller learns the requested runtime was not honored.
// The platform-side CP fix (controlplane#188) is the sibling gate; this
// closes the ws-server `Create` boundary the product UI actually hits.
if payload.Runtime == "" && runtimeExplicitlyRequested && !templateRuntimeResolved {
log.Printf("Create: FAIL-CLOSED (controlplane#188) — template=%q requested but runtime could not be resolved; refusing silent langgraph fallback", payload.Template)
c.JSON(http.StatusUnprocessableEntity, gin.H{
"error": "runtime could not be resolved from the requested template; refusing to silently provision langgraph (controlplane#188). Pass an explicit \"runtime\", or use a template whose config.yaml declares one.",
"template": payload.Template,
"code": "RUNTIME_UNRESOLVED",
})
return
}
if payload.Runtime == "" {
// Legitimate default path: no template AND no runtime requested
// (bare {"name":...}) — langgraph is the intended default here.
payload.Runtime = "langgraph"
}
@@ -1,193 +0,0 @@
package handlers
import (
"bytes"
"database/sql"
"encoding/json"
"net/http"
"net/http/httptest"
"testing"
"github.com/DATA-DOG/go-sqlmock"
"github.com/gin-gonic/gin"
)
// patchReq builds a gin context for a PATCH request to /workspaces/:id/abilities.
func patchReq(id, body string) (*http.Request, *httptest.ResponseRecorder, *gin.Context) {
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: id}}
c.Request = httptest.NewRequest("PATCH", "/workspaces/"+id+"/abilities", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
return c.Request, w, c
}
func TestPatchAbilities_InvalidWorkspaceID(t *testing.T) {
setupTestDB(t)
// "not-a-uuid" fails validateWorkspaceID
_, w, c := patchReq("not-a-uuid", `{"broadcast_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_EmptyBody(t *testing.T) {
setupTestDB(t)
id := "00000000-0000-0000-0000-000000000001"
// Empty JSON object — no ability fields present
_, w, c := patchReq(id, `{}`)
PatchAbilities(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]string
json.Unmarshal(w.Body.Bytes(), &resp)
if resp["error"] != "at least one ability field required" {
t.Errorf("expected 'at least one ability field required', got %v", resp["error"])
}
}
func TestPatchAbilities_WorkspaceNotFound(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000002"
// SELECT EXISTS returns false (workspace does not exist)
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(false))
_, w, c := patchReq(id, `{"broadcast_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusNotFound {
t.Errorf("expected 404, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_SetBroadcastEnabledTrue(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000003"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE broadcast_enabled = true
mock.ExpectExec(`UPDATE workspaces SET broadcast_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, true).
WillReturnResult(sqlmock.NewResult(0, 1))
_, w, c := patchReq(id, `{"broadcast_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]string
json.Unmarshal(w.Body.Bytes(), &resp)
if resp["status"] != "updated" {
t.Errorf("expected status=updated, got %v", resp["status"])
}
}
func TestPatchAbilities_SetTalkToUserEnabledFalse(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000004"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE talk_to_user_enabled = false
mock.ExpectExec(`UPDATE workspaces SET talk_to_user_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, false).
WillReturnResult(sqlmock.NewResult(0, 1))
_, w, c := patchReq(id, `{"talk_to_user_enabled":false}`)
PatchAbilities(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_BothFields(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000005"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE broadcast_enabled = false
mock.ExpectExec(`UPDATE workspaces SET broadcast_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, false).
WillReturnResult(sqlmock.NewResult(0, 1))
// UPDATE talk_to_user_enabled = true
mock.ExpectExec(`UPDATE workspaces SET talk_to_user_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, true).
WillReturnResult(sqlmock.NewResult(0, 1))
_, w, c := patchReq(id, `{"broadcast_enabled":false,"talk_to_user_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_BroadcastUpdateFails(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000006"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE fails
mock.ExpectExec(`UPDATE workspaces SET broadcast_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, true).
WillReturnError(sql.ErrConnDone)
_, w, c := patchReq(id, `{"broadcast_enabled":true}`)
PatchAbilities(c)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d: %s", w.Code, w.Body.String())
}
}
func TestPatchAbilities_TalkToUserUpdateFails(t *testing.T) {
mock := setupTestDB(t)
id := "00000000-0000-0000-0000-000000000007"
// SELECT EXISTS → true
mock.ExpectQuery(`SELECT EXISTS\(SELECT 1 FROM workspaces WHERE id = \$1 AND status != 'removed'\)`).
WithArgs(id).
WillReturnRows(sqlmock.NewRows([]string{"exists"}).AddRow(true))
// UPDATE broadcast_enabled skipped (not in payload)
// UPDATE talk_to_user_enabled fails
mock.ExpectExec(`UPDATE workspaces SET talk_to_user_enabled = \$2, updated_at = now\(\) WHERE id = \$1`).
WithArgs(id, false).
WillReturnError(sql.ErrConnDone)
_, w, c := patchReq(id, `{"talk_to_user_enabled":false}`)
PatchAbilities(c)
if w.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d: %s", w.Code, w.Body.String())
}
}
@@ -34,13 +34,11 @@ import (
// BroadcastHandler is constructed once and shared across requests.
type BroadcastHandler struct {
broadcaster events.EventEmitter
broadcaster *events.Broadcaster
}
// NewBroadcastHandler creates a BroadcastHandler.
// The emitter is any EventEmitter — the concrete *Broadcaster in production,
// or a test double in unit tests.
func NewBroadcastHandler(b events.EventEmitter) *BroadcastHandler {
func NewBroadcastHandler(b *events.Broadcaster) *BroadcastHandler {
return &BroadcastHandler{broadcaster: b}
}
@@ -67,6 +67,7 @@ func TestBroadcast_OrgScopedRecipients(t *testing.T) {
if w.Code != http.StatusOK {
t.Errorf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("failed to unmarshal response: %v", err)
@@ -205,7 +206,7 @@ func TestBroadcast_Disabled(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000003"
senderID := "00000000-0000-0000-0000-000000000001"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Disabled Agent", false))
@@ -236,7 +237,7 @@ func TestBroadcast_EmptyOrg_NoRecipients(t *testing.T) {
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000004" // org root, only workspace in org
senderID := "00000000-0000-0000-0000-000000000001" // org root, only workspace in org
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -296,12 +297,33 @@ func TestBroadcast_InvalidWorkspaceID(t *testing.T) {
}
}
func TestBroadcast_MissingMessage(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-000000000001"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-000000000001/broadcast", bytes.NewBufferString("{}"))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
// TestBroadcast_OrgRootLookupFails verifies that if the recursive CTE for
// finding the org root errors, the handler returns 500 instead of proceeding
// with an un-scoped query that would broadcast to all orgs.
func TestBroadcast_OrgRootLookupFails(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000005"
senderID := "00000000-0000-0000-0000-000000000001"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -331,13 +353,16 @@ func TestBroadcast_OrgRootLookupFails(t *testing.T) {
}
}
// TestBroadcast_OrgScoped_SelfBroadcastExcluded verifies that broadcasting
// from a workspace does not send a broadcast_receive to the sender itself
// (the sender logs broadcast_sent, not broadcast_receive).
func TestBroadcast_OrgScoped_SelfBroadcastExcluded(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000006"
peerID := "00000000-0000-0000-0000-000000000007"
senderID := "00000000-0000-0000-0000-000000000001"
peerID := "00000000-0000-0000-0000-000000000002"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
@@ -374,145 +399,10 @@ func TestBroadcast_OrgScoped_SelfBroadcastExcluded(t *testing.T) {
}
}
// TestBroadcast_RecipientActivityLogFails_SkipsAndContinues: if one recipient's
// activity_log insert fails, the handler logs the error and continues to the
// next recipient rather than aborting the whole broadcast.
func TestBroadcast_RecipientActivityLogFails_SkipsAndContinues(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-000000000008"
peerA := "00000000-0000-0000-0000-000000000009"
peerB := "00000000-0000-0000-0000-00000000000a"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Resilient Agent", true))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(senderID))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID, senderID).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(peerA).AddRow(peerB))
// Peer A fails — handler logs and continues
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerA, senderID, sqlmock.AnyArg()).
WillReturnError(context.DeadlineExceeded)
// Peer B succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerB, senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// Sender log succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: senderID}}
body := `{"message":"partial delivery"}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+senderID+"/broadcast", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusOK {
t.Fatalf("expected 200, got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
json.Unmarshal(w.Body.Bytes(), &resp)
// Only peerB was delivered
if int(resp["delivered"].(float64)) != 1 {
t.Errorf("expected delivered=1, got %v", resp["delivered"])
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet expectations: %v", err)
}
}
// TestBroadcast_SenderActivityLogFails_StillReturns200: if the sender's own
// broadcast_sent activity_log insert fails, the handler still returns 200
// so the caller doesn't retry a broadcast that already partially delivered.
func TestBroadcast_SenderActivityLogFails_StillReturns200(t *testing.T) {
mock := setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
senderID := "00000000-0000-0000-0000-00000000000b"
peerA := "00000000-0000-0000-0000-00000000000c"
mock.ExpectQuery(`SELECT name, broadcast_enabled FROM workspaces WHERE id = \$1 AND status != 'removed'`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"name", "broadcast_enabled"}).AddRow("Log-Fail Agent", true))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID).
WillReturnRows(sqlmock.NewRows([]string{"root_id"}).AddRow(senderID))
mock.ExpectQuery(`WITH RECURSIVE org_chain AS`).
WithArgs(senderID, senderID).
WillReturnRows(sqlmock.NewRows([]string{"id"}).AddRow(peerA))
// Peer log succeeds
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(peerA, senderID, sqlmock.AnyArg()).
WillReturnResult(sqlmock.NewResult(0, 1))
// Sender log FAILS
mock.ExpectExec(`INSERT INTO activity_logs`).WithArgs(senderID, sqlmock.AnyArg()).
WillReturnError(context.DeadlineExceeded)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: senderID}}
body := `{"message":"log fail test"}`
c.Request = httptest.NewRequest("POST", "/workspaces/"+senderID+"/broadcast", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusOK {
t.Errorf("expected 200 even on sender log failure, got %d: %s", w.Code, w.Body.String())
}
}
func TestBroadcast_MissingMessage(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-00000000000d"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-00000000000d/broadcast", bytes.NewBufferString("{}"))
c.Request.Header.Set("Content-Type", "application/json")
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
func TestBroadcast_MissingBody(t *testing.T) {
setupTestDB(t)
broadcaster := newTestBroadcaster()
handler := NewBroadcastHandler(broadcaster)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Params = gin.Params{{Key: "id", Value: "00000000-0000-0000-0000-00000000000e"}}
c.Request = httptest.NewRequest("POST", "/workspaces/00000000-0000-0000-0000-00000000000e/broadcast", nil)
// no Content-Type and no body
handler.Broadcast(c)
if w.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d: %s", w.Code, w.Body.String())
}
}
// TestBroadcast_Truncate tests that messages are truncated with the Unicode ellipsis
// character (U+2026) when len(msg) > max. The truncated output is max runes + "…".
// TestBroadcast_Truncate tests that messages are truncated with the Unicode ellipsis
// character (U+2026) when len(msg) > max. The truncated output is max runes + "…",
// so truncating a 48-char string at max=20 produces 21 characters (20 runes + "…").
func TestBroadcast_Truncate(t *testing.T) {
cases := []struct {
msg string
@@ -520,18 +410,14 @@ func TestBroadcast_Truncate(t *testing.T) {
expect string
}{
{"short", 120, "short"}, // under max — no truncation
// exactly 120 chars → unchanged
// exactly120chars (15) + 105 ones = 120 chars; at max=120 → unchanged
{"exactly120chars1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111", 120, "exactly120chars111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111…"},
// 21 runes at max=20 → 20 + "…" = 21 chars
// "this is a longer mes" = 20 runes; + "…" = 21 chars
{"this is a longer message that needs truncating", 20, "this is a longer mes…"},
// at-max boundary: 20 chars at max=20 → no truncation
{"exactly twenty chars", 20, "exactly twenty chars"},
// over max: 11 chars at max=10 → 10 + "…" = 11
{"hello world!", 10, "hello worl…"},
// Unicode: 3-rune string at max=3 → unchanged
{"日本語", 3, "日本語"},
// Empty string → unchanged
{"", 120, ""},
}
for _, tc := range cases {
result := broadcastTruncate(tc.msg, tc.max)
@@ -37,7 +37,6 @@ package handlers
import (
"context"
"errors"
"fmt"
"log"
"path/filepath"
@@ -133,10 +132,6 @@ func (h *WorkspaceHandler) prepareProvisionContext(
// a workspace_secret named GIT_AUTHOR_NAME can override.
applyAgentGitIdentity(envVars, payload.Name)
applyRuntimeModelEnv(envVars, payload.Runtime, payload.Model)
// SSOT for chat-upload limits — see chat_files.go::chatUploadMaxBytes.
// Injecting via env keeps the Python workspace runtime caps in
// lock-step with the Go cap on every provision. Fixes #1520.
applyChatUploadLimits(envVars)
if payload.Role != "" {
envVars["MOLECULE_AGENT_ROLE"] = payload.Role
}
@@ -228,28 +223,3 @@ func (h *WorkspaceHandler) markProvisionFailed(ctx context.Context, workspaceID,
log.Printf("markProvisionFailed: db update failed for %s: %v", workspaceID, dbErr)
}
}
// applyChatUploadLimits seeds the chat-upload cap env vars on the
// workspace container so the Python /internal/chat/uploads/ingest
// handler parses the multipart form with the same per-file allowance
// that the Go proxy enforces.
//
// Why env-driven (and not, say, a hard-coded Python constant): keeping
// one Go constant as the source of truth and forwarding it lets
// operations bump the cap by editing one file + redeploy, instead of
// editing two files in two languages and risking the drift that
// shipped #1520 (Go cap 50 MB, Python parser cap 1 MiB — Starlette
// default — so a 5 MB image always 400'd on parse before per-file
// enforcement could fire).
//
// Pre-existing env wins. If something downstream (a tenant override,
// a plugin mutator, an A/B experiment) has already set either var,
// we leave it alone. Default-only injection.
func applyChatUploadLimits(envVars map[string]string) {
if _, set := envVars["CHAT_UPLOAD_MAX_FILE_BYTES"]; !set {
envVars["CHAT_UPLOAD_MAX_FILE_BYTES"] = fmt.Sprintf("%d", chatUploadMaxFileBytes)
}
if _, set := envVars["CHAT_UPLOAD_MAX_TOTAL_BYTES"]; !set {
envVars["CHAT_UPLOAD_MAX_TOTAL_BYTES"] = fmt.Sprintf("%d", chatUploadMaxBytes)
}
}
@@ -718,7 +718,7 @@ func TestWorkspaceList_Empty(t *testing.T) {
"parent_id", "active_tasks", "last_error_rate", "last_sample_error",
"uptime_seconds", "current_task", "runtime", "workspace_dir", "x", "y", "collapsed",
"budget_limit", "monthly_spend",
"broadcast_enabled", "talk_to_user_enabled",
"broadcast_enabled", "talk_to_user_enabled",
}))
w := httptest.NewRecorder()
@@ -1770,3 +1770,147 @@ runtime_config:
t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String())
}
}
// ==================== #188 fail-closed: template/runtime contract ====================
//
// molecule-controlplane#188 / #184: if a caller names a `template` (intent
// for a specific runtime) but the runtime cannot be resolved from it, the
// server MUST NOT silently provision langgraph and return 201 — that false
// success produced 5/5 wrong workspaces and a bogus codex E2E pass. These
// tests pin the fail-closed boundary at the ws-server `Create` handler (the
// path the product UI hits), and guard the legitimate default path against
// regression.
// Template requested but its dir/config.yaml is absent → 422, not silent
// langgraph 201.
func TestWorkspaceCreate_188_TemplateMissingRuntime_FailsClosed(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
// configsDir is an empty temp dir → resolveInsideRoot succeeds (the path
// is inside root) but config.yaml read fails → runtime cannot be resolved.
configsDir := t.TempDir()
if err := os.MkdirAll(filepath.Join(configsDir, "ghost-template"), 0o755); err != nil {
t.Fatalf("mkdir: %v", err)
}
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", configsDir)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Ghost","template":"ghost-template"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusUnprocessableEntity {
t.Fatalf("expected 422 (fail-closed, controlplane#188), got %d: %s", w.Code, w.Body.String())
}
var resp map[string]interface{}
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("parse: %v", err)
}
if resp["code"] != "RUNTIME_UNRESOLVED" {
t.Errorf("expected code RUNTIME_UNRESOLVED, got %v", resp["code"])
}
}
// Template config.yaml has no `runtime:` key → 422, not silent langgraph.
func TestWorkspaceCreate_188_TemplateConfigNoRuntimeKey_FailsClosed(t *testing.T) {
setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
configsDir := t.TempDir()
tdir := filepath.Join(configsDir, "noruntime-template")
if err := os.MkdirAll(tdir, 0o755); err != nil {
t.Fatalf("mkdir: %v", err)
}
// config.yaml exists but declares no runtime.
if err := os.WriteFile(filepath.Join(tdir, "config.yaml"), []byte("name: noruntime\n"), 0o644); err != nil {
t.Fatalf("write: %v", err)
}
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", configsDir)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"NoRuntime","template":"noruntime-template"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusUnprocessableEntity {
t.Fatalf("expected 422 (fail-closed), got %d: %s", w.Code, w.Body.String())
}
}
// Regression guard: the legitimate default path (no template, no runtime —
// bare {"name":...}) MUST still default to langgraph and return 201. The
// #188 fix must not break this.
func TestWorkspaceCreate_188_NoTemplateNoRuntime_StillDefaultsLanggraph(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WithArgs(sqlmock.AnyArg(), "Plain Default", nil, 3, "langgraph", sqlmock.AnyArg(), (*string)(nil), nil, "none", (*int64)(nil), models.DefaultMaxConcurrentTasks, "push").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO canvas_layouts").
WithArgs(sqlmock.AnyArg(), float64(0), float64(0)).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Plain Default"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("expected 201 (legitimate default path), got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// Explicit runtime, no template → honored, 201 (no template resolution
// needed; runtimeExplicitlyRequested true but already resolved).
func TestWorkspaceCreate_188_ExplicitRuntimeNoTemplate_OK(t *testing.T) {
mock := setupTestDB(t)
setupTestRedis(t)
broadcaster := newTestBroadcaster()
handler := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
mock.ExpectBegin()
mock.ExpectExec("INSERT INTO workspaces").
WithArgs(sqlmock.AnyArg(), "Explicit Codex", nil, 3, "codex", sqlmock.AnyArg(), (*string)(nil), nil, "none", (*int64)(nil), models.DefaultMaxConcurrentTasks, "push").
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectCommit()
mock.ExpectExec("INSERT INTO canvas_layouts").
WithArgs(sqlmock.AnyArg(), float64(0), float64(0)).
WillReturnResult(sqlmock.NewResult(0, 1))
mock.ExpectExec("INSERT INTO structure_events").
WillReturnResult(sqlmock.NewResult(0, 1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
body := `{"name":"Explicit Codex","runtime":"codex"}`
c.Request = httptest.NewRequest("POST", "/workspaces", bytes.NewBufferString(body))
c.Request.Header.Set("Content-Type", "application/json")
handler.Create(c)
if w.Code != http.StatusCreated {
t.Fatalf("expected 201, got %d: %s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
@@ -81,11 +81,11 @@ func TestPositiveMatches(t *testing.T) {
fixture string
expectedName string
}{
{"ghp_EXAMPLE111122223333444455556666777788889999", "github-pat-classic"},
{"ghs_EXAMPLE111122223333444455556666777788889999", "github-app-installation-token"},
{"gho_EXAMPLE111122223333444455556666777788889999", "github-oauth-user-to-server"},
{"ghu_EXAMPLE111122223333444455556666777788889999", "github-oauth-user"},
{"ghr_EXAMPLE111122223333444455556666777788889999", "github-oauth-refresh"},
{"ghp_" + "EXAMPLE111122223333444455556666777788889999", "github-pat-classic"},
{"ghs_" + "EXAMPLE111122223333444455556666777788889999", "github-app-installation-token"},
{"gho_" + "EXAMPLE111122223333444455556666777788889999", "github-oauth-user-to-server"},
{"ghu_" + "EXAMPLE111122223333444455556666777788889999", "github-oauth-user"},
{"ghr_" + "EXAMPLE111122223333444455556666777788889999", "github-oauth-refresh"},
{"github_pat_EXAMPLE" + strings.Repeat("1", 80), "github-pat-fine-grained"},
{"sk-ant-EXAMPLE" + strings.Repeat("1", 40), "anthropic-api-key"},
{"sk-proj-EXAMPLE" + strings.Repeat("1", 40), "openai-project-key"},
@@ -156,7 +156,7 @@ func TestNegativeShapes(t *testing.T) {
// makes ScanString do its own thing (e.g. accidentally normalise
// case) would diverge silently.
func TestScanString_NoOp(t *testing.T) {
in := "ghp_EXAMPLE111122223333444455556666777788889999"
in := "ghp_" + "EXAMPLE111122223333444455556666777788889999"
m1, err1 := ScanBytes([]byte(in))
if err1 != nil {
t.Fatalf("ScanBytes errored: %v", err1)
+6
View File
@@ -172,6 +172,12 @@ async def handle_tool_call(name: str, arguments: dict) -> str:
arguments.get("message", ""),
workspace_id=arguments.get("workspace_id") or None,
)
elif name == "get_runtime_identity":
return await tool_get_runtime_identity()
elif name == "update_agent_card":
return await tool_update_agent_card(
arguments.get("card"),
)
return f"Unknown tool: {name}"
-42
View File
@@ -599,28 +599,6 @@ def _sanitize_for_external(msg: str) -> str:
import re as _re
msg = _re.sub(r"(?i)(?:bearer|token|api[_-]?key|sk-)[ :=]+[A-Za-z0-9_/.-]{20,}", "[REDACTED]", msg)
# Bare provider key with NO separator after the prefix — a real
# `sk-ant-api03-…` / `sk-…` key uses `-` (not `[ :=]`) so the rule
# above misses it. Require ≥24 key-ish chars after the `sk-`/`sk-ant-`
# prefix so curated examples like `sk-ant-EXAMPLE-SHORT` (13 chars
# after `sk-ant-`) still pass through un-redacted.
msg = _re.sub(r"(?i)\bsk-(?:ant-)?[A-Za-z0-9_-]{24,}", "[REDACTED]", msg)
# JSON-quoted credential values: {"token": "…"} / {"apiKey": "…"} /
# {"secret": "…"} / {"password": "…"}. Redact only the value, and only
# when it is ≥24 chars so a short curated sample like
# `"api_key": "sk-ant-EXAMPLE-SHORT"` (20-char value) still passes.
msg = _re.sub(
r'(?i)("(?:token|api[_-]?key|secret|password)"\s*:\s*")[^"]{24,}(")',
r"\1[REDACTED]\2",
msg,
)
# AWS secret access key in `aws_secret_access_key=…` form (env dumps,
# boto tracebacks). The base64-ish value runs until whitespace/quote.
msg = _re.sub(
r"(?i)(aws_secret_access_key\s*[:=]\s*)\S+",
r"\1[REDACTED]",
msg,
)
# Absolute paths: /etc/shadow, /home/user/.aws/credentials, etc.
msg = _re.sub(r"(?:/[^/\s]+){2,}", lambda m: m.group(0) if len(m.group(0)) < 60 else "[REDACTED_PATH]", msg)
return msg
@@ -630,7 +608,6 @@ def sanitize_agent_error(
exc: BaseException | None = None,
category: str | None = None,
stderr: str | None = None,
reason: str | None = None,
) -> str:
"""Render an agent-side failure into a user-safe error message.
@@ -638,18 +615,6 @@ def sanitize_agent_error(
category string (e.g. from `classify_subprocess_error`). If both are
given, `category` wins. If neither, the tag defaults to "unknown".
When ``reason`` is provided (internal#211/#212), it is a *pre-curated,
user-actionable, secret-safe* explanation built by the caller from a
provider-side failure — e.g. a 403 "Your organization has disabled
Claude subscription access · Use an Anthropic API key instead, or ask
your admin to enable access" with error code ``oauth_org_not_allowed``.
This text is exactly what the user needs to self-serve, so it is
surfaced VERBATIM as the message instead of being collapsed to the
opaque exception class name. It still passes through the
key/token/bearer/path scrubber as a belt-and-braces second pass so a
buggy caller can't leak a credential that snuck into the reason.
``reason`` wins over ``stderr``; both lose to neither being set.
When ``stderr`` is provided (e.g. the first ~1 KB of a subprocess stderr
or HTTP error body), it is sanitized and appended to the output so the
A2A caller gets actionable context without needing to dig through workspace
@@ -664,13 +629,6 @@ def sanitize_agent_error(
else:
tag = "unknown"
if reason:
# Curated, user-actionable reason — surface it as the message.
# Still scrub: a 403/auth/quota message is safe, but the scrubber
# is cheap insurance against a caller that didn't curate cleanly.
clean = _sanitize_for_external(reason[:_MAX_STDERR_PREVIEW])
return f"Agent error ({tag}): {clean}"
if stderr:
# Truncate and sanitize before including — prevents DoS via
# a malicious or buggy peer injecting a huge error body, and
+9 -66
View File
@@ -26,14 +26,9 @@ Path safety:
a colliding name fails fast (the random prefix already makes
collisions astronomical, but defense-in-depth costs nothing).
Limits (SSOT — matches the Go contract from chat_files.go, injected
via CHAT_UPLOAD_MAX_TOTAL_BYTES / CHAT_UPLOAD_MAX_FILE_BYTES at
provision time; falls back to legacy 50 MB / 25 MB when env unset):
- CHAT_UPLOAD_MAX_TOTAL_BYTES total request body (default 50 MB)
- CHAT_UPLOAD_MAX_FILE_BYTES per file (default 25 MB)
ALSO passed as Starlette ``max_part_size`` to override the
Starlette-1.0 default of 1 MiB which silently 400'd every
upload > 1 MiB before #1520 fix.
Limits (matches the Go contract from chat_files.go):
- 50 MB total request body
- 25 MB per file
- filename truncated to 100 chars
Response shape:
@@ -66,47 +61,14 @@ logger = logging.getLogger(__name__)
# keeps working unchanged.
CHAT_UPLOAD_DIR = "/workspace/.molecule/chat-uploads"
def _env_int(name: str, default: int) -> int:
"""Parse an int from the environment, falling back to ``default``.
Mis-formatted values (anything ``int()`` rejects) fall back to the
default rather than crashing module import — operations needs to be
able to roll back a bad env-var push by simply removing the var,
not by also fixing a worker that won't boot.
"""
raw = os.environ.get(name)
if not raw:
return default
try:
return int(raw)
except (TypeError, ValueError):
logger.warning("internal_chat_uploads: env %s=%r not an int; using default %d", name, raw, default)
return default
# Total-request body cap. multipart/form-data with multiple parts can
# add ~100 bytes of framing per file; the cap is the bytes hitting the
# socket, including framing.
#
# SSOT (issue #1520): the source of truth is the Go constant
# chatUploadMaxBytes in workspace-server/internal/handlers/chat_files.go,
# exported to the workspace container as CHAT_UPLOAD_MAX_TOTAL_BYTES at
# provision time (workspace_provision_shared.go::applyChatUploadLimits).
# Unset env → keep the previous 50 MB default so an unprovisioned /
# locally-run workspace does NOT regress.
CHAT_UPLOAD_MAX_BYTES = _env_int("CHAT_UPLOAD_MAX_TOTAL_BYTES", 50 * 1024 * 1024)
CHAT_UPLOAD_MAX_BYTES = 50 * 1024 * 1024 # 50 MB
# Per-file cap. SSOT (issue #1520): exported from the Go side as
# CHAT_UPLOAD_MAX_FILE_BYTES; default 25 MB if env is unset so an older
# workspace provisioned before the env-injection landed keeps the
# legacy ceiling.
#
# This value is ALSO passed as Starlette's ``max_part_size`` (see
# ingest_handler below) — Starlette 1.0 defaults max_part_size to
# **1 MiB**, which is the actual root cause of #1520: any single file
# part above 1 MiB raised MultiPartException before per-file enforcement
# could fire. Wiring max_part_size to the same cap as per-file means
# the user-visible ceiling is exactly the per-file cap, no surprises.
CHAT_UPLOAD_MAX_FILE_BYTES = _env_int("CHAT_UPLOAD_MAX_FILE_BYTES", 25 * 1024 * 1024)
# Per-file cap. Keeping per-file under total lets a user attach, say,
# a 5 MB PDF + 10 small screenshots in a single batch.
CHAT_UPLOAD_MAX_FILE_BYTES = 25 * 1024 * 1024 # 25 MB
# Conservative {alnum, dot, underscore, dash} character class — anything
# outside gets rewritten so embedded paths, control chars, newlines,
@@ -184,30 +146,11 @@ async def ingest_handler(request: Request) -> JSONResponse:
status_code=413,
)
# max_part_size: Starlette 1.0 defaults to 1 MiB. Any single
# part above that raises MultiPartException BEFORE per-file
# enforcement can run — which silently broke every chat upload
# > 1 MiB (issue #1520, fleet-wide P0 2026-05-18). Wire it to
# the per-file cap so the user-visible ceiling matches what
# the per-file 413 path expects.
try:
form = await request.form(
max_files=64,
max_fields=32,
max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES,
)
form = await request.form(max_files=64, max_fields=32)
except Exception as exc: # multipart parse error
logger.warning("internal_chat_uploads: multipart parse failed: %s", exc)
# Surface the exception detail (feedback_surface_actionable_failure_reason_to_user):
# MultiPartException strings ("Part exceeded maximum size of …",
# "Invalid boundary", "Too many parts", etc.) contain no secrets
# — they describe shape, not content. The 200-char cap is
# belt-and-braces against an exception class we haven't seen
# whose ``str()`` is unbounded.
return JSONResponse(
{"error": "failed to parse multipart form", "detail": str(exc)[:200]},
status_code=400,
)
return JSONResponse({"error": "failed to parse multipart form"}, status_code=400)
# Starlette's FormData allows multiple values per key — `files` may
# appear multiple times for batched uploads. getlist returns them
-117
View File
@@ -788,123 +788,6 @@ def test_sanitize_agent_error_stderr_combined_with_existing_tests():
assert "workspace logs" in out
# ─── reason passthrough (internal#211/#212: surface actionable provider error) ───
def test_sanitize_agent_error_reason_surfaced_verbatim():
"""A curated provider reason is shown to the user, not collapsed to the
exception class name. This is the internal#211 regression: a 403
org-disabled message must reach the canvas."""
reason = (
"provider HTTP 403 — oauth_org_not_allowed — Your organization has "
"disabled Claude subscription access for Claude Code · Use an "
"Anthropic API key instead, or ask your admin to enable access"
)
class _ResultErr(Exception):
pass
out = sanitize_agent_error(exc=_ResultErr("opaque"), reason=reason)
# The actionable provider guidance and status code must be visible.
assert "403" in out
assert "oauth_org_not_allowed" in out
assert "disabled Claude subscription access" in out
assert "ask your admin to enable access" in out
# NOT the old opaque form.
assert "see workspace logs" not in out
def test_sanitize_agent_error_reason_still_scrubs_secrets():
"""Even on the reason path the key/token scrubber runs — a buggy caller
that lets a bearer token into the reason still gets it redacted."""
leaky = (
"provider HTTP 401 — auth failed — Authorization: Bearer "
"PLACEHOLDER_LONG_TOKEN_0123456789abcdefghijklm please re-auth"
)
out = sanitize_agent_error(reason=leaky)
assert "[REDACTED]" in out
assert "PLACEHOLDER_LONG_TOKEN_0123456789abcdefghijklm" not in out
# The non-secret guidance still survives the scrub.
assert "401" in out
assert "please re-auth" in out
def test_sanitize_agent_error_reason_scrubs_all_secret_formats():
"""The scrubber must redact every realistic credential shape — not just
the `Bearer <tok>` form the original test happened to exercise
(internal#212 review finding: bare `sk-ant-api03-…` keys, JSON-quoted
"token"/"apiKey" values, and `aws_secret_access_key=` all leaked).
All curated/actionable guidance must still survive the scrub.
"""
# 1. Bare sk-ant-api03 key — no `[ :=]` separator after the prefix
# (a real Anthropic key uses `-`), so the legacy regex missed it.
bare = (
"provider HTTP 401 — auth failed — invalid key "
"sk-FAKEPLACEHOLDERabcdefghijklmnopqrstuvwxy0123456789 "
"please re-auth"
)
out = sanitize_agent_error(reason=bare)
assert "sk-FAKEPLACEHOLDERabcdefghijklmnopqrstuvwxy0123456789" not in out
assert "[REDACTED]" in out
assert "401" in out # actionable status survives
assert "please re-auth" in out # actionable guidance survives
# 2. JSON-quoted "token" / "apiKey" values.
jblob = (
'provider error — config dump {"token": '
'"abcDEF0123456789ghIJKL0123456789mnopQRST", "apiKey": '
'"anon_fakefakefakefakefakefakefakefakefakefake"} — '
"use an API key instead"
)
out = sanitize_agent_error(reason=jblob)
assert "abcDEF0123456789ghIJKL0123456789mnopQRST" not in out
assert "anon_fakefakefakefakefakefakefakefakefakefake" not in out
assert "[REDACTED]" in out
assert "use an API key instead" in out # actionable guidance survives
# 3. aws_secret_access_key=… form.
awsblob = (
"provider HTTP 403 — boto credential error "
"aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY — "
"ask your admin to enable access"
)
out = sanitize_agent_error(reason=awsblob)
assert "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" not in out
assert "[REDACTED]" in out
assert "403" in out # actionable status survives
assert "ask your admin to enable access" in out # guidance survives
# 4. Regression: the original Bearer form still redacts.
# Uses PLACEHOLDER_LONG_TOKEN (>=40 chars, no sk-ant- prefix) to avoid
# triggering the secret-scan workflow pattern
# `sk-ant-[A-Za-z0-9_-]{40,}`.
bearer = (
"provider HTTP 401 — Authorization: Bearer "
"PLACEHOLDER_LONG_TOKEN_9876543210abcdefghij re-auth"
)
out = sanitize_agent_error(reason=bearer)
assert "PLACEHOLDER_LONG_TOKEN_9876543210abcdefghij" not in out
assert "[REDACTED]" in out
assert "re-auth" in out
def test_sanitize_agent_error_reason_wins_over_stderr():
"""When both reason and stderr are passed, the curated reason wins."""
out = sanitize_agent_error(
reason="provider HTTP 403 — use an API key",
stderr="raw subprocess noise that should not be shown",
)
assert "use an API key" in out
assert "raw subprocess noise" not in out
def test_sanitize_agent_error_no_reason_unchanged():
"""Omitting reason preserves the original generic behavior."""
out = sanitize_agent_error(exc=ValueError("boom"))
assert "ValueError" in out
assert "workspace logs" in out
# ======================================================================
# classify_subprocess_error
@@ -299,122 +299,3 @@ def test_symlink_at_target_is_refused(client: TestClient, chat_uploads_dir: Path
assert r.status_code == 500, r.text
# Sentinel content unchanged — the symlink wasn't followed.
assert sentinel.read_bytes() == b"original"
# ───────────── issue #1520: max_part_size + SSOT env-driven caps ─────────────
def test_part_above_starlette_1mib_default_is_accepted(client: TestClient, chat_uploads_dir: Path):
"""Regression: pre-fix, ANY single multipart part > 1 MiB raised
MultiPartException because the ingest handler called
``request.form()`` without ``max_part_size`` and Starlette 1.0's
default is 1 MiB (issue #1520, fleet-wide P0 2026-05-18).
This test sends a 2 MiB part, which is well below the 25 MB default
per-file cap but ABOVE the Starlette default, so it pins the fix:
we now pass ``max_part_size=CHAT_UPLOAD_MAX_FILE_BYTES`` so the
parser uses the same cap the per-file 413 path expects.
"""
payload = b"a" * (2 * 1024 * 1024) # 2 MiB — > Starlette 1 MiB default
r = client.post(
"/internal/chat/uploads/ingest",
files={"files": ("big-but-allowed.bin", payload)},
headers={"Authorization": "Bearer test-secret"},
)
assert r.status_code == 200, r.text
item = r.json()["files"][0]
assert item["size"] == len(payload)
def test_parse_error_surfaces_exception_detail(client: TestClient):
"""Per feedback_surface_actionable_failure_reason_to_user: the 400
body must include a ``detail`` field naming WHICH multipart error
fired. The MultiPartException strings ("Part exceeded maximum size
of …", "Invalid boundary", "Too many parts", etc.) describe SHAPE
not content — no secrets.
We trigger a real Starlette MultiPartException by submitting a body
whose Content-Type advertises ``multipart/form-data`` but whose
body is not a valid multipart envelope — the parser raises before
any per-file check can fire.
"""
r = client.post(
"/internal/chat/uploads/ingest",
content=b"this is not a valid multipart body",
headers={
"Authorization": "Bearer test-secret",
"Content-Type": "multipart/form-data; boundary=----not-a-real-boundary",
},
)
assert r.status_code == 400, r.text
body = r.json()
assert body["error"] == "failed to parse multipart form"
# Detail must be present + non-empty + bounded.
assert "detail" in body and isinstance(body["detail"], str)
assert body["detail"], "detail must not be empty"
assert len(body["detail"]) <= 200, "detail must be bounded"
def test_total_cap_413_still_fires_above_per_file_pass(client: TestClient, monkeypatch: pytest.MonkeyPatch):
"""Total-cap 413 path still works: two parts whose sum exceeds
CHAT_UPLOAD_MAX_BYTES but each individually fits the per-file cap.
Sanity-check that raising the per-file ceiling didn't accidentally
short-circuit the total-cap check.
"""
monkeypatch.setattr(internal_chat_uploads, "CHAT_UPLOAD_MAX_BYTES", 1024)
monkeypatch.setattr(internal_chat_uploads, "CHAT_UPLOAD_MAX_FILE_BYTES", 800)
r = client.post(
"/internal/chat/uploads/ingest",
files=[
("files", ("a.bin", b"a" * 600)),
("files", ("b.bin", b"b" * 600)),
],
headers={"Authorization": "Bearer test-secret"},
)
assert r.status_code == 413
# Either early (Content-Length pre-parse) or post-parse cumulative path is
# acceptable; both messages mention exceeding the total limit.
err = r.json()["error"]
assert "exceeds" in err and "limit" in err, err
def test_env_driven_ssot_overrides_caps(tmp_path: Path, monkeypatch: pytest.MonkeyPatch):
"""SSOT contract: setting CHAT_UPLOAD_MAX_FILE_BYTES /
CHAT_UPLOAD_MAX_TOTAL_BYTES in the environment at module import
time changes the module constants. Pin so the
workspace_provision_shared.go::applyChatUploadLimits env injection
cannot silently drift from what the Python side reads.
"""
import importlib
monkeypatch.setenv("CHAT_UPLOAD_MAX_FILE_BYTES", str(7 * 1024 * 1024))
monkeypatch.setenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", str(13 * 1024 * 1024))
reloaded = importlib.reload(internal_chat_uploads)
try:
assert reloaded.CHAT_UPLOAD_MAX_FILE_BYTES == 7 * 1024 * 1024
assert reloaded.CHAT_UPLOAD_MAX_BYTES == 13 * 1024 * 1024
finally:
# Reset to defaults so subsequent tests see clean constants.
monkeypatch.delenv("CHAT_UPLOAD_MAX_FILE_BYTES", raising=False)
monkeypatch.delenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", raising=False)
importlib.reload(internal_chat_uploads)
def test_env_driven_ssot_malformed_value_falls_back_to_default(tmp_path: Path, monkeypatch: pytest.MonkeyPatch):
"""If ops pushes a garbage value the worker still boots with the
in-code default (operability over precision — see _env_int
docstring). Pin the fallback.
"""
import importlib
monkeypatch.setenv("CHAT_UPLOAD_MAX_FILE_BYTES", "not-an-int")
monkeypatch.setenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", "") # empty == use default
reloaded = importlib.reload(internal_chat_uploads)
try:
# Defaults (legacy 25 MB / 50 MB) come back.
assert reloaded.CHAT_UPLOAD_MAX_FILE_BYTES == 25 * 1024 * 1024
assert reloaded.CHAT_UPLOAD_MAX_BYTES == 50 * 1024 * 1024
finally:
monkeypatch.delenv("CHAT_UPLOAD_MAX_FILE_BYTES", raising=False)
monkeypatch.delenv("CHAT_UPLOAD_MAX_TOTAL_BYTES", raising=False)
importlib.reload(internal_chat_uploads)