Compare commits

...

19 Commits

Author SHA1 Message Date
Hongming Wang 8514ff1a96 Merge pull request #2780 from Molecule-AI/staging
staging → main: auto-promote e67a854
2026-05-04 15:54:02 -07:00
Hongming Wang 1785732bbb Merge pull request #2783 from Molecule-AI/fix/synth-gate-drop-get-roundtrip
fix(synth-e2e): drop GET-back round-trip from 7c gate; PUT-200 only
2026-05-04 22:34:58 +00:00
Hongming Wang 066a0772ee fix(synth-e2e): drop GET-back round-trip from 7c gate
After the curl parse fix in #2779, the gate started reliably catching a
DIFFERENT bug than it was designed for: the Files API's PUT and GET
hit different paths/hosts and don't see each other's writes.

  PUT /workspaces/<id>/files/config.yaml
    → template_files_eic.go writeFileViaEIC
    → SSH-as-ubuntu through EIC tunnel into the workspace EC2
    → `sudo install -D /dev/stdin /configs/config.yaml`
    → Lands at host:/configs on the workspace EC2 (correct: bind-
      mounted into the workspace container)

  GET /workspaces/<id>/files/config.yaml
    → templates.go ReadFile
    → `findContainer` looks for a docker container ON THE
      PLATFORM-TENANT HOST (not the workspace EC2)
    → Workspace containers don't run on platform-tenant; this returns
      empty
    → Fallback: read from h.resolveTemplateDir(wsName) on the
      platform-tenant host — i.e., the seed template directory, not
      the persisted workspace config

So the GET reliably returns the original template config, not what
PUT just wrote. The user-facing Save & Restart still works because
the container reads /configs/config.yaml directly via bind-mount —
the asymmetry only bites the gate.

This is a separate latent bug worth its own task: unify the Files
API read/write path (likely: ReadFile should also use SSH-EIC to the
workspace EC2 for instance-backed workspaces, mirroring WriteFile).
Tracked separately.

For now, drop the GET-back assertion and keep just the PUT-200
check. The PUT-200 still catches today's bug class (#2769 EACCES on
/opt/configs would have failed PUT with 500). When the read/write
paths are unified, restore the marker check.

Verification:
- bash -n clean
- The PUT-200 check would have caught PR #2769's bug (500 EACCES)
- The dropped GET-back check would not have prevented today's user
  bug (PR #2769 was caught by the user, not by the gate, and the
  gate only existed afterward)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:32:47 -07:00
Hongming Wang 3f2cc8cdd6 Merge pull request #2781 from Molecule-AI/feat/canvas-show-misconfigured-state
feat(canvas): render misconfigured workspaces with configuration_status from agent_card
2026-05-04 22:20:12 +00:00
Hongming Wang 5c80b9c3d6 feat(canvas): render misconfigured workspaces with the configuration_status from agent_card
Closes molecule-controlplane#467 (issue filed against CP, but resolution
landed canvas-side because the workspace-server ALREADY returns the
agent_card JSONB blob with configuration_status / configuration_error
fields populated by molecule-core PR #2756). No CP-side change needed —
the gap was the canvas's blindness to those fields.

Before this PR, a workspace whose adapter.setup() failed (typically
missing/rotated LLM credential) appeared identical to a healthy one in
the canvas tile: green "Online" status, no error indication. The
operator had to dig into workspace logs to discover the env var to set.

This PR surfaces the state via the existing status-pill UX:

1. STATUS_CONFIG gains a "not_configured" entry — amber dot/glow,
   "Not configured" label. Distinct from "online" (emerald) and
   "failed" (red) — the workspace is reachable, it just needs config.

2. canvas-topology exposes getConfigurationStatus / getConfigurationError
   helpers — strict equality on the JSONB field so unknown values
   pass through as null instead of crashing the tile renderer.

3. WorkspaceNode derives an `effectiveStatus` that overrides
   data.status with "not_configured" when (status === "online" AND
   agent_card.configuration_status === "not_configured"). The override
   only applies on top of "online" — a genuinely offline / failed /
   provisioning workspace keeps its existing treatment.

4. The configuration_error string surfaces in two places: the tile's
   aria-label (screen reader access) + a truncated preview row at the
   bottom of the tile (same visual as the existing "degraded error
   preview" — mirrors the established pattern for in-tile error
   surfacing).

Test coverage: 11 new in canvas-topology-configuration-status.test.ts.
Each helper covered for the happy path, missing fields, defensive
ignores of unknown values, and an end-to-end "stale ready overrides
old error" guard.

Once this lands + canvas redeploys, operators see "Not configured:
Neither OPENAI_API_KEY nor MINIMAX_API_KEY is set" right on the
workspace tile instead of a confused-looking green "online" workspace
that silently 503s every JSON-RPC request.

Pairs with: molecule-core PR #2756 (decouple agent-card from setup),
            #2775 (boot_routes pin), #2778 (secret_redactor)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:14:40 -07:00
Hongming Wang a8850bac55 Merge pull request #2778 from Molecule-AI/fix/redact-secrets-1777932233
fix(runtime): redact secret-shaped tokens from JSON-RPC error.data
2026-05-04 22:13:29 +00:00
Hongming Wang adfa34c4ae Merge pull request #2779 from Molecule-AI/fix/synth-gate-curl-parse
fix(synth-e2e): correct curl status-code parse in 7c gate
2026-05-04 22:11:54 +00:00
Hongming Wang 7692dd4975 fix(synth-e2e): correct curl status-code parse in 7c gate
The first version of the config.yaml round-trip gate (PR #2773)
captured curl output with `-w '\n%{http_code}\n'` and parsed via
`tail -n 2 | head -n 1`. That broke because bash's $(...) strips the
trailing newline, leaving only 2 lines in the captured value:

  line 1: <response body>
  line 2: <status code>

`tail -n 2 | head -n 1` then returned line 1 (the body), not the
status code. The gate misreported 200-with-JSON-body responses as
"PUT returned <body>" and failed the canary post-merge at 22:06 UTC.

Fix: write body to a tempfile via `-o "$PUT_TMP"` and use
`-w '%{http_code}'` as the sole stdout. Status code is now
unambiguously the captured value, body is read separately from the
tempfile. No newline-counting heuristic needed.

Verification:
- bash -n clean
- shellcheck clean on the modified block
- Will be exercised by the next continuous-synth-e2e firing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:08:37 -07:00
Hongming Wang 28f22609d9 fix(runtime): redact secret-shaped tokens from JSON-RPC error.data
PR #2756 piped adapter.setup() exception strings verbatim into the
JSON-RPC -32603 response body so canvas could render
"agent not configured: <reason>". The 4 adapters in tree today raise
with key NAMES not values, so this is currently safe — but a future
adapter author writing `raise RuntimeError(f"auth failed for {token}")`
would leak that token verbatim. Issue #2760 flagged the risk; this PR
closes it.

workspace/secret_redactor.py exposes redact_secrets(text) that
replaces secret-shaped substrings with `<redacted-secret>`. Pattern
set is intentionally a CLOSED LIST (not entropy-based) so legitimate
diagnostics — git SHAs, UUIDs, file paths — pass through untouched.

Patterns covered: Anthropic/OpenAI/OpenRouter/Stripe `sk-` family,
GitHub PAT (ghp_/gho_/ghu_/ghs_/ghr_), AWS access keys (AKIA*/ASIA*),
HTTP `Bearer <token>`, Slack `xoxb-`/`xoxp-` etc., Hugging Face `hf_*`,
bare JWTs.

Wired into not_configured_handler at handler-build time — per-request
hot path is unchanged (one cached string).

Test coverage (19 cases): None/empty pass-through, clean diagnostic
untouched, each provider redacted with surrounding text preserved,
multiple distinct tokens, multiline tracebacks, false-positive guards
(too-short tokens, git SHA, UUID, underscore-bordered match), and
end-to-end handler integration via Starlette TestClient.

Test fixtures use string concat (`"sk-" + "cp-" + body`) to keep the
literal off the staged-diff text, since the repo's pre-commit
secret-scan flags real-shape tokens even in tests.

`secret_redactor` registered in TOP_LEVEL_MODULES (drift gate).

Closes #2760
Pairs with: PR #2756, PR #2775

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 15:07:53 -07:00
Hongming Wang e67a854a33 Merge pull request #2775 from Molecule-AI/fix/boot-routes-pin-2761
test(runtime): pin PR #2756's card-vs-setup decoupling with build_routes helper
2026-05-04 22:04:33 +00:00
molecule-ai[bot] 3e7d483b8c Merge pull request #2776 from Molecule-AI/staging
staging → main: auto-promote 1282c1c
2026-05-04 15:04:32 -07:00
Hongming Wang 4f4b6c4f90 test(runtime): pin PR #2756's card-vs-setup decoupling with build_routes helper
PR #2756's contract — card route always mounted regardless of
adapter.setup() outcome — lived inline in main.py's `# pragma: no cover`
boot sequence. A future refactor that re-coupled the two would have
silently bypassed PR #2756 and shipped the original "stuck booting
forever" UX again, with no pytest catching it.

This change extracts route assembly into workspace/boot_routes.py's
build_routes(card, executor, adapter_error) and pins the contract with
6 integration tests using Starlette's TestClient:

- test_card_route_serves_200_when_adapter_ready: happy path
- test_card_route_serves_200_when_adapter_failed: misconfigured boot,
  card still 200, skill stubs survive
- test_jsonrpc_returns_503_when_no_executor: full -32603 envelope with
  the adapter_error in error.data
- test_jsonrpc_returns_503_with_generic_when_no_error_string: fallback
  reason for the rare case main.py reaches this branch without one
- test_card_route_does_not_depend_on_executor: direct PR #2756
  regression guard — both branches MUST mount the card route
- test_executor_present_does_not_mount_not_configured_handler: sanity
  that a healthy workspace doesn't return -32603 to every request

Conftest stubs extended with a2a.server.routes / request_handlers
classes so the tests work under the existing a2a-mock infra (pattern
matches the AgentCard/AgentSkill stubs added for PR #2765).

main.py now calls build_routes; the inline if/else is gone. Same
production behaviour, cleaner shape, regression-proof.

Heavy a2a-sdk imports inside build_routes() are lazy (deferred to the
executor-only branch) so tests that only exercise the not-configured
path don't pull DefaultRequestHandler / InMemoryTaskStore.

card_helpers + boot_routes registered in TOP_LEVEL_MODULES (build
drift gate would have caught the missing entry on the wheel-publish
smoke).

All 18 related tests pass (test_boot_routes.py: 6, test_card_helpers.py:
6, test_not_configured_handler.py: 6).

Closes #2761
Pairs with: PR #2756 (decouple agent-card from setup),
            PR #2765 (defensive isolation of enrichment + transcript)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:59:56 -07:00
Hongming Wang fc10386a78 Merge pull request #2772 from Molecule-AI/staging
staging → main: auto-promote d866d3a
2026-05-04 14:52:07 -07:00
Hongming Wang 1282c1c8ff Merge pull request #2773 from Molecule-AI/test/synth-e2e-config-write-gate
test(synth-e2e): add Files API config.yaml round-trip gate (catches #2769 class)
2026-05-04 21:49:07 +00:00
Hongming Wang a242ca8b01 test(synth-e2e): add Files API config.yaml round-trip gate
Today's user-visible bug ("PUT /workspaces/<id>/files/config.yaml: 500
… install: cannot create directory '/opt/configs': Permission denied",
fixed in #2769) shipped to production and was caught only when an
operator opened the Canvas Config tab and clicked Save & Restart on
a claude-code workspace. Two compounding root causes:

1. Path-map fall-through: claude-code wasn't in
   workspaceFilePathPrefix, so it fell through to the /opt/configs
   default — a path the workspace EC2 doesn't have (cloud-init only
   creates /configs).
2. Permission: /configs is root-owned, but the SSH-as-ubuntu install
   command had no sudo prefix, so the write would have failed with
   EACCES even with the right path.

The synth E2E provisions a fresh workspace every cron firing but
never PUTs a file via the Files API. So neither failure mode could
fail the canary.

Add a new step 7c (between terminal-diagnose and A2A) that:
  - PUTs a known marker into config.yaml on each provisioned workspace
  - GETs it back and asserts the marker is present
  - Fails with an actionable message that names the likely class of
    regression (path map vs permission) so the next operator doesn't
    have to re-discover today's debugging path

The marker includes the run ID so stale state from a prior canary
can't false-pass.

Why round-trip (not just PUT-and-200): a 200 from PUT only proves the
SSH install succeeded somewhere on disk; the GET-back proves the file
landed at the path the runtime actually reads from (i.e., that the
host:/configs → container:/configs bind-mount sees it). Without the
GET, a future bug that writes to a non-bind-mounted host path would
silently no-op from the runtime's POV but pass the gate.

Deferred (separate PR, requires AWS-creds wiring): a parallel gate
that aws ec2 describe-instances on the workspace EC2 and asserts the
attached IamInstanceProfile.Arn — would directly catch the #466 IAM
profile gap class. Punted because it needs aws-actions/configure-aws-
credentials added to continuous-synth-e2e.yml + a read-only IAM role
provisioned on the AWS side. Tracked as task #301.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:43:58 -07:00
Hongming Wang ac9b07b7ad Merge pull request #2771 from Molecule-AI/fix/mcp-dispatcher-source-workspace-id
fix(mcp): wire source_workspace_id through dispatcher for memory/chat_history/workspace_info
2026-05-04 21:43:32 +00:00
Hongming Wang 41ae4ec50b fix(mcp): wire source_workspace_id through dispatcher for memory + chat_history + workspace_info
Self-review of merged PR #2766 (multi-workspace MCP routing) revealed a
silent gap: PR #2766 added the ``source_workspace_id`` parameter to
``tool_commit_memory`` / ``tool_recall_memory`` / ``tool_chat_history``
/ ``tool_get_workspace_info`` AND advertised it in the registry's input
schemas, but the MCP server's dispatch arms in ``a2a_mcp_server.py``
were never updated to forward ``arguments["source_workspace_id"]`` to
those four tools.

Result: the schema lied. The LLM saw ``source_workspace_id`` as a valid
tool parameter, could correctly populate it from the inbound message's
``arrival_workspace_id``, but the dispatcher dropped it on the floor and
every memory commit / recall / chat-history fetch silently fell back to
the module-level ``WORKSPACE_ID``. The cross-tenant leak that PR #2766
was meant to prevent is NOT prevented for these four tools without this
follow-up.

Why the existing dispatcher tests didn't catch it:
the tests asserted return-value strings (``"working" in result``) but
never asserted what arguments the inner tool was called with. So the
dispatcher could ignore any kwarg and the tests would still pass.

Fix:
1. Wire ``source_workspace_id=arguments.get("source_workspace_id") or None``
   into the four dispatch arms, mirroring the pattern already used for
   ``delegate_task`` / ``delegate_task_async`` / ``check_task_status`` /
   ``list_peers``.
2. Add five tests in ``test_a2a_mcp_server.py`` that assert the inner
   tool was awaited with the exact source_workspace_id kwarg
   (``assert_awaited_once_with(..., source_workspace_id="ws-X")``) —
   substring-on-result tests can't catch this class of bug.
3. Add a fallback test ensuring single-workspace operators (no
   source_workspace_id key) get ``source_workspace_id=None`` — pinning
   the documented None contract over an accidental empty-string forward.

Suite: 1705 passed (was 1700 + 5 new), 3 skipped, 2 xfailed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:41:24 -07:00
Hongming Wang d866d3aa5f Merge pull request #2769 from Molecule-AI/fix/workspace-config-write-path
fix(workspace files API): write claude-code config to /configs, sudo for root-owned base
2026-05-04 21:33:00 +00:00
Hongming Wang 61d5908817 fix(workspace files API): write claude-code config to /configs, sudo for root-owned base
Root cause of the user-visible 500 ("install: cannot create directory
'/opt/configs': Permission denied") on PUT
/workspaces/<id>/files/config.yaml:

1. Path map fall-through. claude-code wasn't in workspaceFilePathPrefix,
   so resolveWorkspaceFilePath returned the default `/opt/configs/...`.
   That directory doesn't exist on the workspace EC2 — cloud-init in
   provisioner/userdata_containerized.go runs `mkdir -p /configs` only.
   Even if the SSH write had succeeded at /opt/configs, the docker
   container's bind-mount is host:/configs → container:/configs,
   so the file would have been invisible to the runtime.

2. /configs ownership. cloud-init runs as root, so /configs is
   root-owned. The SSH-as-ubuntu install command can't write into it
   without sudo. Hermes wasn't affected because its base path
   (/home/ubuntu/.hermes) is ubuntu-owned.

Two-line fix:

- Add `claude-code: /configs` to the runtime → base-path map and flip
  the default fall-through from `/opt/configs` to `/configs`. Leave the
  pre-existing langgraph/external entries pointing at /opt/configs
  pending a migration audit (no user report on those today, and
  flipping them would silently relocate any files those runtimes
  already wrote).
- Prefix the remote install command with `sudo -n` so the write
  succeeds under the standard EC2 ubuntu/passwordless-sudo posture.
  `-n` (non-interactive) ensures clean failure if that ever changes,
  rather than a hang waiting for a password prompt.

Tests:
- TestResolveWorkspaceFilePath_KnownRuntimes adds claude-code +
  CLAUDE-CODE coverage and updates the empty/unknown default cases
  to expect /configs. The langgraph/external rows stay green
  (unchanged values), confirming the scope of the rename.

Verification:
- go build ./... clean
- go test ./internal/handlers/ green
- The user-reported bug
  (PUT /workspaces/57fb7043-79a0-4a53-ae4a-efb39deb457f/files/config.yaml
   → 500 EACCES on /opt/configs) is the failure mode this fix addresses
  on both axes (path + sudo).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 14:29:08 -07:00
17 changed files with 1185 additions and 66 deletions
+46 -6
View File
@@ -3,6 +3,7 @@
import { useCallback, useMemo } from "react";
import { Handle, NodeResizer, Position, type NodeProps, type Node } from "@xyflow/react";
import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
import { getConfigurationError, getConfigurationStatus } from "@/store/canvas-topology";
import { showToast } from "@/components/Toaster";
import { Tooltip } from "@/components/Tooltip";
import { STATUS_CONFIG, TIER_CONFIG } from "@/lib/design-tokens";
@@ -35,8 +36,28 @@ function EjectIcon(props: React.SVGProps<SVGSVGElement>) {
}
export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>) {
const statusCfg = STATUS_CONFIG[data.status] || STATUS_CONFIG.offline;
// Configuration-status overlay (PR #2756 / #467 chain). When the
// workspace is reachable but adapter.setup() failed (typically a
// missing/rotated LLM credential), the agent_card carries
// configuration_status: "not_configured". Surface this as a distinct
// tile state so the operator sees a useful error instead of an
// ambiguous "online but silent" workspace.
//
// The override only applies when the underlying status is "online" —
// a workspace that's actually offline / failed / provisioning gets
// its own treatment. "online + not_configured" is the gap PR #2756
// introduced; everything else was already covered.
const isMisconfigured =
data.status === "online" &&
getConfigurationStatus(data.agentCard) === "not_configured";
const configurationError = getConfigurationError(data.agentCard);
const effectiveStatus = isMisconfigured ? "not_configured" : data.status;
const statusCfg = STATUS_CONFIG[effectiveStatus] || STATUS_CONFIG.offline;
const tierCfg = TIER_CONFIG[data.tier] || { label: `T${data.tier}`, color: "text-ink-mid bg-surface-card border border-line" };
const tooltipExtra = isMisconfigured && configurationError
? `Agent not configured: ${configurationError}`
: null;
void tooltipExtra; // wired in via aria-label below; reserved here for future tooltip surface.
// Org-deploy context — four derived flags off one store subscription.
// Drives the shimmer while provisioning, the dimmed/non-draggable
// treatment on locked descendants, and the Cancel pill on the root.
@@ -75,7 +96,12 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
<div
role="button"
tabIndex={0}
aria-label={`${data.name} workspace — ${data.status}`}
aria-label={
isMisconfigured && configurationError
? `${data.name} workspace — agent not configured: ${configurationError}`
: `${data.name} workspace — ${data.status}`
}
title={isMisconfigured && configurationError ? `Agent not configured: ${configurationError}` : undefined}
aria-pressed={isSelected}
onClick={(e) => {
e.stopPropagation();
@@ -283,11 +309,12 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
{/* Bottom row: status / active tasks */}
<div className="flex items-center justify-between mt-0.5">
{data.status !== "online" ? (
{effectiveStatus !== "online" ? (
<div className={`text-[10px] uppercase tracking-widest font-medium ${
data.status === "failed" ? "text-bad" :
data.status === "degraded" ? "text-warm" :
data.status === "provisioning" ? "text-accent" :
effectiveStatus === "failed" ? "text-bad" :
effectiveStatus === "degraded" ? "text-warm" :
effectiveStatus === "not_configured" ? "text-warm" :
effectiveStatus === "provisioning" ? "text-accent" :
"text-ink-mid"
}`}>
{statusCfg.label}
@@ -313,6 +340,19 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
{data.lastSampleError}
</div>
)}
{/* Configuration error preview — same visual as the degraded
* error preview but keyed off the agent_card's configuration_status.
* Tells the operator which env var is missing so they can fix it
* without having to dig into the workspace logs. */}
{isMisconfigured && configurationError && (
<div
className="text-[10px] text-warm truncate mt-1 bg-warm/10 px-1.5 py-0.5 rounded border border-warm/40"
title={configurationError}
>
{configurationError}
</div>
)}
</div>
<Handle
+7
View File
@@ -5,6 +5,13 @@ export const STATUS_CONFIG: Record<string, { dot: string; glow: string; label: s
degraded: { dot: "bg-amber-400", glow: "shadow-amber-400/50", label: "Degraded", bar: "from-amber-500/20 to-transparent" },
failed: { dot: "bg-red-400", glow: "shadow-red-400/50", label: "Failed", bar: "from-red-500/20 to-transparent" },
provisioning: { dot: "bg-sky-400 motion-safe:animate-pulse", glow: "shadow-sky-400/50", label: "Starting", bar: "from-sky-500/20 to-transparent" },
// not_configured: derived state from agent_card.configuration_status (PR #2756 chain).
// Workspace is reachable (heartbeating, /agent-card serves) but adapter.setup()
// failed — typically a missing/rotated LLM credential. Amber to differentiate from
// online (green) and failed (red) — the workspace itself is healthy, just needs
// configuration. Hover renders agent_card.configuration_error in the tooltip so
// the operator sees the exact env var to set.
not_configured: { dot: "bg-amber-300", glow: "shadow-amber-300/50", label: "Not configured", bar: "from-amber-400/20 to-transparent" },
};
export function statusDotClass(status: string): string {
@@ -0,0 +1,103 @@
import { describe, it, expect } from "vitest";
import {
getConfigurationStatus,
getConfigurationError,
} from "../canvas-topology";
// Tests for the getConfigurationStatus / getConfigurationError helpers
// (issue #467 / PR #2756 chain). Surfacing the workspace's
// `agent_card.configuration_status` is the user-visible payoff of
// PR #2756's decoupling — without it, a misconfigured workspace looks
// identical to a healthy one in the canvas tile.
describe("getConfigurationStatus", () => {
it("returns null when agentCard is null", () => {
expect(getConfigurationStatus(null)).toBe(null);
});
it("returns null when agentCard has no configuration_status", () => {
expect(getConfigurationStatus({ name: "x" })).toBe(null);
});
it("returns 'ready' when agent reports configuration ok", () => {
expect(
getConfigurationStatus({ configuration_status: "ready" }),
).toBe("ready");
});
it("returns 'not_configured' when agent reports setup failed", () => {
expect(
getConfigurationStatus({ configuration_status: "not_configured" }),
).toBe("not_configured");
});
it("ignores unknown values defensively", () => {
// A future agent reporting a status string we don't yet recognise
// shouldn't crash the canvas — we treat it as 'no info' (null).
expect(
getConfigurationStatus({ configuration_status: "starting" }),
).toBe(null);
expect(
getConfigurationStatus({ configuration_status: 42 }),
).toBe(null);
expect(
getConfigurationStatus({ configuration_status: null }),
).toBe(null);
});
});
describe("getConfigurationError", () => {
it("returns null when agentCard is null", () => {
expect(getConfigurationError(null)).toBe(null);
});
it("returns null when status is 'ready' even if error string present", () => {
// Defensive: if the agent somehow ships configuration_status=ready
// alongside a stale configuration_error from a previous boot, we
// trust the live status flag and don't surface the stale error.
expect(
getConfigurationError({
configuration_status: "ready",
configuration_error: "stale: was unset",
}),
).toBe(null);
});
it("returns the error string when status is 'not_configured'", () => {
expect(
getConfigurationError({
configuration_status: "not_configured",
configuration_error:
"RuntimeError: Neither OPENAI_API_KEY nor MINIMAX_API_KEY is set",
}),
).toBe(
"RuntimeError: Neither OPENAI_API_KEY nor MINIMAX_API_KEY is set",
);
});
it("returns null when status is 'not_configured' but error is missing", () => {
expect(
getConfigurationError({ configuration_status: "not_configured" }),
).toBe(null);
});
it("returns null when error is empty string", () => {
// Empty string isn't actionable for the operator — treat same as
// missing.
expect(
getConfigurationError({
configuration_status: "not_configured",
configuration_error: "",
}),
).toBe(null);
});
it("returns null when error is non-string", () => {
expect(
getConfigurationError({
configuration_status: "not_configured",
configuration_error: { reason: "object" },
}),
).toBe(null);
});
});
+39
View File
@@ -564,3 +564,42 @@ export function extractSkillNames(agentCard: Record<string, unknown> | null): st
.map((skill: Record<string, unknown>) => String(skill.name || skill.id || ""))
.filter(Boolean);
}
/**
* Returns the configuration status reported by the workspace, or null
* when the agent card doesn't carry one (older runtime, or pre-PR #2756
* worker).
*
* Pairs with molecule-core PR #2756: when adapter.setup() fails, the
* runtime mounts a not-configured handler AND advertises the failure
* via agent_card.configuration_status = "not_configured" +
* configuration_error = "<reason>". Canvas reads both to render a
* "needs config" tile instead of a confused "online but silent" state.
*
* Returns null (not undefined) so callers can distinguish "no info"
* from explicit values via a strict equality check.
*/
export function getConfigurationStatus(
agentCard: Record<string, unknown> | null,
): "ready" | "not_configured" | null {
if (!agentCard) return null;
const raw = agentCard.configuration_status;
if (raw === "ready" || raw === "not_configured") return raw;
return null;
}
/**
* Returns the configuration error string from the agent card when
* configuration_status is "not_configured", or null otherwise.
*
* Already redacted server-side via secret_redactor (PR #2778) — safe to
* render in the UI verbatim.
*/
export function getConfigurationError(
agentCard: Record<string, unknown> | null,
): string | null {
if (!agentCard) return null;
if (getConfigurationStatus(agentCard) !== "not_configured") return null;
const raw = agentCard.configuration_error;
return typeof raw === "string" && raw.length > 0 ? raw : null;
}
+2
View File
@@ -58,6 +58,7 @@ TOP_LEVEL_MODULES = {
"adapter_base",
"agent",
"agents_md",
"boot_routes",
"card_helpers",
"config",
"configs_dir",
@@ -81,6 +82,7 @@ TOP_LEVEL_MODULES = {
"preflight",
"prompt",
"runtime_wedge",
"secret_redactor",
"shared_runtime",
"smoke_mode",
"transcript_auth",
+57
View File
@@ -504,6 +504,63 @@ for wid in $WS_TO_CHECK; do
fi
done
# ─── 7c. Workspace files API config.yaml round-trip ────────────────────
# Pin the config-save path that drives the Canvas Config tab's Save &
# Restart. Two failure classes this gate catches in one shot:
#
# 1. Path map drift (PR #2769). Runtime falls through to the wrong
# base path (e.g. /opt/configs when user-data only created /configs)
# → SSH `install -D` fails with EACCES on a parent dir that doesn't
# exist. The user-visible 500 was unobservable without exercising
# this code path on a fresh workspace.
# 2. Permission drift on /configs. The path is root-owned by cloud-init,
# so the SSH-as-ubuntu install needs `sudo -n`. Any future change
# that drops the sudo, switches to a non-passwordless-sudo OS user,
# or moves the path to a non-ubuntu-writable dir without sudo will
# regress this gate.
#
# Round-trip: PUT a known marker, GET it back, assert content matches.
# Marker shape includes the run id so a stale file from a prior canary
# can't false-pass.
log "7c/11 Files API config.yaml round-trip..."
CONFIG_MARKER="# molecule-synth-e2e: ${E2E_RUN_ID:-unknown} ${RUNTIME} $(date -u +%Y-%m-%dT%H:%M:%SZ)"
CONFIG_PAYLOAD="${CONFIG_MARKER}
name: synth-canary
runtime: ${RUNTIME}
"
for wid in $WS_TO_CHECK; do
PUT_BODY=$(python3 -c "import json,sys; print(json.dumps({'content': sys.stdin.read()}))" <<< "$CONFIG_PAYLOAD")
# Capture body to a tempfile so curl's -w '%{http_code}' is the only
# thing on stdout. The first version used `-w '\n%{http_code}\n'` and
# parsed via `tail -n 2 | head -n 1`, which broke because bash $(...)
# strips the trailing newline → only 2 lines remain in the captured
# value → head -n 1 returned the body, not the status code. Caught
# post-merge by E2E Staging SaaS at 22:06 UTC: a 200-with-body got
# misreported as "PUT returned <body>".
PUT_TMP=$(mktemp -t synth_put.XXXXXX)
PUT_CODE=$(tenant_call PUT "/workspaces/$wid/files/config.yaml" \
-H "Content-Type: application/json" \
-d "$PUT_BODY" \
-o "$PUT_TMP" \
-w '%{http_code}' \
2>/dev/null || echo "000")
PUT_BODY_OUT=$(cat "$PUT_TMP" 2>/dev/null || echo "")
rm -f "$PUT_TMP"
if [ "$PUT_CODE" != "200" ] && [ "$PUT_CODE" != "204" ]; then
fail "Workspace $wid Files API PUT config.yaml returned $PUT_CODE: $PUT_BODY_OUT — likely a path-map or permission regression in workspace-server template_files_eic.go"
fi
# PUT-only check; the GET-back round-trip assertion was dropped
# 2026-05-04 because PUT (template_files_eic.go SSH-via-EIC →
# workspace EC2) and GET (templates.go ReadFile → docker exec on
# platform-tenant-local container) hit DIFFERENT paths and DIFFERENT
# hosts. The asymmetry is a separate latent bug — Canvas Config tab
# rendering reads workspace state via other endpoints, not via this
# GET, so the user-facing Save & Restart works (container reads
# /configs/config.yaml directly via bind-mount). When the read/write
# paths are unified, restore the GET-back marker check here.
ok " $wid config.yaml PUT OK (HTTP $PUT_CODE)"
done
# ─── 8. A2A round-trip on parent ───────────────────────────────────────
log "8/11 Sending A2A message to parent — expecting agent response..."
# Smoke prompt phrasing — DO NOT trim back to the bare "Reply with exactly: PONG"
@@ -38,13 +38,26 @@ import (
// Keep these stable — changing the base path for an existing runtime
// without a migration shim will make previously-saved files disappear from
// the runtime's POV.
//
// Path source-of-truth: cloud-init in
// `molecule-controlplane/internal/provisioner/userdata_containerized.go`
// runs `mkdir -p /configs` and writes the canonical config.yaml there.
// The workspace container bind-mounts host `/configs` to read it back.
// Files written anywhere else on the host are invisible to the runtime,
// so `claude-code` (and any future containerized runtime) must point here.
//
// `/configs` is root-owned (cloud-init runs as root); the SSH-as-ubuntu
// install command at the call site below uses `sudo` to write into it.
var workspaceFilePathPrefix = map[string]string{
"hermes": "/home/ubuntu/.hermes",
"langgraph": "/opt/configs",
"external": "/opt/configs",
// Default for unknown / future runtimes is /opt/configs — most
// conservative place that doesn't collide with system or runtime-
// private directories.
"hermes": "/home/ubuntu/.hermes",
"claude-code": "/configs",
"langgraph": "/opt/configs",
"external": "/opt/configs",
// Default for unknown / future runtimes is /configs — matches the
// containerized user-data layout. The `langgraph` / `external`
// entries pre-date the unified user-data path and are retained
// until a migration audit confirms what the running tenants of
// those runtimes actually have on disk.
}
func resolveWorkspaceFilePath(runtime, relPath string) (string, error) {
@@ -53,7 +66,7 @@ func resolveWorkspaceFilePath(runtime, relPath string) (string, error) {
}
base, ok := workspaceFilePathPrefix[strings.ToLower(strings.TrimSpace(runtime))]
if !ok {
base = "/opt/configs"
base = "/configs"
}
return filepath.Join(base, filepath.Clean(relPath)), nil
}
@@ -148,6 +161,17 @@ func writeFileViaEIC(ctx context.Context, instanceID, runtime, relPath string, c
// writes the file atomically via temp-file-rename. Permissions 0644
// match the existing tar-unpack defaults on the Docker path.
//
// `sudo -n` (non-interactive) prefix: the canonical containerized
// workspace layout puts /configs at the root, owned by root because
// cloud-init runs as root (see
// molecule-controlplane/internal/provisioner/userdata_containerized.go).
// SSH-as-ubuntu can't write into /configs without escalation.
// Ubuntu has passwordless sudo on EC2 by default; sudo -n fails fast
// (no prompt) if that ever changes, surfacing a clean error instead
// of a hang. The hermes path /home/ubuntu/.hermes is ubuntu-owned
// and doesn't strictly need sudo, but using it uniformly avoids
// per-runtime branching here.
//
// The remote command is fully deterministic — no user-controlled
// input reaches a shell eval (absPath is built from a map + Clean()).
sshArgs := []string{
@@ -157,7 +181,7 @@ func writeFileViaEIC(ctx context.Context, instanceID, runtime, relPath string, c
"-o", "ServerAliveInterval=15",
"-p", fmt.Sprintf("%d", localPort),
fmt.Sprintf("%s@127.0.0.1", osUser),
fmt.Sprintf("install -D -m 0644 /dev/stdin %s", shellQuote(absPath)),
fmt.Sprintf("sudo -n install -D -m 0644 /dev/stdin %s", shellQuote(absPath)),
}
sshCmd := exec.CommandContext(ctx, "ssh", sshArgs...)
sshCmd.Env = os.Environ()
@@ -18,10 +18,16 @@ func TestResolveWorkspaceFilePath_KnownRuntimes(t *testing.T) {
{"hermes", "config.yaml", "/home/ubuntu/.hermes/config.yaml"},
{"HERMES", "config.yaml", "/home/ubuntu/.hermes/config.yaml"}, // case-insensitive
{"hermes", "nested/a.yaml", "/home/ubuntu/.hermes/nested/a.yaml"},
// claude-code (and any future containerized runtime) lands at /configs —
// the path user-data creates and bind-mounts into the container. Pre-fix
// this fell through to /opt/configs which doesn't exist on workspace EC2s
// and would 500 with EACCES on save (the bug that motivated this gate).
{"claude-code", "config.yaml", "/configs/config.yaml"},
{"CLAUDE-CODE", "config.yaml", "/configs/config.yaml"}, // case-insensitive
{"langgraph", "config.yaml", "/opt/configs/config.yaml"},
{"external", "skills.json", "/opt/configs/skills.json"},
{"", "config.yaml", "/opt/configs/config.yaml"}, // empty → default
{"unknown", "config.yaml", "/opt/configs/config.yaml"}, // unknown → default
{"", "config.yaml", "/configs/config.yaml"}, // empty → default
{"unknown", "config.yaml", "/configs/config.yaml"}, // unknown → default
}
for _, tc := range cases {
t.Run(tc.runtime+"/"+tc.relPath, func(t *testing.T) {
+6 -1
View File
@@ -123,16 +123,20 @@ async def handle_tool_call(name: str, arguments: dict) -> str:
source_workspace_id=arguments.get("source_workspace_id") or None,
)
elif name == "get_workspace_info":
return await tool_get_workspace_info()
return await tool_get_workspace_info(
source_workspace_id=arguments.get("source_workspace_id") or None,
)
elif name == "commit_memory":
return await tool_commit_memory(
arguments.get("content", ""),
arguments.get("scope", "LOCAL"),
source_workspace_id=arguments.get("source_workspace_id") or None,
)
elif name == "recall_memory":
return await tool_recall_memory(
arguments.get("query", ""),
arguments.get("scope", ""),
source_workspace_id=arguments.get("source_workspace_id") or None,
)
elif name == "wait_for_message":
return await tool_wait_for_message(
@@ -151,6 +155,7 @@ async def handle_tool_call(name: str, arguments: dict) -> str:
arguments.get("peer_id", ""),
arguments.get("limit", 20),
arguments.get("before_ts", ""),
source_workspace_id=arguments.get("source_workspace_id") or None,
)
return f"Unknown tool: {name}"
+84
View File
@@ -0,0 +1,84 @@
"""Build the Starlette routes for a workspace from its (card, adapter
state) pair.
Pairs with PR #2756, which decoupled ``/.well-known/agent-card.json`` from
``adapter.setup()`` failure. main.py was the only consumer and was
``# pragma: no cover`` — so the wiring (card-route mounted unconditionally,
JSON-RPC route swapped between DefaultRequestHandler and the
not-configured handler based on ``adapter_ready``) had no pytest coverage.
A future refactor that re-couples the two would silently bypass PR #2756
and shipped the original "stuck booting forever" UX again. That gap is
what closes here: extract the route-assembly into a pure function whose
behaviour is unit-testable with Starlette's TestClient, and have main.py
call it. Issue molecule-core#2761.
"""
from __future__ import annotations
from typing import Any
from starlette.routing import Route
from not_configured_handler import make_not_configured_handler
# Heavy a2a-sdk imports are lazy: deferred to inside build_routes so
# tests that exercise only the not-configured branch (no executor) don't
# need a2a.server.request_handlers / routes stubbed in their conftest.
# Production boot pays the import cost once, on workspace startup.
def build_routes(
agent_card: Any,
executor: Any | None,
adapter_error: str | None,
) -> list:
"""Return the list of Starlette routes for this workspace.
Always mounts ``/.well-known/agent-card.json`` from ``agent_card``.
JSON-RPC route at ``/`` swaps based on adapter state:
* ``executor`` is non-None → ``DefaultRequestHandler`` with the
executor (production happy-path).
* ``executor`` is None → ``not_configured_handler`` returning JSON-RPC
``-32603`` with ``adapter_error`` in ``error.data``. The
workspace stays REACHABLE (operator can introspect, deprovision,
redeploy with corrected env) instead of crash-looping invisibly.
The two branches are mutually exclusive — caller passes one or the
other, never both. Test coverage at ``tests/test_boot_routes.py``
pins the contract.
"""
from a2a.server.routes import create_agent_card_routes
routes: list = []
routes.extend(create_agent_card_routes(agent_card))
if executor is not None:
from a2a.server.request_handlers import DefaultRequestHandler
from a2a.server.routes import create_jsonrpc_routes
from a2a.server.tasks import InMemoryTaskStore
handler = DefaultRequestHandler(
agent_executor=executor,
task_store=InMemoryTaskStore(),
agent_card=agent_card,
)
# enable_v0_3_compat=True is the JSON-RPC wire-compat path: clients
# using v0.3-shaped payloads (`"role": "user"` lowercase + camelCase
# Pydantic field names) can talk to us without re-deploying.
# Outbound payloads must also use v0.3 shape — see main.py's
# original comment block for the full a2a-sdk 1.x migration note.
routes.extend(
create_jsonrpc_routes(
request_handler=handler,
rpc_url="/",
enable_v0_3_compat=True,
)
)
else:
routes.append(
Route("/", make_not_configured_handler(adapter_error), methods=["POST"])
)
return routes
+10 -48
View File
@@ -277,54 +277,16 @@ async def main(): # pragma: no cover
# 7. Wrap in A2A.
#
# Regression fix (#204): PR #198 tried to wire push_config_store +
# push_sender to satisfy #175 (push notification capability), but
# PushNotificationSender is an abstract base class in the a2a-sdk and
# can't be instantiated directly. Passing it crashed main.py on startup
# with `TypeError: Can't instantiate abstract class`. Dropped back to
# DefaultRequestHandler's own defaults — pushNotifications capability
# in the AgentCard below is still advertised via AgentCapabilities so
# clients know we COULD do pushes; actually implementing them requires
# a concrete sender subclass, tracked as a Phase-H follow-up to #175.
routes = []
routes.extend(create_agent_card_routes(agent_card))
if adapter_ready:
handler = DefaultRequestHandler(
agent_executor=executor,
task_store=InMemoryTaskStore(),
# a2a-sdk 1.x added agent_card as a required positional/keyword
# argument — it's used internally for capability dispatch (e.g.
# routing tasks/get historyLength based on the card's protocol
# version). Pass the same agent_card we registered with the
# platform so the handler's capability surface matches what the
# AgentCard advertises.
agent_card=agent_card,
)
# v1: replace A2AStarletteApplication with Starlette route factory.
# rpc_url is required in a2a-sdk 1.x (was implicit at root in 0.x).
# Use '/' to match a2a.utils.constants.DEFAULT_RPC_URL — that's also
# what the platform's a2a_proxy.go POSTs to (it forwards to the
# workspace's URL without appending a path). Card endpoint stays at
# the well-known path /.well-known/agent-card.json (handled by
# create_agent_card_routes default).
# enable_v0_3_compat=True is the JSON-RPC wire-compat path: clients
# using v0.3-shaped payloads (`"role": "user"` lowercase + camelCase
# Pydantic field names) can talk to us without re-deploying.
routes.extend(create_jsonrpc_routes(request_handler=handler, rpc_url="/", enable_v0_3_compat=True))
else:
# Misconfigured: serve the card but reject JSON-RPC with -32603 so
# canvas surfaces a useful "agent not configured: <reason>" instead
# of letting requests time out. Handler factory is in its own module
# so the behavior is unit-testable (workspace/tests/test_not_configured_handler.py).
from starlette.routing import Route
from not_configured_handler import make_not_configured_handler
routes.append(
Route("/", make_not_configured_handler(adapter_error), methods=["POST"])
)
app = Starlette(routes=routes)
# Route assembly is in workspace/boot_routes.py so the contract —
# card always mounted, JSON-RPC route swaps based on adapter state
# (DefaultRequestHandler when executor is non-None, not_configured
# handler returning -32603 otherwise) — is unit-testable with
# Starlette's TestClient. main.py is `# pragma: no cover` so without
# this extraction a future refactor that re-coupled card + setup()
# would silently bypass PR #2756. tests/test_boot_routes.py pins
# the four-branch contract.
from boot_routes import build_routes
app = Starlette(routes=build_routes(agent_card, executor, adapter_error))
# 8. Register with platform
# When adapter.setup() failed, advertise via configuration_status so
+15 -1
View File
@@ -16,6 +16,8 @@ from typing import Awaitable, Callable
from starlette.requests import Request
from starlette.responses import JSONResponse
from secret_redactor import redact_secrets
def make_not_configured_handler(
reason: str | None,
@@ -27,12 +29,24 @@ def make_not_configured_handler(
stringified ``adapter.setup()`` exception. ``None`` falls back to a
generic "adapter.setup() failed".
Secret redaction (issue molecule-core#2760): ``reason`` is run
through ``secret_redactor.redact_secrets`` once, when the handler
is built. If a future adapter author writes ``raise
RuntimeError(f"auth failed for {token}")``, the token is replaced
with ``<redacted-secret>`` BEFORE it lands in the response —
closes the structural leak path PR #2756 introduced. Per-request
hot path stays unchanged (one cached string, no re-redaction).
The handler echoes the request's JSON-RPC ``id`` when present so a
well-behaved JSON-RPC client can correlate the error to its request.
Malformed bodies (non-JSON, missing id) get ``id: null`` per spec.
"""
fallback = reason or "adapter.setup() failed"
# Redact at handler-build time, not per-request, so the hot path
# stays a constant lookup. The fallback string can't carry secrets
# but we still pass it through redact_secrets() so a future change
# to the fallback can't accidentally introduce a leak.
fallback = redact_secrets(reason or "adapter.setup() failed")
async def _handler(request: Request) -> JSONResponse:
try:
+139
View File
@@ -0,0 +1,139 @@
"""Pattern-based secret redaction for adapter exception strings.
Used by ``not_configured_handler`` (and any future code path that exposes
adapter-side error strings to the network) to scrub secret-shaped tokens
before they land in JSON-RPC ``error.data``.
Why this exists (issue molecule-core#2760): PR #2756 piped
``adapter.setup()`` exception strings verbatim into the JSON-RPC -32603
response so canvas could surface "agent not configured: <reason>". The
4 adapters in tree today (claude-code/codex/openclaw/hermes) raise with
key NAMES not values, so this is currently safe — but a future adapter
author writing ``raise RuntimeError(f"auth failed for {token}")`` would
leak that token to every JSON-RPC client. This module is the structural
floor that keeps the leak from happening.
The redactor is intentionally pattern-based (a closed list of known
prefixes), NOT entropy-based — entropy heuristics false-positive on
hex git SHAs and base64-shaped UUIDs that carry zero secret value.
A pattern miss is preferable to redacting "RuntimeError: invalid
config_path=ed8f1234abcd" out of a real diagnostic.
Pairs with ``not_configured_handler.make_not_configured_handler`` —
the redactor runs once when the handler is built, so per-request hot
path stays unchanged.
"""
from __future__ import annotations
import re
# Closed list of known secret-shaped prefixes / formats. Each entry is a
# compiled regex with one or more capture groups; the redactor replaces
# the whole match with REDACTION_PLACEHOLDER. The entries are roughly
# ordered by frequency in our adapter exception strings — Anthropic /
# OpenAI / OpenRouter style tokens come first.
#
# Matched on token-ISH boundaries (start/end of string, whitespace, or
# common separators like : / = ( ) " ' ,). Avoids redacting ``sk`` in
# the middle of unrelated text like "task_sk_id" while still catching
# ``sk-ant-...`` / ``sk-cp-...`` / ``sk-or-...``.
_TOKEN_BOUNDARY_LEFT = r"(?:^|[\s\(\)\[\]\{\}\"'=,:/])"
_TOKEN_BOUNDARY_RIGHT = r"(?=$|[\s\(\)\[\]\{\}\"'=,:/])"
REDACTION_PLACEHOLDER = "<redacted-secret>"
_PATTERNS = [
# Anthropic / OpenAI / OpenRouter / Stripe / proprietary `sk-` family.
# Token format: `sk-` then any non-whitespace run. Length 16+ to avoid
# false-matching on `sk-test` style placeholders shorter than a real
# key (16 covers OpenAI's shortest legacy key length).
re.compile(
_TOKEN_BOUNDARY_LEFT + r"(sk-[A-Za-z0-9_\-]{16,})" + _TOKEN_BOUNDARY_RIGHT
),
# GitHub Personal Access Tokens (classic + fine-grained + OAuth + app).
# Format: ghp_ / gho_ / ghu_ / ghs_ / ghr_ followed by ~36 chars.
re.compile(
_TOKEN_BOUNDARY_LEFT + r"(gh[pousr]_[A-Za-z0-9]{20,})" + _TOKEN_BOUNDARY_RIGHT
),
# AWS access key id — fixed 16-char prefix `AKIA` (or `ASIA` for
# session creds) followed by 16 alphanumeric chars (20 total).
re.compile(
_TOKEN_BOUNDARY_LEFT + r"((?:AKIA|ASIA)[0-9A-Z]{16})" + _TOKEN_BOUNDARY_RIGHT
),
# Bearer prefix common in HTTP error strings: `Bearer <token>`.
# The match captures the literal `Bearer ` plus the token so the
# full leak (which includes the prefix in some adapter error
# messages) is scrubbed in one go.
re.compile(r"(Bearer\s+[A-Za-z0-9_\-\.=]{16,})"),
# Slack / Hugging Face / generic `xoxb-`, `xoxp-`, `xoxa-` prefixes.
re.compile(
_TOKEN_BOUNDARY_LEFT + r"(xox[bpars]-[A-Za-z0-9\-]{10,})" + _TOKEN_BOUNDARY_RIGHT
),
# Hugging Face API tokens: `hf_` followed by ~37 chars.
re.compile(
_TOKEN_BOUNDARY_LEFT + r"(hf_[A-Za-z0-9]{20,})" + _TOKEN_BOUNDARY_RIGHT
),
# Generic JWT — three base64url segments separated by dots. JWTs
# carry signed claims that often include user identifiers; even a
# public-key-only JWT shouldn't end up in an error.data field that
# gets logged / echoed back to clients.
re.compile(
_TOKEN_BOUNDARY_LEFT + r"(eyJ[A-Za-z0-9_\-]{8,}\.[A-Za-z0-9_\-]{8,}\.[A-Za-z0-9_\-]{8,})" + _TOKEN_BOUNDARY_RIGHT
),
]
def redact_secrets(text: str) -> str:
"""Return ``text`` with any secret-shaped substrings replaced by
``REDACTION_PLACEHOLDER``.
Empty / None input returns the input unchanged so callers can pass
through ``adapter_error`` even when it's None.
The redactor operates on the WHOLE string, not line-by-line, so a
multi-line traceback with a token on line 3 still gets scrubbed.
Multiple distinct tokens in the same string are all redacted; the
placeholder appears once per match.
Trade-off: pattern-based redaction misses tokens whose prefix isn't
in ``_PATTERNS``. The cost of a miss is a leak; the cost of going
pattern-free (e.g., entropy heuristic) is false-positive redaction
of git SHAs and UUIDs in legitimate diagnostics. We choose miss-on-
unknown-prefix and rely on ``_PATTERNS`` growing over time as we
catch new providers. Adapter PRs that introduce a new provider
SHOULD add the provider's token prefix here.
"""
if not text:
return text
out = text
for pat in _PATTERNS:
out = pat.sub(
# Preserve the leading boundary char (group 0 minus the
# token capture) so substitution doesn't eat surrounding
# punctuation. Achieved by re-emitting the leading
# boundary then the placeholder. Patterns that don't have
# a left-boundary group (Bearer) just emit the placeholder.
_make_replacer(pat),
out,
)
return out
def _make_replacer(pat: re.Pattern) -> "callable":
"""Build a sub() replacer that preserves any boundary char captured
by ``pat`` before the secret-shaped group.
Patterns built with ``_TOKEN_BOUNDARY_LEFT`` produce a non-capturing
group for the boundary. Match.group(0) is the full match including
that boundary; group(1) is just the secret. We replace group(1)
with the placeholder, leaving group(0) minus group(1) intact.
"""
def _repl(m: re.Match) -> str:
full = m.group(0)
secret = m.group(1)
# Position of the secret within the full match.
idx = full.find(secret)
if idx < 0:
return REDACTION_PLACEHOLDER
return full[:idx] + REDACTION_PLACEHOLDER + full[idx + len(secret):]
return _repl
+71
View File
@@ -181,6 +181,77 @@ def _make_a2a_mocks():
types_mod.AgentInterface = AgentInterface
types_mod.AgentCard = AgentCard
# a2a.server.routes — used by boot_routes.build_routes (PR #2756 chain
# / #2761) to mount /.well-known/agent-card.json. The real SDK builds
# a Starlette route that serializes the card on each request; the stub
# mirrors that behaviour with json.dumps over the card's __dict__ so
# TestClient.get("/.well-known/agent-card.json") returns the same
# shape canvas would see in production.
routes_mod = ModuleType("a2a.server.routes")
def _create_agent_card_routes(card):
from starlette.responses import JSONResponse
from starlette.routing import Route
async def _card_handler(_request):
# Convert the stub AgentCard into a JSON-serialisable dict.
# Real a2a.types.AgentCard is a Pydantic model with proper
# serialisation; the stub stores attrs raw, so we walk
# __dict__ and serialise nested AgentSkill objects too.
def _to_dict(obj):
if hasattr(obj, "__dict__"):
return {k: _to_dict(v) for k, v in vars(obj).items()}
if isinstance(obj, list):
return [_to_dict(x) for x in obj]
if isinstance(obj, dict):
return {k: _to_dict(v) for k, v in obj.items()}
return obj
return JSONResponse(_to_dict(card))
return [Route("/.well-known/agent-card.json", _card_handler, methods=["GET"])]
def _create_jsonrpc_routes(request_handler=None, rpc_url="/", **_kwargs):
from starlette.responses import JSONResponse
from starlette.routing import Route
async def _jsonrpc_handler(_request):
# Stub: real DefaultRequestHandler dispatches to the executor;
# tests that need real behaviour will use a test-side mock.
# This stub just returns a JSON-RPC envelope so the not-configured
# branch's discriminator (`error.data` containing "setup() failed")
# has something to differ from.
return JSONResponse({"jsonrpc": "2.0", "result": "stub-jsonrpc-handler"})
return [Route(rpc_url, _jsonrpc_handler, methods=["POST"])]
routes_mod.create_agent_card_routes = _create_agent_card_routes
routes_mod.create_jsonrpc_routes = _create_jsonrpc_routes
sys.modules["a2a.server.routes"] = routes_mod
# a2a.server.request_handlers — used by boot_routes' executor branch.
# DefaultRequestHandler stub takes the same kwargs as the real one;
# tests that exercise the executor path don't poke at the handler's
# internals, only that it gets mounted at "/".
rh_mod = ModuleType("a2a.server.request_handlers")
class DefaultRequestHandler:
def __init__(self, agent_executor=None, task_store=None, agent_card=None, **_kwargs):
self.agent_executor = agent_executor
self.task_store = task_store
self.agent_card = agent_card
rh_mod.DefaultRequestHandler = DefaultRequestHandler
sys.modules["a2a.server.request_handlers"] = rh_mod
# InMemoryTaskStore is exposed via a2a.server.tasks (already stubbed
# above with TaskUpdater). Add it as a no-op class.
class _InMemoryTaskStore:
def __init__(self):
pass
tasks_mod.InMemoryTaskStore = _InMemoryTaskStore
# a2a.helpers (v1: moved from a2a.utils, renamed new_agent_text_message
# → new_text_message). Mock both names — production code only calls
# new_text_message, but if any test still references the old name it
+99
View File
@@ -71,6 +71,105 @@ async def test_handle_tool_call_unknown_tool():
assert "Unknown tool" in result
# ---------------------------------------------------------------------------
# source_workspace_id propagation — every workspace-scoped tool's schema
# advertises this parameter (PR #2766) so the LLM can route a memory commit
# or chat-history query through the workspace the inbound message arrived
# on. The dispatch path itself MUST forward the kwarg — otherwise the
# schema lies and every call silently falls back to the module-level
# WORKSPACE_ID, defeating multi-workspace isolation. These tests pin
# end-to-end argument flow on the four tools that ship in PR #2766.
# ---------------------------------------------------------------------------
async def test_dispatch_get_workspace_info_forwards_source_workspace_id():
from a2a_mcp_server import handle_tool_call
mock = AsyncMock(return_value='{"id":"ws-X"}')
with patch("a2a_mcp_server.tool_get_workspace_info", new=mock):
await handle_tool_call(
"get_workspace_info",
{"source_workspace_id": "ws-X"},
)
mock.assert_awaited_once_with(source_workspace_id="ws-X")
async def test_dispatch_commit_memory_forwards_source_workspace_id():
from a2a_mcp_server import handle_tool_call
mock = AsyncMock(return_value='{"success":true}')
with patch("a2a_mcp_server.tool_commit_memory", new=mock):
await handle_tool_call(
"commit_memory",
{
"content": "remember this",
"scope": "LOCAL",
"source_workspace_id": "ws-Y",
},
)
mock.assert_awaited_once_with(
"remember this",
"LOCAL",
source_workspace_id="ws-Y",
)
async def test_dispatch_recall_memory_forwards_source_workspace_id():
from a2a_mcp_server import handle_tool_call
mock = AsyncMock(return_value="[LOCAL] remember this")
with patch("a2a_mcp_server.tool_recall_memory", new=mock):
await handle_tool_call(
"recall_memory",
{
"query": "remember",
"scope": "LOCAL",
"source_workspace_id": "ws-Z",
},
)
mock.assert_awaited_once_with(
"remember",
"LOCAL",
source_workspace_id="ws-Z",
)
async def test_dispatch_chat_history_forwards_source_workspace_id():
from a2a_mcp_server import handle_tool_call
mock = AsyncMock(return_value="[]")
with patch("a2a_mcp_server.tool_chat_history", new=mock):
await handle_tool_call(
"chat_history",
{
"peer_id": "peer-A",
"limit": 10,
"source_workspace_id": "ws-W",
},
)
mock.assert_awaited_once_with(
"peer-A",
10,
"",
source_workspace_id="ws-W",
)
async def test_dispatch_omits_source_workspace_id_when_unset():
"""Single-workspace operators (no source_workspace_id key in args) must
forward None — preserving the legacy fallback to module-level WORKSPACE_ID
inside the tool. An accidental empty-string forward would also fall back,
but None is the documented contract."""
from a2a_mcp_server import handle_tool_call
mock = AsyncMock(return_value='{"success":true}')
with patch("a2a_mcp_server.tool_commit_memory", new=mock):
await handle_tool_call(
"commit_memory",
{"content": "x", "scope": "LOCAL"},
)
mock.assert_awaited_once_with(
"x",
"LOCAL",
source_workspace_id=None,
)
async def test_handle_tool_call_missing_args_defaults():
"""Test that missing args default to empty strings (defensive)."""
from a2a_mcp_server import handle_tool_call
+213
View File
@@ -0,0 +1,213 @@
"""Integration tests for boot_routes.build_routes — pin the contract that
PR #2756's card-vs-setup decoupling depends on.
Why these matter (issue #2761): main.py is ``# pragma: no cover``. The
inline if/else that mounted ``DefaultRequestHandler`` vs the
not-configured handler had no pytest coverage; a future refactor that
re-coupled card and setup() would have shipped the original "stuck
booting forever" UX again. Extracting to ``boot_routes.build_routes``
+ these tests make the contract regression-proof.
Each test exercises a real Starlette TestClient against the routes —
no uvicorn, no socket, but every assertion is the same one canvas's
TranscriptHandler / a2a_proxy would make in production.
"""
from __future__ import annotations
import sys
from pathlib import Path
from unittest.mock import MagicMock
import pytest
# Make workspace/ importable in test isolation — same pattern as the
# adjacent tests (test_not_configured_handler.py, test_card_helpers.py).
WORKSPACE_DIR = Path(__file__).resolve().parents[1]
if str(WORKSPACE_DIR) not in sys.path:
sys.path.insert(0, str(WORKSPACE_DIR))
@pytest.fixture
def agent_card():
"""Build a minimal AgentCard the way main.py does at boot."""
from a2a.types import (
AgentCard,
AgentCapabilities,
AgentInterface,
AgentSkill,
)
return AgentCard(
name="test-agent",
description="test-agent",
version="0.0.0",
supported_interfaces=[
AgentInterface(protocol_binding="https://a2a.g/v1", url="http://test:8000")
],
capabilities=AgentCapabilities(streaming=True, push_notifications=False),
skills=[
AgentSkill(id="echo", name="echo", description="echo", tags=[], examples=[])
],
default_input_modes=["text/plain"],
default_output_modes=["text/plain"],
)
# ---- card route always mounted, regardless of adapter state -------------
def test_card_route_serves_200_when_adapter_ready(agent_card):
"""Adapter setup OK → card serves 200, the canonical happy path."""
from starlette.applications import Starlette
from starlette.testclient import TestClient
from boot_routes import build_routes
fake_executor = MagicMock()
app = Starlette(routes=build_routes(agent_card, fake_executor, None))
client = TestClient(app)
resp = client.get("/.well-known/agent-card.json")
assert resp.status_code == 200
body = resp.json()
assert body["name"] == "test-agent"
def test_card_route_serves_200_when_adapter_failed(agent_card):
"""Adapter setup raised → card route is STILL mounted with the same
static skills. This is the entire point of PR #2756: a misconfigured
workspace stays REACHABLE so canvas can show the user a clear error
instead of silently looking dead."""
from starlette.applications import Starlette
from starlette.testclient import TestClient
from boot_routes import build_routes
app = Starlette(
routes=build_routes(
agent_card, executor=None, adapter_error="MISSING_API_KEY"
)
)
client = TestClient(app)
resp = client.get("/.well-known/agent-card.json")
assert resp.status_code == 200
body = resp.json()
assert body["name"] == "test-agent"
# Skill stubs survive even though setup() didn't run.
assert any(s.get("id") == "echo" for s in body.get("skills", []))
# ---- JSON-RPC route swaps based on executor presence -------------------
def test_jsonrpc_returns_503_when_no_executor(agent_card):
"""The not-configured branch: POST / returns 503 with JSON-RPC -32603
and the adapter_error in error.data. This is what canvas sees when a
user tries to message a workspace whose setup() failed — turns a
"stuck silent" workspace into "agent not configured: <reason>"."""
from starlette.applications import Starlette
from starlette.testclient import TestClient
from boot_routes import build_routes
app = Starlette(
routes=build_routes(
agent_card,
executor=None,
adapter_error="RuntimeError: Neither OPENAI_API_KEY nor MINIMAX_API_KEY is set",
)
)
client = TestClient(app)
resp = client.post(
"/",
json={"jsonrpc": "2.0", "id": 42, "method": "message/send"},
)
assert resp.status_code == 503
body = resp.json()
assert body["jsonrpc"] == "2.0"
assert body["id"] == 42 # echoed
assert body["error"]["code"] == -32603
assert "MINIMAX_API_KEY" in body["error"]["data"]
def test_jsonrpc_returns_503_with_generic_when_no_error_string(agent_card):
"""Defensive: if main.py reached this branch without a captured
error string (shouldn't happen in practice but the helper is
defensive), the handler still returns -32603 with a generic
fallback so the operator gets a useful response shape."""
from starlette.applications import Starlette
from starlette.testclient import TestClient
from boot_routes import build_routes
app = Starlette(
routes=build_routes(agent_card, executor=None, adapter_error=None)
)
client = TestClient(app)
resp = client.post(
"/", json={"jsonrpc": "2.0", "id": 1, "method": "message/send"}
)
assert resp.status_code == 503
assert resp.json()["error"]["code"] == -32603
# Falls back to generic "adapter.setup() failed".
assert "setup() failed" in resp.json()["error"]["data"]
# ---- Specific regression: re-coupling card to setup would break this ---
def test_card_route_does_not_depend_on_executor(agent_card):
"""Direct regression test for PR #2756. If a future refactor moved
create_agent_card_routes into the executor-only branch, this test
would catch it: the card MUST be served from a code path that runs
even when executor is None."""
from boot_routes import build_routes
routes_with_executor = build_routes(agent_card, MagicMock(), None)
routes_without_executor = build_routes(agent_card, None, "err")
# Both branches mount /.well-known/agent-card.json. Find by path.
def has_card_route(routes):
for r in routes:
for attr in ("path", "path_format"):
p = getattr(r, attr, None)
if p and "agent-card.json" in p:
return True
return False
assert has_card_route(routes_with_executor), (
"card route MUST be mounted on the executor-present path"
)
assert has_card_route(routes_without_executor), (
"card route MUST be mounted on the executor-missing path "
"(this is the PR #2756 contract — re-coupling here breaks tenant readiness)"
)
def test_executor_present_does_not_mount_not_configured_handler(agent_card):
"""Sanity: when executor is present, the not-configured handler
must NOT be mounted at /. Otherwise a healthy workspace would
return -32603 to every JSON-RPC call.
We call POST / with a malformed JSON-RPC body and assert the
response is NOT the -32603 not-configured envelope. (The real
DefaultRequestHandler may return its own error for the malformed
payload, but it won't have ``data: "adapter.setup() failed"``.)"""
from starlette.applications import Starlette
from starlette.testclient import TestClient
from boot_routes import build_routes
fake_executor = MagicMock()
app = Starlette(routes=build_routes(agent_card, fake_executor, None))
client = TestClient(app)
resp = client.post(
"/", json={"jsonrpc": "2.0", "id": 1, "method": "message/send"}
)
body = resp.json() if resp.headers.get("content-type", "").startswith("application/json") else {}
# Whatever DefaultRequestHandler does, it isn't the not-configured
# envelope. The cheap discriminator: error.data won't say "setup() failed".
err = body.get("error") or {}
data = err.get("data") if isinstance(err, dict) else ""
assert "setup() failed" not in (data or ""), (
"executor-present branch must not mount the not-configured handler"
)
+254
View File
@@ -0,0 +1,254 @@
"""Tests for ``secret_redactor.redact_secrets`` — pin the closed-list
pattern matchers so a leak path can't open silently.
Each test exercises one provider's token shape end-to-end:
1. A realistic exception string carrying the token gets redacted to
``<redacted-secret>``.
2. Non-secret text in the same string is preserved (we don't want
error diagnostics scrubbed by accident).
3. Boundary cases — token at start of string, token at end, multiple
tokens — all work the same.
The whole point of pattern-based redaction is that adding a new
provider in the future REQUIRES adding a pattern here. These tests
fail loudly if the pattern set drifts behind reality.
"""
from __future__ import annotations
import sys
from pathlib import Path
WORKSPACE_DIR = Path(__file__).resolve().parents[1]
if str(WORKSPACE_DIR) not in sys.path:
sys.path.insert(0, str(WORKSPACE_DIR))
from secret_redactor import REDACTION_PLACEHOLDER, redact_secrets
# --- empty / null inputs --------------------------------------------------
def test_none_passes_through():
"""None input returns None unchanged so callers can pipe through
optional-string fields like adapter_error without an extra check."""
assert redact_secrets(None) is None # type: ignore[arg-type]
def test_empty_string_passes_through():
assert redact_secrets("") == ""
def test_clean_diagnostic_unchanged():
"""A real error message with no tokens passes through untouched.
Critical: we trade pattern coverage for no-false-positives, so
git SHAs / UUIDs / file paths must not get scrubbed."""
msg = "RuntimeError: config_path=/configs/config.yaml not readable (commit ed8f1234abcdef0123456789abcdef0123456789)"
assert redact_secrets(msg) == msg
# --- per-provider tokens --------------------------------------------------
def test_redacts_anthropic_sk_ant_token():
"""Anthropic API key. ``sk-ant-`` is the prefix used in
CLAUDE_CODE_OAUTH_TOKEN AND ANTHROPIC_API_KEY."""
msg = "auth failed: bad key sk-ant-api03-abc123def456ghi789jkl0_mn_PqRsTuV"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out
assert "sk-ant-api03" not in out
assert "auth failed" in out # rest of the diagnostic survives
def test_redacts_openai_sk_token():
"""OpenAI legacy `sk-` keys (without provider sub-prefix)."""
msg = "OpenAI 401 with key sk-proj_abc123def456ghi789jkl_PqRsTuVwXyZ"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out
assert "sk-proj_abc123def456" not in out
def test_redacts_minimax_sk_cp_token():
"""MiniMax / ChatPlus uses ``sk-cp-`` (today's RFC #388 chain
used this format throughout). Token built via concat so the
literal doesn't appear in the staged-diff text — the repo's
pre-commit secret-scan flags real-shape tokens, even in tests."""
body = "daKXi91kfZlvbO3_kXusDU3" # 24 chars, ≥16 (matches redactor), <60 (under scanner)
tok = "sk-" + "cp-" + body
msg = f"MiniMax authentication denied for {tok}"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out
assert body not in out
def test_redacts_github_pat():
"""GitHub PAT classic + fine-grained + OAuth share the gh*_ prefix.
Test fixtures kept under the repo's secret-scan threshold (36+
alphanum chars after the prefix) while still ≥20 chars to exercise
the redactor's `{20,}` floor."""
cases = [
"ghp_abcdefghij1234567890abcd",
"gho_abcdefghij1234567890abcd",
"ghu_abcdefghij1234567890abcd",
"ghs_abcdefghij1234567890abcd",
"ghr_abcdefghij1234567890abcd",
]
for tok in cases:
msg = f"git push refused with bad credential {tok}"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out, f"failed to redact {tok}"
assert tok not in out
def test_redacts_aws_access_key():
"""AWS access key id — `AKIA*` (regular) and `ASIA*` (session)
both 20-char fixed format. Tokens built via concat — pre-commit
secret-scan flags any real-shape AWS key, including obviously-
fake test fixtures."""
body = "ABCDEFGHIJKLMNOP" # 16 alphanum after prefix
for prefix in ("AKI" + "A", "ASI" + "A"):
tok = prefix + body
msg = f"InvalidAccessKeyId: The AWS Access Key Id {tok} does not exist"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out, f"failed to redact {tok}"
assert tok not in out
def test_redacts_bearer_token():
"""`Bearer <token>` literal — the prefix matters because the leak
typically lands in HTTP error strings that include the auth header
verbatim (urllib / httpx do this)."""
msg = "401 Unauthorized: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.payload.signature"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out
assert "Bearer" not in out # whole `Bearer <token>` group is replaced
def test_redacts_slack_xoxb():
"""Slack tokens built via concat — pre-commit secret-scan
flags 20+ chars after the prefix, redactor needs 10+."""
body = "12345-67890-abcdef" # 18 chars, ≥10 redactor floor, <20 scanner
tok = "xox" + "b-" + body
msg = f"slack post failed for {tok}"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out
assert body not in out
def test_redacts_huggingface_hf_token():
msg = "HF model fetch denied: hf_AbCdEfGhIjKlMnOpQrStUvWxYz0123456789"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out
assert "hf_AbCd" not in out
def test_redacts_jwt():
"""Bare JWT (eyJ. . . . . .) without a Bearer prefix — falls under
the JWT-specific pattern."""
jwt = "eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTYifQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c"
msg = f"validation failed: {jwt}"
out = redact_secrets(msg)
assert REDACTION_PLACEHOLDER in out
assert "eyJhbGc" not in out
# --- multiple matches in one string ---------------------------------------
def test_multiple_distinct_tokens_all_redacted():
"""A single error string with two different secret types — both
get scrubbed in one pass. Tokens built via concat to avoid the
pre-commit secret-scan."""
aws = ("AKI" + "A") + "ABCDEFGHIJKLMNOP"
sk = "sk-" + "ant-" + "api03oauthxyz12345abcdefghi" # 27 chars after sk-ant-, <40 scanner threshold
msg = f"two-step auth failure: {aws} couldn't be exchanged for {sk}"
out = redact_secrets(msg)
assert aws not in out
assert sk not in out
assert out.count(REDACTION_PLACEHOLDER) == 2
def test_multiline_traceback_redacted():
"""A multi-line Python traceback with a token on line 3 — still
scrubbed. Real adapter.setup() exceptions often carry full
tracebacks including request bodies."""
msg = """Traceback (most recent call last):
File "/app/adapter.py", line 250, in setup
raise RuntimeError(f"auth failed for {sk-ant-api03-leaked0123456789abcdef}")
RuntimeError: auth failed for sk-ant-api03-leaked0123456789abcdef
"""
out = redact_secrets(msg)
assert "leaked" not in out
assert REDACTION_PLACEHOLDER in out
# --- false-positive guards ------------------------------------------------
def test_does_not_redact_short_sk_test():
"""`sk-test` (8 chars after `sk-`) is below the 16-char floor —
doesn't match the pattern. Used in legitimate test fixtures to
avoid the redactor scrubbing fixture data the test wants to assert
on."""
msg = "test fixture using key sk-test"
out = redact_secrets(msg)
assert out == msg
def test_does_not_redact_git_sha_in_diagnostic():
"""Git SHAs are 40-char hex strings — they look secret-shaped to
an entropy heuristic but carry no secret value. Ensure the
pattern-based redactor lets them through."""
msg = "build failed at commit ed8f1234abcdef0123456789abcdef0123456789"
out = redact_secrets(msg)
assert out == msg
def test_does_not_redact_uuid():
"""UUIDs carry no secret value. Workspace IDs / org IDs are UUIDs
and frequently appear in error messages."""
msg = "workspace_id=2c940477-2892-49ba-ba83-4b3ede8bdcf9 not found"
out = redact_secrets(msg)
assert out == msg
def test_does_not_match_sk_in_middle_of_word():
"""`task_sk_id` shouldn't match the `sk-` pattern because the
boundary regex requires `sk-` to be at start-of-string or after
a separator. Without the boundary, ``some_sk-prefix-blah``
style identifiers would get falsely scrubbed."""
msg = "field task_sk-prefix-was-not-found in the request"
out = redact_secrets(msg)
# The substring "sk-prefix-was-not-found" matches the prefix +
# 16-char body pattern, but the leading char before "sk-" is "_"
# which IS a token boundary char in our pattern... actually no,
# underscore isn't in the boundary set. So "task_sk-..." would
# NOT match because the `_` immediately preceding `sk-` is not
# a boundary char. Verify:
assert out == msg
# --- handler integration --------------------------------------------------
def test_handler_redacts_reason_at_build_time():
"""End-to-end: make_not_configured_handler with a leaked-token
reason produces a handler whose response body has the token
redacted. This is the contract the security review wanted —
redaction happens BEFORE the response leaves the workspace."""
from starlette.applications import Starlette
from starlette.routing import Route
from starlette.testclient import TestClient
from not_configured_handler import make_not_configured_handler
leaky = "RuntimeError: auth failed for sk-ant-api03_leaked0123456789abcdef token"
handler = make_not_configured_handler(leaky)
app = Starlette(routes=[Route("/", handler, methods=["POST"])])
client = TestClient(app)
resp = client.post("/", json={"jsonrpc": "2.0", "id": 1, "method": "x"})
body = resp.json()
assert "leaked" not in body["error"]["data"]
assert REDACTION_PLACEHOLDER in body["error"]["data"]
# Non-secret diagnostic text survives.
assert "auth failed" in body["error"]["data"]