Compare commits

...

23 Commits

Author SHA1 Message Date
molecule-ai[bot] 226e57a942 Merge pull request #2931 from Molecule-AI/staging
staging → main: auto-promote a1d2027
2026-05-05 21:03:11 +00:00
Hongming Wang e9eb3868d5 Merge pull request #2930 from Molecule-AI/docs/python-version-callout-mcp-snippet
docs: callout Python>=3.11 on Universal MCP snippet + runtime doc
2026-05-05 20:48:31 +00:00
Hongming Wang cb70d3d437 docs: callout Python>=3.11 requirement on Universal MCP install snippet
User-reported friction: pip install molecule-ai-workspace-runtime on a
3.10 interpreter fails with "Could not find a version that satisfies the
requirement (from versions: none)" — pip's requires_python filter
silently drops the only available artifact before attempting install,
so the error doesn't mention Python at all. Operators see
"package missing", file a bug, and chase a phantom CDN/visibility
issue.

Two changes mirror the requirement at the two operator-touch surfaces:

1. workspace-server/internal/handlers/external_connection.go:
   the externalUniversalMcpTemplate snippet (rendered into the
   canvas Connect-External-Agent modal) now leads with a brief
   "Requires Python >= 3.11" block + diagnostic + upgrade paths.

2. docs/workspace-runtime-package.md: same callout at the top of
   the doc, before the Overview, so anyone landing here from search
   gets the answer immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:44:25 -07:00
Hongming Wang a1d202723d Merge pull request #2929 from Molecule-AI/auto-sync/main-ef67dc51
chore: sync main → staging (auto, ff to ef67dc51)
2026-05-05 13:42:55 -07:00
Hongming Wang fc30b5c9de Merge pull request #2905 from Molecule-AI/fix/poll-path-message-enrichment
fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url
2026-05-05 20:36:41 +00:00
molecule-ai[bot] ef67dc513e Merge pull request #2928 from Molecule-AI/staging
staging → main: auto-promote 2bf6a70
2026-05-05 20:33:52 +00:00
Hongming Wang 23d3f057d3 Merge pull request #2890 from Molecule-AI/refactor/a2a-tools-memory-extract-rfc2873-iter4c
refactor(workspace): extract memory tools from a2a_tools.py (RFC #2873 iter 4c)
2026-05-05 20:31:45 +00:00
Hongming Wang 8ca027ddf3 fix(tests): drop unused json + pytest imports
Bot lint flagged the two imports as unused (correct — neither is
referenced after the file shrank during review). Resolves the two
unresolved review threads silently blocking merge per the staging
"all conversations resolved" gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:26:49 -07:00
Hongming Wang 46a4ef83bb fix(tests): patch a2a_tools_memory.httpx, not a2a_tools.httpx
Iter 4c (#2890) moved tool_commit_memory + tool_recall_memory into
a2a_tools_memory.py, which has its own top-level `import httpx`.
test_mcp_memory.py + the secret-redact memory tests still patched
`a2a_tools.httpx.AsyncClient`, which after the move is the WRONG
module's reference — the real call inside the moved tool resolves to
`a2a_tools_memory.httpx.AsyncClient` and reaches the network. CI
catches this as 7 failures: JSONDecodeError on empty bodies and
"All connection attempts failed" on the recall side.

Update 7 patch sites to `a2a_tools_memory.httpx.AsyncClient`. The
existing tests in `test_a2a_tools_impl.py` were already updated by
the iter-4c PR; only these two files were missed.

Verified: pytest workspace/tests/test_mcp_memory.py +
test_secret_redact.py — 43/43 pass after the fix (both files were
red on the iter-4c branch CI).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:25:06 -07:00
Hongming Wang a6afc18de5 Merge pull request #2927 from Molecule-AI/fix/org-import-polish-2872
fix(org-import): polish — wrap-safe ErrNoRows, bounded lookup, godoc (#2872)
2026-05-05 20:25:01 +00:00
Hongming Wang 423d58d42c fix(org-import): polish — wrap-safe ErrNoRows, bounded lookup, godoc
Three small hardening passes from #2872's optional/important findings,
batched into one polish PR:

1. errors.Is(err, sql.ErrNoRows) instead of err == sql.ErrNoRows.
   The bare equality breaks if any future caller wraps the error via
   fmt.Errorf("…: %w", err) — the no-rows happy path would fall
   through to the "real DB error" branch and abort the import.
   errors.Is unwraps. New test
   TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound pins the
   fix; verified the test fails on the old `==` shape (build break
   on unused-import + assertion failure once import dropped).

2. Bounded 5s timeout on lookupExistingChild instead of
   context.Background().
   The createWorkspaceTree call site runs in goroutines spawned from
   the /org/import handler, so plumbing the request context here
   would cascade-cancel into provisionWorkspaceAuto and abort
   in-flight EC2 provisioning if the client disconnected mid-import
   — that's the wrong tradeoff. A short bounded timeout protects the
   per-row SELECT against a wedged DB without taking the
   drop-everything-on-disconnect behaviour. The lookup is a single
   ~10ms query; 5s leaves 500x headroom for transient slow paths.

3. Godoc clarifications on the skip-path block.
   - /org/import is ADDITIVE-ONLY, never destructive. Children
     present in the existing tree but absent from the new template
     are preserved (no DELETE on diff).
   - Skip-path does NOT propagate updates to existing nodes — a
     re-import that adds an initial_memory or schedule to an
     existing workspace is silently dropped. Document the limitation
     so future operators know to delete-and-re-import or reach for
     a future /org/sync route.

Verification:
  - go build ./... → clean
  - go test ./internal/handlers/... → all passing (TestLookup* +
    TestCreateWorkspaceTree* + TestClass1* + TestGate*)
  - 4 lookup tests + 1 new wrap-safety test → 5/5 PASS
  - Full handlers suite → green

Refs molecule-core#2872 (Optional findings — wrap-safety + ctx, godoc
clarifications for additive-only + skip-path-update-limitation)

Out of scope (deferred):
  - PR-D partial unique index migration + ON CONFLICT — sequenced
    after Phase 4 cleanup verified clean per #2872 plan
  - PR-E full createWorkspaceTree integration test for partial-match
    — needs heavier sqlmock scaffolding for downstream
    workspaces_audit/canvas_layouts/secrets/channels INSERTs;
    follow-up

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:20:54 -07:00
Hongming Wang 9386f1d399 Merge pull request #2926 from Molecule-AI/fix/agent-comms-display-parity
fix(canvas): AgentCommsPanel display + initial-state parity with my-chat
2026-05-05 20:15:29 +00:00
Hongming Wang a766e5ce48 Merge pull request #2925 from Molecule-AI/e2e/poll-mode-chat-upload-tests
test(e2e): poll-mode chat upload E2E in standard suite
2026-05-05 20:13:27 +00:00
Hongming Wang 5ad2669f88 fix(canvas): AgentCommsPanel display + initial-state parity with my-chat
User-visible problem: agent-comms panel opens mid-conversation on long
histories (the same chat-opens-in-middle bug PR #2903 fixed for
my-chat) and silently renders empty state when the history fetch fails
(no retry button, no diagnostic).

Three changes mirror the my-chat patterns from ChatTab:

1. Initial-mount instant scroll.
   Adds hasInitialScrollRef + switches the scroll hook from useEffect
   to useLayoutEffect. First arrival of messages → scrollIntoView
   `instant`; subsequent appends → `smooth` as before. useLayoutEffect
   runs before paint so the user never sees the panel jump for one
   frame on every append.

2. Error UI with Retry button.
   Adds `loadError` state. The history-load .catch now sets the
   error message; a new branch in the render renders a red alert
   with the failure text and a Retry button that re-invokes
   `loadInitial`. Same shape as ChatTab MyChatPanel's `loadError`
   handling — both surfaces should fail loud, not silent.

3. Extracted `loadInitial` callback.
   The history-load body becomes a useCallback so the retry button
   has a stable reference to call. Mirrors ChatTab's loadInitial.

Tests (4 new in AgentCommsPanel.render.test.tsx):
- Loading state renders the loading copy.
- Error state with Retry button renders on rejection; clicking
  Retry fires a second api.get.
- Empty state renders when load succeeds with zero rows.
- scrollIntoView is called with behavior=instant on first message
  arrival (pins the chat-opens-in-middle prevention).

Verification:
- pnpm test → 1284/1284 pass (1280 prior + 4 new)
- tsc --noEmit → clean
- 92 → 93 test files, no existing test broken

Closes the parity gap raised in chat. The two surfaces now share:
loading copy / error UI / empty-state placeholder / scroll behaviour /
useLayoutEffect timing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:09:36 -07:00
Hongming Wang 0ca4e431c1 test(e2e): add poll-mode chat upload E2E and wire into e2e-api.yml
Covers the user-visible flow that Phase 1-5b shipped (RFC #2891):
register a poll-mode workspace, POST a multi-file /chat/uploads, verify
the activity feed shows one chat_upload_receive row per file, fetch the
bytes via /pending-uploads/:fid/content, ack each row, and confirm a
post-ack fetch returns 404. Also pins cross-workspace bleed protection
(workspace B's bearer on A's URL → 401, B's URL with A's file_id →
404) and the file_id-UUID-parse 400 path.

23 assertions, all green against a local platform (Postgres+Redis+
platform-server stack matches the e2e-api.yml CI recipe verbatim).

Why a new script instead of extending test_poll_mode_e2e.sh: that
script tests A2A short-circuit + since_id cursor semantics; this one
tests the chat-upload path. They share zero handler code on the
platform side and would dilute each other's failure messages if
combined.

Why not the bearerless-401 strict-mode assertion: the platform's
wsauth fail-opens for bearerless requests when MOLECULE_ENV=development
(see middleware/devmode.go). The CI workflow doesn't set that var, but
some local-dev .env files do — the assertion would flap by environment
without testing the poll-mode upload contract. The middleware's own
unit tests cover strict-mode 401.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:08:55 -07:00
Hongming Wang 2bf6a7005f Merge pull request #2923 from Molecule-AI/feat/class1-ast-gate-2867
test(handlers): generic Class 1 leak AST gate (#2867 PR-A)
2026-05-05 20:04:59 +00:00
Hongming Wang 16ead69641 Merge pull request #2913 from Molecule-AI/fix-saas-logout-ui-missing
fix(canvas): wire SaaS Sign-out button — /cp/auth/signout was unreachable from UI
2026-05-05 20:03:53 +00:00
Hongming Wang 60afcd43c9 test(handlers): generic Class 1 leak AST gate (#2867 PR-A)
Adds class1_ast_gate_test.go — a per-package AST walk that fails the
build if any handler function INSERTs INTO workspaces inside a range
loop body without one of three escape hatches:

  1. A call to a registered preflight helper (lookupExistingChild today;
     extend preflightCallNames as new helpers are introduced).
  2. An ON CONFLICT clause in the same SQL literal (idempotent UPSERT,
     like registry.go).
  3. An explicit `// class1-gate: idempotent-by-design` comment in the
     function body (deliberately awkward — forces a code-review beat).

Why this is broader than the existing
TestCreateWorkspaceTree_CallsLookupBeforeInsert gate in
org_import_idempotency_test.go: that one is hard-coded to one function
in one file. This one walks every non-test .go file in the handlers
package and applies a structural rule independent of file/function
names. A future handler written from scratch in a new file would not
have been covered before — now it is.

Detection mechanism (per AST):
  - Collect spans (Lbrace..Rbrace) of every RangeStmt body in each
    function. Position-based instead of stack-based — ast.Inspect's
    nil-callback ordering doesn't give per-node pop semantics, so a
    naive push/pop stack silently miscounts. Position spans are
    deterministic.
  - Walk every BasicLit, regex-match `^\s*INSERT INTO workspaces\(`
    (tightened from bytes.Index "INSERT INTO workspaces" so
    workspaces_audit literals don't false-positive — same regex used
    by the existing createWorkspaceTree gate).
  - For each match: record insertLine, hasONCONFLICT, and the
    innermost enclosing RangeStmt line (or 0 if not inside any range).
  - Fail the function if INSERT is inside a range AND no preflight
    AND no ON CONFLICT AND no allowlist annotation.

Self-tests (per `feedback_assert_exact_not_substring.md` —
verify gate fails on the bug shape before merging):
  - TestClass1_GateFiresOnSyntheticBuggySource: synthetic source
    where INSERT is inside `for _, child := range children` body
    must trigger the gate's three guards (enclosingRangeLine!=0,
    hasONCONFLICT=false, no preflight call).
  - TestClass1_GateAllowsONCONFLICT: synthetic INSERT...ON CONFLICT
    must NOT trigger the gate (idempotent UPSERT case).
  - TestClass1_GateAllowsAllowlistAnnotation: function with
    `// class1-gate: idempotent-by-design` must be skipped.
  - TestClass1_NoUnpreflightedInsertInsideRange: production sweep
    over every handler .go file. Currently passes because
    org_import.go preflights, registry.go ON-CONFLICTs, and
    workspace.go's Create has no INSERT inside a range body.

Verification:
  - go test ./internal/handlers/... -run TestClass1_ -count=1
    → 4/4 PASS
  - go test ./internal/handlers/... -count=1 → suite green
    (no pre-existing test broken by the new file)

Refs molecule-core#2867 (PR-A Class 1 generic AST gate)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:01:34 -07:00
Hongming Wang 9a53529047 ci: retrigger after stuck Canvas tabs E2E (was running 17min vs typical <1min on staging) 2026-05-05 12:38:09 -07:00
Hongming Wang 575f893f4e fix(canvas): consume CP logout_url to break the SSO re-auth loop
Follow-up to molecule-controlplane#485. The first half of #2913 wired
a Sign-out button + signOut() helper that POSTed /cp/auth/signout, but
clicking still left the user signed in: WorkOS's browser cookie
preserved the SSO session, /cp/auth/login auto-re-authed via SSO, and
the user landed back on /orgs.

CP PR #485 returns the AuthKit hosted logout URL in the signout
response. This change has signOut() navigate the browser there
instead of /cp/auth/login. AuthKit clears its cookie + redirects to
return_to (configured server-side from APP_URL) → next /cp/auth/login
hits a fresh AuthKit, no SSO session, login form actually shows.

Defensive parsing: malformed JSON, missing logout_url, or wrong-type
logout_url all fall through to the legacy /cp/auth/login fallback,
which works locally (DisabledProvider, dev) where there's no SSO to
escape.

Forward-compat: when CP doesn't have #485 deployed yet, signOut()
sees logout_url="" or missing → fallback fires. Order of merge
between this and #485 doesn't matter, but the bug isn't actually
fixed end-to-end until both ship.

Tests added (3 new, 15 total auth.test.ts):
- Hosted logout: navigates to logout_url when response includes one.
- DisabledProvider path: falls back to /cp/auth/login when "".
- Defensive: malformed JSON body → fallback (no crash).
- Defensive: non-string logout_url → fallback (no open redirect).

Verified:
- npx vitest run src/lib/__tests__/auth.test.ts — 15/15 pass
- tsc --noEmit clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:21:49 -07:00
Hongming Wang 4cac4e7710 fix(canvas): wire SaaS Sign-out button — POST /cp/auth/signout was unreachable from the UI
Reported externally on 2026-05-05: "SaaS app logout does not work."

Root cause: the control plane has had POST /cp/auth/signout (clears the
WorkOS session cookie + revokes at the provider) since auth shipped,
but no canvas code ever called it. grep across canvas/ for
`logout|signOut|signout|sign-out` returned zero results — no helper,
no button, no menu entry. Users had no path to log out short of
clearing cookies in DevTools.

This is a UI gap, not a backend bug. Adding the missing pieces:

1. `signOut()` helper in `canvas/src/lib/auth.ts`:
   - POST /cp/auth/signout with credentials:include (cross-origin
     cookie required for tenant subdomain → app subdomain)
   - Best-effort: a 5xx, 401-stale-cookie, or network failure still
     redirects the browser to /cp/auth/login. Leaving the user on an
     authed-looking page after they clicked Sign out is the worst
     possible UX — that's the precise "logout doesn't work" symptom
     the report described.
   - Lands on /cp/auth/login (not the current URL) so the user
     doesn't loop back into the org they just left via AuthGate's
     return_to.

2. `AccountBar` component on /orgs page Shell — renders the signed-in
   email + Sign-out button at the top. Click → signOut() →
   `Signing out…` → bounces to login. Disabled-while-pending so a
   double-click can't fire two requests.

3. Tests in `auth.test.ts` (4 new, total 12 pass):
   - POSTs to the right endpoint with credentials:include
   - Redirects to /cp/auth/login after success
   - Redirects EVEN ON network failure (the critical UX invariant)
   - Redirects on 401 (stale cookie path)

The auth-origin resolution (`getAuthOrigin`) is reused so a tenant
subdomain (acme.moleculesai.app) correctly POSTs to
app.moleculesai.app/cp/auth/signout — same chain that fetchSession
+ redirectToLogin already use.

Test plan:
- [x] `npx vitest run src/lib/__tests__/auth.test.ts` — 12/12 green
- [x] `tsc --noEmit` — clean
- [ ] Manual: navigate to /orgs, click Sign out, observe redirect +
      that the next /orgs visit bounces to login (cookie cleared)
- [ ] CI green

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:20:18 -07:00
Hongming Wang 3d0a7c381b fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url
Reported: agents receiving messages via inbox_peek / wait_for_message
get a plain envelope — text + peer_id + kind only. The push-path
(a2a_mcp_server._build_channel_notification) already enriches the
meta dict with peer_name, peer_role, and agent_card_url from the
registry cache, but the poll-path returns InboxMessage.to_dict()
unchanged. So a Claude Code host with channel-push gets the friendly
identity, but every other MCP client (and Claude Code with push
disabled — the universal default) sees plain text.

This silently breaks the contract documented in
a2a_mcp_server.py:303-345:

> In both paths the same fields apply: kind, peer_id, peer_name,
> peer_role, agent_card_url, activity_id

Fix: a2a_tools._enrich_inbound_for_agent() — same shape as the
push-path's enrichment, called from tool_inbox_peek and
tool_wait_for_message. Cache-first non-blocking (5-min TTL via
enrich_peer_metadata_nonblocking, same helper push uses), so a cache
miss returns immediately with bare envelope and warms the cache for
the next poll. agent_card_url is constructable from peer_id alone
and surfaces even on cache miss, so the receiving agent always has
a single endpoint to hit for capabilities.

Degradation paths:
- canvas_user (peer_id="") → pass through unchanged, no enrichment
- a2a_client unavailable (test harness without registry) → bare
  envelope, agent still gets text + peer_id + kind + activity_id

Tests:
- canvas_user passes through unchanged
- peer_agent cache hit → name + role + agent_card_url all present
- peer_agent cache miss → agent_card_url still constructed
- a2a_client unavailable → bare envelope, no crash

All 4 pass against fixed code. Without the fix, the cache-hit and
cache-miss tests would fail (peer_name/peer_role/agent_card_url keys
absent from to_dict's output).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:08:14 -07:00
Hongming Wang 210a26d31a refactor(workspace): extract memory tools from a2a_tools.py to a2a_tools_memory.py (RFC #2873 iter 4c)
Third slice of the a2a_tools.py split (stacked on iter 4b). Owns the
two persistent-memory MCP tools:

  * tool_commit_memory — write to /workspaces/:id/memories with RBAC
    + GLOBAL-scope tier-zero enforcement
  * tool_recall_memory — search /workspaces/:id/memories with RBAC

a2a_tools.py shrinks from 609 → 508 LOC (−101). Both handlers depend
ONLY on a2a_tools_rbac (iter 4a), a2a_client, and the platform's
/memories endpoint — no entanglement with delegation or messaging.

Side-effects of the layered architecture: a2a_tools_memory's import
contract is "depends on a2a_tools_rbac, never on a2a_tools" — the
kitchen-sink module is for back-compat re-exports only. A test pins
this so a future refactor that re-introduces `from a2a_tools import …`
fails in CI.

Tests:
  * 49 patch sites in TestToolCommitMemory + TestToolRecallMemory
    retargeted from `a2a_tools.{_check_memory_*, _is_root_workspace,
    httpx.AsyncClient}` to `a2a_tools_memory.…` because the call sites
    moved.
  * test_a2a_tools_memory.py adds 4 new tests (alias drift gate +
    import-contract + a2a_tools-side re-export).

117 tests total (77 impl + 28 rbac + 8 delegation + 4 memory), all green.

Refs RFC #2873.
2026-05-05 09:50:39 -07:00
20 changed files with 1759 additions and 177 deletions
+3
View File
@@ -172,6 +172,9 @@ jobs:
- name: Run poll-mode + since_id cursor E2E (#2339)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_poll_mode_e2e.sh
- name: Run poll-mode chat upload E2E (RFC #2891)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_poll_mode_chat_upload_e2e.sh
- name: Dump platform log on failure
if: failure() && needs.detect-changes.outputs.api == 'true'
run: cat workspace-server/platform.log || true
+47 -3
View File
@@ -18,7 +18,7 @@
// quick bounce between signup and either Checkout or the tenant UI.
import { useEffect, useState } from "react";
import { fetchSession, redirectToLogin, type Session } from "@/lib/auth";
import { fetchSession, redirectToLogin, signOut, type Session } from "@/lib/auth";
import { PLATFORM_URL } from "@/lib/api";
import { formatCredits, pillTone, bannerKind } from "@/lib/credits";
import { TermsGate } from "@/components/TermsGate";
@@ -129,7 +129,7 @@ export default function OrgsPage() {
return <EmptyState banner={justCheckedOut ? <CheckoutBanner /> : null} />;
}
return (
<Shell>
<Shell session={session}>
{justCheckedOut && <CheckoutBanner />}
<ul className="space-y-3">
{orgs.map((o) => (
@@ -160,11 +160,21 @@ function CheckoutBanner() {
);
}
function Shell({ children }: { children: React.ReactNode }) {
function Shell({
children,
session,
}: {
children: React.ReactNode;
// Optional: when present, the header renders the signed-in email +
// a Sign-out button. The empty-state Shell call doesn't have a
// session in scope, so accept null and skip the header chrome there.
session?: Session | null;
}) {
return (
<main className="min-h-screen bg-surface text-ink">
<TermsGate>
<div className="mx-auto max-w-2xl px-6 pt-20 pb-12">
{session ? <AccountBar session={session} /> : null}
<h1 className="text-3xl font-bold text-ink">Your organizations</h1>
<p className="mt-2 text-ink-mid">
Each org is an isolated Molecule workspace.
@@ -177,6 +187,40 @@ function Shell({ children }: { children: React.ReactNode }) {
);
}
// AccountBar renders the signed-in email + a Sign-out button at the
// top of the page. Without this the user has no way to log out — the
// /cp/auth/signout endpoint exists on the control plane but no UI ever
// called it. Reported externally on 2026-05-05; this is the fix.
//
// Click → calls signOut() which POSTs /cp/auth/signout (clears the
// WorkOS session cookie + revokes at the provider) then bounces to
// /cp/auth/login. The signOut helper is best-effort — even on a 5xx
// or network failure the redirect fires so the user never gets stuck
// on an authed-looking page after they clicked Sign out.
function AccountBar({ session }: { session: Session }) {
const [signingOut, setSigningOut] = useState(false);
return (
<div className="mb-6 flex items-center justify-between text-sm text-ink-mid">
<span title="Signed-in user">{session.email}</span>
<button
type="button"
disabled={signingOut}
onClick={async () => {
setSigningOut(true);
await signOut();
// Redirect happens inside signOut; this line is for tests +
// edge cases (jsdom, blocked navigation) where it doesn't.
setSigningOut(false);
}}
className="rounded border border-line bg-surface-card px-3 py-1 text-xs text-ink hover:bg-surface-card disabled:opacity-50"
aria-label="Sign out"
>
{signingOut ? "Signing out…" : "Sign out"}
</button>
</div>
);
}
// DataResidencyNotice surfaces where workspace data lives so EU-based
// signups can make an informed choice (GDPR Art. 13 disclosure
// requirement). Plain text, no icon — the goal is clarity, not
@@ -1,6 +1,6 @@
"use client";
import { useState, useEffect, useMemo, useRef } from "react";
import { useState, useEffect, useLayoutEffect, useMemo, useRef, useCallback } from "react";
import ReactMarkdown from "react-markdown";
import remarkGfm from "remark-gfm";
import { api } from "@/lib/api";
@@ -184,13 +184,23 @@ function unwrapErrorText(raw: string | null): string {
export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
const [messages, setMessages] = useState<CommMessage[]>([]);
const [loading, setLoading] = useState(true);
const [loadError, setLoadError] = useState<string | null>(null);
// Dedup by timestamp+type+peer to handle API load + WebSocket race
const seenKeys = useRef(new Set<string>());
const bottomRef = useRef<HTMLDivElement>(null);
// Mirrors the my-chat scroll behaviour from ChatTab (PR #2903) —
// smooth-scroll on a long history gets interrupted by concurrent
// renders and lands the panel mid-conversation. Switch the first
// arrival to instant; subsequent appends animate.
const hasInitialScrollRef = useRef(false);
// Load history
useEffect(() => {
// Load history. Extracted so the error-state retry button can
// re-invoke without remount. ChatTab uses the same shape
// (loadInitial → loadError state → retry button).
const loadInitial = useCallback(() => {
setLoading(true);
setLoadError(null);
seenKeys.current.clear();
api.get<ActivityEntry[]>(`/workspaces/${workspaceId}/activity?source=agent&limit=50`)
.then((entries) => {
const filtered = (entries ?? [])
@@ -234,10 +244,15 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
// the .then body) — the panel just sat on the empty state
// with zero signal.
console.warn("AgentCommsPanel: load activity failed", err);
setLoadError(err instanceof Error ? err.message : String(err));
setLoading(false);
});
}, [workspaceId]);
useEffect(() => {
loadInitial();
}, [loadInitial]);
// Live updates routed through the global ReconnectingSocket. The
// previous pattern of `new WebSocket(WS_URL)` per panel had no
// onclose / no reconnect, so any drop (idle timeout, browser
@@ -358,7 +373,18 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
} catch { /* ignore */ }
});
useEffect(() => {
// useLayoutEffect (not useEffect) so the scroll runs BEFORE paint —
// otherwise the user sees the panel jump for one frame on every
// append. Mirrors ChatTab's MyChatPanel scroll block.
useLayoutEffect(() => {
if (!hasInitialScrollRef.current && messages.length > 0) {
// Instant on first arrival — smooth-scroll on a long history
// gets interrupted by concurrent renders and lands the panel
// mid-conversation (the chat-opens-in-middle bug class).
hasInitialScrollRef.current = true;
bottomRef.current?.scrollIntoView({ behavior: "instant" as ScrollBehavior });
return;
}
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
}, [messages]);
@@ -366,6 +392,27 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
return <div className="text-xs text-ink-soft text-center py-8">Loading agent communications...</div>;
}
if (loadError !== null && messages.length === 0) {
// Mirrors ChatTab my-chat error UI — surfaces the load failure
// with a retry button instead of silently rendering empty state.
return (
<div
role="alert"
className="mx-2 mt-2 rounded-lg border border-red-800/50 bg-red-950/30 px-3 py-2.5"
>
<p className="text-[11px] text-bad mb-1.5">
Failed to load agent communications: {loadError}
</p>
<button
onClick={loadInitial}
className="text-[10px] px-2 py-0.5 rounded bg-red-800/40 text-bad hover:bg-red-700/50 transition-colors"
>
Retry
</button>
</div>
);
}
if (messages.length === 0) {
return (
<div className="text-xs text-ink-soft text-center py-8">
@@ -0,0 +1,115 @@
// @vitest-environment jsdom
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor } from "@testing-library/react";
// API mock — tests can override per case via apiGetMock.mockImplementationOnce.
const apiGetMock = vi.fn<(url: string) => Promise<unknown>>();
vi.mock("@/lib/api", () => ({
api: {
get: (url: string) => apiGetMock(url),
},
}));
// useSocketEvent — no-op for these render tests; live updates aren't
// what we're verifying here.
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: () => {},
}));
// Canvas store — peer name resolution.
vi.mock("@/store/canvas", () => ({
useCanvasStore: {
getState: () => ({
nodes: [
{ id: "ws-self", data: { name: "Self" } },
{ id: "ws-peer", data: { name: "Peer Agent" } },
],
}),
},
}));
// Toaster shim — AgentCommsPanel imports showToast.
vi.mock("../../Toaster", () => ({
showToast: vi.fn(),
}));
import { AgentCommsPanel } from "../AgentCommsPanel";
// jsdom doesn't implement scrollIntoView. Tests that observe the call
// install a spy here; tests that don't care still need a no-op stub
// so the component doesn't throw.
const scrollSpy = vi.fn<(opts?: ScrollIntoViewOptions | boolean) => void>();
beforeEach(() => {
apiGetMock.mockReset();
scrollSpy.mockReset();
Element.prototype.scrollIntoView = scrollSpy as unknown as Element["scrollIntoView"];
});
afterEach(() => {
vi.clearAllMocks();
});
describe("AgentCommsPanel — initial-state parity with ChatTab my-chat", () => {
it("shows loading text while history fetch is in flight", () => {
apiGetMock.mockReturnValueOnce(new Promise(() => { /* never resolves */ }));
render(<AgentCommsPanel workspaceId="ws-self" />);
expect(screen.getByText("Loading agent communications...")).toBeDefined();
});
it("renders error UI with a Retry button when the history fetch rejects", async () => {
apiGetMock.mockRejectedValueOnce(new Error("network down"));
render(<AgentCommsPanel workspaceId="ws-self" />);
// Wait for the error state to render — loading→error transition is async.
const alert = await waitFor(() => screen.getByRole("alert"));
expect(alert.textContent).toMatch(/Failed to load agent communications/);
expect(alert.textContent).toMatch(/network down/);
// Retry button must be present and trigger a refetch.
const retry = screen.getByRole("button", { name: "Retry" });
apiGetMock.mockResolvedValueOnce([]); // success on retry
fireEvent.click(retry);
// Two calls total: initial load + retry. Pin via mock call count.
await waitFor(() => expect(apiGetMock.mock.calls.length).toBe(2));
});
it("falls back to empty-state copy when load succeeds with zero rows", async () => {
apiGetMock.mockResolvedValueOnce([]);
render(<AgentCommsPanel workspaceId="ws-self" />);
await waitFor(() =>
expect(screen.getByText("No agent-to-agent communications yet.")).toBeDefined(),
);
});
it("scrollIntoView is called with behavior=instant on the first message arrival", async () => {
apiGetMock.mockResolvedValueOnce([
{
id: "act-1",
activity_type: "a2a_send",
source_id: "ws-self",
target_id: "ws-peer",
method: "message/send",
summary: "Delegating",
request_body: { message: { parts: [{ text: "hi" }] } },
response_body: null,
status: "ok",
created_at: "2026-04-25T18:00:00Z",
},
]);
render(<AgentCommsPanel workspaceId="ws-self" />);
// useLayoutEffect is what makes the first call instant — wait for
// the panel to render at least one message.
await waitFor(() => expect(scrollSpy.mock.calls.length).toBeGreaterThan(0));
// The pinned contract: SOME call uses behavior: "instant" — the
// first-arrival case. Subsequent appends use "smooth", but those
// can't fire here (no live update yet).
const sawInstant = scrollSpy.mock.calls.some((args) => {
const opts = args[0];
return typeof opts === "object" && opts !== null && "behavior" in opts && opts.behavior === "instant";
});
expect(sawInstant).toBe(true);
});
});
+155 -1
View File
@@ -2,7 +2,7 @@
* @vitest-environment jsdom
*/
import { describe, it, expect, vi, afterEach } from "vitest";
import { fetchSession, redirectToLogin } from "../auth";
import { fetchSession, redirectToLogin, signOut } from "../auth";
afterEach(() => {
vi.unstubAllGlobals();
@@ -110,3 +110,157 @@ describe("redirectToLogin", () => {
expect((window.location as unknown as { href: string }).href).toBe(signupHref);
});
});
describe("signOut", () => {
// Helper — most tests need the same window.location stub.
function stubLocation(): void {
Object.defineProperty(window, "location", {
writable: true,
value: {
href: "https://acme.moleculesai.app/orgs",
pathname: "/orgs",
hostname: "acme.moleculesai.app",
protocol: "https:",
},
});
}
it("POSTs to /cp/auth/signout with credentials:include", async () => {
stubLocation();
const fetchMock = vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: "" }),
});
vi.stubGlobal("fetch", fetchMock);
await signOut();
expect(fetchMock).toHaveBeenCalledTimes(1);
expect(fetchMock).toHaveBeenCalledWith(
expect.stringContaining("/cp/auth/signout"),
expect.objectContaining({ method: "POST", credentials: "include" }),
);
});
it("navigates to provider logout_url when the response includes one", async () => {
// The hosted-logout path is what actually breaks the SSO re-auth
// loop reported on PR #2913. Without this, AuthKit's browser
// cookie keeps the user signed in via SSO and any subsequent
// /cp/auth/login silently re-auths.
stubLocation();
const hostedLogout =
"https://api.workos.com/user_management/sessions/logout?session_id=cookie&return_to=https%3A%2F%2Fapp.moleculesai.app%2Forgs";
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: hostedLogout }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe(hostedLogout);
});
it("falls back to /cp/auth/login when logout_url is empty (DisabledProvider / dev)", async () => {
// DisabledProvider returns "" — the local /cp/auth/login redirect
// works in dev/test where there's no SSO session to escape.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: "" }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
// Tenant subdomain (acme.moleculesai.app) → auth origin is app.moleculesai.app.
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("redirects even when the POST fails so the user isn't stuck on an authed page", async () => {
// Critical UX invariant: clicking 'Sign out' MUST navigate away from
// the authenticated app, even if the network is down or the cookie
// is already invalid. Anything else looks like the button is
// broken — the precise complaint that triggered this fix.
stubLocation();
vi.stubGlobal("fetch", vi.fn().mockRejectedValue(new Error("network down")));
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("redirects on 401 (session already invalid) just like 200", async () => {
// A user with an already-invalid cookie should still see the
// logout flow complete — no error, no stuck-on-app dead end.
// Note: 401 means res.ok=false → we don't read .json() at all,
// so a missing body is fine.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: false,
status: 401,
json: async () => ({}),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("falls back to /cp/auth/login when the response body is malformed", async () => {
// Defensive parsing: a body that isn't valid JSON, or doesn't
// have logout_url, or has logout_url as the wrong type — none of
// these should strand the user on the authed page. Fallback path
// takes over.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => {
throw new Error("not json");
},
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("falls back to /cp/auth/login when logout_url is the wrong type", async () => {
// Even valid JSON should be type-checked: a non-string logout_url
// (e.g. server-side bug, version drift) must not crash or open-
// redirect the user.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: 42 }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
});
+77
View File
@@ -67,3 +67,80 @@ export function redirectToLogin(screenHint: "sign-up" | "sign-in" = "sign-in"):
const dest = `${authOrigin}${AUTH_BASE}/${path}?return_to=${encodeURIComponent(returnTo)}`;
window.location.href = dest;
}
/**
* signOut posts to /cp/auth/signout to clear the WorkOS session cookie
* + revoke at the provider, then navigates the browser to the
* provider-supplied hosted logout URL (so the provider's BROWSER-side
* SSO cookie is cleared too — without this, AuthKit silently re-auths
* via SSO on the next /cp/auth/login and the user is "still signed
* in" after pressing Sign out).
*
* Two-layer flow:
* 1. POST /cp/auth/signout → CP clears OUR session cookie + revokes
* session_id at the provider API. Response includes
* `logout_url` — the AuthKit hosted URL the BROWSER must navigate
* to so the provider's own browser cookie is cleared.
* 2. window.location.href = <logout_url> → AuthKit clears its
* session, then redirects the browser to the configured
* return_to (defaults to APP_URL/orgs).
*
* Best-effort by design: a 5xx, network failure, missing logout_url
* (DisabledProvider, dev), or stale cookie still results in the
* browser navigating away — leaving the user on a logged-in-looking
* page after they clicked "Sign out" is the worst possible UX. The
* fallback path navigates to /cp/auth/login on the auth origin, which
* works correctly in environments without a hosted logout flow (dev,
* tests, DisabledProvider).
*
* Throws nothing — callers can disable the button optimistically or
* await this and trust it returns. On a redirect-blocked test
* environment (jsdom under vitest) we still exit cleanly so unit tests
* can spy on the fetch call.
*/
export async function signOut(): Promise<void> {
let logoutURL: string | undefined;
// Fire-and-tolerate the POST. credentials:include is mandatory cross-
// origin so the SaaS canvas (acme.moleculesai.app) can hit
// app.moleculesai.app/cp/auth/signout with the session cookie.
try {
const res = await fetch(`${getAuthOrigin()}${AUTH_BASE}/signout`, {
method: "POST",
credentials: "include",
});
if (res.ok) {
// Body shape: {"ok": true, "logout_url": "..."}. logout_url is
// empty for DisabledProvider (dev/local) — we fall back to
// /cp/auth/login below. Defensive parsing: a malformed body
// shouldn't strand the user on the authed page.
const body: unknown = await res.json().catch(() => null);
if (
body &&
typeof body === "object" &&
"logout_url" in body &&
typeof (body as { logout_url: unknown }).logout_url === "string" &&
(body as { logout_url: string }).logout_url
) {
logoutURL = (body as { logout_url: string }).logout_url;
}
}
} catch {
// Ignore — we still redirect below.
}
if (typeof window === "undefined") return;
if (logoutURL) {
// Hosted logout: AuthKit clears its SSO cookie + redirects to
// return_to (configured server-side). This is the path that
// actually breaks the SSO re-auth loop.
window.location.href = logoutURL;
return;
}
// Fallback: no hosted logout (dev, DisabledProvider, network
// failure). Land on the login screen rather than the current URL:
// returning to a tenant URL after signout would just re-redirect
// through /cp/auth/login due to AuthGate. Send the user straight
// there with no return_to so they don't loop back into the org they
// just left.
const authOrigin = getAuthOrigin();
window.location.href = `${authOrigin}${AUTH_BASE}/login`;
}
+9
View File
@@ -1,5 +1,14 @@
# Workspace Runtime PyPI Package
## Requires Python >= 3.11
The wheel pins `requires_python>=3.11`. On Python 3.10 or older, `pip install
molecule-ai-workspace-runtime` fails with `Could not find a version that
satisfies the requirement (from versions: none)` — the pin filters the only
available artifact before pip even attempts install. Upgrade the interpreter
(`brew install python@3.12` / `apt install python3.12` / etc.) or use a
3.11+ venv.
## Overview
The shared workspace runtime infrastructure has **one editable source** and
+1
View File
@@ -56,6 +56,7 @@ TOP_LEVEL_MODULES = {
"a2a_mcp_server",
"a2a_tools",
"a2a_tools_delegation",
"a2a_tools_memory",
"a2a_tools_rbac",
"adapter_base",
"agent",
+295
View File
@@ -0,0 +1,295 @@
#!/usr/bin/env bash
# E2E for poll-mode chat upload (RFC #2891 phases 1-5b).
#
# Round-trip: register a workspace as poll-mode (no callback URL) → POST a
# multi-file chat upload → verify each file becomes (a) one
# `chat_upload_receive` activity row and (b) one /pending-uploads row → fetch
# the bytes back via the poll endpoint → ack → verify the row 404s on
# subsequent fetch. Also pins cross-workspace bleed protection: workspace B
# cannot read workspace A's pending uploads even with its own valid bearer.
#
# Why this exists separately from test_chat_upload_e2e.sh: that script
# covers the PUSH path (the workspace's own /internal/chat/uploads/ingest).
# This script covers the POLL path: the same canvas-side request lands on
# the platform's pendinguploads.Storage instead, and the workspace fetches
# it later. The two paths share zero handler code on the platform side, so
# both need their own E2E.
#
# Requires: platform running on localhost:8080 with migrations applied.
# bash workspace-server/scripts/dev-start.sh
# bash workspace-server/scripts/run-migrations.sh
#
# Idempotent: each run uses fresh per-script workspace UUIDs so reruns
# don't collide. Best-effort cleanup on EXIT — does NOT call
# e2e_cleanup_all_workspaces (see
# `feedback_never_run_cluster_cleanup_tests_on_live_platform.md`).
set -euo pipefail
source "$(dirname "$0")/_lib.sh"
PASS=0
FAIL=0
TIMEOUT="${A2A_TIMEOUT:-30}"
gen_uuid() {
if command -v uuidgen >/dev/null 2>&1; then
uuidgen | tr '[:upper:]' '[:lower:]'
else
python3 -c 'import uuid; print(uuid.uuid4())'
fi
}
WS_A="$(gen_uuid)"
WS_B="$(gen_uuid)"
# Per-run scratch dir collected under one trap so every assertion-failure
# path drops the temp files it made (see test_chat_attachments_e2e.sh).
TMPDIR_E2E=$(mktemp -d -t poll-chat-upload-e2e-XXXXXX)
cleanup() {
local rc=$?
curl -s -X DELETE "$BASE/workspaces/$WS_A?confirm=true" >/dev/null 2>&1 || true
curl -s -X DELETE "$BASE/workspaces/$WS_B?confirm=true" >/dev/null 2>&1 || true
rm -rf "$TMPDIR_E2E"
exit $rc
}
trap cleanup EXIT INT TERM
check() {
local desc="$1" expected="$2" actual="$3"
if echo "$actual" | grep -qF -- "$expected"; then
echo "PASS: $desc"
PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected to contain: $expected"
echo " got: $(echo "$actual" | head -10)"
FAIL=$((FAIL + 1))
fi
}
check_eq() {
local desc="$1" expected="$2" actual="$3"
if [ "$actual" = "$expected" ]; then
echo "PASS: $desc"
PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected: $expected"
echo " got: $actual"
FAIL=$((FAIL + 1))
fi
}
echo "=== Poll-Mode Chat Upload E2E ==="
echo " base: $BASE"
echo " workspace A: $WS_A"
echo " workspace B: $WS_B"
echo ""
# ---------- Phase 1: register poll-mode workspace ----------
echo "--- Phase 1: Register poll-mode workspace A ---"
REG_A=$(curl -s -X POST "$BASE/registry/register" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"$WS_A\",
\"delivery_mode\": \"poll\",
\"agent_card\": {\"name\": \"poll-chat-upload-test-a\"}
}")
check "register accepts poll mode without URL" '"status":"registered"' "$REG_A"
TOK_A=$(echo "$REG_A" | e2e_extract_token || true)
[ -n "$TOK_A" ] || { echo "FAIL: no auth_token in register response (ws A)"; FAIL=$((FAIL + 1)); exit 1; }
# ---------- Phase 2: multi-file chat upload ----------
echo ""
echo "--- Phase 2: POST /chat/uploads with two files ---"
FILE1="$TMPDIR_E2E/alpha.txt"
FILE2="$TMPDIR_E2E/beta.txt"
EXPECTED1="alpha-secret-$(openssl rand -hex 4)"
EXPECTED2="beta-secret-$(openssl rand -hex 4)"
printf '%s' "$EXPECTED1" > "$FILE1"
printf '%s' "$EXPECTED2" > "$FILE2"
UPLOAD=$(curl -s -X POST "$BASE/workspaces/$WS_A/chat/uploads" \
-H "Authorization: Bearer $TOK_A" \
-F "files=@$FILE1;filename=alpha.txt;type=text/plain" \
-F "files=@$FILE2;filename=beta.txt;type=text/plain" \
-w "\nHTTP_CODE=%{http_code}\n")
UPLOAD_CODE=$(echo "$UPLOAD" | grep -oE 'HTTP_CODE=[0-9]+' | cut -d= -f2)
UPLOAD_BODY=$(echo "$UPLOAD" | sed '/^HTTP_CODE=/,$d')
check_eq "upload returns 200" "200" "$UPLOAD_CODE"
check "upload response has files array" '"files":' "$UPLOAD_BODY"
# Pull file_ids out of the URI in the response. URI shape is
# `platform-pending:<wsid>/<file_id>` — proves the response came from the
# poll-mode branch, not the push-mode internal-ingest branch.
URI1=$(echo "$UPLOAD_BODY" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][0]["uri"])')
URI2=$(echo "$UPLOAD_BODY" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][1]["uri"])')
check "URI 1 has platform-pending: scheme" "platform-pending:$WS_A/" "$URI1"
check "URI 2 has platform-pending: scheme" "platform-pending:$WS_A/" "$URI2"
FID1="${URI1##*/}"
FID2="${URI2##*/}"
[ -n "$FID1" ] && [ -n "$FID2" ] || { echo "FAIL: could not extract file IDs"; FAIL=$((FAIL + 1)); exit 1; }
echo " file_id 1: $FID1"
echo " file_id 2: $FID2"
# ---------- Phase 3: activity rows visible to the workspace ----------
echo ""
echo "--- Phase 3: /activity shows two chat_upload_receive rows ---"
# activity_logs INSERTs run in a goroutine — give them a moment.
sleep 1
ACT=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/activity?type=a2a_receive&limit=20")
check "activity feed has the alpha file" "$FID1" "$ACT"
check "activity feed has the beta file" "$FID2" "$ACT"
check "activity rows tagged chat_upload_receive" '"method":"chat_upload_receive"' "$ACT"
check "activity rows record alpha mimetype" '"mimeType":"text/plain"' "$ACT"
CHAT_UPLOAD_COUNT=$(echo "$ACT" | python3 -c '
import json, sys
rows = json.load(sys.stdin)
n = sum(1 for r in rows if (r.get("method") or "") == "chat_upload_receive")
print(n)
')
check_eq "exactly two chat_upload_receive rows" "2" "$CHAT_UPLOAD_COUNT"
# ---------- Phase 4: GET /pending-uploads/:file_id/content ----------
echo ""
echo "--- Phase 4: Fetch content for each pending upload ---"
GOT1=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check_eq "alpha bytes round-trip" "$EXPECTED1" "$GOT1"
GOT2=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID2/content")
check_eq "beta bytes round-trip" "$EXPECTED2" "$GOT2"
# Mimetype + Content-Disposition headers should match what was uploaded.
HEAD1=$(curl -s -D - -o /dev/null --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check "alpha response carries text/plain Content-Type" "Content-Type: text/plain" "$HEAD1"
check "alpha response carries Content-Disposition with filename" 'filename="alpha.txt"' "$HEAD1"
# ---------- Phase 5: idempotent re-fetch (until ack) ----------
echo ""
echo "--- Phase 5: Re-fetch before ack returns the same bytes ---"
RE_GOT1=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check_eq "re-fetch returns same alpha bytes" "$EXPECTED1" "$RE_GOT1"
# ---------- Phase 6: ack each row ----------
echo ""
echo "--- Phase 6: Ack each pending upload ---"
ACK1=$(curl -s -X POST --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/ack")
check "alpha ack returns acked:true" '"acked":true' "$ACK1"
ACK2=$(curl -s -X POST --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID2/ack")
check "beta ack returns acked:true" '"acked":true' "$ACK2"
# Re-ack should still 200 (idempotent — the row's gone but the workspace's
# at-least-once intent was already honored, and the second ack hits the
# raced path which also returns 200).
RE_ACK1=$(curl -s -w '\n%{http_code}' -X POST --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/ack")
RE_ACK1_CODE=$(printf '%s' "$RE_ACK1" | tail -n1)
# Acked rows return 404 on Get-before-Ack (the row's still in the table
# but Get filters acked_at IS NULL); workspace would not normally re-ack
# since it already saw the success. Accept both 200 and 404 here so the
# test pins the contract without being brittle on the inner ordering.
case "$RE_ACK1_CODE" in
200|404)
echo "PASS: re-ack returns 200 or 404 ($RE_ACK1_CODE)"
PASS=$((PASS + 1))
;;
*)
echo "FAIL: re-ack returned unexpected $RE_ACK1_CODE"
FAIL=$((FAIL + 1))
;;
esac
# ---------- Phase 7: GET content after ack returns 404 ----------
echo ""
echo "--- Phase 7: Acked file 404s on subsequent fetch ---"
POST_ACK=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
POST_ACK_CODE=$(printf '%s' "$POST_ACK" | tail -n1)
check_eq "acked alpha returns HTTP 404" "404" "$POST_ACK_CODE"
# ---------- Phase 8: cross-workspace bleed protection ----------
echo ""
echo "--- Phase 8: Workspace B cannot read workspace A's pending uploads ---"
# Stage a fresh upload on workspace A so we have an UN-acked row to probe.
PROBE_FILE="$TMPDIR_E2E/probe.txt"
printf '%s' "probe-bytes-$(openssl rand -hex 4)" > "$PROBE_FILE"
PROBE_UP=$(curl -s -X POST "$BASE/workspaces/$WS_A/chat/uploads" \
-H "Authorization: Bearer $TOK_A" \
-F "files=@$PROBE_FILE;filename=probe.txt;type=text/plain")
PROBE_FID=$(echo "$PROBE_UP" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][0]["uri"].split("/")[-1])')
[ -n "$PROBE_FID" ] || { echo "FAIL: probe upload returned no file_id"; FAIL=$((FAIL + 1)); exit 1; }
# Register a SECOND poll-mode workspace and capture its bearer.
REG_B=$(curl -s -X POST "$BASE/registry/register" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"$WS_B\",
\"delivery_mode\": \"poll\",
\"agent_card\": {\"name\": \"poll-chat-upload-test-b\"}
}")
check "second workspace registers" '"status":"registered"' "$REG_B"
TOK_B=$(echo "$REG_B" | e2e_extract_token || true)
[ -n "$TOK_B" ] || { echo "FAIL: no auth_token (ws B)"; FAIL=$((FAIL + 1)); exit 1; }
# B's bearer hitting B's URL with A's file_id → 404 (handler checks the row's
# workspace_id matches the URL :id, not the bearer's workspace).
CROSS_RESP=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_B" \
"$BASE/workspaces/$WS_B/pending-uploads/$PROBE_FID/content")
CROSS_CODE=$(printf '%s' "$CROSS_RESP" | tail -n1)
check_eq "B's URL with A's file_id returns 404" "404" "$CROSS_CODE"
# B's bearer hitting A's URL → 401 (wsAuth pins bearer to :id). This is the
# strictest cross-workspace check: a presented-but-wrong bearer is rejected
# in EVERY platform posture (dev-mode fail-open only triggers when no bearer
# is presented at all — invalid tokens always 401).
WRONG_BEARER=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_B" \
"$BASE/workspaces/$WS_A/pending-uploads/$PROBE_FID/content")
WRONG_CODE=$(printf '%s' "$WRONG_BEARER" | tail -n1)
check_eq "B's bearer on A's URL returns 401" "401" "$WRONG_CODE"
# NB: a fully bearerless request to /pending-uploads/:fid/content returns
# 401 ONLY when the platform has MOLECULE_ENV != development (production /
# staging). On local-dev with MOLECULE_ENV=development the wsauth middleware
# fail-opens for bearerless requests so the canvas at :3000 can talk to the
# platform at :8080 without per-call token plumbing — see middleware/
# devmode.go. The strict bearerless-401 contract is covered by the wsauth
# unit + middleware tests; we don't reassert it here because the result
# depends on platform posture, not the poll-mode upload contract.
# ---------- Phase 9: invalid file_id rejected at the URL parser ----------
echo ""
echo "--- Phase 9: Invalid file_id returns 400 ---"
BAD_FID=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/not-a-uuid/content")
BAD_FID_CODE=$(printf '%s' "$BAD_FID" | tail -n1)
check_eq "invalid file_id UUID returns 400" "400" "$BAD_FID_CODE"
# ---------- Results ----------
echo ""
echo "=== Results: $PASS passed, $FAIL failed ==="
[ "$FAIL" -eq 0 ]
@@ -0,0 +1,468 @@
package handlers
// class1_ast_gate_test.go — generic Class 1 leak gate per #2867 PR-A.
//
// What this gate prevents:
// The tenant-hongming leak class — a handler iterates a YAML-derived
// slice (ws.Children, sub_workspaces, etc.) and calls
// `INSERT INTO workspaces` inside the loop body without first
// checking whether a workspace with the same (parent_id, name) is
// already there. Each call to such a handler doubles the tree.
//
// Why this is broader than TestCreateWorkspaceTree_CallsLookupBeforeInsert:
// The existing gate is hard-coded to org_import.go's createWorkspaceTree.
// That catches the specific function that triggered the original
// incident — but a future handler written from scratch in a different
// file would not be covered. This gate walks every production handler
// .go file and applies a structural rule that does not depend on
// function or file names.
//
// The rule (verbatim from #2867 PR-A):
//
// "No handler in handlers/ may iterate a slice (any RangeStmt) AND
// call INSERT INTO workspaces inside the loop body without a
// preceding SELECT id FROM workspaces WHERE name=$1 AND parent_id IS
// NOT DISTINCT FROM $2 in the same function (== a lookupExistingChild
// call, OR an ON CONFLICT clause baked into the same INSERT, OR an
// explicit allowlist annotation)."
//
// Allowlist mechanism: a function whose body contains the exact comment
// string `// class1-gate: idempotent-by-design` is treated as safe.
// Use this only after writing a unit test that pins WHY the function
// is safe. The annotation is intentionally awkward to type — it should
// be rare.
import (
"go/ast"
"go/parser"
"go/token"
"os"
"path/filepath"
"regexp"
"sort"
"strings"
"testing"
)
// reINSERTWorkspaces matches the exact statement shape we care about.
// Tightened (vs bytes.Index "INSERT INTO workspaces") so the audit
// table `workspaces_audit` literal — or any other lookalike — does not
// false-positive trigger this gate. The same regex is used in the
// existing createWorkspaceTree gate (workspaces_insert_allowlist_test.go)
// — keep them in sync if either changes.
var reINSERTWorkspaces = regexp.MustCompile(`(?m)^\s*INSERT INTO workspaces\s*\(`)
// reONCONFLICT matches ON CONFLICT clauses anywhere in the same SQL
// literal. An UPSERT (INSERT ... ON CONFLICT ... DO UPDATE) is
// idempotent by definition, so the gate exempts it.
var reONCONFLICT = regexp.MustCompile(`(?i)\bON CONFLICT\b`)
// gateAllowlistComment is the magic comment a function author writes
// to opt out of this gate. Forces an explicit decision.
const gateAllowlistComment = "// class1-gate: idempotent-by-design"
// preflightCallNames are function names whose presence in a function
// body counts as "did a SELECT-by-(parent_id, name) preflight". Add
// new names here as new preflight helpers are introduced. Keep the
// list TIGHT — any sloppy addition weakens the gate.
var preflightCallNames = map[string]bool{
"lookupExistingChild": true,
}
// TestClass1_NoUnpreflightedInsertInsideRange walks every production
// .go file in this package, parses the AST, and fails the test if any
// FuncDecl violates the rule above.
//
// Failure message must include: file path, function name, line of
// the offending INSERT, line of the enclosing range, and a hint at
// the three escape hatches (preflight call, ON CONFLICT, allowlist
// comment).
func TestClass1_NoUnpreflightedInsertInsideRange(t *testing.T) {
wd, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
entries, err := os.ReadDir(wd)
if err != nil {
t.Fatalf("readdir %s: %v", wd, err)
}
type violation struct {
file string
fn string
insertLine int
rangeLine int
}
var violations []violation
scanned := 0
for _, e := range entries {
name := e.Name()
if e.IsDir() || !strings.HasSuffix(name, ".go") {
continue
}
if strings.HasSuffix(name, "_test.go") {
continue
}
path := filepath.Join(wd, name)
src, err := os.ReadFile(path)
if err != nil {
t.Fatalf("read %s: %v", path, err)
}
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, name, src, parser.ParseComments)
if err != nil {
t.Fatalf("parse %s: %v", path, err)
}
scanned++
// Walk every function declaration and apply the rule.
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Body == nil {
continue
}
// Allowlist: skip if the function body contains the magic
// comment. We check via the source range of the function
// — comments inside the body are in file.Comments and
// must overlap the function's Pos/End range.
if functionHasAllowlistComment(file, fd) {
continue
}
// First pass: locate every INSERT INTO workspaces literal
// in this function. We treat each such literal as a
// candidate violation and try to clear it via the rules.
candidates := findInsertWorkspacesLiterals(fd, src, fset)
if len(candidates) == 0 {
continue
}
// Has the function called a preflight helper? Single
// pass — if any preflight name appears, every INSERT in
// the function is considered preflighted. This is more
// permissive than position-aware (preflight could be
// AFTER the INSERT and still satisfy the gate), but the
// existing org_import.go gate already pins the position
// invariant for createWorkspaceTree, and a function that
// preflights AFTER inserting would fail the position
// gate in a separate test.
hasPreflight := functionCallsAny(fd, preflightCallNames)
for _, c := range candidates {
if c.hasONCONFLICT {
continue
}
if hasPreflight {
continue
}
if c.enclosingRangeLine == 0 {
// INSERT not inside any RangeStmt — single-shot,
// not the bug pattern.
continue
}
violations = append(violations, violation{
file: name,
fn: fd.Name.Name,
insertLine: c.insertLine,
rangeLine: c.enclosingRangeLine,
})
}
}
}
if scanned == 0 {
t.Fatal("scanned 0 .go files — wrong working directory? gate would always pass")
}
if len(violations) > 0 {
// Stable sort so the failure message is deterministic across
// reruns.
sort.Slice(violations, func(i, j int) bool {
if violations[i].file != violations[j].file {
return violations[i].file < violations[j].file
}
return violations[i].insertLine < violations[j].insertLine
})
var b strings.Builder
b.WriteString("Class 1 leak gate (#2867 PR-A) — these handler functions iterate a slice and INSERT INTO workspaces inside the loop body without a (parent_id, name) preflight.\n\n")
b.WriteString("This is the bug shape that triggered the tenant-hongming leak (TeamHandler.Expand re-inserting the entire sub_workspaces tree on every call). To fix any reported violation, choose ONE of:\n")
b.WriteString(" 1. Call h.lookupExistingChild(ctx, name, parentID) before the INSERT and skip the INSERT when it returns existing=true. (preferred)\n")
b.WriteString(" 2. Use INSERT ... ON CONFLICT ... DO ... (idempotent UPSERT, like registry.go).\n")
b.WriteString(" 3. Annotate the function with a `// class1-gate: idempotent-by-design` comment AND a unit test that pins why the function is structurally idempotent. (rare; require code review)\n\n")
b.WriteString("Violations:\n")
for _, v := range violations {
b.WriteString(" - ")
b.WriteString(v.file)
b.WriteString(":")
b.WriteString(itoa(v.insertLine))
b.WriteString(" — function ")
b.WriteString(v.fn)
b.WriteString("() INSERTs inside RangeStmt at line ")
b.WriteString(itoa(v.rangeLine))
b.WriteString("\n")
}
t.Fatal(b.String())
}
}
func itoa(n int) string {
// Avoid strconv import for one call site — keeps the test focused.
if n == 0 {
return "0"
}
neg := n < 0
if neg {
n = -n
}
var buf [20]byte
i := len(buf)
for n > 0 {
i--
buf[i] = byte('0' + n%10)
n /= 10
}
if neg {
i--
buf[i] = '-'
}
return string(buf[i:])
}
// candidateInsert holds the per-INSERT facts needed to decide whether
// the gate fires.
type candidateInsert struct {
insertLine int
hasONCONFLICT bool
enclosingRangeLine int // 0 means not inside any range
}
// findInsertWorkspacesLiterals walks fd's body and returns one
// candidateInsert per INSERT INTO workspaces string literal.
//
// Position-based detection: collect every RangeStmt's body span first,
// then for each INSERT literal check if its position is inside any
// span. ast.Inspect's nil-call ordering does NOT give per-node pop
// semantics, so a stack-based approach against ast.Inspect would
// silently miscount. Position spans are deterministic and easy to
// reason about.
func findInsertWorkspacesLiterals(fd *ast.FuncDecl, src []byte, fset *token.FileSet) []candidateInsert {
var out []candidateInsert
type span struct{ start, end token.Pos }
var ranges []span
ast.Inspect(fd.Body, func(n ast.Node) bool {
rs, ok := n.(*ast.RangeStmt)
if !ok || rs.Body == nil {
return true
}
ranges = append(ranges, span{rs.Body.Lbrace, rs.Body.Rbrace})
return true
})
enclosingRangeLineFor := func(p token.Pos) int {
// Pick the innermost enclosing range — i.e., the one with the
// largest start that still covers p. Innermost is the one
// whose body actually contains the INSERT, which is the line
// most useful in a violation message.
bestStart := token.NoPos
bestLine := 0
for _, s := range ranges {
if p > s.start && p < s.end && s.start > bestStart {
bestStart = s.start
bestLine = fset.Position(s.start).Line
}
}
return bestLine
}
ast.Inspect(fd.Body, func(n ast.Node) bool {
bl, ok := n.(*ast.BasicLit)
if !ok || bl.Kind != token.STRING {
return true
}
// Strip surrounding backticks/quotes — value includes them.
lit := bl.Value
if len(lit) >= 2 {
lit = lit[1 : len(lit)-1]
}
if !reINSERTWorkspaces.MatchString(lit) {
return true
}
out = append(out, candidateInsert{
insertLine: fset.Position(bl.Pos()).Line,
hasONCONFLICT: reONCONFLICT.MatchString(lit),
enclosingRangeLine: enclosingRangeLineFor(bl.Pos()),
})
return true
})
return out
}
// functionCallsAny returns true if any CallExpr in fd's body has a
// function name (either a SelectorExpr Sel.Name or an Ident name)
// matching a key in names.
func functionCallsAny(fd *ast.FuncDecl, names map[string]bool) bool {
found := false
ast.Inspect(fd.Body, func(n ast.Node) bool {
if found {
return false
}
ce, ok := n.(*ast.CallExpr)
if !ok {
return true
}
switch fun := ce.Fun.(type) {
case *ast.Ident:
if names[fun.Name] {
found = true
return false
}
case *ast.SelectorExpr:
if names[fun.Sel.Name] {
found = true
return false
}
}
return true
})
return found
}
// functionHasAllowlistComment returns true if the function body
// (between fd.Body.Lbrace and fd.Body.Rbrace) contains a comment
// equal to gateAllowlistComment.
func functionHasAllowlistComment(file *ast.File, fd *ast.FuncDecl) bool {
if fd.Body == nil {
return false
}
start := fd.Body.Lbrace
end := fd.Body.Rbrace
for _, cg := range file.Comments {
for _, c := range cg.List {
if c.Pos() < start || c.Pos() > end {
continue
}
if strings.TrimSpace(c.Text) == gateAllowlistComment {
return true
}
}
}
return false
}
// TestClass1_GateFiresOnSyntheticBuggySource — proves the gate actually
// catches the bug shape it's named after. Without this, a regression
// to "always pass" would not be noticed until the leak shipped again.
// Per memory feedback_assert_exact_not_substring.md: tighten the test
// + verify it FAILS on old-shape source before merging.
func TestClass1_GateFiresOnSyntheticBuggySource(t *testing.T) {
const buggySrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func buggyExpand(db fakeDB, ctx context.Context, children []string) {
for _, child := range children {
// Bug shape: INSERT inside the range body, no preflight.
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2)`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, "buggy.go", buggySrc, parser.ParseComments)
if err != nil {
t.Fatalf("parse synthetic source: %v", err)
}
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "buggyExpand" {
continue
}
candidates := findInsertWorkspacesLiterals(fd, []byte(buggySrc), fset)
if len(candidates) != 1 {
t.Fatalf("expected 1 INSERT literal, got %d", len(candidates))
}
c := candidates[0]
if c.enclosingRangeLine == 0 {
t.Errorf("synthetic INSERT inside `for _, child := range` should be detected as enclosed by range, got enclosingRangeLine=0 — gate would miss the bug shape")
}
if c.hasONCONFLICT {
t.Errorf("synthetic INSERT has no ON CONFLICT, gate falsely treated it as idempotent")
}
if functionCallsAny(fd, preflightCallNames) {
t.Errorf("synthetic function does not call lookupExistingChild — gate falsely treated it as preflighted")
}
// All three guards say the gate WOULD fire. Pass.
return
}
t.Fatal("buggyExpand FuncDecl not found in synthetic source")
}
// TestClass1_GateAllowsONCONFLICT — pins that an INSERT with ON
// CONFLICT inside a range body is NOT flagged. registry.go's
// upsert pattern is the prod example.
func TestClass1_GateAllowsONCONFLICT(t *testing.T) {
const safeSrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func upsertLoop(db fakeDB, ctx context.Context, children []string) {
for _, child := range children {
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2) ON CONFLICT (id) DO UPDATE SET name = $2`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, _ := parser.ParseFile(fset, "safe.go", safeSrc, parser.ParseComments)
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "upsertLoop" {
continue
}
candidates := findInsertWorkspacesLiterals(fd, []byte(safeSrc), fset)
if len(candidates) != 1 {
t.Fatalf("expected 1 candidate, got %d", len(candidates))
}
if !candidates[0].hasONCONFLICT {
t.Errorf("ON CONFLICT clause should be detected, was missed — gate would falsely flag idempotent UPSERTs")
}
}
}
// TestClass1_GateAllowsAllowlistAnnotation — pins the escape hatch
// works. Annotated functions are skipped at the FuncDecl level.
func TestClass1_GateAllowsAllowlistAnnotation(t *testing.T) {
const annotatedSrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func intentionallyUnpreflighted(db fakeDB, ctx context.Context, children []string) {
// class1-gate: idempotent-by-design
for _, child := range children {
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2)`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, _ := parser.ParseFile(fset, "annotated.go", annotatedSrc, parser.ParseComments)
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "intentionallyUnpreflighted" {
continue
}
if !functionHasAllowlistComment(file, fd) {
t.Error("allowlist comment should be detected for the intentionallyUnpreflighted function — escape hatch not working")
}
}
}
@@ -198,6 +198,13 @@ const externalUniversalMcpTemplate = `# Universal MCP — standalone register +
# Pair with the Claude Code or Python SDK tab if your runtime needs
# inbound A2A delivery (canvas messages → agent conversation turns).
# Requires Python >= 3.11. On 3.10 or older pip says
# "Could not find a version that satisfies the requirement
# (from versions: none)" — the wheel's requires_python pin filters
# the only available artifact before pip even attempts install.
# Upgrade the interpreter (brew install python@3.12 / apt install
# python3.12 / etc.) or use a 3.11+ venv.
# 1. Install the workspace runtime wheel:
pip install molecule-ai-workspace-runtime
@@ -7,6 +7,7 @@ import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"log"
"os"
@@ -79,7 +80,16 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
}
}
ctxLookup := context.Background()
// 5s timeout bounds the lookup independently of any HTTP request
// context. createWorkspaceTree runs in goroutines spawned from the
// /org/import handler, so plumbing the request context here would
// cascade-cancel into provisionWorkspaceAuto and abort in-flight
// EC2 provisioning if the client disconnected mid-import — that's
// the wrong behaviour. A short bounded timeout protects the
// per-row SELECT against a wedged DB without taking the
// drop-everything-on-disconnect tradeoff.
ctxLookup, cancelLookup := context.WithTimeout(context.Background(), 5*time.Second)
defer cancelLookup()
// Idempotency: if a workspace with the same (parent_id, name) already
// exists, skip the INSERT + canvas_layouts + broadcast + provisioning.
// This is what makes /org/import safe to call multiple times — the
@@ -91,6 +101,15 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
// (parent exists, some children missing) backfill the missing children
// instead of either no-op'ing the whole subtree or duplicating the
// existing children.
//
// /org/import is ADDITIVE-ONLY, never destructive. Children present
// in the existing tree but absent from the new template are
// preserved (no DELETE on diff). Skip-path also does NOT propagate
// updates to existing nodes — a re-import that adds an
// initial_memory or schedule to an existing workspace is silently
// dropped (the function bypasses seedInitialMemories, schedule SQL,
// channel config for skipped rows). To force-update an existing
// tree, delete and re-import or use a future /org/sync route.
existingID, existing, lookupErr := h.lookupExistingChild(ctxLookup, ws.Name, parentID)
if lookupErr != nil {
return fmt.Errorf("idempotency check for %s: %w", ws.Name, lookupErr)
@@ -605,6 +624,12 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
//
// On sql.ErrNoRows: returns ("", false, nil) — caller should INSERT.
// On a real DB error: returns ("", false, err) — caller propagates.
//
// errors.Is is wrap-safe — a future caller wrapping the error
// (database/sql can wrap driver errors with %w in some setups) would
// silently break a `err == sql.ErrNoRows` equality check, causing the
// no-rows path to fall through to the "real DB error" branch and
// abort the import. errors.Is unwraps.
func (h *OrgHandler) lookupExistingChild(ctx context.Context, name string, parentID *string) (string, bool, error) {
var existingID string
err := db.DB.QueryRowContext(ctx, `
@@ -614,7 +639,7 @@ func (h *OrgHandler) lookupExistingChild(ctx context.Context, name string, paren
AND status != 'removed'
LIMIT 1
`, name, parentID).Scan(&existingID)
if err == sql.ErrNoRows {
if errors.Is(err, sql.ErrNoRows) {
return "", false, nil
}
if err != nil {
@@ -2,7 +2,9 @@ package handlers
import (
"context"
"database/sql"
"errors"
"fmt"
"go/ast"
"go/parser"
"go/token"
@@ -123,6 +125,36 @@ func TestLookupExistingChild_DBError_Propagates(t *testing.T) {
}
}
// TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound — pins the
// wrap-safety of the errors.Is(err, sql.ErrNoRows) check. The previous
// `err == sql.ErrNoRows` equality would fall through to the
// "real DB error" branch on a wrapped no-rows error, aborting the
// import for what is in fact the no-rows happy path. driver/sql
// wrapping is currently a non-issue but a future driver change or a
// caller that wraps the result via fmt.Errorf("…: %w", err) would
// silently break the equality check. errors.Is unwraps.
func TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
wrapped := fmt.Errorf("driver-wrapped: %w", sql.ErrNoRows)
mock.ExpectQuery(`SELECT id FROM workspaces`).
WithArgs("Alpha", &parent).
WillReturnError(wrapped)
h := &OrgHandler{}
id, found, err := h.lookupExistingChild(context.Background(), "Alpha", &parent)
if err != nil {
t.Fatalf("expected wrapped no-rows to be treated as not-found (err=nil), got: %v", err)
}
if found {
t.Errorf("expected found=false on wrapped no-rows, got found=true")
}
if id != "" {
t.Errorf("expected empty id on wrapped no-rows, got %q", id)
}
}
// workspacesInsertRE matches a SQL literal that begins (after optional
// leading whitespace) with `INSERT INTO workspaces` followed by `(` —
// requiring the open-paren rules out lookalikes like
+56 -111
View File
@@ -325,115 +325,14 @@ async def tool_get_workspace_info(source_workspace_id: str | None = None) -> str
return json.dumps(info, indent=2)
async def tool_commit_memory(
content: str,
scope: str = "LOCAL",
source_workspace_id: str | None = None,
) -> str:
"""Save important information to persistent memory.
GLOBAL scope is writable only by root workspaces (tier == 0).
RBAC memory.write permission is required for all scope levels.
The source workspace_id is embedded in every record so the platform
can enforce cross-workspace isolation and audit trail.
``source_workspace_id`` selects which registered workspace this
memory belongs to when the agent is registered into multiple
workspaces (PR-1 / multi-workspace mode). When unset, falls back
to the module-level WORKSPACE_ID — single-workspace operators see
no behaviour change.
"""
if not content:
return "Error: content is required"
content = _redact_secrets(content)
scope = scope.upper()
if scope not in ("LOCAL", "TEAM", "GLOBAL"):
scope = "LOCAL"
# RBAC: require memory.write permission (mirrors builtin_tools/memory.py)
if not _check_memory_write_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.write' "
"permission for this operation."
)
# Scope enforcement: only root workspaces (tier 0) can write GLOBAL memory.
# This prevents tenant workspaces from poisoning org-wide memory (GH#1610).
if scope == "GLOBAL" and not _is_root_workspace():
return (
"Error: RBAC — only root workspaces (tier 0) can write to GLOBAL scope. "
"Non-root workspaces may use LOCAL or TEAM scope."
)
src = source_workspace_id or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{src}/memories",
json={
"content": content,
"scope": scope,
# Embed source workspace so the platform can namespace-isolate
# and audit cross-workspace writes (GH#1610 fix).
"workspace_id": src,
},
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if resp.status_code in (200, 201):
return json.dumps({"success": True, "id": data.get("id"), "scope": scope})
return f"Error: {data.get('error', resp.text)}"
except Exception as e:
return f"Error saving memory: {e}"
async def tool_recall_memory(
query: str = "",
scope: str = "",
source_workspace_id: str | None = None,
) -> str:
"""Search persistent memory for previously saved information.
RBAC memory.read permission is required (mirrors builtin_tools/memory.py).
The workspace_id is sent as a query parameter so the platform can
cross-validate it against the auth token and defend against any future
path traversal / cross-tenant read bugs in the platform itself.
``source_workspace_id`` selects which registered workspace's memories
to search when the agent is registered into multiple workspaces.
Unset → defaults to the module-level WORKSPACE_ID.
"""
# RBAC: require memory.read permission (mirrors builtin_tools/memory.py)
if not _check_memory_read_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.read' "
"permission for this operation."
)
src = source_workspace_id or WORKSPACE_ID
params: dict[str, str] = {"workspace_id": src}
if query:
params["q"] = query
if scope:
params["scope"] = scope.upper()
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/memories",
params=params,
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if isinstance(data, list):
if not data:
return "No memories found."
lines = []
for m in data:
lines.append(f"[{m.get('scope', '?')}] {m.get('content', '')}")
return "\n".join(lines)
return json.dumps(data)
except Exception as e:
return f"Error recalling memory: {e}"
# Memory tool handlers — extracted to a2a_tools_memory (RFC #2873 iter 4c).
# Re-imported here so call sites + tests that reference
# ``a2a_tools.tool_commit_memory`` / ``tool_recall_memory`` keep
# resolving identically.
from a2a_tools_memory import ( # noqa: E402 (import after the top-of-module imports)
tool_commit_memory,
tool_recall_memory,
)
# ---------------------------------------------------------------------------
@@ -550,6 +449,52 @@ async def tool_chat_history(
return json.dumps(rows)
def _enrich_inbound_for_agent(d: dict) -> dict:
"""Add peer_name / peer_role / agent_card_url to a poll-path message.
The PUSH path (a2a_mcp_server._build_channel_notification) already
enriches the meta dict with these fields, so a Claude Code host
with channel-push sees them. The POLL path goes through
InboxMessage.to_dict, which is intentionally identity-free (the
storage layer doesn't know about the registry cache). Without this
helper, every non-Claude-Code MCP client that uses inbox_peek /
wait_for_message gets a plain message and the receiving agent
can't tell who's writing — breaking the contract documented in
a2a_mcp_server.py:303-345 ("In both paths the same fields apply").
Cache-first non-blocking enrichment (same shape as push): on cache
miss the helper returns the bare message; the next call within the
5-min TTL hits the warm cache. Failure to enrich is non-fatal —
the agent still gets text + peer_id + kind + activity_id, just
without the friendly identity.
"""
peer_id = d.get("peer_id") or ""
if not peer_id:
# canvas_user — no peer to enrich; helper returns the plain
# message unchanged so the canvas reply path still works.
return d
try:
from a2a_client import ( # local import — avoid module-load cycle
_agent_card_url_for,
enrich_peer_metadata_nonblocking,
)
except Exception: # noqa: BLE001
# If a2a_client is unavailable (test harness, partial install),
# degrade gracefully — agent still gets the bare envelope.
return d
record = enrich_peer_metadata_nonblocking(peer_id)
if record is not None:
if name := record.get("name"):
d["peer_name"] = name
if role := record.get("role"):
d["peer_role"] = role
# agent_card_url is constructable from peer_id alone — surface it
# even when registry enrichment misses, so the receiving agent has
# a single endpoint to hit for the peer's full capability list.
d["agent_card_url"] = _agent_card_url_for(peer_id)
return d
async def tool_inbox_peek(limit: int = 10) -> str:
"""Return up to ``limit`` pending inbound messages without removing them."""
import inbox # local import — avoids a circular dep at module load
@@ -558,7 +503,7 @@ async def tool_inbox_peek(limit: int = 10) -> str:
if state is None:
return _INBOX_NOT_ENABLED_MSG
messages = state.peek(limit=limit if isinstance(limit, int) else 10)
return json.dumps([m.to_dict() for m in messages])
return json.dumps([_enrich_inbound_for_agent(m.to_dict()) for m in messages])
async def tool_inbox_pop(activity_id: str) -> str:
@@ -606,4 +551,4 @@ async def tool_wait_for_message(timeout_secs: float = 60.0) -> str:
message = await loop.run_in_executor(None, state.wait, timeout)
if message is None:
return json.dumps({"timeout": True, "timeout_secs": timeout})
return json.dumps(message.to_dict())
return json.dumps(_enrich_inbound_for_agent(message.to_dict()))
+141
View File
@@ -0,0 +1,141 @@
"""Memory tool handlers — single-concern slice of the a2a_tools surface.
Extracted from ``a2a_tools.py`` (RFC #2873 iter 4c). Owns the two
agent-memory MCP tools:
* ``tool_commit_memory`` — write to the workspace's persistent memory.
* ``tool_recall_memory`` — search the workspace's persistent memory.
Both go through the platform's ``/workspaces/:id/memories`` endpoint;
the platform is the source of truth for namespace isolation + audit
trail. Local responsibility here is RBAC enforcement BEFORE hitting
the network so a denied operation surfaces a clear in-band error
instead of an opaque platform 403.
Imports the RBAC primitives from ``a2a_tools_rbac`` (iter 4a).
"""
from __future__ import annotations
import json
import httpx
from a2a_client import PLATFORM_URL, WORKSPACE_ID
from a2a_tools_rbac import (
auth_headers_for_heartbeat as _auth_headers_for_heartbeat,
check_memory_read_permission as _check_memory_read_permission,
check_memory_write_permission as _check_memory_write_permission,
is_root_workspace as _is_root_workspace,
)
from builtin_tools.security import _redact_secrets
async def tool_commit_memory(
content: str,
scope: str = "LOCAL",
source_workspace_id: str | None = None,
) -> str:
"""Save important information to persistent memory.
GLOBAL scope is writable only by root workspaces (tier == 0).
RBAC memory.write permission is required for all scope levels.
The source workspace_id is embedded in every record so the platform
can enforce cross-workspace isolation and audit trail.
``source_workspace_id`` selects which registered workspace this
memory belongs to when the agent is registered into multiple
workspaces (PR-1 / multi-workspace mode). When unset, falls back
to the module-level WORKSPACE_ID — single-workspace operators see
no behaviour change.
"""
if not content:
return "Error: content is required"
content = _redact_secrets(content)
scope = scope.upper()
if scope not in ("LOCAL", "TEAM", "GLOBAL"):
scope = "LOCAL"
# RBAC: require memory.write permission (mirrors builtin_tools/memory.py)
if not _check_memory_write_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.write' "
"permission for this operation."
)
# Scope enforcement: only root workspaces (tier 0) can write GLOBAL memory.
# This prevents tenant workspaces from poisoning org-wide memory (GH#1610).
if scope == "GLOBAL" and not _is_root_workspace():
return (
"Error: RBAC — only root workspaces (tier 0) can write to GLOBAL scope. "
"Non-root workspaces may use LOCAL or TEAM scope."
)
src = source_workspace_id or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{src}/memories",
json={
"content": content,
"scope": scope,
# Embed source workspace so the platform can namespace-isolate
# and audit cross-workspace writes (GH#1610 fix).
"workspace_id": src,
},
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if resp.status_code in (200, 201):
return json.dumps({"success": True, "id": data.get("id"), "scope": scope})
return f"Error: {data.get('error', resp.text)}"
except Exception as e:
return f"Error saving memory: {e}"
async def tool_recall_memory(
query: str = "",
scope: str = "",
source_workspace_id: str | None = None,
) -> str:
"""Search persistent memory for previously saved information.
RBAC memory.read permission is required (mirrors builtin_tools/memory.py).
The workspace_id is sent as a query parameter so the platform can
cross-validate it against the auth token and defend against any future
path traversal / cross-tenant read bugs in the platform itself.
``source_workspace_id`` selects which registered workspace's memories
to search when the agent is registered into multiple workspaces.
Unset → defaults to the module-level WORKSPACE_ID.
"""
# RBAC: require memory.read permission (mirrors builtin_tools/memory.py)
if not _check_memory_read_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.read' "
"permission for this operation."
)
src = source_workspace_id or WORKSPACE_ID
params: dict[str, str] = {"workspace_id": src}
if query:
params["q"] = query
if scope:
params["scope"] = scope.upper()
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/memories",
params=params,
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if isinstance(data, list):
if not data:
return "No memories found."
lines = []
for m in data:
lines.append(f"[{m.get('scope', '?')}] {m.get('content', '')}")
return "\n".join(lines)
return json.dumps(data)
except Exception as e:
return f"Error recalling memory: {e}"
+49 -49
View File
@@ -702,9 +702,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-1"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("Remember this", scope="local")
data = json.loads(result)
@@ -716,9 +716,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(200, {"id": "mem-2"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("Remember this", scope="INVALID")
data = json.loads(result)
@@ -728,9 +728,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(200, {"id": "mem-3"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("Team info", scope="TEAM")
data = json.loads(result)
@@ -741,9 +741,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-4"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=True):
result = await a2a_tools.tool_commit_memory("Global info", scope="GLOBAL")
data = json.loads(result)
@@ -753,9 +753,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(200, {"id": "mem-5"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("info")
data = json.loads(result)
@@ -766,9 +766,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-6"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("info")
data = json.loads(result)
@@ -779,9 +779,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(400, {"error": "bad request payload"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("info")
assert "Error" in result
@@ -791,9 +791,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_exc=RuntimeError("storage failure"))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("info")
assert "Error saving memory" in result
@@ -808,9 +808,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-poison"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("poisoned GLOBAL memory", scope="GLOBAL")
# Must NOT have called the platform — early rejection
@@ -824,9 +824,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-7"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=False), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=False), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("should be denied", scope="LOCAL")
mc.post.assert_not_called()
@@ -838,9 +838,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-8"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
await a2a_tools.tool_commit_memory("test content", scope="LOCAL")
call_kwargs = mc.post.call_args.kwargs
@@ -865,8 +865,8 @@ class TestToolRecallMemory:
{"scope": "TEAM", "content": "We use Python 3.11"},
]
mc = _make_http_mock(get_resp=_resp(200, memories))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
result = await a2a_tools.tool_recall_memory(query="capital")
assert "[LOCAL]" in result
@@ -878,8 +878,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
result = await a2a_tools.tool_recall_memory(query="anything")
assert result == "No memories found."
@@ -890,8 +890,8 @@ class TestToolRecallMemory:
payload = {"error": "search unavailable"}
mc = _make_http_mock(get_resp=_resp(200, payload))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
result = await a2a_tools.tool_recall_memory()
parsed = json.loads(result)
@@ -901,8 +901,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_exc=RuntimeError("search service down"))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
result = await a2a_tools.tool_recall_memory(query="test")
assert "Error recalling memory" in result
@@ -913,8 +913,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
await a2a_tools.tool_recall_memory(query="paris", scope="local")
call_kwargs = mc.get.call_args.kwargs
@@ -928,8 +928,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
await a2a_tools.tool_recall_memory()
call_kwargs = mc.get.call_args.kwargs
@@ -942,8 +942,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
await a2a_tools.tool_recall_memory(scope="team")
call_kwargs = mc.get.call_args.kwargs
@@ -960,8 +960,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, [{"scope": "GLOBAL", "content": "secret"}]))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=False):
result = await a2a_tools.tool_recall_memory(query="secret")
mc.get.assert_not_called()
@@ -0,0 +1,150 @@
"""Tests for `_enrich_inbound_for_agent` — the poll-path companion to
the push-path enrichment in `a2a_mcp_server._build_channel_notification`.
The MCP poll path (inbox_peek / wait_for_message) returns
`InboxMessage.to_dict()`, which has `activity_id, text, peer_id, kind,
method, created_at` but NOT the registry-resolved `peer_name`,
`peer_role`, or `agent_card_url`. The receiving agent then sees a
plain message and can't tell who's writing — breaking the universal
contract documented in `a2a_mcp_server.py:303-345` ("In both paths
the same fields apply").
The enrichment helper closes that gap. These tests pin:
- canvas_user (peer_id="") passes through unchanged
- peer_agent with cache hit gets peer_name + peer_role + agent_card_url
- peer_agent with cache miss still gets agent_card_url (constructable
from peer_id alone)
- a2a_client unavailable (test harness without registry) degrades
gracefully — agent still gets the bare envelope
"""
from __future__ import annotations
import os
# a2a_client.py reads WORKSPACE_ID at import time and raises if it's
# unset. Stamp a stub before any test pulls in a2a_tools (which transitively
# imports a2a_client). conftest.py mocks the SDK but not this env var.
os.environ.setdefault("WORKSPACE_ID", "00000000-0000-0000-0000-000000000001")
import sys
import types
from unittest.mock import patch
PEER_UUID = "11111111-2222-3333-4444-555555555555"
def test_canvas_user_passes_through_unchanged():
from a2a_tools import _enrich_inbound_for_agent
base = {
"activity_id": "act-1",
"text": "hello from canvas",
"peer_id": "",
"kind": "canvas_user",
"method": "message/send",
"created_at": "2026-05-05T11:00:00Z",
}
out = _enrich_inbound_for_agent(dict(base))
# Plain pass-through — no enrichment fields added for canvas_user.
assert out == base
assert "peer_name" not in out
assert "peer_role" not in out
assert "agent_card_url" not in out
def test_peer_agent_cache_hit_adds_name_role_and_card_url():
from a2a_tools import _enrich_inbound_for_agent
record = {"name": "ops-agent", "role": "sre"}
card_url = f"https://platform.example/registry/{PEER_UUID}/agent-card"
with patch(
"a2a_client.enrich_peer_metadata_nonblocking",
return_value=record,
), patch(
"a2a_client._agent_card_url_for",
return_value=card_url,
):
out = _enrich_inbound_for_agent({
"activity_id": "act-2",
"text": "ping",
"peer_id": PEER_UUID,
"kind": "peer_agent",
"method": "message/send",
"created_at": "2026-05-05T11:01:00Z",
})
assert out["peer_name"] == "ops-agent"
assert out["peer_role"] == "sre"
assert out["agent_card_url"] == card_url
def test_peer_agent_cache_miss_still_gets_agent_card_url():
"""agent_card_url is constructable from peer_id alone — surface it
even when registry enrichment misses, so the receiving agent has a
single endpoint to hit for the peer's full capability list."""
from a2a_tools import _enrich_inbound_for_agent
card_url = f"https://platform.example/registry/{PEER_UUID}/agent-card"
with patch(
"a2a_client.enrich_peer_metadata_nonblocking",
return_value=None, # cache miss
), patch(
"a2a_client._agent_card_url_for",
return_value=card_url,
):
out = _enrich_inbound_for_agent({
"activity_id": "act-3",
"text": "ping",
"peer_id": PEER_UUID,
"kind": "peer_agent",
"method": "message/send",
"created_at": "2026-05-05T11:02:00Z",
})
assert "peer_name" not in out
assert "peer_role" not in out
assert out["agent_card_url"] == card_url
def test_peer_agent_a2a_client_unavailable_degrades_gracefully(monkeypatch):
"""If a2a_client can't be imported (test harness, partial install),
return the bare envelope — agent still gets text + peer_id + kind +
activity_id, just without the friendly identity."""
from a2a_tools import _enrich_inbound_for_agent
# Stub a2a_client import to fail.
real_module = sys.modules.pop("a2a_client", None)
fake = types.ModuleType("a2a_client")
# Deliberately omit enrich_peer_metadata_nonblocking and
# _agent_card_url_for so the helper's fallback path fires.
sys.modules["a2a_client"] = fake
try:
out = _enrich_inbound_for_agent({
"activity_id": "act-4",
"text": "ping",
"peer_id": PEER_UUID,
"kind": "peer_agent",
"method": "message/send",
"created_at": "2026-05-05T11:03:00Z",
})
finally:
if real_module is not None:
sys.modules["a2a_client"] = real_module
else:
sys.modules.pop("a2a_client", None)
# Bare envelope passes through — receiving agent still has enough
# to act, even if the friendly identity is missing.
assert out["peer_id"] == PEER_UUID
assert out["text"] == "ping"
assert out["kind"] == "peer_agent"
assert "peer_name" not in out
assert "peer_role" not in out
assert "agent_card_url" not in out
+69
View File
@@ -0,0 +1,69 @@
"""Drift gate + smoke tests for ``a2a_tools_memory`` (RFC #2873 iter 4c).
The full behavior matrix (RBAC denies, scope enforcement, platform
HTTP error paths) lives in ``test_a2a_tools_impl.py`` (TestToolCommitMemory
+ TestToolRecallMemory) which patches `a2a_tools_memory.foo` after the
iter 4c retarget.
This file pins:
1. **Drift gate** — every previously-public symbol on ``a2a_tools``
(``tool_commit_memory``, ``tool_recall_memory``) is the EXACT same
callable as ``a2a_tools_memory.foo``. Refactor wrapping silently
loses the existing test coverage; this gate makes that drift fail
fast.
2. **Import contract** — ``a2a_tools_memory`` does NOT pull in
``a2a_tools`` at module-load time. The handlers depend on
``a2a_tools_rbac`` (the layered architecture) and ``a2a_client``,
not on the kitchen-sink module that re-exports them.
"""
from __future__ import annotations
import sys
import pytest
@pytest.fixture(autouse=True)
def _require_workspace_id(monkeypatch):
monkeypatch.setenv("WORKSPACE_ID", "00000000-0000-0000-0000-000000000000")
monkeypatch.setenv("PLATFORM_URL", "http://test.invalid")
yield
# ============== Drift gate ==============
class TestBackCompatAliases:
def test_tool_commit_memory_alias(self):
import a2a_tools
import a2a_tools_memory
assert a2a_tools.tool_commit_memory is a2a_tools_memory.tool_commit_memory
def test_tool_recall_memory_alias(self):
import a2a_tools
import a2a_tools_memory
assert a2a_tools.tool_recall_memory is a2a_tools_memory.tool_recall_memory
# ============== Import contract ==============
class TestImportContract:
def test_memory_module_does_not_load_a2a_tools(self, monkeypatch):
"""`a2a_tools_memory` must depend on `a2a_tools_rbac` (the layered
architecture) and `a2a_client`, NEVER on the kitchen-sink
`a2a_tools`. Top-level `from a2a_tools import …` would defeat
the modularization goal and risk a circular-import."""
# Drop both modules to control import order
for m in ("a2a_tools", "a2a_tools_memory"):
sys.modules.pop(m, None)
# Import memory module. Should succeed without a2a_tools loaded.
import a2a_tools_memory # noqa: F401
assert "a2a_tools_memory" in sys.modules
def test_a2a_tools_re_exports_memory_handlers(self):
"""The opposite direction: a2a_tools must surface every memory
symbol so existing call sites + tests work unchanged."""
import a2a_tools
assert hasattr(a2a_tools, "tool_commit_memory")
assert hasattr(a2a_tools, "tool_recall_memory")
+5 -5
View File
@@ -63,7 +63,7 @@ async def test_commit_memory_success(monkeypatch):
mcp = _load_mcp()
client = FakeClient()
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: client)
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: client)
result = await mcp.handle_tool_call("commit_memory", {
"content": "Architecture decision: use Go for backend",
@@ -92,7 +92,7 @@ async def test_commit_memory_default_scope(monkeypatch):
mcp = _load_mcp()
client = FakeClient()
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: client)
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: client)
result = await mcp.handle_tool_call("commit_memory", {
"content": "Some note",
@@ -108,7 +108,7 @@ async def test_recall_memory_success(monkeypatch):
mcp = _load_mcp()
client = FakeClient()
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: client)
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: client)
result = await mcp.handle_tool_call("recall_memory", {"query": "architecture"})
@@ -127,7 +127,7 @@ async def test_recall_memory_empty(monkeypatch):
async def get(self, url, params=None, headers=None, **kwargs):
return FakeResponse(200, [])
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: EmptyClient())
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: EmptyClient())
result = await mcp.handle_tool_call("recall_memory", {})
assert "No memories found" in result
@@ -139,7 +139,7 @@ async def test_recall_memory_with_scope_filter(monkeypatch):
mcp = _load_mcp()
client = FakeClient()
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: client)
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: client)
await mcp.handle_tool_call("recall_memory", {"scope": "TEAM"})
+2 -2
View File
@@ -357,7 +357,7 @@ class TestA2AToolCommitMemoryRedactsSecrets:
fake_client.post = _capture
with patch("a2a_tools.httpx.AsyncClient", return_value=fake_client):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=fake_client):
await a2a_tools.tool_commit_memory(content_with_secret)
stored = captured.get("content", "")
@@ -385,7 +385,7 @@ class TestA2AToolCommitMemoryRedactsSecrets:
fake_client.post = _capture
with patch("a2a_tools.httpx.AsyncClient", return_value=fake_client):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=fake_client):
await a2a_tools.tool_commit_memory(f"key={key}")
stored = captured.get("content", "")