Compare commits

..

72 Commits

Author SHA1 Message Date
molecule-ai[bot] a869bc1536 Merge pull request #2963 from Molecule-AI/staging
staging → main: auto-promote 7ee696e
2026-05-05 17:21:02 -07:00
Hongming Wang d3e115cb06 Merge pull request #2972 from Molecule-AI/fix/a2a-poll-queued-envelope-2967
fix(a2a-client): recognize poll-mode 'queued' envelope (#2967)
2026-05-06 00:05:27 +00:00
Hongming Wang b372c265ab Merge pull request #2968 from Molecule-AI/fix-chat-platform-pending-scheme
fix(canvas/chat): handle platform-pending: scheme for poll-mode upload downloads (PR #2966 followup)
2026-05-06 00:04:49 +00:00
Hongming Wang 146c0e7c60 fix(a2a-client): recognize poll-mode 'queued' envelope (#2967)
workspace-server's a2a_proxy poll-mode short-circuit returns

    {status: "queued", delivery_mode: "poll", method: <a2a_method>}

when the peer has no URL to dispatch to (poll-mode peers, including
every external molecule-mcp standalone runtime). The bare
send_a2a_message parser only knew about JSON-RPC {result, error}
keys, so this envelope fell through to the "unexpected response shape"
error path. Two production symptoms on the reno-stars tenant traced
to it:

1. File transfer logged as failed when it actually succeeded —
   operator-facing logs showed an A2A_ERROR but the receiving
   workspace did get the chunked file via the agent's fallback path.
2. delegate_task retried after the false failure → peer received
   duplicate delegations → conversation got confused, the second
   peer self-diagnosed in a notify ("⚠️ Peer 二次请求 — 我先不执行").

Add a third branch to the parser, BETWEEN the existing JSON-RPC
{result, error} cases and the catch-all "unexpected" fallback. The
queued envelope is delivery-acknowledged-but-pending-consumption —
not an error — so it returns a clean success string the agent can
render as a normal outcome. The success string includes "queued"
and "poll" so an operator scanning logs sees the routing path
without parsing JSON.

Defensive: the new branch only fires when BOTH status="queued" AND
delivery_mode="poll" are present. A partial envelope (one key
missing) still falls through to the catch-all, so a future server
bug that emits a malformed shape gets surfaced instead of silently
swallowed.

Tests:
- test_poll_queued_envelope_returns_success_string — pins the canonical
  envelope returns a non-error string. Discriminating: verified to FAIL
  on old code (returned [A2A_ERROR] string), PASS on new.
- test_poll_queued_envelope_with_other_method — pins the parser doesn't
  hardcode message/send. Discriminating: also FAILS on old code.
- test_status_queued_without_poll_mode_still_falls_through — pins both
  keys are required (defensive against future server bugs).

12 existing tests in TestSendA2AMessage still pass — no regression.

Scope: hotfix for the bare send_a2a_message path. The full SSOT
typed-A2AResponse refactor (#158-#163, parents under #2967) covers the
broader vocabulary alignment between Go server and Python client. This
PR ends the production symptoms now without preempting that work.
2026-05-05 16:58:48 -07:00
Hongming Wang 5d8b5e96e3 fix(canvas/chat): handle platform-pending: scheme for poll-mode upload downloads
Followup to PR #2966. The user reported the about:blank symptom on
reno-stars and the browser console showed:

  Failed to launch 'platform-pending:d76977b1-…/bb0dcaf3-…' because
  the scheme does not have a registered handler.

So the agent's "download link" was a `platform-pending:<wsid>/<file_id>`
URI — the canonical reference for poll-mode chat uploads (see
workspace-server/internal/handlers/chat_files.go:690 +
workspace/inbox_uploads.py). PR #2966 only handled `workspace:`,
`file:///`, and absolute container paths; the platform-pending
scheme fell through to the raw URI which the browser couldn't
navigate to.

Fix
---

- `resolveAttachmentHref`: added a `platform-pending:` branch that
  resolves to `${PLATFORM_URL}/workspaces/<wsid>/pending-uploads/
  <file_id>/content`. Uses the wsid from the URI, NOT the chat's
  workspace_id — these can differ when a file is forwarded across
  workspaces (cross-workspace delegation, agent forwarding).
- New `isPlatformAttachment(uri)` helper — single source of truth
  for "this URI requires our auth headers, route through
  downloadChatFile". Used by both `downloadChatFile` (chip click)
  and ChatTab's markdown-link override.
- ChatTab.tsx markdown-link override now imports
  `isPlatformAttachment` instead of duplicating the scheme list.
  Pre-fix this list was duplicated and missed `platform-pending:`.

Tests
-----

The 4 IME tests still pass; tsc clean. The platform-pending resolution
is exercised via the `isPlatformAttachment` SSOT helper (any URI
reaching `downloadChatFile` or the markdown override goes through
it). A dedicated test for the URL shape would need a more elaborate
fixture; manual verification on staging post-deploy is the practical
gate.

Reported on production reno-stars 2026-05-05.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:55:43 -07:00
Hongming Wang dc6e1ac2bf Merge pull request #2966 from Molecule-AI/fix-chat-ime-and-download-link
fix(canvas/chat): IME-safe Enter + markdown link target/scheme handling
2026-05-05 23:52:54 +00:00
Hongming Wang c2e12f3fb6 fix(canvas/chat): IME-safe Enter + markdown link target/scheme handling
Two production-reported regressions in the same chat surface, fixed
in one focused PR.

Issue 1 — IME composition + Enter sends half-typed message
----------------------------------------------------------

ChatTab's textarea onKeyDown was:

  if (e.key === "Enter" && !e.shiftKey) {
    e.preventDefault();
    sendMessage();
  }

For agents typing CJK / Japanese / Korean via the system IME, Enter
commits the candidate selection — not a newline, not a send. With
the old check, every IME-commit Enter accidentally sent the
half-typed message ("你好" + half-typed-pinyin + Enter to commit
the next candidate → message goes out before the user finishes).

Fix: guard on `event.nativeEvent.isComposing` AND `e.keyCode !== 229`.
The latter covers older Safari / WebKit-based mobile browsers that
delay setting isComposing on the composition-end Enter.

Issue 2 — markdown links land at about:blank
---------------------------------------------

ReactMarkdown's default `<a>` rendering passes the agent-supplied
href directly to the DOM with no target / scheme handling:

  - http(s) → navigates the canvas tab away (canvas state lost)
  - workspace://path / file:///workspace/... / /workspace/... →
    browser hits unhandled-protocol click → about:blank, no
    download (the reported bug)

Fix: ReactMarkdown `components.a` override:

  - In-container paths (workspace:, file:///{workspace,configs,home,
    plugins}, bare /{workspace,configs,...}) → preventDefault, route
    through downloadChatFile (same auth path the AttachmentChip
    uses). Filename is derived from the path's last segment.
  - External (http/https/mailto/unknown scheme) → target="_blank"
    rel="noopener noreferrer" so canvas state survives.

Tests
-----

ChatTab.imeAndLinks.test.tsx (4 tests):
  - Enter with isComposing=true → does NOT send, input preserved
  - Enter with keyCode=229 (older-Safari IME) → does NOT send
  - Enter with no IME signal → DOES send (happy path intact)
  - Shift+Enter → does NOT send (newline path intact)

The link-component override is exercised through the full ChatTab
render — the IME tests are jsdom-only and don't load chat history
with markdown messages, so the link test would need a more elaborate
fixture. Manual verification on staging post-deploy is the practical
gate; if the link test grows critical the AttachmentViews-style chip
test can extend.

Verified:
- tsc --noEmit clean
- 4/4 IME tests pass

Reported on production 2026-05-05.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:47:04 -07:00
Hongming Wang dd5df70e59 Merge pull request #2965 from Molecule-AI/rfc-2945-pr-b-typed-events
feat(events): typed EventType registry — SSOT for WS event names (RFC #2945 PR-B)
2026-05-05 16:35:47 -07:00
Hongming Wang f1dc721eeb Merge pull request #2964 from Molecule-AI/fix/delegation-ledger-utf8-truncate-2962
fix(delegation_ledger): rune-safe preview truncation (#2962)
2026-05-05 23:34:57 +00:00
Hongming Wang 5b78bea10d feat(events): typed EventType registry — single source of truth for WS event names (RFC #2945 PR-B)
Pre-RFC-#2945, every BroadcastOnly / RecordAndBroadcast call site
passed a bare string literal:

  h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", payload)

29 producers (Go, ~30 call sites in handlers/, scheduler/, registry/,
bundle/) and ~30 canvas consumers (TS store + listeners) duplicated
the same string with no shared definition. A producer renaming an
event silently broke every consumer — same drift class that produced
the reno-stars data-loss regression on the persistence side. PR-A
fixed the persistence-side SSOT (AgentMessageWriter); PR-B fixes the
event-name SSOT.

What this PR ships

  internal/events/types.go
    - EventType typed string + 29 named constants covering the full
      taxonomy (chat / lifecycle / agent assignment / delegation /
      task / approval / auth).
    - Grouped semantically; new constants must be added here AND
      mirrored in canvas/src/lib/ws-events.ts (parity gate landing
      in PR-B-2 follow-up).
    - AllEventTypes slice — authoritative list for the snapshot
      test + the cross-language parity gate.

  internal/events/types_test.go (3 tests)
    - TestAllEventTypes_IsSnapshot: pins the canonical list. Adding
      a new constant without updating AllEventTypes (or vice versa)
      fails with a one-line diff.
    - TestEventType_NoEmptyConstants: catches accidentally-empty
      values (typo in types.go: const X EventType = ...).
    - TestEventType_AllUppercaseSnakeCase: pins the wire format that
      canvas TS switch statements assume (no kebab-case, no mixed
      case, no leading/trailing/double underscores).

  agent_message_writer.go (single migration)
    - Demonstrates the constant-usage shape:
        events.EventAgentMessage  →  "AGENT_MESSAGE"
    - Other ~30 call sites stay on bare strings for now (this PR
      narrow); the migration happens in PR-B-1 follow-up. Both
      shapes (constant + bare string) co-exist on the wire — the
      typed version is just the recommended path for new code.

Why ship this in stages

  1. PR-B (this): types + tests + first migration → MERGEABLE NOW,
     low risk.
  2. PR-B-1 (follow-up): migrate the remaining ~30 call sites to
     constants. Mechanical, low-risk.
  3. PR-B-2 (follow-up): canvas/src/lib/ws-events.ts mirror + cross-
     language parity gate. Touches both repos.

Per memory feedback_oss_design_philosophy.md (every refactor toward
OSS plugin shape) — this surface is now plugin-safe: external
implementations can import the events package and get the same
named taxonomy without copying strings.

Verified
- go vet ./internal/events/ clean
- go build ./... clean
- TestAllEventTypes_IsSnapshot + TestEventType_* all pass
- TestAgentMessageWriter_* (the only call site touched) still green

Refs RFC #2945, PR #2949 (PR-A SSOT), PR #2944 (reno-stars).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:25:38 -07:00
Hongming Wang a5903af459 fix(delegation_ledger): rune-safe preview truncation (#2962)
The previous byte-slice form `s[:previewCap]` could split a multi-byte
codepoint at byte 4096, producing invalid UTF-8. Postgres JSONB rejects
the row → ledger insert silently fails → audit gap on dashboards while
activity_logs continues to record the event.

Walk the string by rune index and stop at the last boundary that fits
inside the cap. ASCII-only strings still hit the cap exactly; CJK/emoji
strings stop slightly under, never over.

Mirrors the truncatePreviewRunes fix shipped for agent_message_writer
in #2959. Followup: deduplicate into a shared helper once both have
landed.

Tests: 2 regression tests using utf8.ValidString — one with an all-3-byte
rune string just over the cap, one with a single multi-byte rune sitting
exactly on the boundary. Verified on the previous byte-slice impl: both
new tests would fail (invalid UTF-8 + truncation past cap by 1 byte).
2026-05-05 16:19:51 -07:00
Hongming Wang 07d09f3696 Merge pull request #2959 from Molecule-AI/rfc-2945-pr-a-followup-utf8-and-db-errors
fix(handlers): UTF-8-safe preview truncation + distinguish DB errors from not-found (PR-A followup)
2026-05-05 16:19:29 -07:00
Hongming Wang f7c270bf24 Merge pull request #2955 from Molecule-AI/auto-sync/main-e0df90c2
chore: sync main → staging (auto, ff to e0df90c2)
2026-05-05 16:19:03 -07:00
Hongming Wang 0301f90183 Merge pull request #2961 from Molecule-AI/fix/doctor-register-side-effect
fix(mcp-doctor): heartbeat (idempotent) instead of register (UPSERT) — self-review of #2954
2026-05-05 23:18:54 +00:00
Hongming Wang feef80423b Merge pull request #2958 from Molecule-AI/fix/external-connect-templates-mcp-command
fix(external-connect): use molecule-mcp wrapper in Codex/OpenClaw templates (#2957)
2026-05-05 16:18:23 -07:00
Hongming Wang 469b24ff8f Merge pull request #2960 from Molecule-AI/fix/memory-tab-v2-self-review
fix(memory): self-review on PR #2956 — drop speculative field, tighten 503 match
2026-05-05 23:15:50 +00:00
Hongming Wang c4d3c9a451 fix(memory): self-review on PR #2956 — drop speculative field, tighten 503 match
Two issues caught in five-axis self-review of #2956:

## 1. Drop speculative source_workspace_id rendering

The panel rendered a "from peer" badge based on
`propagation.source_workspace_id`, claiming it surfaced cross-
workspace propagation. But the OpenAPI spec at
docs/api-protocol/memory-plugin-v1.yaml documents `propagation` as
"Opaque metadata the plugin stores and returns. Reserved for future
cross-namespace propagation semantics" — and a grep across
workspace-server/internal/memory/ confirms NO writer in the codebase
populates that key. The badge would never render against real data.

Violates "don't design for hypothetical future requirements" from
the project conventions. Drop the field from MemoryV2, the row badge,
the test fixtures, and the JSDoc. When propagation gains a concrete
shape, re-add backed by an actual writer.

## 2. Tighten 503 detection — match the literal contract string

Pre-fix detection: `msg.includes('503') || msg.toLowerCase().includes('plugin is not configured')`
False-positives on any unrelated 503 + on any error mentioning
"plugin" + "configured" in any order.

Post-fix: `msg.includes('MEMORY_PLUGIN_URL')` — the env var name is a
hard-coded literal in workspace-server/internal/handlers/memories_v2.go's
available() error, so this is a pinned cross-layer contract. Drift
between the Go error message and the canvas detection now fails
loud (TestMemoriesV2_PluginUnwired_All503 asserts the env var name
in the response body; the canvas test asserts the same).

Extracted as a named export `isPluginUnavailableError` so the
detection is unit-testable and reusable. Added 4 direct tests:
contract-string match, generic-503 false-negative, 401 false-
negative, non-Error inputs.

## Test results

- 30 component tests pass (was 26; +4 for isPluginUnavailableError)
- Coverage on MemoryInspectorPanel.tsx: 100% lines, 100% functions
  (branch coverage up to 85.9% from 84.7% — speculative-field
  branches no longer count)
- Full canvas suite: 1277/1277 pass across 91 files
2026-05-05 16:11:13 -07:00
Hongming Wang 2652ea8342 fix(mcp-doctor): heartbeat (idempotent) instead of register (UPSERT)
Self-review caught after #2954 landed: check_register() POSTed to
/registry/register with agent_card.name="doctor-probe". The endpoint
is an UPSERT, so the doctor probe overwrites the workspace's actual
agent_card metadata until the real agent's next register call. An
operator running `molecule-mcp doctor` against a live workspace
would see their canvas briefly display "doctor-probe" as the agent
name — invisible production-disruption.

Switches to POST /registry/heartbeat. heartbeat only updates
last_heartbeat_at (and clears awaiting_agent if needed) — the same
work a normal molecule-mcp boot does every 20s in steady state, so
the doctor's extra heartbeat is indistinguishable from background
traffic.

Function renamed check_register → check_token_auth to match what
it actually does. check_register kept as back-compat alias so any
external test/import still resolves.

Also unified the duplicated token-resolution paths into a single
_resolve_token() returning (value, source_label). Pre-fix:
check_register and _resolve_token_summary read env in parallel
ladders — a future env-var addition would have to touch both.

New tests:
  - test_check_token_auth_uses_heartbeat_endpoint: mocks urlopen,
    asserts the URL ends in /registry/heartbeat AND does NOT
    contain /registry/register. Pins the load-bearing invariant
    so a future refactor can't silently re-route through register.
  - test_resolve_token_returns_value_and_label_for_env: pins the
    consolidated resolver returns both pieces of info from the
    same source-decision.
  - test_resolve_token_returns_none_when_missing: missing-env
    happy path.

Verification:
  - 13/13 tests pass (10 existing + 3 new)
  - Manual stripped-env run still renders 4 FAIL + 2 WARN with
    actionable hints, exit 1.

Refs molecule-core#2934 item 6 (doctor side-effect fix-up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:11:08 -07:00
Hongming Wang 1e01083e55 fix(handlers): UTF-8-safe preview truncation + distinguish DB errors from not-found (RFC #2945 PR-A followup)
Self-review of PR #2949 surfaced two pre-existing defects that the
SSOT consolidation inherited from the original /notify handler. Both
are addressable in a small follow-up; shipping them as a separate PR
keeps the consolidation and the bug-fix individually reviewable.

Critical: byte-slice preview truncation produces invalid UTF-8
-------------------------------------------------------------

Pre-fix:

    if len(preview) > 80 {
        preview = preview[:80] + "…"
    }

`len()` returns BYTES; `preview[:80]` slices on a byte boundary. For
agent-authored chat in CJK / emoji / accented characters, byte 80
lands mid-codepoint → invalid UTF-8 → Postgres JSONB rejects → INSERT
fails → activity_log row never written → message vanishes from chat
history on the next reload. The persistence-failure log fires but
operators have to grep to find it, and the user-visible regression
mode is identical to reno-stars.

Fix: extract `truncatePreviewRunes(s, maxRunes)` that walks the rune
boundary using `for i := range s` (Go's range over string yields rune
start indices). Cap at 80 RUNES not bytes — UI-friendly count, not
storage count.

Important: workspace-lookup error path swallows real DB errors
--------------------------------------------------------------

Pre-fix:

    if err := w.db.QueryRowContext(...).Scan(&wsName); err != nil {
        return ErrWorkspaceNotFound
    }

Conflates `sql.ErrNoRows` (legit not-found → caller 404) with real
DB errors (connection drop, query timeout, pool exhaustion → caller
should 503). During a Postgres outage every notify call surfaced as
"workspace not found" — masking the actual incident in alerting and
making the symptom indistinguishable from "you typed a bad workspace
ID".

Fix: distinguish via `errors.Is(err, sql.ErrNoRows)` and wrap
non-not-found errors with `fmt.Errorf("agent_message: workspace
lookup: %w", err)`. Callers' existing fallback path (return 500 /
return error wrapped) handles the new shape correctly without any
changes — verified by running existing TestNotify_* and
TestMCPHandler_SendMessage_* tests.

Tests added (3 new, 11 total writer tests)
------------------------------------------

- TestTruncatePreviewRunes_RuneBoundary: 8-case table — ASCII, CJK,
  exactly-at-max, emoji prefix. Asserts both correct visible output
  AND `utf8.ValidString` on every result so the bug shape (invalid
  UTF-8) can't recur.

- TestAgentMessageWriter_Send_NonASCIIMessagePersists: end-to-end
  with a 200-rune CJK message (exceeds the 80-rune cap, would have
  hit the byte-slice bug). Pins the INSERT summary contains valid
  UTF-8 with exactly 80-rune body + ellipsis.

- TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped: pins the
  DB-outage path returns a wrapped non-ErrWorkspaceNotFound error so
  alerting can distinguish 404 from 503. Verified via mock
  ExpectQuery returning a transient error.

Verified
--------

- `go vet ./internal/handlers/` clean
- `go build ./...` clean
- All 14 writer + caller tests pass (8 original + 3 new + AST gate +
  TestNotify_* + TestMCPHandler_SendMessage_* sibling tests)

Per memory feedback_assert_exact_not_substring.md: every new test
asserts boundary behavior directly (UTF-8 validity, exact rune count,
errors.Is comparison) rather than substring-match in stringified
output.

Refs RFC #2945, PR #2949, PR #2944.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:10:58 -07:00
Hongming Wang eab36e217e fix(external-connect): use molecule-mcp wrapper in Codex/OpenClaw templates (#2957)
The External Connect modal's Codex and OpenClaw tabs were rendering
this MCP server config:

  command = "python3"
  args = ["-m", "molecule_runtime.a2a_mcp_server"]

That spawns the bare MCP dispatcher with no presence wiring. The
``molecule-mcp`` console-script wrapper (mcp_cli.main) is what calls
``POST /registry/register`` at startup and runs the 20s heartbeat
thread alongside the MCP stdio loop. Without the wrapper, the canvas
flips the workspace back to ``awaiting_agent`` (OFFLINE) within
60-90s — even while tools work — because nothing is heartbeating.

Operator-side this looks like: the workspace is registered and tools
work fine when invoked, but the canvas shows "offline" / "Restart"
CTA, peer agents see the workspace as awaiting_agent in list_peers
output, and inbound A2A delivery silently fails the readiness check.
A new external-Codex operator (#2957) hit this and spent debugging
time on what should have been a copy-paste install.

Fix: switch both Codex and OpenClaw templates to
``command = "molecule-mcp"`` / ``args = []``, matching the universal
MCP template that already handles this correctly. Inline comment in
each template explains the wrapper-vs-bare-module tradeoff so a
future template author doesn't regress to the shorter form.

Hermes-channel intentionally still spawns the bare module — the
hermes plugin owns the platform plugin path and runs its own
register_platform/heartbeat code in-process; double-heartbeating
would race. Universal/Codex/OpenClaw all need the wrapper.

Regression gate: TestExternalMcpTemplates_UseMoleculeMcpWrapper
asserts the three templates that must use the wrapper actually do,
and explicitly fails on the old ``-m molecule_runtime.a2a_mcp_server``
shape. Verified the test FAILS on pre-fix source by stashing only
external_connection.go and re-running.

Source: molecule-core#2957 issue 1 (item 4 of the report — the
``(codex returned empty output)`` / opaque-canvas-error / stale-
session items live in codex-channel-molecule and are tracked
separately).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 16:06:02 -07:00
Hongming Wang 7ee696ec9a Merge pull request #2954 from Molecule-AI/feat/molecule-mcp-doctor
feat(mcp): add molecule-mcp doctor onboarding diagnostic (#2934 item 6)
2026-05-05 22:57:28 +00:00
Hongming Wang decec9b9a1 Merge pull request #2956 from Molecule-AI/feat/memory-tab-v2-redesign
feat(memory): redesign Memory tab for v2 plugin
2026-05-05 22:56:55 +00:00
Hongming Wang ada27fdb5d Merge pull request #2949 from Molecule-AI/rfc-2945-pr-a-agent-message-writer
refactor(handlers): AgentMessageWriter SSOT — consolidate Notify + MCP send_message_to_user (RFC #2945 PR-A)
2026-05-05 22:56:28 +00:00
Hongming Wang f0f4d0e761 feat(memory): redesign Memory tab for v2 plugin
Replaces the v1 LOCAL/TEAM/GLOBAL tab trio (mapped to the deprecated
shared_context model) with a v2 plugin-driven UI. Without this,
canvas Memory tab was reading the frozen agent_memories table while
all post-cutover agent writes went to the plugin's memory_records —
the tab silently displayed stale data.

## Backend (workspace-server)

New routes under wsAuth, all behind the existing per-tenant token:

  GET    /workspaces/:id/v2/namespaces      → readable + writable lists
  GET    /workspaces/:id/v2/memories        → plugin search proxy
  DELETE /workspaces/:id/v2/memories/:mid   → plugin forget proxy

memories_v2.go — slim handler:
  - Server-side ACL: every search request is intersected with the
    resolver's readable-namespaces set (canvas-supplied namespace
    that the workspace can't read returns [] not 403, matches v1
    existence-non-inferring shape).
  - Returns 503 with "set MEMORY_PLUGIN_URL" hint when plugin
    isn't wired (canvas surfaces a banner).
  - Maps plugin not_found → 404, other plugin errors → 502.
  - View shaping: NamespaceView.label rendered server-side
    ("Workspace (abc-1234)", "Team (t-99)", "Org (acme)", custom)
    so canvas doesn't parse namespace names. MemoryView surfaces
    pin/expires_at/score/source_workspace_id from Propagation.

memories_v2_test.go — 100% line + 100% function coverage:
  - 503 path on every endpoint when unwired
  - Namespaces success + readable/writable error paths
  - Search: empty intersection, full-path query/kind/limit
    propagation, namespace=/no-namespace branches, propagation
    map missing/wrong-type, intersect error, plugin error
  - Forget: success, plugin not_found→404, other plugin
    errors→502, missing memoryId→400
  - Helpers: namespaceLabel for all 4 kinds + truncation,
    parseLimit edge cases (default/0/negative/over-cap/non-num),
    memoryToView field round-trip, indexOfColon, shortID

## Frontend (canvas)

MemoryInspectorPanel rewritten for v2:
  - Drop LOCAL/TEAM/GLOBAL trio. Namespace dropdown driven by
    GET /v2/namespaces.readable, "All namespaces" default.
  - New per-row badges: kind (F/S/C), source (agent/runtime/user),
    pin (📌), TTL countdown (12h / "expired"), score% on
    semantic search, source-workspace ⇡ws-pee for propagated.
  - Drop Edit button — v2 plugin contract has no PATCH; the
    model is forget + recommit. Forget stays.
  - Plugin-unavailable banner with operator hint when /v2/*
    returns 503.
  - Bug fix surfaced by test: rollback-on-failed-delete order
    of operations (loadEntries() called setError(null) AFTER
    we set the failure message, wiping it). Reload first, then
    set the error.

MemoryEditorDialog deleted — Add was POST /memories which v2
doesn't support from canvas (writes go via MCP). The legacy
Edit-flow tests go with it.

## Test results

Backend: `go test ./internal/handlers/` — all pass
Backend coverage on memories_v2.go: 100% lines, 100% functions
Canvas: `vitest run` — 91 files, 1273 tests pass (26 new)
Canvas coverage on MemoryInspectorPanel.tsx: 100% lines,
  100% functions, 96.7% statements, 84.7% branches
  (uncovered branches are defensive `?? fallback` for
   contract-impossible kind/source values)

## Migration note

The legacy v1 GET/POST/PATCH/DELETE on /workspaces/:id/memories
remains in place for the back-compat MCP shim (mcp_tools_memory_v2's
legacy routing) and admin export/import. PR-9 (#283) drops
agent_memories along with the v1 endpoints once the cutover
verification window closes.
2026-05-05 15:53:28 -07:00
molecule-ai[bot] e0df90c294 Merge pull request #2951 from Molecule-AI/staging
staging → main: auto-promote 1edee11
2026-05-05 15:48:32 -07:00
Hongming Wang f01f374072 feat(mcp): add molecule-mcp doctor onboarding diagnostic
Closes #2934 item 6 — the deferred follow-up from Ryan's onboarding-
friction report. Quote: "this single command would have saved me
30 of the 45 minutes."

When push delivery fails or the install half-works, the operator
today has no signal — they hand-grep the Claude Code binary or
chase the `from versions: none` red herring. Doctor renders six
checks in one screen with concrete next-step suggestions:

  1. Python version    >=3.11? (wheel's pin)
  2. Wheel install     molecule-ai-workspace-runtime importable +
                        version surfaced
  3. PATH for binary   `molecule-mcp` resolves on PATH; if not,
                        prints the resolved user-site bin dir to
                        add (or recommends pipx)
  4. Env vars          PLATFORM_URL + WORKSPACE_ID + token (env or
                        *_FILE or .auth_token)
  5. Platform reach    GET ${PLATFORM_URL}/healthz returns 2xx
  6. Registry register POST /registry/register with the resolved
                        token returns 2xx — end-to-end auth check

Each line: `[OK|WARN|FAIL] <label>: <status>` plus a `next:` hint
when not OK. ANSI colors auto-disable on non-TTY / NO_COLOR.

Exit code: 0 on all-OK or only-WARN, 1 on any FAIL — scriptable
from CI install-checks.

## Files

`workspace/mcp_doctor.py`  (new) — six check functions + `run()`
                                   entry point. Uses urllib (stdlib)
                                   so doctor works even on a partial
                                   install where `requests` is missing.

`workspace/mcp_cli.py`             Subcommand dispatch:
                                     molecule-mcp doctor   → mcp_doctor.run()
                                     molecule-mcp --help   → usage banner
                                     molecule-mcp          → server (unchanged)

`workspace/tests/test_mcp_doctor.py`  (new) — 10 tests covering each
                                       check's pass/fail/skip path
                                       plus the end-to-end exit-code
                                       contract on a stripped env.

`scripts/build_runtime_package.py`    Adds `mcp_doctor` to
                                       TOP_LEVEL_MODULES so the
                                       wheel ships the new module.

## Out of scope (deferred follow-ups)
- Claude Code-specific checks (parse ~/.claude.json, verify each
  MCP entry is plugin-sourced + dev-channels flag set). That's a
  separate Claude-Code-shaped doctor; lives in the channel plugin.
- Automated remediation. Doctor is diagnostic — tells the operator
  what's wrong + how to fix it, doesn't apply changes.

## Verification
  - python -m pytest tests/test_mcp_doctor.py -v   → 10/10 PASS
  - python -m pytest tests/test_mcp_cli*.py        → 67/67 PASS
    (existing CLI suite still green; subcommand dispatch added
    before env-validation, doesn't disturb the server-boot path)
  - manual: `molecule-mcp doctor` on a stripped env renders 4 FAIL
    + 2 WARN + exit code 1, with each `next:` hint actionable

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:44:36 -07:00
Hongming Wang 1edee1131b Merge pull request #2948 from Molecule-AI/auto-sync/main-f5ea812e
chore: sync main → staging (auto, ff to f5ea812e)
2026-05-05 15:29:46 -07:00
Hongming Wang d99b3f2aec refactor(handlers): consolidate Notify + MCP send_message_to_user through AgentMessageWriter (RFC #2945 PR-A)
Pre-RFC-#2945 the broadcast + activity_log INSERT for "agent → user
chat" was duplicated across two handlers — activity.go's Notify (HTTP
/notify) and mcp_tools.go's toolSendMessageToUser (MCP tools/call).
The duplication is exactly what produced the reno-stars production
data-loss regression (PR #2944): the persistence-half fix landed for
one handler and silently lagged for the other for months, dropping
every long-form external-agent message on reload.

PR #2944 added the missing INSERT to mcp_tools.go and a forward-
looking AST gate. This PR removes the duplication at the source.

What changes
------------

NEW: workspace-server/internal/handlers/agent_message_writer.go
- AgentMessageWriter struct + NewAgentMessageWriter ctor.
- Send(ctx, workspaceID, message, attachments) error: workspace
  lookup → broadcast WS AGENT_MESSAGE → INSERT activity_logs.
- ErrWorkspaceNotFound for the lookup-miss path so callers can
  return 404 / JSON-RPC error cleanly.
- Best-effort persistence: INSERT failure logs only, returns nil so
  the broadcast success isn't undone (matches previous behavior in
  both call sites — pinned by test).
- Takes events.EventEmitter (interface) so tests can substitute a
  capturing fake without nil-panicking inside hub.Broadcast.

UPDATED: activity.go:Notify
- Replaced ~75 lines of inline broadcast+INSERT with a 12-line
  call to AgentMessageWriter.Send.
- Attachment shape conversion (NotifyAttachment → AgentMessageAttachment)
  is local to the HTTP handler; the writer's API doesn't import the
  HTTP-binding-tagged type.

UPDATED: mcp_tools.go:toolSendMessageToUser
- Replaced ~40 lines (the post-#2944 broadcast+INSERT pair) with a
  6-line call to the writer.
- Attachments is nil today because the MCP tool args don't expose
  attachments yet. When the schema adds it, build the slice and
  pass through; the writer half is ready.

Tests
-----

agent_message_writer_test.go (8 tests, comprehensive):
- TestAgentMessageWriter_Send_Success_NoAttachments — happy path,
  pins JSON `{"result":"hi"}`.
- TestAgentMessageWriter_Send_Success_WithAttachments — pins file
  parts shape (kind=file, file.{uri,name,mimeType,size}). Uses a
  jsonMatcher that decodes + asserts via predicate (tolerant of
  map key ordering, exact on shape).
- TestAgentMessageWriter_Send_WorkspaceNotFound — pins
  ErrWorkspaceNotFound + asserts NO broadcast NO INSERT.
- TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil — pins
  best-effort persistence contract.
- TestAgentMessageWriter_Send_PreviewTruncation — pins ≤80-char
  preview + ellipsis (Ryan's onboarding-friction report would have
  bloated activity_logs.summary by 2KB without this).
- TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent — pins WS
  event name + payload shape via capturingEmitter.
- TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty — pins
  the "no key when nil" wire contract.

The existing AST gate from #2944
(TestAgentMessageBroadcastsArePersisted) still holds: any future
function emitting AGENT_MESSAGE without an INSERT fails the test.
With the writer in place that's now redundant — both producers go
through it — but the gate is cheap to keep as defense-in-depth.

Verified: go vet clean; all writer + caller tests pass; existing
TestNotify_* + TestMCPHandler_SendMessage_* + the AST gate all green.

Refs RFC #2945, PR #2944.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:29:42 -07:00
molecule-ai[bot] f5ea812e9d Merge pull request #2947 from Molecule-AI/staging
staging → main: auto-promote c4807a9
2026-05-05 22:22:58 +00:00
Hongming Wang 3b7ed9cf53 Merge pull request #2946 from Molecule-AI/fix/onboarding-followup-2934
mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934)
2026-05-05 22:19:21 +00:00
Hongming Wang da9061c131 mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934)
Self-review of #2935 turned up two real defects:

1. Stale README issue references — the build_runtime_package.py
   README template said "(issue #2934 follow-up)" twice, but the
   marketplace-plugin and `doctor` items now have dedicated tracking
   issues. Updated to point at #2936 and #2937 respectively.

2. Silent fallthrough on broken MOLECULE_WORKSPACE_TOKEN_FILE — when
   an operator EXPLICITLY pointed TOKEN_FILE at a path that didn't
   exist / wasn't readable / was blank / contained internal whitespace,
   the resolver silently returned the generic "set one of these three
   vars" error. That's exactly the silent failure mode #2934 flagged
   ("a new user has no chance"). Refactor `_read_token_from_file_env`
   to return `(token, error)`; surface the SPECIFIC failure when the
   operator's intent was clearly the file path. Skip the CONFIGS_DIR
   fallback in that case so the operator's config bug isn't masked
   by a different source happening to work.

Adds 2 renames + 2 new tests in test_mcp_cli_split.py:
  - test_missing_file_returns_specific_error (asserts "does not exist")
  - test_empty_file_returns_specific_error (asserts "is empty")
  - test_multi_line_file_rejected (asserts "internal whitespace")
  - test_token_file_error_skips_configs_dir_fallback (asserts a valid
    CONFIGS_DIR/.auth_token does NOT silently rescue a broken
    TOKEN_FILE)

All 81 mcp_cli + mcp_cli_multi_workspace + mcp_cli_split tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 15:07:15 -07:00
Hongming Wang c4807a930d Merge pull request #2940 from Molecule-AI/refactor/a2a-tools-inbox-extract-rfc2873-iter4e
refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e)
2026-05-05 21:58:32 +00:00
Hongming Wang d22fbb29b8 Merge pull request #2944 from Molecule-AI/fix-mcp-send-message-to-user-persist
fix(mcp): persist send_message_to_user pushes to activity_log (reno-stars data loss)
2026-05-05 21:57:37 +00:00
Hongming Wang 899c53550d test(mcp): comprehensive coverage for send_message_to_user persistence + AST gate (reno-stars followup)
Per user request: audit all similar tools + write comprehensive tests
including E2E for the persistence-of-AGENT_MESSAGE-broadcasts contract.

Audit (all BroadcastOnly call sites in workspace-server/internal/):

  | Site | Event | Persisted? | Notes |
  |---|---|:---:|---|
  | a2a_proxy_helpers.go:275 | A2A_RESPONSE | ✓ | LogActivity above |
  | activity.go:486 (Notify) | AGENT_MESSAGE | ✓ | INSERT line 535 |
  | activity.go:701 (LogActivity) | ACTIVITY_LOGGED | ✓ | self-emits inside DB write |
  | mcp_tools.go:341 (toolSendMessageToUser) | AGENT_MESSAGE | ✓ NEW (this PR) |
  | registry.go:575 | TASK_UPDATED | N/A | transient progress, not chat |
  | registry.go:596 | WORKSPACE_HEARTBEAT | N/A | infra ping, not chat |

Only one chat-bearing broadcast was missing persistence (the just-
fixed mcp bridge path). No other regressions found.

Tests added (4 new, total 5 send_message_to_user tests):

1. TestAgentMessageBroadcastsArePersisted — AST gate that walks every
   non-test .go in the package, finds funcs that BroadcastOnly with
   "AGENT_MESSAGE", asserts each ALSO contains an
   "INSERT INTO activity_logs". Forward-looking regression block:
   any future chat tool that broadcasts without persisting fails the
   test with a clear file:func diagnostic. Mutation-tested locally:
   removing the INSERT block from toolSendMessageToUser reliably
   produces the expected failure.

2. TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s — pins
   the "best-effort persistence" contract. DB INSERT failures must
   NOT abort the tool response (the WS broadcast already succeeded;
   retrying would double-render in the live chat). Matches /notify.

3. TestMCPHandler_SendMessageToUser_ResponseBodyShape — pins the
   exact `{"result": "<message>"}` JSON shape stored in
   response_body. The canvas hydrater (extractResponseText in
   historyHydration.ts) reads body.result; any drift here silently
   breaks chat history without failing the INSERT. Per memory
   feedback_assert_exact_not_substring.md, asserts the literal JSON
   shape, not a substring.

4. TestMCPHandler_SendMessageToUser_PersistsToActivityLog (existing,
   from previous commit) — pins INSERT shape with regex on
   'a2a_receive' + 'notify' literals.

5. TestMCPHandler_SendMessageToUser_Blocked_WhenEnvNotSet (existing)
   — env-gate aborts before DB.

Test fixture cleanup: newMCPHandler now uses newTestBroadcaster (real
ws.Hub) instead of events.NewBroadcaster(nil) — the latter nil-panics
inside hub.Broadcast on the AGENT_MESSAGE path. Same broadcaster
shape every other handler test uses.

E2E note: the AST gate is the strongest forward-looking guarantee.
A real-DB integration test would add value for CI but is largely
duplicative of the sqlmock contract tests above (sqlmock pins SQL
shape with much faster feedback). Left as a future enhancement when
the handlers Postgres-integration suite extends MCP coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:52:32 -07:00
Hongming Wang cdfc9f743f fix(mcp): persist send_message_to_user pushes to activity_log (reno-stars data loss)
Reported on production tenant reno-stars: an external claude-code agent
(CEO Ryan PC workspace) sent a long-form message via send_message_to_user;
the user saw it live in the chat panel but it vanished after a refresh.
Confirmed via direct production query — the message is NOT in
activity_logs at all (only short test pings around it are persisted).

Root cause: there are TWO server-side handlers for send_message_to_user:

  1. HTTP `/workspaces/:id/notify` (activity.go:Notify) — broadcasts WS
     AND inserts a row into activity_logs. This is the path the
     in-container runtime's tool_send_message_to_user calls.

  2. MCP-bridge `tools/call name=send_message_to_user`
     (mcp_tools.go:toolSendMessageToUser) — broadcasts WS only,
     **never persisted**. This is the path EXTERNAL agents using
     molecule-mcp's send_message_to_user tool route through.

The persistence fix landed for path 1 months ago but was never mirrored
on path 2. External agents — exactly the case in reno-stars/CEO Ryan PC
— have been silently losing every long-form notification on reload.

Fix: mirror the activity.go INSERT shape inside toolSendMessageToUser:

  INSERT INTO activity_logs
    (workspace_id, activity_type, method, summary, response_body, status)
  VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')

Same wire shape as /notify so the canvas's chat-history hydration
(`type=a2a_receive&source=canvas`) treats both writers identically.
Errors are log-only — broadcast already succeeded, persistence failure
shouldn't block the tool response (matches /notify behavior; downside
is the same data-loss-on-DB-error risk, surfaced via log.Printf).

Tests
-----

- `TestMCPHandler_SendMessageToUser_PersistsToActivityLog` — pins both
  the workspace-name lookup AND the INSERT shape. Regex-matches
  `'a2a_receive'` + `'notify'` literals so a future refactor that
  changes activity_type or method breaks the test loud, not silently
  re-introducing the data-loss bug.
- Updated newMCPHandler to use newTestBroadcaster() (real ws.Hub) —
  events.NewBroadcaster(nil) crashes inside hub.Broadcast in the
  send_message_to_user path. Same shape every other handler test uses.

Verified `go test ./internal/handlers/ -run TestMCPHandler_SendMessage`
green; full vet clean.

Refs reno-stars production incident 2026-05-05.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:47:48 -07:00
molecule-ai[bot] 7a2664523c Merge pull request #2942 from Molecule-AI/staging
staging → main: auto-promote 1ad107c
2026-05-05 14:47:21 -07:00
molecule-ai[bot] 632e906640 Merge pull request #2938 from Molecule-AI/staging
staging → main: auto-promote b906e1d
2026-05-05 21:29:56 +00:00
Hongming Wang 475da5b64c refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e)
Continues the OSS-shape refactor. After iters 4a-4d (rbac, delegation,
memory, messaging) the only behavior left in ``a2a_tools.py`` was
``report_activity`` plus three thin inbox-tool wrappers and the
``_enrich_inbound_for_agent`` helper. This iter extracts the inbox
slice to ``a2a_tools_inbox.py`` so the kitchen-sink module shrinks
from 280 LOC to ~165 LOC of imports + report_activity + back-compat
re-export blocks.

Extracted symbols:
  - ``_INBOX_NOT_ENABLED_MSG`` (sentinel)
  - ``_enrich_inbound_for_agent`` (poll-path peer enrichment helper)
  - ``tool_inbox_peek``
  - ``tool_inbox_pop``
  - ``tool_wait_for_message``

Re-exports (`from a2a_tools_inbox import …`) preserve the public
``a2a_tools.tool_inbox_*`` surface so existing tests + call sites
continue to resolve unchanged.

New tests in test_a2a_tools_inbox_split.py:
  1. **Drift gate (5)** — every previously-public symbol on a2a_tools
     is the EXACT same object as a2a_tools_inbox.foo (`is`, not `==`),
     catches a future "wrap with logging" refactor that silently loses
     existing test coverage.
  2. **Import contract (1)** — a2a_tools_inbox does NOT eagerly import
     a2a_tools at module load. Pins the layered architecture: the
     extracted slice depends on ``inbox`` + a lazy ``a2a_client``
     import, never on the kitchen-sink that re-exports it.
  3. **_enrich_inbound_for_agent branches (5)** — peer_id-empty
     (canvas_user) returns dict unchanged; missing peer_id key same;
     a2a_client unavailable (test harness, partial install) degrades
     gracefully with a bare envelope; registry hit populates
     peer_name + peer_role + agent_card_url; registry miss still
     surfaces agent_card_url (constructable from peer_id alone).

The full timeout-clamp / validation / JSON-shape behavior matrix for
the three wrappers stays in test_a2a_tools_inbox_wrappers.py — those
tests pass identically against both the alias and the underlying impl.

Wiring updates:
  - ``scripts/build_runtime_package.py``: add ``a2a_tools_inbox`` to
    ``TOP_LEVEL_MODULES`` so it ships in the runtime wheel and the
    drift gate doesn't fail the next publish.
  - ``.github/workflows/ci.yml``: add ``a2a_tools_inbox.py`` to
    ``CRITICAL_FILES`` so the 75% MCP/inbox/auth per-file floor
    applies — this is now where the inbox-delivery code actually
    lives.
2026-05-05 14:28:58 -07:00
Hongming Wang 1ad107cc15 Merge pull request #2935 from Molecule-AI/fix/onboarding-friction-2934
fix(onboarding): address Claude Code MCP onboarding friction (#2934)
2026-05-05 21:25:57 +00:00
Hongming Wang e4bd1e4293 Merge pull request #2933 from Molecule-AI/auto-sync/main-226e57a9
chore: sync main → staging (auto, ff to 226e57a9)
2026-05-05 14:22:57 -07:00
Hongming Wang 01deeb36cf fix(onboarding): address Claude Code MCP onboarding friction (#2934)
Ryan's bug report (#2934) walked through ~45 min of debugging a stock
external-runtime install. This PR fixes the four items he flagged that
have a small surface, and stubs out the larger ones for follow-up.

Fixed in this PR
================

#1 — Python floor disclosure (README in publish bundle)
  Add an explicit "Requires Python ≥3.11" section that calls out the
  cryptic "Could not find a version that satisfies the requirement"
  failure mode; recommend `pipx install` over `pip install` so the
  binary lands on PATH automatically; show the explicit `pip install
  --user` alternative with the PATH caveat.

#3 — MOLECULE_WORKSPACE_TOKEN_FILE support (mcp_workspace_resolver.py)
  Add a third resolution step between the inline env var and the
  in-container CONFIGS_DIR fallback. Operators can write the bearer to
  a 0600 file (e.g. ~/.config/molecule/token) and point
  MOLECULE_WORKSPACE_TOKEN_FILE at it, keeping the secret out of
  ~/.zsh_history and out of plaintext in MCP-host configs like
  ~/.claude.json. Inline TOKEN still wins on conflict so rotation flows
  are predictable. README documents the safer option as the
  recommended path. 6 new tests pin every leg (file resolves, inline
  wins, missing/empty file falls through, blank env unset-equivalent,
  help text advertises it).

#4 — Push delivery 3-condition gating (README in publish bundle)
  Document that real-time push on Claude Code requires (a) the server
  to declare experimental.claude/channel (we do), (b) the server to be
  marketplace-plugin-sourced (operators must scaffold their own until
  the official marketplace lands — see #2934 follow-up), and (c) the
  --dangerously-load-development-channels flag on the claude
  invocation. Until any of the three is in place, delivery silently
  falls back to poll mode with no diagnostic. The README now says all
  of this explicitly so a new operator doesn't grep the binary for
  channel_enable to figure it out.

#8 — serverInfo.name mismatch (a2a_mcp_server.py)
  The server reported `serverInfo.name = "a2a-delegation"` while
  operators register it as `molecule` (the name in `claude mcp add
  molecule …`). Harmless on tool routing today but matters for any
  future Claude Code allowlist that gates push by hardcoded server
  name. Renamed to "molecule" with an inline comment explaining the
  invariant.

Deferred (separate issues to track)
===================================

#2 — covered transitively by #1's pipx recommendation; no separate fix.
#5 — `moleculesai/claude-code-plugin` marketplace repo (substantial new
     repo work; the README references it as a documented follow-up).
#6 — `molecule-mcp doctor` subcommand (substantial new CLI surface;
     mentioned in the README's push-vs-poll section as the planned
     diagnostic for silent push fallback).
#7 — `--dangerously-load-development-channels` rename — not in our
     control; that's Claude Code's flag.

Tests
=====
164/164 mcp_cli + a2a_mcp_server tests pass locally
(WORKSPACE_ID=00000000-0000-0000-0000-000000000001 pytest …) including
6 new TestTokenFileEnv cases. Wheel builds successfully via
scripts/build_runtime_package.py with the new README markers verified
in the output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 14:19:09 -07:00
Hongming Wang b906e1da61 Merge pull request #2892 from Molecule-AI/refactor/a2a-tools-messaging-extract-rfc2873-iter4d
refactor(workspace): extract messaging tools from a2a_tools.py (RFC #2873 iter 4d)
2026-05-05 21:07:44 +00:00
molecule-ai[bot] 226e57a942 Merge pull request #2931 from Molecule-AI/staging
staging → main: auto-promote a1d2027
2026-05-05 21:03:11 +00:00
Hongming Wang abc3affcb6 test(a2a_tools): cover inbox tool wrappers to restore 75% per-file floor
After RFC #2873 iter 4d extracted messaging tools to
``a2a_tools_messaging.py``, the only behavior left in ``a2a_tools.py``
is ``report_activity`` (covered by test_a2a_tools_impl) plus three
thin wrappers around inbox state — ``tool_inbox_peek``,
``tool_inbox_pop``, ``tool_wait_for_message`` — which were never
directly exercised at the module level.

Per-file critical-path coverage dropped to 54.4% on the iter 4d
branch, breaking the 75% MCP/inbox/auth floor in ci.yml.

Adds ``test_a2a_tools_inbox_wrappers.py`` — 14 focused tests on the
three wrappers covering: inbox-disabled fallback (via the
_INBOX_NOT_ENABLED_MSG sentinel), input validation
(empty/non-str activity_id, non-int peek limit), the timeout clamp
contract on wait_for_message (300s ceiling, 0s floor, non-numeric
fallback to 60s), JSON-shape pinning, and the limit/activity_id
forwarding contract.

Result: a2a_tools.py back to 100% covered with the existing impl-tests
suite, gate green.
2026-05-05 13:59:58 -07:00
Hongming Wang 3322524b0f Merge remote-tracking branch 'origin/staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d
# Conflicts:
#	workspace/a2a_tools.py
2026-05-05 13:57:44 -07:00
Hongming Wang de01ff51b0 Merge pull request #2932 from Molecule-AI/refactor/embed-help-fix-docs-hostname
refactor(external-connect): embed help in agent paste + fix wrong docs hostname
2026-05-05 20:57:15 +00:00
Hongming Wang f3782662bd refactor(external-connect): embed help in agent paste, fix wrong docs hostname
Two related fixes to the Connect-External-Agent flow that the user
flagged: the "Need help?" disclosure block in the modal is for the
operator's eyes only — but the agent reading the pasted snippet has
no access to that context. And the docs URL was pointing at a
hostname that doesn't resolve.

User-visible problems:
1. The agent doesn't see the install link, docs link, or the common-
   error/check pairs that the human pasted. When the agent fails to
   register or hits ConnectionRefused, it can't self-diagnose because
   the troubleshooting context lives in a separate UI block.
2. https://docs.molecule.ai → DNS NXDOMAIN. Every "Documentation"
   link in the modal was a dead link.

## Fixes

### Move help INTO the snippet (not a separate human-only UI block)
Each of the 7 server-rendered templates in
`workspace-server/internal/handlers/external_connection.go` now
appends a `# Need help?` section with: install link, correct docs
link, and the top common errors as `# • symptom — check` pairs.

Templates updated: curl / channel (Claude Code) / mcp (Universal MCP) /
python / hermes / codex / openclaw. Agents reading the paste now have
the same diagnostic context the human did.

### Drop the duplicated UI block in the canvas modal
`canvas/src/components/ExternalConnectModal.tsx`:
- Removed the `TAB_HELP` per-tab metadata constant (152 lines).
- Removed the `HelpBlock` component (62 lines).
- Removed the `<HelpBlock help={TAB_HELP[tab]} />` render call.

The snippet is now the single source of truth for tab-level help.

### Fix the wrong docs hostname
The actual docs site is `doc.moleculesai.app` (singular `doc`,
`.app` not `.ai`), confirmed by:
- `package.json` description in `Molecule-AI/docs` repo →
  "Molecule AI documentation site — doc.moleculesai.app"
- HTTP HEAD on the new URL → 200 for both
  `/docs/guides/mcp-server-setup` and
  `/docs/guides/external-agent-registration`
- HTTP HEAD on old `docs.molecule.ai` → 000 (NXDOMAIN)

All template docs URLs now point at `doc.moleculesai.app`.

## Verification
- `go build ./...` clean
- `go test ./internal/handlers/... -count=1` green
- `pnpm test` → 1291/1291 pass (unchanged)
- `tsc --noEmit` clean
- 219 LOC removed (canvas duplicate UI), 69 LOC added (snippet help)
- Net `-150 LOC` while gaining the agent-readable help

## Out of scope (deferred, captured in followups)
- One blog post still has `canonical: "https://docs.molecule.ai/blog/..."`
  in `src/app/blog/2026-04-20-chrome-devtools-mcp/page.mdx` — separate
  blog-content fix.
- Comment in `theme-provider.tsx` references `docs.moleculesai.app`
  (with `s`) — comment-only, not a runtime URL.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:51:35 -07:00
Hongming Wang e9eb3868d5 Merge pull request #2930 from Molecule-AI/docs/python-version-callout-mcp-snippet
docs: callout Python>=3.11 on Universal MCP snippet + runtime doc
2026-05-05 20:48:31 +00:00
Hongming Wang cb70d3d437 docs: callout Python>=3.11 requirement on Universal MCP install snippet
User-reported friction: pip install molecule-ai-workspace-runtime on a
3.10 interpreter fails with "Could not find a version that satisfies the
requirement (from versions: none)" — pip's requires_python filter
silently drops the only available artifact before attempting install,
so the error doesn't mention Python at all. Operators see
"package missing", file a bug, and chase a phantom CDN/visibility
issue.

Two changes mirror the requirement at the two operator-touch surfaces:

1. workspace-server/internal/handlers/external_connection.go:
   the externalUniversalMcpTemplate snippet (rendered into the
   canvas Connect-External-Agent modal) now leads with a brief
   "Requires Python >= 3.11" block + diagnostic + upgrade paths.

2. docs/workspace-runtime-package.md: same callout at the top of
   the doc, before the Overview, so anyone landing here from search
   gets the answer immediately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:44:25 -07:00
Hongming Wang a1d202723d Merge pull request #2929 from Molecule-AI/auto-sync/main-ef67dc51
chore: sync main → staging (auto, ff to ef67dc51)
2026-05-05 13:42:55 -07:00
Hongming Wang 0d0840d9d9 Merge branch 'staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d 2026-05-05 13:41:55 -07:00
Hongming Wang fc30b5c9de Merge pull request #2905 from Molecule-AI/fix/poll-path-message-enrichment
fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url
2026-05-05 20:36:41 +00:00
molecule-ai[bot] ef67dc513e Merge pull request #2928 from Molecule-AI/staging
staging → main: auto-promote 2bf6a70
2026-05-05 20:33:52 +00:00
Hongming Wang 23d3f057d3 Merge pull request #2890 from Molecule-AI/refactor/a2a-tools-memory-extract-rfc2873-iter4c
refactor(workspace): extract memory tools from a2a_tools.py (RFC #2873 iter 4c)
2026-05-05 20:31:45 +00:00
Hongming Wang 8ca027ddf3 fix(tests): drop unused json + pytest imports
Bot lint flagged the two imports as unused (correct — neither is
referenced after the file shrank during review). Resolves the two
unresolved review threads silently blocking merge per the staging
"all conversations resolved" gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:26:49 -07:00
Hongming Wang 46a4ef83bb fix(tests): patch a2a_tools_memory.httpx, not a2a_tools.httpx
Iter 4c (#2890) moved tool_commit_memory + tool_recall_memory into
a2a_tools_memory.py, which has its own top-level `import httpx`.
test_mcp_memory.py + the secret-redact memory tests still patched
`a2a_tools.httpx.AsyncClient`, which after the move is the WRONG
module's reference — the real call inside the moved tool resolves to
`a2a_tools_memory.httpx.AsyncClient` and reaches the network. CI
catches this as 7 failures: JSONDecodeError on empty bodies and
"All connection attempts failed" on the recall side.

Update 7 patch sites to `a2a_tools_memory.httpx.AsyncClient`. The
existing tests in `test_a2a_tools_impl.py` were already updated by
the iter-4c PR; only these two files were missed.

Verified: pytest workspace/tests/test_mcp_memory.py +
test_secret_redact.py — 43/43 pass after the fix (both files were
red on the iter-4c branch CI).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:25:06 -07:00
Hongming Wang a6afc18de5 Merge pull request #2927 from Molecule-AI/fix/org-import-polish-2872
fix(org-import): polish — wrap-safe ErrNoRows, bounded lookup, godoc (#2872)
2026-05-05 20:25:01 +00:00
Hongming Wang 423d58d42c fix(org-import): polish — wrap-safe ErrNoRows, bounded lookup, godoc
Three small hardening passes from #2872's optional/important findings,
batched into one polish PR:

1. errors.Is(err, sql.ErrNoRows) instead of err == sql.ErrNoRows.
   The bare equality breaks if any future caller wraps the error via
   fmt.Errorf("…: %w", err) — the no-rows happy path would fall
   through to the "real DB error" branch and abort the import.
   errors.Is unwraps. New test
   TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound pins the
   fix; verified the test fails on the old `==` shape (build break
   on unused-import + assertion failure once import dropped).

2. Bounded 5s timeout on lookupExistingChild instead of
   context.Background().
   The createWorkspaceTree call site runs in goroutines spawned from
   the /org/import handler, so plumbing the request context here
   would cascade-cancel into provisionWorkspaceAuto and abort
   in-flight EC2 provisioning if the client disconnected mid-import
   — that's the wrong tradeoff. A short bounded timeout protects the
   per-row SELECT against a wedged DB without taking the
   drop-everything-on-disconnect behaviour. The lookup is a single
   ~10ms query; 5s leaves 500x headroom for transient slow paths.

3. Godoc clarifications on the skip-path block.
   - /org/import is ADDITIVE-ONLY, never destructive. Children
     present in the existing tree but absent from the new template
     are preserved (no DELETE on diff).
   - Skip-path does NOT propagate updates to existing nodes — a
     re-import that adds an initial_memory or schedule to an
     existing workspace is silently dropped. Document the limitation
     so future operators know to delete-and-re-import or reach for
     a future /org/sync route.

Verification:
  - go build ./... → clean
  - go test ./internal/handlers/... → all passing (TestLookup* +
    TestCreateWorkspaceTree* + TestClass1* + TestGate*)
  - 4 lookup tests + 1 new wrap-safety test → 5/5 PASS
  - Full handlers suite → green

Refs molecule-core#2872 (Optional findings — wrap-safety + ctx, godoc
clarifications for additive-only + skip-path-update-limitation)

Out of scope (deferred):
  - PR-D partial unique index migration + ON CONFLICT — sequenced
    after Phase 4 cleanup verified clean per #2872 plan
  - PR-E full createWorkspaceTree integration test for partial-match
    — needs heavier sqlmock scaffolding for downstream
    workspaces_audit/canvas_layouts/secrets/channels INSERTs;
    follow-up

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:20:54 -07:00
Hongming Wang 9386f1d399 Merge pull request #2926 from Molecule-AI/fix/agent-comms-display-parity
fix(canvas): AgentCommsPanel display + initial-state parity with my-chat
2026-05-05 20:15:29 +00:00
Hongming Wang a766e5ce48 Merge pull request #2925 from Molecule-AI/e2e/poll-mode-chat-upload-tests
test(e2e): poll-mode chat upload E2E in standard suite
2026-05-05 20:13:27 +00:00
Hongming Wang 5ad2669f88 fix(canvas): AgentCommsPanel display + initial-state parity with my-chat
User-visible problem: agent-comms panel opens mid-conversation on long
histories (the same chat-opens-in-middle bug PR #2903 fixed for
my-chat) and silently renders empty state when the history fetch fails
(no retry button, no diagnostic).

Three changes mirror the my-chat patterns from ChatTab:

1. Initial-mount instant scroll.
   Adds hasInitialScrollRef + switches the scroll hook from useEffect
   to useLayoutEffect. First arrival of messages → scrollIntoView
   `instant`; subsequent appends → `smooth` as before. useLayoutEffect
   runs before paint so the user never sees the panel jump for one
   frame on every append.

2. Error UI with Retry button.
   Adds `loadError` state. The history-load .catch now sets the
   error message; a new branch in the render renders a red alert
   with the failure text and a Retry button that re-invokes
   `loadInitial`. Same shape as ChatTab MyChatPanel's `loadError`
   handling — both surfaces should fail loud, not silent.

3. Extracted `loadInitial` callback.
   The history-load body becomes a useCallback so the retry button
   has a stable reference to call. Mirrors ChatTab's loadInitial.

Tests (4 new in AgentCommsPanel.render.test.tsx):
- Loading state renders the loading copy.
- Error state with Retry button renders on rejection; clicking
  Retry fires a second api.get.
- Empty state renders when load succeeds with zero rows.
- scrollIntoView is called with behavior=instant on first message
  arrival (pins the chat-opens-in-middle prevention).

Verification:
- pnpm test → 1284/1284 pass (1280 prior + 4 new)
- tsc --noEmit → clean
- 92 → 93 test files, no existing test broken

Closes the parity gap raised in chat. The two surfaces now share:
loading copy / error UI / empty-state placeholder / scroll behaviour /
useLayoutEffect timing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:09:36 -07:00
Hongming Wang 0ca4e431c1 test(e2e): add poll-mode chat upload E2E and wire into e2e-api.yml
Covers the user-visible flow that Phase 1-5b shipped (RFC #2891):
register a poll-mode workspace, POST a multi-file /chat/uploads, verify
the activity feed shows one chat_upload_receive row per file, fetch the
bytes via /pending-uploads/:fid/content, ack each row, and confirm a
post-ack fetch returns 404. Also pins cross-workspace bleed protection
(workspace B's bearer on A's URL → 401, B's URL with A's file_id →
404) and the file_id-UUID-parse 400 path.

23 assertions, all green against a local platform (Postgres+Redis+
platform-server stack matches the e2e-api.yml CI recipe verbatim).

Why a new script instead of extending test_poll_mode_e2e.sh: that
script tests A2A short-circuit + since_id cursor semantics; this one
tests the chat-upload path. They share zero handler code on the
platform side and would dilute each other's failure messages if
combined.

Why not the bearerless-401 strict-mode assertion: the platform's
wsauth fail-opens for bearerless requests when MOLECULE_ENV=development
(see middleware/devmode.go). The CI workflow doesn't set that var, but
some local-dev .env files do — the assertion would flap by environment
without testing the poll-mode upload contract. The middleware's own
unit tests cover strict-mode 401.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:08:55 -07:00
Hongming Wang 2bf6a7005f Merge pull request #2923 from Molecule-AI/feat/class1-ast-gate-2867
test(handlers): generic Class 1 leak AST gate (#2867 PR-A)
2026-05-05 20:04:59 +00:00
Hongming Wang 16ead69641 Merge pull request #2913 from Molecule-AI/fix-saas-logout-ui-missing
fix(canvas): wire SaaS Sign-out button — /cp/auth/signout was unreachable from UI
2026-05-05 20:03:53 +00:00
Hongming Wang 60afcd43c9 test(handlers): generic Class 1 leak AST gate (#2867 PR-A)
Adds class1_ast_gate_test.go — a per-package AST walk that fails the
build if any handler function INSERTs INTO workspaces inside a range
loop body without one of three escape hatches:

  1. A call to a registered preflight helper (lookupExistingChild today;
     extend preflightCallNames as new helpers are introduced).
  2. An ON CONFLICT clause in the same SQL literal (idempotent UPSERT,
     like registry.go).
  3. An explicit `// class1-gate: idempotent-by-design` comment in the
     function body (deliberately awkward — forces a code-review beat).

Why this is broader than the existing
TestCreateWorkspaceTree_CallsLookupBeforeInsert gate in
org_import_idempotency_test.go: that one is hard-coded to one function
in one file. This one walks every non-test .go file in the handlers
package and applies a structural rule independent of file/function
names. A future handler written from scratch in a new file would not
have been covered before — now it is.

Detection mechanism (per AST):
  - Collect spans (Lbrace..Rbrace) of every RangeStmt body in each
    function. Position-based instead of stack-based — ast.Inspect's
    nil-callback ordering doesn't give per-node pop semantics, so a
    naive push/pop stack silently miscounts. Position spans are
    deterministic.
  - Walk every BasicLit, regex-match `^\s*INSERT INTO workspaces\(`
    (tightened from bytes.Index "INSERT INTO workspaces" so
    workspaces_audit literals don't false-positive — same regex used
    by the existing createWorkspaceTree gate).
  - For each match: record insertLine, hasONCONFLICT, and the
    innermost enclosing RangeStmt line (or 0 if not inside any range).
  - Fail the function if INSERT is inside a range AND no preflight
    AND no ON CONFLICT AND no allowlist annotation.

Self-tests (per `feedback_assert_exact_not_substring.md` —
verify gate fails on the bug shape before merging):
  - TestClass1_GateFiresOnSyntheticBuggySource: synthetic source
    where INSERT is inside `for _, child := range children` body
    must trigger the gate's three guards (enclosingRangeLine!=0,
    hasONCONFLICT=false, no preflight call).
  - TestClass1_GateAllowsONCONFLICT: synthetic INSERT...ON CONFLICT
    must NOT trigger the gate (idempotent UPSERT case).
  - TestClass1_GateAllowsAllowlistAnnotation: function with
    `// class1-gate: idempotent-by-design` must be skipped.
  - TestClass1_NoUnpreflightedInsertInsideRange: production sweep
    over every handler .go file. Currently passes because
    org_import.go preflights, registry.go ON-CONFLICTs, and
    workspace.go's Create has no INSERT inside a range body.

Verification:
  - go test ./internal/handlers/... -run TestClass1_ -count=1
    → 4/4 PASS
  - go test ./internal/handlers/... -count=1 → suite green
    (no pre-existing test broken by the new file)

Refs molecule-core#2867 (PR-A Class 1 generic AST gate)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:01:34 -07:00
Hongming Wang 9a53529047 ci: retrigger after stuck Canvas tabs E2E (was running 17min vs typical <1min on staging) 2026-05-05 12:38:09 -07:00
Hongming Wang 575f893f4e fix(canvas): consume CP logout_url to break the SSO re-auth loop
Follow-up to molecule-controlplane#485. The first half of #2913 wired
a Sign-out button + signOut() helper that POSTed /cp/auth/signout, but
clicking still left the user signed in: WorkOS's browser cookie
preserved the SSO session, /cp/auth/login auto-re-authed via SSO, and
the user landed back on /orgs.

CP PR #485 returns the AuthKit hosted logout URL in the signout
response. This change has signOut() navigate the browser there
instead of /cp/auth/login. AuthKit clears its cookie + redirects to
return_to (configured server-side from APP_URL) → next /cp/auth/login
hits a fresh AuthKit, no SSO session, login form actually shows.

Defensive parsing: malformed JSON, missing logout_url, or wrong-type
logout_url all fall through to the legacy /cp/auth/login fallback,
which works locally (DisabledProvider, dev) where there's no SSO to
escape.

Forward-compat: when CP doesn't have #485 deployed yet, signOut()
sees logout_url="" or missing → fallback fires. Order of merge
between this and #485 doesn't matter, but the bug isn't actually
fixed end-to-end until both ship.

Tests added (3 new, 15 total auth.test.ts):
- Hosted logout: navigates to logout_url when response includes one.
- DisabledProvider path: falls back to /cp/auth/login when "".
- Defensive: malformed JSON body → fallback (no crash).
- Defensive: non-string logout_url → fallback (no open redirect).

Verified:
- npx vitest run src/lib/__tests__/auth.test.ts — 15/15 pass
- tsc --noEmit clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:21:49 -07:00
Hongming Wang 4cac4e7710 fix(canvas): wire SaaS Sign-out button — POST /cp/auth/signout was unreachable from the UI
Reported externally on 2026-05-05: "SaaS app logout does not work."

Root cause: the control plane has had POST /cp/auth/signout (clears the
WorkOS session cookie + revokes at the provider) since auth shipped,
but no canvas code ever called it. grep across canvas/ for
`logout|signOut|signout|sign-out` returned zero results — no helper,
no button, no menu entry. Users had no path to log out short of
clearing cookies in DevTools.

This is a UI gap, not a backend bug. Adding the missing pieces:

1. `signOut()` helper in `canvas/src/lib/auth.ts`:
   - POST /cp/auth/signout with credentials:include (cross-origin
     cookie required for tenant subdomain → app subdomain)
   - Best-effort: a 5xx, 401-stale-cookie, or network failure still
     redirects the browser to /cp/auth/login. Leaving the user on an
     authed-looking page after they clicked Sign out is the worst
     possible UX — that's the precise "logout doesn't work" symptom
     the report described.
   - Lands on /cp/auth/login (not the current URL) so the user
     doesn't loop back into the org they just left via AuthGate's
     return_to.

2. `AccountBar` component on /orgs page Shell — renders the signed-in
   email + Sign-out button at the top. Click → signOut() →
   `Signing out…` → bounces to login. Disabled-while-pending so a
   double-click can't fire two requests.

3. Tests in `auth.test.ts` (4 new, total 12 pass):
   - POSTs to the right endpoint with credentials:include
   - Redirects to /cp/auth/login after success
   - Redirects EVEN ON network failure (the critical UX invariant)
   - Redirects on 401 (stale cookie path)

The auth-origin resolution (`getAuthOrigin`) is reused so a tenant
subdomain (acme.moleculesai.app) correctly POSTs to
app.moleculesai.app/cp/auth/signout — same chain that fetchSession
+ redirectToLogin already use.

Test plan:
- [x] `npx vitest run src/lib/__tests__/auth.test.ts` — 12/12 green
- [x] `tsc --noEmit` — clean
- [ ] Manual: navigate to /orgs, click Sign out, observe redirect +
      that the next /orgs visit bounces to login (cookie cleared)
- [ ] CI green

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 12:20:18 -07:00
Hongming Wang 3d0a7c381b fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url
Reported: agents receiving messages via inbox_peek / wait_for_message
get a plain envelope — text + peer_id + kind only. The push-path
(a2a_mcp_server._build_channel_notification) already enriches the
meta dict with peer_name, peer_role, and agent_card_url from the
registry cache, but the poll-path returns InboxMessage.to_dict()
unchanged. So a Claude Code host with channel-push gets the friendly
identity, but every other MCP client (and Claude Code with push
disabled — the universal default) sees plain text.

This silently breaks the contract documented in
a2a_mcp_server.py:303-345:

> In both paths the same fields apply: kind, peer_id, peer_name,
> peer_role, agent_card_url, activity_id

Fix: a2a_tools._enrich_inbound_for_agent() — same shape as the
push-path's enrichment, called from tool_inbox_peek and
tool_wait_for_message. Cache-first non-blocking (5-min TTL via
enrich_peer_metadata_nonblocking, same helper push uses), so a cache
miss returns immediately with bare envelope and warms the cache for
the next poll. agent_card_url is constructable from peer_id alone
and surfaces even on cache miss, so the receiving agent always has
a single endpoint to hit for capabilities.

Degradation paths:
- canvas_user (peer_id="") → pass through unchanged, no enrichment
- a2a_client unavailable (test harness without registry) → bare
  envelope, agent still gets text + peer_id + kind + activity_id

Tests:
- canvas_user passes through unchanged
- peer_agent cache hit → name + role + agent_card_url all present
- peer_agent cache miss → agent_card_url still constructed
- a2a_client unavailable → bare envelope, no crash

All 4 pass against fixed code. Without the fix, the cache-hit and
cache-miss tests would fail (peer_name/peer_role/agent_card_url keys
absent from to_dict's output).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:08:14 -07:00
Hongming Wang 8e5d193761 fix(tests): retarget get_peers_with_diagnostic patches to a2a_tools_messaging (RFC #2873 iter 4d)
Inherits the iter 4b test retarget commit through rebase. Adds the
remaining 4 patch sites in test_a2a_multi_workspace.py that target
get_peers_with_diagnostic — that call site moved from a2a_tools to
a2a_tools_messaging in this PR.

Refs RFC #2873 iter 4d.
2026-05-05 09:52:15 -07:00
Hongming Wang 3e0d2e650a refactor(workspace): extract messaging tools from a2a_tools.py to a2a_tools_messaging.py (RFC #2873 iter 4d)
Fourth slice of the a2a_tools.py split (stacked on iter 4c). Owns the
four human-and-peer messaging MCP tools + the chat-upload helper:

  * _upload_chat_files — stage local paths to /chat/uploads
  * tool_send_message_to_user — push canvas-chat via /notify
  * tool_list_peers — discover peers across registered workspaces
  * tool_get_workspace_info — JSON-encode workspace info
  * tool_chat_history — fetch prior conversation rows with a peer

a2a_tools.py shrinks from 508 → 213 LOC (−295). The remaining 213
is just report_activity + back-compat re-exports. Inbox tools
(tool_inbox_peek/pop/wait_for_message) deferred to iter 4e.

Layered architecture: messaging depends on a2a_tools_rbac (iter 4a),
a2a_client, platform_auth — NOT on kitchen-sink a2a_tools. An
import-contract test pins this so future refactors that add
`from a2a_tools import …` fail in CI.

Tests:
  * 28 patch sites in TestToolSendMessageToUser + TestToolListPeers +
    TestToolGetWorkspaceInfo + TestChatHistory retargeted from
    `a2a_tools.{httpx, get_peers_*, get_workspace_info,
    _upload_chat_files, _peer_*, list_registered_workspaces}` to
    `a2a_tools_messaging.…` because the call sites moved.
  * test_a2a_tools_messaging.py adds 7 new tests:
    - 5 alias drift gates
    - 2 import-contract tests (no top-level a2a_tools dep + a2a_tools
      surfaces every messaging symbol)

137 tests total in the a2a_tools suite, all green.

Refs RFC #2873.
2026-05-05 09:50:47 -07:00
Hongming Wang 210a26d31a refactor(workspace): extract memory tools from a2a_tools.py to a2a_tools_memory.py (RFC #2873 iter 4c)
Third slice of the a2a_tools.py split (stacked on iter 4b). Owns the
two persistent-memory MCP tools:

  * tool_commit_memory — write to /workspaces/:id/memories with RBAC
    + GLOBAL-scope tier-zero enforcement
  * tool_recall_memory — search /workspaces/:id/memories with RBAC

a2a_tools.py shrinks from 609 → 508 LOC (−101). Both handlers depend
ONLY on a2a_tools_rbac (iter 4a), a2a_client, and the platform's
/memories endpoint — no entanglement with delegation or messaging.

Side-effects of the layered architecture: a2a_tools_memory's import
contract is "depends on a2a_tools_rbac, never on a2a_tools" — the
kitchen-sink module is for back-compat re-exports only. A test pins
this so a future refactor that re-introduces `from a2a_tools import …`
fails in CI.

Tests:
  * 49 patch sites in TestToolCommitMemory + TestToolRecallMemory
    retargeted from `a2a_tools.{_check_memory_*, _is_root_workspace,
    httpx.AsyncClient}` to `a2a_tools_memory.…` because the call sites
    moved.
  * test_a2a_tools_memory.py adds 4 new tests (alias drift gate +
    import-contract + a2a_tools-side re-export).

117 tests total (77 impl + 28 rbac + 8 delegation + 4 memory), all green.

Refs RFC #2873.
2026-05-05 09:50:39 -07:00
58 changed files with 7459 additions and 2201 deletions
+1
View File
@@ -387,6 +387,7 @@ jobs:
"a2a_mcp_server.py"
"mcp_cli.py"
"a2a_tools.py"
"a2a_tools_inbox.py"
"inbox.py"
"platform_auth.py"
)
+3
View File
@@ -172,6 +172,9 @@ jobs:
- name: Run poll-mode + since_id cursor E2E (#2339)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_poll_mode_e2e.sh
- name: Run poll-mode chat upload E2E (RFC #2891)
if: needs.detect-changes.outputs.api == 'true'
run: bash tests/e2e/test_poll_mode_chat_upload_e2e.sh
- name: Dump platform log on failure
if: failure() && needs.detect-changes.outputs.api == 'true'
run: cat workspace-server/platform.log || true
+47 -3
View File
@@ -18,7 +18,7 @@
// quick bounce between signup and either Checkout or the tenant UI.
import { useEffect, useState } from "react";
import { fetchSession, redirectToLogin, type Session } from "@/lib/auth";
import { fetchSession, redirectToLogin, signOut, type Session } from "@/lib/auth";
import { PLATFORM_URL } from "@/lib/api";
import { formatCredits, pillTone, bannerKind } from "@/lib/credits";
import { TermsGate } from "@/components/TermsGate";
@@ -129,7 +129,7 @@ export default function OrgsPage() {
return <EmptyState banner={justCheckedOut ? <CheckoutBanner /> : null} />;
}
return (
<Shell>
<Shell session={session}>
{justCheckedOut && <CheckoutBanner />}
<ul className="space-y-3">
{orgs.map((o) => (
@@ -160,11 +160,21 @@ function CheckoutBanner() {
);
}
function Shell({ children }: { children: React.ReactNode }) {
function Shell({
children,
session,
}: {
children: React.ReactNode;
// Optional: when present, the header renders the signed-in email +
// a Sign-out button. The empty-state Shell call doesn't have a
// session in scope, so accept null and skip the header chrome there.
session?: Session | null;
}) {
return (
<main className="min-h-screen bg-surface text-ink">
<TermsGate>
<div className="mx-auto max-w-2xl px-6 pt-20 pb-12">
{session ? <AccountBar session={session} /> : null}
<h1 className="text-3xl font-bold text-ink">Your organizations</h1>
<p className="mt-2 text-ink-mid">
Each org is an isolated Molecule workspace.
@@ -177,6 +187,40 @@ function Shell({ children }: { children: React.ReactNode }) {
);
}
// AccountBar renders the signed-in email + a Sign-out button at the
// top of the page. Without this the user has no way to log out — the
// /cp/auth/signout endpoint exists on the control plane but no UI ever
// called it. Reported externally on 2026-05-05; this is the fix.
//
// Click → calls signOut() which POSTs /cp/auth/signout (clears the
// WorkOS session cookie + revokes at the provider) then bounces to
// /cp/auth/login. The signOut helper is best-effort — even on a 5xx
// or network failure the redirect fires so the user never gets stuck
// on an authed-looking page after they clicked Sign out.
function AccountBar({ session }: { session: Session }) {
const [signingOut, setSigningOut] = useState(false);
return (
<div className="mb-6 flex items-center justify-between text-sm text-ink-mid">
<span title="Signed-in user">{session.email}</span>
<button
type="button"
disabled={signingOut}
onClick={async () => {
setSigningOut(true);
await signOut();
// Redirect happens inside signOut; this line is for tests +
// edge cases (jsdom, blocked navigation) where it doesn't.
setSigningOut(false);
}}
className="rounded border border-line bg-surface-card px-3 py-1 text-xs text-ink hover:bg-surface-card disabled:opacity-50"
aria-label="Sign out"
>
{signingOut ? "Signing out…" : "Sign out"}
</button>
</div>
);
}
// DataResidencyNotice surfaces where workspace data lives so EU-based
// signups can make an informed choice (GDPR Art. 13 disclosure
// requirement). Plain text, no icon — the goal is clarity, not
@@ -20,160 +20,6 @@ import * as Dialog from "@radix-ui/react-dialog";
type Tab = "python" | "curl" | "claude" | "mcp" | "hermes" | "codex" | "openclaw" | "fields";
// Per-tab help metadata: docs link, where-to-install link, common errors.
// All URLs verified against repo content (docs/guides/* file paths map to
// docs.molecule.ai/docs/guides/*; canonical hostname confirmed by existing
// blog post canonical metadata) or against the snippet text the operator
// just copied. Never linking to a URL that wasn't already in product —
// dead links here defeat the purpose of "more comprehensive instructions."
const TAB_HELP: Record<
Tab,
{
docsUrl?: string;
docsLabel?: string;
downloadUrl?: string;
downloadLabel?: string;
commonIssues?: { symptom: string; check: string }[];
}
> = {
mcp: {
docsUrl: "https://docs.molecule.ai/docs/guides/mcp-server-setup",
docsLabel: "MCP server setup guide",
downloadUrl: "https://pypi.org/project/molecule-ai-workspace-runtime/",
downloadLabel: "molecule-ai-workspace-runtime on PyPI",
commonIssues: [
{
symptom: "Tools not appearing in your agent",
check:
"Run `claude mcp list` (or your runtime's equivalent) — the molecule entry should be listed. If missing, re-run the `claude mcp add` line.",
},
{
symptom: "ConnectionRefused / DNS error on first call",
check:
"PLATFORM_URL must include the scheme (https://) and have no trailing slash. Verify with `curl $PLATFORM_URL/healthz`.",
},
],
},
python: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
downloadUrl: "https://pypi.org/project/molecule-ai-workspace-runtime/",
downloadLabel: "molecule-ai-workspace-runtime on PyPI",
commonIssues: [
{
symptom: "401 from /heartbeat",
check:
"AUTH_TOKEN expired or wrong workspace_id. Tokens are shown only once at create time — re-create the workspace to get a fresh token.",
},
{
symptom: "AGENT_URL not reachable from platform",
check:
"Public HTTPS URL required for inbound A2A. Use ngrok or Cloudflare Tunnel if your agent is behind NAT.",
},
],
},
claude: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
downloadUrl: "https://claude.com/claude-code",
downloadLabel: "Claude Code (claude.com)",
commonIssues: [
{
symptom: "plugin not installed",
check:
"Run `/plugin marketplace add Molecule-AI/molecule-mcp-claude-channel` then `/plugin install molecule@molecule-mcp-claude-channel` inside Claude Code, then `/reload-plugins`.",
},
{
symptom: "not on the approved channels allowlist",
check:
"Custom channels need `--dangerously-load-development-channels` on the launch command. Team/Enterprise orgs need admin to set `channelsEnabled` + `allowedChannelPlugins` in claude.ai admin settings.",
},
{
symptom: "Inbound messages not arriving",
check:
"Check stderr for `molecule channel: connected — watching N workspace(s)`. Verify ~/.claude/channels/molecule/.env has the right PLATFORM_URL + token.",
},
],
},
hermes: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
downloadUrl: "https://github.com/NousResearch/hermes-agent",
downloadLabel: "hermes-agent (NousResearch)",
commonIssues: [
{
symptom: "Gateway start failure",
check:
"Tail ~/.hermes/gateway.log. YAML duplicate-key in config.yaml is the most common cause — `gateway:` block must appear exactly once.",
},
{
symptom: "Plugin not discovered after install",
check:
"Run `pip show hermes-channel-molecule` to confirm install. Some hermes builds need `hermes plugin reload` before the new platform_plugins entry takes effect.",
},
],
},
codex: {
docsUrl: "https://docs.molecule.ai/docs/guides/mcp-server-setup",
docsLabel: "MCP server setup guide",
downloadUrl: "https://github.com/openai/codex",
downloadLabel: "openai/codex",
commonIssues: [
{
symptom: "[mcp_servers.molecule] not loaded",
check:
"Codex must be ≥ 0.57. Check with `codex --version`; upgrade via `npm install -g @openai/codex@latest`.",
},
{
symptom: "TOML parse error after re-running setup",
check:
"TOML rejects duplicate `[mcp_servers.molecule]` tables. Open ~/.codex/config.toml and remove the old block before pasting the new one.",
},
{
symptom: "Canvas messages don't wake codex",
check:
"Step 3 (codex-channel-molecule bridge daemon) is required for inbound push. Check `pgrep -f codex-channel-molecule` and `tail ~/.codex-channel-molecule/daemon.log`.",
},
],
},
openclaw: {
docsUrl: "https://docs.molecule.ai/docs/guides/mcp-server-setup",
docsLabel: "MCP server setup guide",
commonIssues: [
{
symptom: "Gateway not starting",
check:
"Tail ~/.openclaw/gateway.log. The loopback bind requires :18789 to be free — check with `lsof -iTCP:18789`.",
},
{
symptom: "openclaw mcp set rejected",
check:
"The heredoc generates JSON; verify it parsed by running `jq < ~/.openclaw/mcp/molecule.json`. Re-run `openclaw mcp set` if the file is malformed.",
},
],
},
curl: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
commonIssues: [
{
symptom: "401 / 403 on register",
check:
"WORKSPACE_AUTH_TOKEN must be the value shown at workspace create. Tokens are shown only once.",
},
],
},
fields: {
docsUrl:
"https://docs.molecule.ai/docs/guides/external-agent-registration",
docsLabel: "External agent registration guide",
},
};
export interface ExternalConnectionInfo {
workspace_id: string;
platform_url: string;
@@ -457,7 +303,6 @@ export function ExternalConnectModal({ info, onClose }: Props) {
<Field label="heartbeat_endpoint" value={info.heartbeat_endpoint} onCopy={() => copy(info.heartbeat_endpoint, "hb")} copied={copiedKey === "hb"} />
</div>
)}
<HelpBlock help={TAB_HELP[tab]} />
</div>
<div className="mt-5 flex justify-end gap-2">
@@ -506,70 +351,6 @@ function SnippetBlock({
);
}
// HelpBlock — collapsible "Need help?" section under each tab's snippet.
// Renders only the keys present in the per-tab help metadata (no empty
// sections). Closed by default so the snippet stays the visual focus;
// operators with a working setup never see this. Uses native <details>
// for keyboard accessibility (Tab + Enter) without extra ARIA wiring.
function HelpBlock({
help,
}: {
help: (typeof TAB_HELP)[Tab] | undefined;
}) {
if (!help) return null;
const { docsUrl, docsLabel, downloadUrl, downloadLabel, commonIssues } = help;
if (!docsUrl && !downloadUrl && !commonIssues?.length) return null;
return (
<details className="mt-3 border border-line rounded-lg bg-surface text-xs">
<summary className="cursor-pointer select-none px-3 py-2 text-ink-mid hover:text-ink">
Need help? install link, docs, common errors
</summary>
<div className="px-3 pb-3 pt-1 space-y-2">
{downloadUrl && (
<div>
<span className="text-ink-soft">Where to install: </span>
<a
href={downloadUrl}
target="_blank"
rel="noopener noreferrer"
className="text-accent underline hover:text-accent-strong"
>
{downloadLabel || downloadUrl}
</a>
</div>
)}
{docsUrl && (
<div>
<span className="text-ink-soft">Documentation: </span>
<a
href={docsUrl}
target="_blank"
rel="noopener noreferrer"
className="text-accent underline hover:text-accent-strong"
>
{docsLabel || docsUrl}
</a>
</div>
)}
{commonIssues && commonIssues.length > 0 && (
<div>
<div className="text-ink-soft mb-1">Common errors:</div>
<ul className="space-y-1.5 pl-3">
{commonIssues.map((issue, i) => (
<li key={i}>
<code className="text-warm font-mono">{issue.symptom}</code>
<span className="text-ink-mid"> {issue.check}</span>
</li>
))}
</ul>
</div>
)}
</div>
</details>
);
}
function Field({
label,
value,
@@ -1,261 +0,0 @@
'use client';
import { useEffect, useRef, useState } from "react";
import { createPortal } from "react-dom";
import { api } from "@/lib/api";
import type { MemoryEntry } from "@/components/MemoryInspectorPanel";
type Scope = "LOCAL" | "TEAM" | "GLOBAL";
const SCOPES: Scope[] = ["LOCAL", "TEAM", "GLOBAL"];
interface AddProps {
open: boolean;
mode: "add";
workspaceId: string;
defaultScope: Scope;
defaultNamespace?: string;
entry?: undefined;
onClose: () => void;
onSaved: () => void;
}
interface EditProps {
open: boolean;
mode: "edit";
workspaceId: string;
entry: MemoryEntry;
defaultScope?: undefined;
defaultNamespace?: undefined;
onClose: () => void;
onSaved: () => void;
}
type Props = AddProps | EditProps;
export function MemoryEditorDialog(props: Props) {
const { open, mode, workspaceId, onClose, onSaved } = props;
const dialogRef = useRef<HTMLDivElement>(null);
const [mounted, setMounted] = useState(false);
const [scope, setScope] = useState<Scope>("LOCAL");
const [namespace, setNamespace] = useState("general");
const [content, setContent] = useState("");
const [saving, setSaving] = useState(false);
const [error, setError] = useState<string | null>(null);
useEffect(() => {
setMounted(true);
}, []);
// Reset form whenever the dialog opens.
useEffect(() => {
if (!open) return;
setError(null);
setSaving(false);
if (mode === "edit" && props.entry) {
setScope(props.entry.scope);
setNamespace(props.entry.namespace || "general");
setContent(props.entry.content);
} else if (mode === "add") {
setScope(props.defaultScope);
setNamespace(props.defaultNamespace || "general");
setContent("");
}
// mode/props are stable per-open; intentional shallow deps.
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [open]);
// Move focus into the dialog when it opens (WCAG SC 2.4.3).
useEffect(() => {
if (!open || !mounted) return;
const raf = requestAnimationFrame(() => {
dialogRef.current?.querySelector<HTMLElement>("textarea, input, select")?.focus();
});
return () => cancelAnimationFrame(raf);
}, [open, mounted]);
// Escape closes; Cmd/Ctrl-Enter saves.
const onCloseRef = useRef(onClose);
onCloseRef.current = onClose;
const handleSaveRef = useRef<() => void>(() => {});
useEffect(() => {
if (!open) return;
const handler = (e: KeyboardEvent) => {
if (e.key === "Escape") {
e.preventDefault();
onCloseRef.current();
} else if (e.key === "Enter" && (e.metaKey || e.ctrlKey)) {
e.preventDefault();
handleSaveRef.current();
}
};
window.addEventListener("keydown", handler);
return () => window.removeEventListener("keydown", handler);
}, [open]);
const handleSave = async () => {
if (saving) return;
const trimmed = content.trim();
if (!trimmed) {
setError("Content cannot be empty");
return;
}
setError(null);
setSaving(true);
try {
if (mode === "add") {
await api.post(`/workspaces/${workspaceId}/memories`, {
content: trimmed,
scope,
namespace: namespace.trim() || "general",
});
} else {
// PATCH only sends fields that changed. Content always changeable;
// namespace only sent if it differs from the original (saves a
// no-op write through redactSecrets + re-embed).
const original = props.entry;
const body: Record<string, string> = {};
if (trimmed !== original.content) body.content = trimmed;
const ns = namespace.trim() || "general";
if (ns !== original.namespace) body.namespace = ns;
if (Object.keys(body).length === 0) {
// No-op edit — close without an HTTP round-trip.
onSaved();
onClose();
return;
}
await api.patch(
`/workspaces/${workspaceId}/memories/${encodeURIComponent(original.id)}`,
body,
);
}
onSaved();
onClose();
} catch (e) {
setError(e instanceof Error ? e.message : "Save failed");
} finally {
setSaving(false);
}
};
handleSaveRef.current = handleSave;
if (!open || !mounted) return null;
const titleId = "memory-editor-title";
const isEdit = mode === "edit";
return createPortal(
<div className="fixed inset-0 z-[9999] flex items-center justify-center">
<div className="absolute inset-0 bg-black/60 backdrop-blur-sm" onClick={onClose} />
<div
ref={dialogRef}
role="dialog"
aria-modal="true"
aria-labelledby={titleId}
className="relative bg-surface-sunken border border-line rounded-xl shadow-2xl shadow-black/50 max-w-[480px] w-full mx-4 overflow-hidden"
>
<div className="px-5 py-4 space-y-3">
<h3 id={titleId} className="text-sm font-semibold text-ink">
{isEdit ? "Edit memory" : "Add memory"}
</h3>
{/* Scope */}
<div className="space-y-1">
<label className="text-[10px] text-ink-soft block" htmlFor="memory-editor-scope">
Scope
</label>
{isEdit ? (
<div
id="memory-editor-scope"
className="text-[12px] font-mono text-ink-mid bg-surface rounded px-2 py-1.5 border border-line/50"
title="Scope is fixed on edit. To move a memory across scopes, delete and re-create it."
>
{scope}
</div>
) : (
<div className="flex items-center gap-1" id="memory-editor-scope" role="radiogroup" aria-label="Scope">
{SCOPES.map((s) => (
<button
key={s}
type="button"
role="radio"
aria-checked={scope === s}
onClick={() => setScope(s)}
className={[
"px-3 py-1 text-[11px] rounded transition-colors",
scope === s
? "bg-accent-strong text-white"
: "bg-surface-card text-ink-mid hover:text-ink",
].join(" ")}
>
{s}
</button>
))}
</div>
)}
</div>
{/* Namespace */}
<div className="space-y-1">
<label htmlFor="memory-editor-namespace" className="text-[10px] text-ink-soft block">
Namespace
</label>
<input
id="memory-editor-namespace"
type="text"
value={namespace}
onChange={(e) => setNamespace(e.target.value)}
placeholder="general"
className="w-full bg-surface border border-line/60 focus:border-accent/60 rounded px-2 py-1.5 text-[12px] text-ink placeholder-zinc-600 focus:outline-none transition-colors"
/>
</div>
{/* Content */}
<div className="space-y-1">
<label htmlFor="memory-editor-content" className="text-[10px] text-ink-soft block">
Content
</label>
<textarea
id="memory-editor-content"
value={content}
onChange={(e) => setContent(e.target.value)}
rows={6}
placeholder="What should the agent remember?"
className="w-full bg-surface border border-line/60 focus:border-accent/60 rounded px-2 py-1.5 text-[12px] font-mono text-ink placeholder-zinc-600 focus:outline-none transition-colors resize-y min-h-[100px] max-h-[300px]"
/>
</div>
{error && (
<div
role="alert"
aria-live="assertive"
className="px-2 py-1.5 bg-red-950/30 border border-red-800/40 rounded text-[11px] text-bad"
>
{error}
</div>
)}
</div>
<div className="flex items-center justify-end gap-2 px-5 py-3 border-t border-line bg-surface/50">
<button
type="button"
onClick={onClose}
disabled={saving}
className="px-3.5 py-1.5 text-[13px] text-ink-mid hover:text-ink bg-surface-card hover:bg-surface-elevated border border-line hover:border-line-soft rounded-lg transition-colors focus:outline-none focus-visible:ring-2 focus-visible:ring-accent/40 disabled:opacity-50 disabled:cursor-not-allowed"
>
Cancel
</button>
<button
type="button"
onClick={handleSave}
disabled={saving}
className="px-3.5 py-1.5 text-[13px] rounded-lg transition-colors bg-accent hover:bg-accent-strong text-white focus:outline-none focus-visible:ring-2 focus-visible:ring-offset-2 focus-visible:ring-offset-surface-sunken focus-visible:ring-accent/60 disabled:opacity-50 disabled:cursor-not-allowed"
>
{saving ? "Saving…" : isEdit ? "Save changes" : "Add memory"}
</button>
</div>
</div>
</div>,
document.body,
);
}
+383 -229
View File
@@ -1,30 +1,81 @@
'use client';
import { useState, useEffect, useCallback } from "react";
import { api } from "@/lib/api";
import { ConfirmDialog } from "@/components/ConfirmDialog";
import { MemoryEditorDialog } from "@/components/MemoryEditorDialog";
/**
* MemoryInspectorPanel — Memory v2 redesign.
*
* Reads the canvas Memory tab from the v2 plugin via the
* workspace-server proxy at /v2/{namespaces,memories}, replacing the
* v1 LOCAL/TEAM/GLOBAL trio that mapped to the deprecated
* shared_context model.
*
* Surface differences from v1:
* - Namespace dropdown driven by GET /v2/namespaces (workspace /
* team / org / custom — labels rendered server-side).
* - Per-row badges for kind (fact|summary|checkpoint), source
* (agent|runtime|user), pin (📌), TTL countdown, and propagation
* source-workspace if the memory came from a peer.
* - No Edit affordance — v2's plugin contract has no PATCH; the
* model is forget + recommit. Delete (Forget) stays.
*
* Shipping note: when the plugin isn't wired (MEMORY_PLUGIN_URL
* unset), every endpoint returns 503 with a clear hint. The panel
* surfaces that as a banner so operators know to set the env var,
* rather than rendering a perpetual empty state that looks like
* "no memories yet".
*/
import { useCallback, useEffect, useMemo, useState } from 'react';
import { api } from '@/lib/api';
import { ConfirmDialog } from '@/components/ConfirmDialog';
// ── Types ─────────────────────────────────────────────────────────────────────
/** Memory entry returned by GET /workspaces/:id/memories */
export interface MemoryEntry {
id: string;
workspace_id: string;
content: string;
scope: "LOCAL" | "TEAM" | "GLOBAL";
namespace: string;
created_at: string;
/**
* Semantic similarity score (01). Only present when the API is queried
* with ?q=<query> and the pgvector backend has been deployed.
* Absent on plain list fetches — renders gracefully without a badge.
*/
similarity_score?: number;
export type NamespaceKind = 'workspace' | 'team' | 'org' | 'custom';
export interface NamespaceView {
name: string;
kind: NamespaceKind;
label: string;
}
type Scope = "LOCAL" | "TEAM" | "GLOBAL";
const SCOPES: Scope[] = ["LOCAL", "TEAM", "GLOBAL"];
export interface NamespacesResponse {
readable: NamespaceView[];
writable: NamespaceView[];
}
export type MemoryKind = 'fact' | 'summary' | 'checkpoint';
export type MemorySource = 'agent' | 'runtime' | 'user';
export interface MemoryV2 {
id: string;
namespace: string;
content: string;
kind: MemoryKind;
source: MemorySource;
pin: boolean;
expires_at?: string | null;
created_at: string;
/** 0..1 plugin similarity score; only present when ?q= is set. */
score?: number | null;
// Note: an earlier iteration of this type carried a `source_workspace_id`
// field rendered as a "from peer" badge. The propagation contract that
// would have populated it ("Reserved for future cross-namespace
// propagation semantics" in memory-plugin-v1.yaml) is unimplemented —
// nothing in the codebase writes that key. Removed in self-review.
// Re-add when propagation gains a concrete shape.
}
interface MemoriesResponse {
memories: MemoryV2[];
}
// MemoryEntry kept as a back-compat type alias so any other component
// still importing it doesn't break the build. New consumers should
// prefer MemoryV2 — the v1 shape (LOCAL/TEAM/GLOBAL scope) is gone.
//
// `unknown` is used over `any` so TS still flags accidental field
// access on the legacy shape.
export type MemoryEntry = MemoryV2;
interface Props {
workspaceId: string;
@@ -32,11 +83,26 @@ interface Props {
// ── Helpers ───────────────────────────────────────────────────────────────────
/**
* Sanitise a memory id for use in an HTML id attribute.
*/
function sanitizeId(id: string): string {
return id.replace(/[^a-zA-Z0-9]/g, "-");
return id.replace(/[^a-zA-Z0-9]/g, '-');
}
/**
* Detect a memory-plugin-503 error from the api wrapper's stringified
* Error message. Matches on the literal env-var name rather than the
* status code, because the api shim renders status codes inside a
* larger formatted message and a future status-code reformat would
* silently break the detection.
*
* The substring `MEMORY_PLUGIN_URL` is hard-coded in the handler at
* `workspace-server/internal/handlers/memories_v2.go:available()`,
* so this is a pinned cross-layer contract — drift is caught by both
* the Go test (TestMemoriesV2_PluginUnwired_All503) and the canvas
* test (TestMemoryInspectorPanel — plugin unavailable).
*/
export function isPluginUnavailableError(err: unknown): boolean {
const msg = err instanceof Error ? err.message : '';
return msg.includes('MEMORY_PLUGIN_URL');
}
function formatRelativeTime(iso: string): string {
@@ -47,6 +113,24 @@ function formatRelativeTime(iso: string): string {
return new Date(iso).toLocaleDateString();
}
/**
* Render a TTL countdown like "12h", "3d", or "expired" (when the
* stored expires_at is in the past). Non-fatal if expires_at is null
* or invalid — falls through to empty string so the badge doesn't
* render.
*/
export function formatTTL(expiresAt: string | null | undefined): string {
if (!expiresAt) return '';
const ts = new Date(expiresAt).getTime();
if (Number.isNaN(ts)) return '';
const diff = ts - Date.now();
if (diff <= 0) return 'expired';
if (diff < 60_000) return `${Math.floor(diff / 1000)}s`;
if (diff < 3_600_000) return `${Math.floor(diff / 60_000)}m`;
if (diff < 86_400_000) return `${Math.floor(diff / 3_600_000)}h`;
return `${Math.floor(diff / 86_400_000)}d`;
}
// ── Skeleton rows ──────────────────────────────────────────────────────────────
function MemorySkeletonRows() {
@@ -71,63 +155,92 @@ function MemorySkeletonRows() {
// ── Component ─────────────────────────────────────────────────────────────────
const ALL_NAMESPACES = '__all__';
export function MemoryInspectorPanel({ workspaceId }: Props) {
const [activeScope, setActiveScope] = useState<Scope>("LOCAL");
const [activeNamespace, setActiveNamespace] = useState("");
const [entries, setEntries] = useState<MemoryEntry[]>([]);
const [namespaces, setNamespaces] = useState<NamespacesResponse | null>(null);
const [activeNamespace, setActiveNamespace] = useState<string>(ALL_NAMESPACES);
const [entries, setEntries] = useState<MemoryV2[]>([]);
const [loading, setLoading] = useState(true);
const [error, setError] = useState<string | null>(null);
// ── Search state (debounced) ────────────────────────────────────────────────
const [searchQuery, setSearchQuery] = useState("");
const [debouncedQuery, setDebouncedQuery] = useState("");
// Plugin-disabled banner (503 from server). Stored separately so we
// can keep showing the namespace dropdown empty rather than
// hiding the whole panel.
const [pluginUnavailable, setPluginUnavailable] = useState(false);
// Search state (debounced)
const [searchQuery, setSearchQuery] = useState('');
const [debouncedQuery, setDebouncedQuery] = useState('');
useEffect(() => {
const timer = setTimeout(
() => setDebouncedQuery(searchQuery.trim()),
300
);
const timer = setTimeout(() => setDebouncedQuery(searchQuery.trim()), 300);
return () => clearTimeout(timer);
}, [searchQuery]);
// ── Delete state ─────────────────────────────────────────────────────────────
// Delete state
const [pendingDeleteId, setPendingDeleteId] = useState<string | null>(null);
// ── Editor state (Add + Edit share one modal) ───────────────────────────────
type EditorState =
| { mode: "add" }
| { mode: "edit"; entry: MemoryEntry }
| null;
const [editorState, setEditorState] = useState<EditorState>(null);
// ── Namespace loading ──────────────────────────────────────────────────────
// ── Data loading ────────────────────────────────────────────────────────────
const loadNamespaces = useCallback(async () => {
try {
const data = await api.get<NamespacesResponse>(
`/workspaces/${workspaceId}/v2/namespaces`,
);
setNamespaces(data);
setPluginUnavailable(false);
} catch (e) {
// Plugin-unavailable (503) indicates MEMORY_PLUGIN_URL isn't set.
// Anything else stays as a generic load failure that the
// entries-load path will also flag.
if (isPluginUnavailableError(e)) {
setPluginUnavailable(true);
}
setNamespaces({ readable: [], writable: [] });
}
}, [workspaceId]);
// ── Entries loading ────────────────────────────────────────────────────────
const loadEntries = useCallback(async () => {
setLoading(true);
setError(null);
try {
const params = new URLSearchParams();
params.set("scope", activeScope);
if (debouncedQuery) params.set("q", debouncedQuery);
if (activeNamespace) params.set("namespace", activeNamespace);
if (activeNamespace !== ALL_NAMESPACES) {
params.set('namespace', activeNamespace);
}
if (debouncedQuery) params.set('q', debouncedQuery);
const url = `/workspaces/${workspaceId}/memories?${params.toString()}`;
const data = await api.get<MemoryEntry[]>(url);
const url = `/workspaces/${workspaceId}/v2/memories?${params.toString()}`;
const data = await api.get<MemoriesResponse>(url);
// When a semantic query is active, sort by similarity_score descending.
// When a semantic query is active and the plugin returns
// scores, sort by score descending so the most-relevant hit
// sits at the top. Empty score → push to bottom.
const sorted = debouncedQuery
? [...data].sort(
(a, b) => (b.similarity_score ?? 0) - (a.similarity_score ?? 0)
? [...data.memories].sort(
(a, b) => (b.score ?? 0) - (a.score ?? 0),
)
: data;
: data.memories;
setEntries(sorted);
} catch (e) {
setError(e instanceof Error ? e.message : "Failed to load memories");
if (isPluginUnavailableError(e)) {
setPluginUnavailable(true);
setError(null); // surfaced via banner, not row error
} else {
setError(e instanceof Error ? e.message : 'Failed to load memories');
}
setEntries([]);
} finally {
setLoading(false);
}
}, [workspaceId, activeScope, debouncedQuery, activeNamespace]);
}, [workspaceId, activeNamespace, debouncedQuery]);
useEffect(() => {
loadNamespaces();
}, [loadNamespaces]);
useEffect(() => {
loadEntries();
@@ -144,16 +257,35 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
setEntries((prev) => prev.filter((e) => e.id !== id));
try {
await api.del(`/workspaces/${workspaceId}/memories/${encodeURIComponent(id)}`);
await api.del(`/workspaces/${workspaceId}/v2/memories/${encodeURIComponent(id)}`);
} catch (e) {
setError(e instanceof Error ? e.message : "Delete failed — reloading...");
// Reload first (which clears any stale error), THEN set the
// delete-failure message — otherwise loadEntries' own
// `setError(null)` wipes our error before the user sees it.
// Caught by the rollback test in MemoryInspectorPanel.test.tsx.
const msg = e instanceof Error ? e.message : 'Delete failed — reloading…';
await loadEntries();
setError(msg);
}
}, [pendingDeleteId, workspaceId, loadEntries]);
// ── Namespace dropdown options ─────────────────────────────────────────────
const dropdownOptions = useMemo(() => {
const opts: Array<{ value: string; label: string; kind?: NamespaceKind }> = [
{ value: ALL_NAMESPACES, label: 'All namespaces' },
];
if (namespaces) {
for (const ns of namespaces.readable) {
opts.push({ value: ns.name, label: ns.label, kind: ns.kind });
}
}
return opts;
}, [namespaces]);
// ── Render ──────────────────────────────────────────────────────────────────
if (loading && entries.length === 0 && !error) {
if (loading && entries.length === 0 && !error && !pluginUnavailable) {
return (
<div className="flex items-center justify-center h-32">
<span className="text-xs text-ink-soft">Loading memories</span>
@@ -163,32 +295,44 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
return (
<div className="flex flex-col h-full">
{/* Scope tabs */}
<div className="px-4 pt-3 pb-2 border-b border-line/40 shrink-0">
<div className="flex items-center gap-1">
{SCOPES.map((scope) => (
<button
type="button"
key={scope}
onClick={() => setActiveScope(scope)}
aria-pressed={activeScope === scope}
className={[
"px-3 py-1 text-[11px] rounded transition-colors",
activeScope === scope
? "bg-accent-strong text-white"
: "bg-surface-card text-ink-mid hover:bg-surface-card hover:text-ink",
].join(" ")}
>
{scope}
</button>
))}
{/* Plugin-unavailable banner */}
{pluginUnavailable && (
<div
role="alert"
aria-live="polite"
className="mx-4 mt-3 px-3 py-2 bg-amber-950/30 border border-amber-800/40 rounded text-xs text-amber-300 shrink-0"
data-testid="plugin-unavailable-banner"
>
Memory plugin not configured. Set <code>MEMORY_PLUGIN_URL</code> on the
workspace-server to enable v2 memory.
</div>
</div>
)}
{/* Search bar + namespace filter */}
{/* Namespace dropdown */}
<div className="px-4 pt-3 pb-2 border-b border-line/40 shrink-0 space-y-2">
<div className="flex items-center gap-2">
<label htmlFor="namespace-dropdown" className="text-[10px] text-ink-soft shrink-0">
Namespace:
</label>
<select
id="namespace-dropdown"
value={activeNamespace}
onChange={(e) => setActiveNamespace(e.target.value)}
aria-label="Filter by namespace"
disabled={pluginUnavailable}
className="flex-1 bg-surface-sunken border border-line/60 focus:border-accent/60 rounded px-2 py-1 text-[11px] text-ink focus:outline-none transition-colors min-w-0 disabled:opacity-50 disabled:cursor-not-allowed"
>
{dropdownOptions.map((opt) => (
<option key={opt.value} value={opt.value}>
{opt.label}
{opt.kind ? ` (${opt.kind})` : ''}
</option>
))}
</select>
</div>
{/* Search bar */}
<div className="relative flex items-center">
{/* Magnifying glass icon */}
<svg
width="12"
height="12"
@@ -206,14 +350,15 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
onChange={(e) => setSearchQuery(e.target.value)}
placeholder="Semantic search…"
aria-label="Search memories"
className="w-full bg-surface-sunken border border-line/60 focus:border-accent/60 rounded-lg pl-8 pr-7 py-1.5 text-[11px] text-ink placeholder-zinc-600 focus:outline-none transition-colors"
disabled={pluginUnavailable}
className="w-full bg-surface-sunken border border-line/60 focus:border-accent/60 rounded-lg pl-8 pr-7 py-1.5 text-[11px] text-ink placeholder-zinc-600 focus:outline-none transition-colors disabled:opacity-50 disabled:cursor-not-allowed"
/>
{searchQuery && (
<button
type="button"
onClick={() => {
setSearchQuery("");
setDebouncedQuery("");
setSearchQuery('');
setDebouncedQuery('');
}}
aria-label="Clear search"
className="absolute right-2 text-ink-soft hover:text-ink transition-colors text-sm leading-none"
@@ -222,51 +367,26 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
</button>
)}
</div>
{/* Namespace filter */}
<div className="flex items-center gap-2">
<label htmlFor="namespace-filter" className="text-[10px] text-ink-soft shrink-0">
Namespace:
</label>
<input
id="namespace-filter"
type="text"
value={activeNamespace}
onChange={(e) => setActiveNamespace(e.target.value)}
placeholder="all namespaces"
aria-label="Filter by namespace"
className="flex-1 bg-surface-sunken border border-line/60 focus:border-accent/60 rounded px-2 py-1 text-[11px] text-ink placeholder-zinc-600 focus:outline-none transition-colors min-w-0"
/>
</div>
</div>
{/* Toolbar */}
<div className="px-4 py-2.5 border-b border-line/40 flex items-center justify-between shrink-0">
<span className="text-[11px] text-ink-soft">
{debouncedQuery
? `${entries.length} result${entries.length !== 1 ? "s" : ""}`
? `${entries.length} result${entries.length !== 1 ? 's' : ''}`
: entries.length === 1
? "1 memory"
: `${entries.length} memories`}
? '1 memory'
: `${entries.length} memories`}
</span>
<div className="flex items-center gap-1.5">
<button
type="button"
onClick={() => setEditorState({ mode: "add" })}
className="px-2 py-1 text-[11px] bg-accent hover:bg-accent-strong text-white rounded transition-colors"
aria-label="Add memory"
>
+ Add
</button>
<button
type="button"
onClick={loadEntries}
className="px-2 py-1 text-[11px] bg-surface-card hover:bg-surface-card text-ink-mid rounded transition-colors"
aria-label="Refresh memories"
>
Refresh
</button>
</div>
<button
type="button"
onClick={loadEntries}
disabled={pluginUnavailable}
className="px-2 py-1 text-[11px] bg-surface-card hover:bg-surface-card text-ink-mid rounded transition-colors disabled:opacity-50 disabled:cursor-not-allowed"
aria-label="Refresh memories"
>
Refresh
</button>
</div>
{/* Error banner */}
@@ -285,47 +405,13 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
{loading ? (
<MemorySkeletonRows />
) : entries.length === 0 ? (
debouncedQuery ? (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true"></span>
<p className="text-sm font-medium text-ink-mid">
No memories match your search
</p>
<p className="text-[11px] text-ink-soft max-w-[200px] leading-relaxed">
Try a different query or{" "}
<button
type="button"
onClick={() => {
setSearchQuery("");
setDebouncedQuery("");
}}
className="text-accent hover:text-accent underline transition-colors"
>
clear the search
</button>
.
</p>
</div>
) : (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true"></span>
<p className="text-sm font-medium text-ink-mid">No {activeScope} memories</p>
<p className="text-[11px] text-ink-soft max-w-[200px] leading-relaxed">
{activeScope === "LOCAL"
? "This workspace has not written any local memories yet."
: activeScope === "TEAM"
? "No team memories shared with this workspace yet."
: "No global memories exist yet."}
</p>
</div>
)
<EmptyState query={debouncedQuery} pluginUnavailable={pluginUnavailable} />
) : (
<div className="space-y-1.5">
{entries.map((entry) => (
<MemoryEntryRow
key={entry.id}
entry={entry}
onEdit={() => setEditorState({ mode: "edit", entry })}
onDelete={() => setPendingDeleteId(entry.id)}
/>
))}
@@ -336,36 +422,64 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
{/* Delete confirmation dialog */}
<ConfirmDialog
open={pendingDeleteId !== null}
title="Delete memory"
message={`Delete this ${activeScope} memory? This cannot be undone.`}
confirmLabel="Delete"
title="Forget memory"
message="Forget this memory? This cannot be undone."
confirmLabel="Forget"
confirmVariant="danger"
onConfirm={confirmDelete}
onCancel={() => setPendingDeleteId(null)}
/>
</div>
);
}
{/* Add / Edit dialog */}
{editorState?.mode === "add" && (
<MemoryEditorDialog
open={true}
mode="add"
workspaceId={workspaceId}
defaultScope={activeScope}
defaultNamespace={activeNamespace || "general"}
onClose={() => setEditorState(null)}
onSaved={loadEntries}
/>
)}
{editorState?.mode === "edit" && (
<MemoryEditorDialog
open={true}
mode="edit"
workspaceId={workspaceId}
entry={editorState.entry}
onClose={() => setEditorState(null)}
onSaved={loadEntries}
/>
)}
// ── Empty state ─────────────────────────────────────────────────────────────
function EmptyState({
query,
pluginUnavailable,
}: {
query: string;
pluginUnavailable: boolean;
}) {
if (pluginUnavailable) {
// The banner already explains the problem; the empty rows just
// mirror it so the operator sees both signals.
return (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true">
</span>
<p className="text-sm font-medium text-ink-mid">Memory plugin disabled</p>
<p className="text-[11px] text-ink-soft max-w-[220px] leading-relaxed">
See banner above for the operator-side fix.
</p>
</div>
);
}
if (query) {
return (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true">
</span>
<p className="text-sm font-medium text-ink-mid">No memories match your search</p>
<p className="text-[11px] text-ink-soft max-w-[200px] leading-relaxed">
Try a different query or clear the search.
</p>
</div>
);
}
return (
<div className="flex flex-col items-center justify-center py-16 gap-3 text-center">
<span className="text-4xl text-ink-soft" aria-hidden="true">
</span>
<p className="text-sm font-medium text-ink-mid">No memories yet</p>
<p className="text-[11px] text-ink-soft max-w-[220px] leading-relaxed">
Agents commit memories via MCP tools (commit_memory, commit_summary). They
appear here once written.
</p>
</div>
);
}
@@ -373,17 +487,32 @@ export function MemoryInspectorPanel({ workspaceId }: Props) {
// ── MemoryEntryRow sub-component ──────────────────────────────────────────────
interface MemoryEntryRowProps {
entry: MemoryEntry;
onEdit: () => void;
entry: MemoryV2;
onDelete: () => void;
}
function MemoryEntryRow({ entry, onEdit, onDelete }: MemoryEntryRowProps) {
const KIND_BADGE_CLASS: Record<MemoryKind, string> = {
fact: 'bg-surface-card text-ink-mid',
summary: 'bg-blue-950 text-accent',
checkpoint: 'bg-violet-950 text-violet-400',
};
const SOURCE_BADGE_CLASS: Record<MemorySource, string> = {
agent: 'bg-surface-card text-ink-mid',
runtime: 'bg-amber-950 text-amber-300',
user: 'bg-emerald-950 text-emerald-400',
};
function MemoryEntryRow({ entry, onDelete }: MemoryEntryRowProps) {
const [expanded, setExpanded] = useState(false);
const bodyId = `mem-body-${sanitizeId(entry.id)}`;
const ttl = formatTTL(entry.expires_at);
return (
<div className="rounded-lg border border-line/60 bg-surface-sunken/50 overflow-hidden">
<div
className="rounded-lg border border-line/60 bg-surface-sunken/50 overflow-hidden"
data-testid={`memory-row-${entry.id}`}
>
{/* Header row */}
<button
type="button"
@@ -392,52 +521,89 @@ function MemoryEntryRow({ entry, onEdit, onDelete }: MemoryEntryRowProps) {
aria-expanded={expanded}
aria-controls={bodyId}
>
{/* Scope badge */}
{/* Kind badge */}
<span
className={[
"text-[9px] shrink-0 font-mono px-1 py-0.5 rounded",
entry.scope === "LOCAL"
? "bg-surface-card text-ink-mid"
: entry.scope === "TEAM"
? "bg-blue-950 text-accent"
: "bg-violet-950 text-violet-400",
].join(" ")}
title={`Scope: ${entry.scope}`}
'text-[9px] shrink-0 font-mono px-1 py-0.5 rounded',
KIND_BADGE_CLASS[entry.kind] ?? 'bg-surface-card text-ink-mid',
].join(' ')}
title={`Kind: ${entry.kind}`}
data-testid="kind-badge"
>
{entry.scope[0]}
{entry.kind[0].toUpperCase()}
</span>
{/* Source badge */}
<span
className={[
'text-[9px] shrink-0 font-mono px-1 py-0.5 rounded',
SOURCE_BADGE_CLASS[entry.source] ?? 'bg-surface-card text-ink-mid',
].join(' ')}
title={`Source: ${entry.source}`}
data-testid="source-badge"
>
{entry.source}
</span>
{/* Pin indicator */}
{entry.pin && (
<span
className="text-[9px] shrink-0"
title="Pinned"
data-testid="pin-badge"
aria-label="Pinned"
>
📌
</span>
)}
{/* Namespace tag */}
<span className="text-[9px] shrink-0 font-mono text-ink-soft truncate max-w-[80px]" title={entry.namespace}>
<span
className="text-[9px] shrink-0 font-mono text-ink-soft truncate max-w-[100px]"
title={entry.namespace}
>
{entry.namespace}
</span>
{/* Content preview */}
<span className="flex-1 min-w-0 text-[10px] font-mono text-ink-mid truncate text-left">
{entry.content.length > 60 ? entry.content.slice(0, 60) + "…" : entry.content}
{entry.content.length > 60 ? entry.content.slice(0, 60) + '…' : entry.content}
</span>
{/* Similarity badge */}
{entry.similarity_score != null && (
{/* Score badge (semantic search only) */}
{entry.score != null && (
<span
className={[
"text-[9px] shrink-0 font-mono tabular-nums",
entry.similarity_score >= 0.8
? "text-accent"
: "text-ink-mid",
].join(" ")}
title={`Similarity: ${(entry.similarity_score * 100).toFixed(1)}%`}
data-testid="similarity-badge"
'text-[9px] shrink-0 font-mono tabular-nums',
entry.score >= 0.8 ? 'text-accent' : 'text-ink-mid',
].join(' ')}
title={`Similarity: ${(entry.score * 100).toFixed(1)}%`}
data-testid="score-badge"
>
{Math.round(entry.similarity_score * 100)}%
{Math.round(entry.score * 100)}%
</span>
)}
{/* TTL countdown */}
{ttl && (
<span
className={[
'text-[9px] shrink-0 font-mono',
ttl === 'expired' ? 'text-bad' : 'text-amber-400',
].join(' ')}
title={`Expires: ${entry.expires_at}`}
data-testid="ttl-badge"
>
{ttl}
</span>
)}
<span className="text-[9px] text-ink-soft shrink-0">
{formatRelativeTime(entry.created_at)}
</span>
<span className="text-[9px] text-ink-soft shrink-0" aria-hidden="true">
{expanded ? "▼" : "▶"}
{expanded ? '▼' : '▶'}
</span>
</button>
@@ -455,31 +621,19 @@ function MemoryEntryRow({ entry, onEdit, onDelete }: MemoryEntryRowProps) {
<div className="flex items-center justify-between gap-2">
<span className="text-[9px] text-ink-soft">
Created: {new Date(entry.created_at).toLocaleString()}
{entry.expires_at && ` · Expires: ${new Date(entry.expires_at).toLocaleString()}`}
</span>
<div className="flex items-center gap-1.5 shrink-0">
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onEdit();
}}
aria-label="Edit memory"
className="text-[10px] px-2 py-0.5 bg-surface-card hover:bg-surface-elevated border border-line/40 rounded text-ink-mid hover:text-ink transition-colors"
>
Edit
</button>
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onDelete();
}}
aria-label="Delete memory"
className="text-[10px] px-2 py-0.5 bg-red-950/40 hover:bg-red-900/50 border border-red-900/30 rounded text-bad transition-colors"
>
Delete
</button>
</div>
<button
type="button"
onClick={(e) => {
e.stopPropagation();
onDelete();
}}
aria-label="Forget memory"
className="text-[10px] px-2 py-0.5 bg-red-950/40 hover:bg-red-900/50 border border-red-900/30 rounded text-bad transition-colors shrink-0"
>
Forget
</button>
</div>
</div>
)}
@@ -1,202 +0,0 @@
// @vitest-environment jsdom
/**
* MemoryEditorDialog tests — covers Add (POST /memories) and Edit
* (PATCH /memories/:id) flows. Pins:
* - Add posts {content, scope, namespace} with the trimmed defaults
* - Edit only sends fields that changed (no-op edit short-circuits, no PATCH fires)
* - Empty content blocks save
* - Save error surfaces in the dialog and keeps the modal open
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor, cleanup } from "@testing-library/react";
vi.mock("@/lib/api", () => ({
api: {
get: vi.fn(),
post: vi.fn(),
patch: vi.fn(),
del: vi.fn(),
},
}));
import { api } from "@/lib/api";
import { MemoryEditorDialog } from "../MemoryEditorDialog";
import type { MemoryEntry } from "../MemoryInspectorPanel";
const mockPost = vi.mocked(api.post);
const mockPatch = vi.mocked(api.patch);
const SAMPLE: MemoryEntry = {
id: "mem-x",
workspace_id: "ws-1",
content: "original content",
scope: "TEAM",
namespace: "procedures",
created_at: "2026-04-17T12:00:00.000Z",
};
beforeEach(() => {
vi.clearAllMocks();
mockPost.mockResolvedValue({} as never);
mockPatch.mockResolvedValue({} as never);
});
afterEach(() => {
cleanup();
});
describe("Add mode", () => {
it("POSTs scope+namespace+trimmed-content and calls onSaved+onClose", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="add"
workspaceId="ws-1"
defaultScope="GLOBAL"
defaultNamespace="facts"
onClose={onClose}
onSaved={onSaved}
/>,
);
const textarea = screen.getByLabelText(/Content/i) as HTMLTextAreaElement;
fireEvent.change(textarea, { target: { value: " new fact " } });
fireEvent.click(screen.getByRole("button", { name: /Add memory$/i }));
await waitFor(() => expect(mockPost).toHaveBeenCalledTimes(1));
expect(mockPost).toHaveBeenCalledWith("/workspaces/ws-1/memories", {
content: "new fact",
scope: "GLOBAL",
namespace: "facts",
});
expect(onSaved).toHaveBeenCalledTimes(1);
expect(onClose).toHaveBeenCalledTimes(1);
});
it("blocks save when content is empty (whitespace-only)", () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="add"
workspaceId="ws-1"
defaultScope="LOCAL"
onClose={onClose}
onSaved={onSaved}
/>,
);
const textarea = screen.getByLabelText(/Content/i) as HTMLTextAreaElement;
fireEvent.change(textarea, { target: { value: " " } });
fireEvent.click(screen.getByRole("button", { name: /Add memory$/i }));
expect(mockPost).not.toHaveBeenCalled();
expect(screen.getByRole("alert").textContent).toMatch(/empty/i);
expect(onSaved).not.toHaveBeenCalled();
expect(onClose).not.toHaveBeenCalled();
});
});
describe("Edit mode", () => {
it("PATCHes only changed fields", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="edit"
workspaceId="ws-1"
entry={SAMPLE}
onClose={onClose}
onSaved={onSaved}
/>,
);
const textarea = screen.getByLabelText(/Content/i) as HTMLTextAreaElement;
fireEvent.change(textarea, { target: { value: "rewritten content" } });
// namespace untouched
fireEvent.click(screen.getByRole("button", { name: /Save changes/i }));
await waitFor(() => expect(mockPatch).toHaveBeenCalledTimes(1));
expect(mockPatch).toHaveBeenCalledWith(
"/workspaces/ws-1/memories/mem-x",
{ content: "rewritten content" },
);
expect(onSaved).toHaveBeenCalledTimes(1);
expect(onClose).toHaveBeenCalledTimes(1);
});
it("no-op edit short-circuits (no PATCH fires) and still closes", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="edit"
workspaceId="ws-1"
entry={SAMPLE}
onClose={onClose}
onSaved={onSaved}
/>,
);
fireEvent.click(screen.getByRole("button", { name: /Save changes/i }));
await waitFor(() => expect(onClose).toHaveBeenCalled());
expect(mockPatch).not.toHaveBeenCalled();
expect(onSaved).toHaveBeenCalledTimes(1);
});
it("sends namespace too when both content and namespace changed", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
render(
<MemoryEditorDialog
open
mode="edit"
workspaceId="ws-1"
entry={SAMPLE}
onClose={onClose}
onSaved={onSaved}
/>,
);
fireEvent.change(screen.getByLabelText(/Content/i), {
target: { value: "newer content" },
});
fireEvent.change(screen.getByLabelText(/Namespace/i), {
target: { value: "blockers" },
});
fireEvent.click(screen.getByRole("button", { name: /Save changes/i }));
await waitFor(() => expect(mockPatch).toHaveBeenCalledTimes(1));
expect(mockPatch).toHaveBeenCalledWith(
"/workspaces/ws-1/memories/mem-x",
{ content: "newer content", namespace: "blockers" },
);
});
it("surfaces save error and keeps the modal open", async () => {
const onClose = vi.fn();
const onSaved = vi.fn();
mockPatch.mockRejectedValueOnce(new Error("boom"));
render(
<MemoryEditorDialog
open
mode="edit"
workspaceId="ws-1"
entry={SAMPLE}
onClose={onClose}
onSaved={onSaved}
/>,
);
fireEvent.change(screen.getByLabelText(/Content/i), {
target: { value: "rewritten content" },
});
fireEvent.click(screen.getByRole("button", { name: /Save changes/i }));
await waitFor(() =>
expect(screen.getByRole("alert").textContent).toMatch(/boom/),
);
expect(onClose).not.toHaveBeenCalled();
expect(onSaved).not.toHaveBeenCalled();
});
});
@@ -1,16 +1,29 @@
// @vitest-environment jsdom
/**
* MemoryInspectorPanel tests — issue #909
* MemoryInspectorPanel — v2 redesign tests.
*
* Covers: loading, empty state, scope tabs, namespace filter,
* entry list, expand, delete flow, optimistic updates, Refresh, semantic search.
* Coverage targets every behavior the panel surfaces:
* - Initial load wires GET /v2/namespaces + GET /v2/memories
* - Plugin-unavailable banner (503) renders + disables interactions
* - Generic error renders in the error banner
* - Namespace dropdown populates from /v2/namespaces.readable; "All
* namespaces" is the default
* - Selecting a namespace re-fetches with ?namespace=...
* - Search input debounces + scopes the request to ?q=
* - Search results sort by score descending
* - Empty-state copy differs by query / plugin-state / no-data
* - Per-row badges render (kind / source / pin / TTL / score /
* score) and TTL countdown handles past/future/null
* - Delete (Forget) flow: optimistic removal, confirmation dialog,
* server failure rolls back via reload
* - formatTTL helper covers s/m/h/d/expired/null/invalid branches
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor, cleanup, act } from "@testing-library/react";
import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest';
import { render, screen, fireEvent, waitFor, cleanup } from '@testing-library/react';
// ── Mocks ─────────────────────────────────────────────────────────────────────
vi.mock("@/lib/api", () => ({
vi.mock('@/lib/api', () => ({
api: {
get: vi.fn(),
post: vi.fn(),
@@ -18,7 +31,7 @@ vi.mock("@/lib/api", () => ({
},
}));
vi.mock("@/components/ConfirmDialog", () => ({
vi.mock('@/components/ConfirmDialog', () => ({
ConfirmDialog: ({
open,
title,
@@ -33,435 +46,473 @@ vi.mock("@/components/ConfirmDialog", () => ({
confirmVariant?: string;
onConfirm: () => void;
onCancel: () => void;
singleButton?: boolean;
}) =>
open ? (
<div data-testid="confirm-dialog">
<p data-testid="dialog-title">{title}</p>
<p data-testid="dialog-message">{message}</p>
<button onClick={onConfirm}>Confirm Delete</button>
<button onClick={onCancel}>Cancel Delete</button>
<button onClick={onConfirm}>Confirm</button>
<button onClick={onCancel}>Cancel</button>
</div>
) : null,
}));
import { api } from "@/lib/api";
import { MemoryInspectorPanel } from "../MemoryInspectorPanel";
// ── Typed mock helpers ────────────────────────────────────────────────────────
import { api } from '@/lib/api';
import {
MemoryInspectorPanel,
formatTTL,
isPluginUnavailableError,
type MemoryV2,
type NamespacesResponse,
} from '../MemoryInspectorPanel';
const mockGet = vi.mocked(api.get);
const mockDel = vi.mocked(api.del);
// ── Sample fixtures ───────────────────────────────────────────────────────────
// ── Fixtures ──────────────────────────────────────────────────────────────────
const NOW = "2026-04-17T12:00:00.000Z";
const MEMORY_A: import("../MemoryInspectorPanel").MemoryEntry = {
id: "mem-a",
workspace_id: "ws-1",
content: "Remember to review PRs before merging",
scope: "LOCAL",
namespace: "general",
created_at: NOW,
const NS_RESPONSE: NamespacesResponse = {
readable: [
{ name: 'workspace:ws-1', kind: 'workspace', label: 'Workspace (ws-1)' },
{ name: 'team:t-1', kind: 'team', label: 'Team (t-1)' },
],
writable: [{ name: 'workspace:ws-1', kind: 'workspace', label: 'Workspace (ws-1)' }],
};
const MEMORY_B: import("../MemoryInspectorPanel").MemoryEntry = {
id: "mem-b",
workspace_id: "ws-1",
content: "Team knowledge: deploy happens on Fridays",
scope: "TEAM",
namespace: "procedures",
created_at: NOW,
const MEM_BASIC: MemoryV2 = {
id: 'mem-a',
namespace: 'workspace:ws-1',
content: 'Remember the standup is at 10am',
kind: 'fact',
source: 'agent',
pin: false,
created_at: '2026-04-17T12:00:00.000Z',
};
const TWO_MEMORIES = [MEMORY_A, MEMORY_B];
const MEM_PINNED: MemoryV2 = {
id: 'mem-pinned',
namespace: 'team:t-1',
content: 'Team retro every Friday',
kind: 'summary',
source: 'user',
pin: true,
expires_at: new Date(Date.now() + 86_400_000).toISOString(),
created_at: '2026-04-17T12:00:00.000Z',
};
const MEM_RUNTIME_CHECKPOINT: MemoryV2 = {
id: 'mem-checkpoint',
namespace: 'team:t-1',
content: 'Runtime checkpoint',
kind: 'checkpoint',
source: 'runtime',
pin: false,
created_at: '2026-04-17T12:00:00.000Z',
};
const MEM_EXPIRED: MemoryV2 = {
id: 'mem-expired',
namespace: 'workspace:ws-1',
content: 'Stale memory',
kind: 'fact',
source: 'agent',
pin: false,
expires_at: new Date(Date.now() - 1000).toISOString(),
created_at: '2026-04-17T12:00:00.000Z',
};
// ── Setup / teardown ──────────────────────────────────────────────────────────
beforeEach(() => {
vi.clearAllMocks();
mockGet.mockReset();
mockDel.mockReset();
});
afterEach(() => {
cleanup();
});
// ── Helper: flush microtasks + React state updates ─────────────────────────────
async function flushUpdates(): Promise<void> {
await act(async () => {});
// Helper: stub a basic two-call flow (namespaces + memories).
function stubFetch(memories: MemoryV2[], namespaces: NamespacesResponse = NS_RESPONSE) {
mockGet.mockImplementation(((url: string) => {
if (url.includes('/v2/namespaces')) {
return Promise.resolve(namespaces);
}
return Promise.resolve({ memories });
}) as typeof api.get);
}
// ── Loading & empty state ─────────────────────────────────────────────────────
// ── isPluginUnavailableError helper ─────────────────────────────────────────
describe("MemoryInspectorPanel — loading and empty state", () => {
it("shows loading indicator before data arrives", () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockReturnValue(new Promise(() => {}) as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
expect(screen.getByText(/loading memories/i)).toBeTruthy();
});
it("renders empty state when API returns []", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText("No LOCAL memories")).toBeTruthy();
});
it("fetches from the correct workspace memories endpoint with scope=LOCAL", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-abc-123" />);
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-abc-123/memories?scope=LOCAL"
);
});
it("shows error banner when fetch throws", async () => {
mockGet.mockRejectedValue(new Error("Network error"));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText("Network error")).toBeTruthy();
});
});
// ── Scope tabs ────────────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — scope tabs", () => {
it("renders LOCAL, TEAM, GLOBAL tabs", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByRole("button", { name: "LOCAL" })).toBeTruthy();
expect(screen.getByRole("button", { name: "TEAM" })).toBeTruthy();
expect(screen.getByRole("button", { name: "GLOBAL" })).toBeTruthy();
});
it("LOCAL is active by default", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByRole("button", { name: "LOCAL" }).getAttribute("aria-pressed")).toBe("true");
});
it("clicking TEAM tab re-fetches with scope=TEAM", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
mockGet.mockClear();
fireEvent.click(screen.getByRole("button", { name: "TEAM" }));
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=TEAM"
);
});
it("clicking GLOBAL tab re-fetches with scope=GLOBAL", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
mockGet.mockClear();
fireEvent.click(screen.getByRole("button", { name: "GLOBAL" }));
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=GLOBAL"
);
});
it("shows scope-specific empty state when switching tabs", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
fireEvent.click(screen.getByRole("button", { name: "TEAM" }));
await flushUpdates();
expect(screen.getByText("No TEAM memories")).toBeTruthy();
});
});
// ── Namespace filter ──────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — namespace filter", () => {
it("renders namespace filter input", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByLabelText("Filter by namespace")).toBeTruthy();
});
it("includes namespace param in API call when set", async () => {
vi.useFakeTimers();
try {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
mockGet.mockClear();
fireEvent.change(screen.getByLabelText("Filter by namespace"), {
target: { value: "facts" },
});
// Advance past the 300ms debounce
act(() => { vi.advanceTimersByTime(350); });
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=LOCAL&namespace=facts"
);
} finally {
vi.useRealTimers();
}
});
});
// ── Entry list ───────────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — entry list", () => {
beforeEach(() => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue(TWO_MEMORIES as any);
});
it("renders a row for every memory", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText(/Remember to review PRs before merging/)).toBeTruthy();
expect(screen.getByText(/Team knowledge: deploy happens on Fridays/)).toBeTruthy();
});
it("displays memory count in toolbar", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText("2 memories")).toBeTruthy();
});
it("displays scope badge for each entry", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByTitle("Scope: LOCAL")).toBeTruthy();
expect(screen.getByTitle("Scope: TEAM")).toBeTruthy();
});
it("entries are collapsed by default (pre region not visible)", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
// Expanded region (pre tag) should not exist in DOM yet
expect(screen.queryByRole("region")).toBeNull();
});
});
// ── Expand / collapse ─────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — expand/collapse", () => {
beforeEach(() => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue(TWO_MEMORIES as any);
});
it("clicking a row header expands it and shows the full content in a pre tag", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
fireEvent.click(
screen.getByText(/Remember to review PRs before merging/).closest("button")!
);
await flushUpdates();
// After expand, a region with the full content <pre> should appear
expect(screen.getByRole("region")).toBeTruthy();
});
it("clicking the header again collapses the row (pre region removed)", async () => {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
const headerBtn = screen
.getByText(/Remember to review PRs before merging/)
.closest("button")!;
fireEvent.click(headerBtn); // expand
await flushUpdates();
expect(screen.getByRole("region")).toBeTruthy();
fireEvent.click(headerBtn); // collapse
await flushUpdates();
// After collapse, the region (pre) is removed from the DOM
expect(screen.queryByRole("region")).toBeNull();
});
});
// ── Delete flow ───────────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — delete flow", () => {
beforeEach(() => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue(TWO_MEMORIES as any);
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockDel.mockResolvedValue({ status: "deleted" } as any);
});
/** Helper: expand memory-A and click its Delete button */
async function openDeleteForMemoryA() {
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
fireEvent.click(
screen.getByText(/Remember to review PRs before merging/).closest("button")!
);
await flushUpdates();
fireEvent.click(screen.getByRole("button", { name: "Delete memory" }));
await flushUpdates();
}
it("opens ConfirmDialog when Delete is clicked", async () => {
await openDeleteForMemoryA();
expect(screen.getByTestId("confirm-dialog")).toBeTruthy();
expect(screen.getByTestId("dialog-title").textContent).toBe("Delete memory");
});
it("calls api.del with the correct URL-encoded path on confirm", async () => {
await openDeleteForMemoryA();
fireEvent.click(screen.getByText("Confirm Delete"));
await flushUpdates();
expect(mockDel).toHaveBeenCalledWith("/workspaces/ws-1/memories/mem-a");
});
it("removes the entry optimistically after confirm", async () => {
await openDeleteForMemoryA();
fireEvent.click(screen.getByText("Confirm Delete"));
await flushUpdates();
expect(screen.queryByText(/Remember to review PRs before merging/)).toBeNull();
// Sibling entry unaffected
expect(screen.getByText(/Team knowledge: deploy happens on Fridays/)).toBeTruthy();
});
it("closes ConfirmDialog without deleting when Cancel is clicked", async () => {
await openDeleteForMemoryA();
fireEvent.click(screen.getByText("Cancel Delete"));
await flushUpdates();
expect(screen.queryByTestId("confirm-dialog")).toBeNull();
expect(mockDel).not.toHaveBeenCalled();
// Sibling memory entry (MEMORY_B) is still in the list
expect(screen.getByText(/Team knowledge: deploy happens on Fridays/)).toBeTruthy();
});
});
// ── Refresh ───────────────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — Refresh button", () => {
it("re-fetches entries when Refresh is clicked", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
expect(screen.getByText("No LOCAL memories")).toBeTruthy();
expect(mockGet).toHaveBeenCalledTimes(1);
fireEvent.click(screen.getByRole("button", { name: "Refresh memories" }));
await flushUpdates();
expect(mockGet).toHaveBeenCalledTimes(2);
});
});
// ── role=alert a11y ──────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — error elements have role=alert", () => {
it("fetch error banner has role='alert'", async () => {
mockGet.mockRejectedValue(new Error("Network error"));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
const alert = screen.getByRole("alert");
expect(alert).toBeTruthy();
expect(alert.textContent).toContain("Network error");
});
});
// ── Semantic search ──────────────────────────────────────────────────────────
describe("MemoryInspectorPanel — semantic search", () => {
afterEach(() => {
vi.useRealTimers();
});
it("debounces search input by 300ms before calling API", async () => {
vi.useFakeTimers();
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
mockGet.mockClear();
fireEvent.change(screen.getByLabelText("Search memories"), {
target: { value: "deploy" },
});
// 200ms — debounce has NOT fired yet
act(() => { vi.advanceTimersByTime(200); });
await flushUpdates();
expect(mockGet).not.toHaveBeenCalled();
// 350ms total — debounce fires
act(() => { vi.advanceTimersByTime(150); });
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=LOCAL&q=deploy"
);
});
it("renders similarity-badge when entry has similarity_score", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([{ ...MEMORY_A, similarity_score: 0.87 }] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
const badge = document.querySelector('[data-testid="similarity-badge"]');
expect(badge).toBeTruthy();
expect(badge?.textContent).toBe("87%");
});
it("does not render similarity-badge when entry has no similarity_score", async () => {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([MEMORY_A] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
describe('isPluginUnavailableError', () => {
it('matches the literal env var contract from the server handler', () => {
expect(
document.querySelector('[data-testid="similarity-badge"]')
).toBeNull();
isPluginUnavailableError(
new Error('API GET /workspaces/x/v2/memories: 503 {"error":"memory plugin is not configured (set MEMORY_PLUGIN_URL)"}'),
),
).toBe(true);
});
it("clear button resets query immediately and re-fetches without ?q=", async () => {
vi.useFakeTimers();
// eslint-disable-next-line @typescript-eslint/no-explicit-any
mockGet.mockResolvedValue([] as any);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await flushUpdates();
it('does not false-match on generic 503 errors that don\'t mention the env var', () => {
expect(isPluginUnavailableError(new Error('API GET /foo: 503 something else'))).toBe(false);
});
fireEvent.change(screen.getByLabelText("Search memories"), {
target: { value: "deploy" },
it('does not false-match on plain 4xx errors', () => {
expect(isPluginUnavailableError(new Error('API GET /foo: 401 unauthorized'))).toBe(false);
});
it('returns false for non-Error inputs', () => {
expect(isPluginUnavailableError(null)).toBe(false);
expect(isPluginUnavailableError(undefined)).toBe(false);
expect(isPluginUnavailableError('a string')).toBe(false);
expect(isPluginUnavailableError({ message: 'MEMORY_PLUGIN_URL' })).toBe(false);
});
});
// ── formatTTL helper ─────────────────────────────────────────────────────────
describe('formatTTL', () => {
it('returns empty string for null/undefined/empty', () => {
expect(formatTTL(null)).toBe('');
expect(formatTTL(undefined)).toBe('');
expect(formatTTL('')).toBe('');
});
it('returns empty for invalid date strings', () => {
expect(formatTTL('not-a-date')).toBe('');
});
it('returns "expired" for past timestamps', () => {
const past = new Date(Date.now() - 5000).toISOString();
expect(formatTTL(past)).toBe('expired');
});
it('formats <60s as seconds', () => {
const future = new Date(Date.now() + 30_000).toISOString();
expect(formatTTL(future)).toMatch(/^\d{1,2}s$/);
});
it('formats <60m as minutes', () => {
const future = new Date(Date.now() + 30 * 60_000).toISOString();
expect(formatTTL(future)).toMatch(/^\d{1,2}m$/);
});
it('formats <24h as hours', () => {
const future = new Date(Date.now() + 5 * 3_600_000).toISOString();
expect(formatTTL(future)).toMatch(/^\d{1,2}h$/);
});
it('formats >24h as days', () => {
const future = new Date(Date.now() + 3 * 86_400_000).toISOString();
expect(formatTTL(future)).toMatch(/^\d{1,2}d$/);
});
});
// ── Initial load + dropdown ─────────────────────────────────────────────────
describe('MemoryInspectorPanel — initial load', () => {
it('fetches namespaces and memories on mount', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
const calls = mockGet.mock.calls.map((c) => c[0]);
expect(calls.some((u) => u.includes('/v2/namespaces'))).toBe(true);
expect(calls.some((u) => u.includes('/v2/memories'))).toBe(true);
});
});
it('renders the row contents from the memories response', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
expect(screen.getByText(/Remember the standup is at 10am/)).toBeTruthy();
});
});
it('populates the namespace dropdown with readable entries + "All namespaces"', async () => {
stubFetch([]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Filter by namespace'));
const select = screen.getByLabelText('Filter by namespace') as HTMLSelectElement;
const optionLabels = Array.from(select.options).map((o) => o.textContent ?? '');
expect(optionLabels[0]).toContain('All namespaces');
expect(optionLabels.join('|')).toContain('Workspace (ws-1)');
expect(optionLabels.join('|')).toContain('Team (t-1)');
});
it('selecting a namespace re-fetches with ?namespace=', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Filter by namespace'));
const select = screen.getByLabelText('Filter by namespace') as HTMLSelectElement;
fireEvent.change(select, { target: { value: 'team:t-1' } });
await waitFor(() => {
const calls = mockGet.mock.calls.map((c) => c[0] as string);
expect(calls.some((u) => u.includes('namespace=team%3At-1'))).toBe(true);
});
});
});
// ── Plugin unavailable (503) ────────────────────────────────────────────────
describe('MemoryInspectorPanel — plugin unavailable', () => {
it('renders the operator-hint banner and disables search input', async () => {
mockGet.mockRejectedValue(new Error('HTTP 503: memory plugin is not configured (set MEMORY_PLUGIN_URL)'));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByTestId('plugin-unavailable-banner'));
const searchInput = screen.getByLabelText('Search memories') as HTMLInputElement;
expect(searchInput.disabled).toBe(true);
});
it('shows the empty-state explaining plugin disabled', async () => {
mockGet.mockRejectedValue(new Error('API GET /workspaces/x/v2/memories: 503 {"error":"memory plugin is not configured (set MEMORY_PLUGIN_URL)"}'));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByText(/Memory plugin disabled/i));
});
});
// ── Generic error (non-503) ─────────────────────────────────────────────────
describe('MemoryInspectorPanel — generic errors', () => {
it('surfaces a non-503 error in the error banner', async () => {
mockGet.mockImplementation(((url: string) => {
if (url.includes('/v2/namespaces')) {
return Promise.resolve(NS_RESPONSE);
}
return Promise.reject(new Error('upstream timeout'));
}) as typeof api.get);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
// Error banner has role=alert
const alerts = screen.getAllByRole('alert');
const found = alerts.some((a) => a.textContent?.includes('upstream timeout'));
expect(found).toBe(true);
});
});
});
// ── Search ──────────────────────────────────────────────────────────────────
describe('MemoryInspectorPanel — search', () => {
it('eventually fires query with ?q= after debounce', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Search memories'));
fireEvent.change(screen.getByLabelText('Search memories'), {
target: { value: 'standup' },
});
act(() => { vi.advanceTimersByTime(350); });
await flushUpdates();
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=LOCAL&q=deploy"
await waitFor(
() => {
const calls = mockGet.mock.calls.map((c) => c[0] as string);
expect(calls.some((u) => u.includes('q=standup'))).toBe(true);
},
{ timeout: 1500 },
);
mockGet.mockClear();
});
fireEvent.click(screen.getByRole("button", { name: "Clear search" }));
await flushUpdates();
it('sorts results by score descending when query active', async () => {
const lowScore: MemoryV2 = { ...MEM_BASIC, id: 'low', score: 0.2, content: 'low' };
const highScore: MemoryV2 = { ...MEM_BASIC, id: 'high', score: 0.95, content: 'high' };
// Plugin returns in arbitrary order; component sorts.
mockGet.mockImplementation(((url: string) => {
if (url.includes('/v2/namespaces')) return Promise.resolve(NS_RESPONSE);
return Promise.resolve({ memories: [lowScore, highScore] });
}) as typeof api.get);
expect(mockGet).toHaveBeenCalledWith(
"/workspaces/ws-1/memories?scope=LOCAL"
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Search memories'));
fireEvent.change(screen.getByLabelText('Search memories'), {
target: { value: 'something' },
});
await waitFor(
() => {
const rows = screen.getAllByTestId(/^memory-row-/);
// First row should be the high-score one
expect(rows[0].getAttribute('data-testid')).toBe('memory-row-high');
},
{ timeout: 1500 },
);
});
it('clear-button resets the query', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Search memories'));
fireEvent.change(screen.getByLabelText('Search memories'), {
target: { value: 'foo' },
});
fireEvent.click(screen.getByLabelText('Clear search'));
expect((screen.getByLabelText('Search memories') as HTMLInputElement).value).toBe('');
});
it('renders no-results empty-state when search has no matches', async () => {
stubFetch([]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Search memories'));
fireEvent.change(screen.getByLabelText('Search memories'), {
target: { value: 'nothing' },
});
await waitFor(
() => {
expect(screen.getByText(/No memories match your search/i)).toBeTruthy();
},
{ timeout: 1500 },
);
});
});
// ── Per-row badges ───────────────────────────────────────────────────────────
describe('MemoryInspectorPanel — row badges', () => {
it('renders kind, source, pin, TTL badges per shape', async () => {
stubFetch([MEM_PINNED, MEM_RUNTIME_CHECKPOINT]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
// Pinned memory: kind=summary, source=user, pin=true, TTL>0
const pinnedRow = screen.getByTestId('memory-row-mem-pinned');
expect(pinnedRow.querySelector('[data-testid="kind-badge"]')?.textContent).toBe('S');
expect(pinnedRow.querySelector('[data-testid="source-badge"]')?.textContent).toBe('user');
expect(pinnedRow.querySelector('[data-testid="pin-badge"]')).toBeTruthy();
expect(pinnedRow.querySelector('[data-testid="ttl-badge"]')?.textContent).toMatch(/^⌛\d+[hd]$/);
// Checkpoint memory: kind=checkpoint, source=runtime, no pin, no TTL
const propRow = screen.getByTestId('memory-row-mem-checkpoint');
expect(propRow.querySelector('[data-testid="kind-badge"]')?.textContent).toBe('C');
expect(propRow.querySelector('[data-testid="source-badge"]')?.textContent).toBe('runtime');
expect(propRow.querySelector('[data-testid="pin-badge"]')).toBeNull();
expect(propRow.querySelector('[data-testid="ttl-badge"]')).toBeNull();
});
});
it('TTL badge shows "expired" for past expires_at', async () => {
stubFetch([MEM_EXPIRED]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
const row = screen.getByTestId('memory-row-mem-expired');
expect(row.querySelector('[data-testid="ttl-badge"]')?.textContent).toBe('⌛expired');
});
});
it('expanding a row shows full content + Forget button', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByTestId('memory-row-mem-a'));
const row = screen.getByTestId('memory-row-mem-a');
const headerButton = row.querySelector('button');
expect(headerButton).toBeTruthy();
fireEvent.click(headerButton!);
await waitFor(() => {
expect(screen.getByLabelText('Forget memory')).toBeTruthy();
});
});
});
// ── Delete (Forget) flow ──────────────────────────────────────────────────────
describe('MemoryInspectorPanel — forget flow', () => {
it('opens the confirm dialog on Forget click and removes optimistically on confirm', async () => {
stubFetch([MEM_BASIC]);
mockDel.mockResolvedValue({ status: 'deleted' });
render(<MemoryInspectorPanel workspaceId="ws-1" />);
// Expand row, click Forget
await waitFor(() => screen.getByTestId('memory-row-mem-a'));
const row = screen.getByTestId('memory-row-mem-a');
fireEvent.click(row.querySelector('button')!);
await waitFor(() => screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByLabelText('Forget memory'));
// Dialog appears with v2-shaped copy (Forget, not Delete)
expect(screen.getByTestId('dialog-title').textContent).toBe('Forget memory');
fireEvent.click(screen.getByText('Confirm'));
// Optimistic removal happens immediately
await waitFor(() => {
expect(screen.queryByTestId('memory-row-mem-a')).toBeNull();
});
// DELETE called with the right path
await waitFor(() => {
const delPaths = mockDel.mock.calls.map((c) => c[0] as string);
expect(delPaths.some((p) => p.includes('/v2/memories/mem-a'))).toBe(true);
});
});
it('cancelling the dialog leaves the row in place', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByTestId('memory-row-mem-a'));
fireEvent.click(screen.getByTestId('memory-row-mem-a').querySelector('button')!);
await waitFor(() => screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByText('Cancel'));
expect(screen.queryByTestId('memory-row-mem-a')).toBeTruthy();
expect(mockDel).not.toHaveBeenCalled();
});
it('rolls back on server failure by reloading entries', async () => {
stubFetch([MEM_BASIC]);
mockDel.mockRejectedValue(new Error('upstream 502'));
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByTestId('memory-row-mem-a'));
fireEvent.click(screen.getByTestId('memory-row-mem-a').querySelector('button')!);
await waitFor(() => screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByLabelText('Forget memory'));
fireEvent.click(screen.getByText('Confirm'));
// After failure, error banner surfaces + reload re-fetches memories
await waitFor(() => {
const alerts = screen.getAllByRole('alert');
const found = alerts.some((a) => a.textContent?.includes('upstream 502'));
expect(found).toBe(true);
});
});
});
// ── Empty state when no memories at all ────────────────────────────────────
describe('MemoryInspectorPanel — empty state', () => {
it('renders the "no memories yet" empty state when not searching', async () => {
stubFetch([]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => {
expect(screen.getByText('No memories yet')).toBeTruthy();
});
});
});
// ── Refresh ─────────────────────────────────────────────────────────────────
describe('MemoryInspectorPanel — refresh', () => {
it('Refresh button refetches memories', async () => {
stubFetch([MEM_BASIC]);
render(<MemoryInspectorPanel workspaceId="ws-1" />);
await waitFor(() => screen.getByLabelText('Refresh memories'));
const before = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
fireEvent.click(screen.getByLabelText('Refresh memories'));
await waitFor(() => {
const after = mockGet.mock.calls.filter((c) =>
(c[0] as string).includes('/v2/memories'),
).length;
expect(after).toBe(before + 1);
});
});
});
+88 -3
View File
@@ -7,7 +7,7 @@ import { api } from "@/lib/api";
import { useCanvasStore, type WorkspaceNodeData } from "@/store/canvas";
import { useSocketEvent } from "@/hooks/useSocketEvent";
import { type ChatMessage, type ChatAttachment, createMessage, appendMessageDeduped } from "./chat/types";
import { uploadChatFiles, downloadChatFile } from "./chat/uploads";
import { uploadChatFiles, downloadChatFile, isPlatformAttachment } from "./chat/uploads";
import { AttachmentChip, PendingAttachmentPill } from "./chat/AttachmentViews";
import { extractFilesFromTask } from "./chat/message-parser";
import { AgentCommsPanel } from "./chat/AgentCommsPanel";
@@ -1061,7 +1061,77 @@ function MyChatPanel({ workspaceId, data }: Props) {
: "dark:prose-invert dark:[--tw-prose-invert-body:theme(colors.zinc.100)] dark:[--tw-prose-invert-headings:theme(colors.white)] dark:[--tw-prose-invert-bold:theme(colors.white)] dark:[--tw-prose-invert-code:theme(colors.zinc.100)]"
}`}
>
<ReactMarkdown remarkPlugins={[remarkGfm]}>{msg.content}</ReactMarkdown>
<ReactMarkdown
remarkPlugins={[remarkGfm]}
components={{
// Default ReactMarkdown renders `<a href="...">`
// with no target and no scheme handling, so:
//
// 1. http/https links navigate the canvas tab
// itself away — user loses canvas state.
// 2. workspace://, file://, and bare /workspace/
// paths from agent-authored markdown produce
// an unhandled-protocol click → browser ends
// up at about:blank with no download (the
// reported bug from 2026-05-05).
//
// Override: external URLs open in a new tab with
// rel="noopener noreferrer"; in-container paths
// route through downloadChatFile so the browser
// gets a real Blob with proper auth headers.
a: ({ href, children, ...rest }) => {
const url = String(href ?? "");
// Use the SSOT helper isPlatformAttachment so
// the markdown link override and the chip
// download path agree on which schemes need
// auth-routed download. Pre-fix this list was
// duplicated and missed `platform-pending:`,
// producing about:blank for poll-mode uploads.
if (isPlatformAttachment(url)) {
return (
<a
href={url}
{...rest}
onClick={(e) => {
e.preventDefault();
// Construct a synthetic ChatAttachment
// and route through the same
// authenticated download path the
// download chips use. Filename is the
// last path segment so Save-As prefills
// sensibly.
const name = url.split(/[\\/]/).pop() || "download";
downloadChatFile(workspaceId, {
uri: url,
name,
}).catch((err) => {
setError(
err instanceof Error
? `Download failed: ${err.message}`
: "Download failed",
);
});
}}
>
{children}
</a>
);
}
// External (http(s) / mailto / unknown scheme):
// open in new tab so canvas state survives.
return (
<a
href={url}
target="_blank"
rel="noopener noreferrer"
{...rest}
>
{children}
</a>
);
},
}}
>{msg.content}</ReactMarkdown>
</div>
)}
{msg.attachments && msg.attachments.length > 0 && (
@@ -1167,7 +1237,22 @@ function MyChatPanel({ workspaceId, data }: Props) {
value={input}
onChange={(e) => setInput(e.target.value)}
onKeyDown={(e) => {
if (e.key === "Enter" && !e.shiftKey) {
// IME-safe send: while a CJK / Japanese / Korean IME is
// composing, Enter accepts the candidate selection — not a
// newline, not a send. `e.nativeEvent.isComposing` is the
// standard signal (modern WebKit/Blink/Gecko); the keyCode
// 229 fallback covers older Safari / WebKit-based mobile
// browsers that delay setting isComposing on the
// composition-end Enter. Reported 2026-05-05: typing
// Chinese with the system IME, pressing Enter to commit
// a candidate would inadvertently send the half-typed
// message.
if (
e.key === "Enter" &&
!e.shiftKey &&
!e.nativeEvent.isComposing &&
e.keyCode !== 229
) {
e.preventDefault();
sendMessage();
}
@@ -0,0 +1,141 @@
// @vitest-environment jsdom
//
// Pins two regressions reported on production 2026-05-05:
//
// 1. IME composition + Enter key: typing Chinese (or any CJK / IME-
// composed text) and pressing Enter to commit the candidate
// selection used to send the half-typed message. The fix checks
// `event.nativeEvent.isComposing` (and a `keyCode === 229`
// fallback for older WebKit) before treating Enter as send.
//
// 2. Markdown link clicks: the agent's ReactMarkdown-rendered links
// used to:
// - http/https → navigate canvas tab away (user lost canvas state)
// - workspace://path / file:///workspace/... / /workspace/... →
// browser hit about:blank (unhandled protocol).
// Fix: external links get target="_blank" + noopener; in-container
// paths route through downloadChatFile (same auth path as chips).
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, fireEvent, waitFor } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
// Mock the api module so render doesn't try to talk to a real CP.
const apiGet = vi.fn((_path: string): Promise<unknown> => Promise.resolve([]));
const apiPost = vi.fn((_path: string, _body: unknown): Promise<unknown> => Promise.resolve({}));
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
post: (path: string, body: unknown) => apiPost(path, body),
del: vi.fn(),
patch: vi.fn(),
put: vi.fn(),
},
}));
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector?: (s: unknown) => unknown) =>
selector ? selector({ agentMessages: {}, consumeAgentMessages: () => [] }) : {},
),
}));
// Capture the downloadChatFile call so the markdown-link test can
// assert in-container paths route through the authenticated download
// path rather than the browser's bare anchor click.
const downloadChatFileMock = vi.fn((_workspaceId: string, _att: { uri: string; name: string }) => Promise.resolve());
vi.mock("../chat/uploads", async () => {
const actual = await vi.importActual<typeof import("../chat/uploads")>("../chat/uploads");
return {
...actual,
downloadChatFile: (workspaceId: string, att: { uri: string; name: string }) =>
downloadChatFileMock(workspaceId, att),
};
});
beforeEach(() => {
apiGet.mockClear();
apiPost.mockClear();
downloadChatFileMock.mockClear();
// jsdom doesn't implement scrollIntoView; ChatTab calls it after
// every render with a new message.
Element.prototype.scrollIntoView = vi.fn();
// Stub IntersectionObserver — the lazy-history sentinel uses it.
class FakeIO {
observe() {}
unobserve() {}
disconnect() {}
}
(window as unknown as { IntersectionObserver: unknown }).IntersectionObserver = FakeIO;
(globalThis as unknown as { IntersectionObserver: unknown }).IntersectionObserver = FakeIO;
});
import { ChatTab } from "../ChatTab";
const minimalData = {
status: "online" as const,
runtime: "claude-code",
currentTask: null,
} as unknown as Parameters<typeof ChatTab>[0]["data"];
describe("ChatTab — IME-safe Enter key", () => {
it("does NOT send the message when Enter fires during IME composition (isComposing)", async () => {
render(<ChatTab workspaceId="ws-ime" data={minimalData} />);
// Find the textarea by its aria-label.
const textarea = await screen.findByLabelText(/Message to agent/i);
fireEvent.change(textarea, { target: { value: "你好" } });
// Simulate the Enter that commits an IME selection: isComposing=true.
fireEvent.keyDown(textarea, { key: "Enter", isComposing: true });
// sendMessage POSTs via api.post; assert it was NOT called.
await waitFor(() => {
expect(apiPost).not.toHaveBeenCalled();
});
// And the input is preserved — ChatTab clears it only on actual send.
expect((textarea as HTMLTextAreaElement).value).toBe("你好");
});
it("does NOT send when keyCode is 229 (older Safari IME fallback)", async () => {
render(<ChatTab workspaceId="ws-ime2" data={minimalData} />);
const textarea = await screen.findByLabelText(/Message to agent/i);
fireEvent.change(textarea, { target: { value: "한국어" } });
// keyCode 229 is the older-Safari signal that an IME is composing.
// Some mobile WebKit-based browsers delay setting isComposing on
// the composition-end Enter; the keyCode fallback covers that.
fireEvent.keyDown(textarea, { key: "Enter", keyCode: 229 });
await waitFor(() => {
expect(apiPost).not.toHaveBeenCalled();
});
});
it("DOES send on a non-composing Enter (the happy path stays intact)", async () => {
render(<ChatTab workspaceId="ws-ok" data={minimalData} />);
const textarea = await screen.findByLabelText(/Message to agent/i);
fireEvent.change(textarea, { target: { value: "hello world" } });
fireEvent.keyDown(textarea, { key: "Enter" /* no isComposing, no 229 */ });
// The api.post for /a2a fires inside sendMessage. waitFor since
// the call goes through several effects.
await waitFor(() => {
expect(apiPost).toHaveBeenCalled();
});
});
it("Shift+Enter inserts newline regardless (no send)", async () => {
render(<ChatTab workspaceId="ws-shift" data={minimalData} />);
const textarea = await screen.findByLabelText(/Message to agent/i);
fireEvent.change(textarea, { target: { value: "line 1" } });
fireEvent.keyDown(textarea, { key: "Enter", shiftKey: true });
await waitFor(() => {
expect(apiPost).not.toHaveBeenCalled();
});
});
});
@@ -1,220 +0,0 @@
// @vitest-environment jsdom
//
// Pins the Edit affordance added to MemoryTab. Until this PR the Memory tab
// was Add+Delete only; an entry that needed correction had to be deleted and
// re-added — losing the version-counter and any in-flight optimistic-locking
// invariants other writers depend on.
//
// Each test pins one branch of the new flow. If any fails, the bug is back.
import { describe, it, expect, vi, afterEach, beforeEach } from "vitest";
import { render, screen, cleanup, waitFor, fireEvent } from "@testing-library/react";
import React from "react";
afterEach(cleanup);
const apiGet = vi.fn();
const apiPost = vi.fn();
const apiDel = vi.fn();
vi.mock("@/lib/api", () => ({
api: {
get: (path: string) => apiGet(path),
post: (path: string, body: unknown) => apiPost(path, body),
del: (path: string) => apiDel(path),
patch: vi.fn(),
put: vi.fn(),
},
}));
import { MemoryTab } from "../MemoryTab";
const sampleEntries = [
{
key: "team_brief",
value: { goal: "ship v2" },
version: 3,
expires_at: null,
updated_at: "2026-05-04T10:00:00Z",
},
{
key: "plain_note",
value: "raw text note",
version: 1,
expires_at: "2099-01-01T00:00:00Z",
updated_at: "2026-05-04T10:01:00Z",
},
];
beforeEach(() => {
apiGet.mockReset();
apiPost.mockReset();
apiDel.mockReset();
apiGet.mockImplementation((path: string) => {
if (path === "/workspaces/ws-test/memory") {
return Promise.resolve(sampleEntries);
}
return Promise.reject(new Error(`unmocked api.get: ${path}`));
});
});
async function renderAndExpand(key: string) {
render(<MemoryTab workspaceId="ws-test" />);
await waitFor(() => expect(apiGet).toHaveBeenCalled());
// Reveal the Advanced section that hosts the entry list.
const showAdvanced = await screen.findByRole("button", { name: "Show" });
fireEvent.click(showAdvanced);
// Expand the row.
const row = await screen.findByRole("button", { name: new RegExp(key) });
fireEvent.click(row);
}
describe("MemoryTab Edit affordance", () => {
it("Edit button appears once a row is expanded", async () => {
await renderAndExpand("team_brief");
expect(screen.getAllByRole("button", { name: "Edit" }).length).toBeGreaterThan(0);
});
it("clicking Edit on a JSON-valued entry pre-fills the textarea with pretty JSON", async () => {
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = (await screen.findByLabelText(
"Edit value for team_brief",
)) as HTMLTextAreaElement;
expect(textarea.value).toBe('{\n "goal": "ship v2"\n}');
});
it("clicking Edit on a string-valued entry pre-fills raw (no surrounding quotes)", async () => {
await renderAndExpand("plain_note");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = (await screen.findByLabelText(
"Edit value for plain_note",
)) as HTMLTextAreaElement;
expect(textarea.value).toBe("raw text note");
});
it("Save POSTs with if_match_version + parsed value, then reloads", async () => {
apiPost.mockResolvedValue({ status: "ok", key: "team_brief", version: 4 });
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for team_brief");
fireEvent.change(textarea, { target: { value: '{"goal":"ship v3"}' } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
expect(apiPost).toHaveBeenCalledWith("/workspaces/ws-test/memory", {
key: "team_brief",
value: { goal: "ship v3" },
if_match_version: 3,
});
// Reload after save → second GET.
await waitFor(() => expect(apiGet).toHaveBeenCalledTimes(2));
});
it("Save with non-JSON text falls back to plain string", async () => {
apiPost.mockResolvedValue({ status: "ok" });
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for team_brief");
fireEvent.change(textarea, { target: { value: "free-form note" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
expect(apiPost.mock.calls[0][1].value).toBe("free-form note");
});
it("TTL field is forwarded as ttl_seconds when set", async () => {
apiPost.mockResolvedValue({ status: "ok" });
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const ttlInput = await screen.findByLabelText("Edit TTL for team_brief");
fireEvent.change(ttlInput, { target: { value: "3600" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
expect(apiPost.mock.calls[0][1].ttl_seconds).toBe(3600);
});
it("blank/zero/non-numeric TTL is omitted from the payload", async () => {
apiPost.mockResolvedValue({ status: "ok" });
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const ttlInput = await screen.findByLabelText("Edit TTL for team_brief");
// Junk + zero both must drop out — payload must not contain ttl_seconds.
fireEvent.change(ttlInput, { target: { value: "abc" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
expect(apiPost.mock.calls[0][1]).not.toHaveProperty("ttl_seconds");
});
it("Cancel discards edits and restores the rendered value", async () => {
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for team_brief");
fireEvent.change(textarea, { target: { value: '{"goal":"discarded"}' } });
fireEvent.click(screen.getByRole("button", { name: "Cancel" }));
expect(apiPost).not.toHaveBeenCalled();
// Editor is gone; the JSON pre-block is back.
expect(screen.queryByLabelText("Edit value for team_brief")).toBeNull();
expect(screen.getAllByText(/"goal": "ship v2"/i).length).toBeGreaterThan(0);
});
it("409 response surfaces a retry hint and reloads", async () => {
apiPost.mockRejectedValueOnce(
new Error("HTTP 409: if_match_version mismatch"),
);
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for team_brief");
fireEvent.change(textarea, { target: { value: '{"goal":"ship v3"}' } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
const alert = await screen.findByRole("alert");
expect(alert.textContent).toMatch(/changed since you opened it/i);
// Initial mount load + post-conflict reload.
await waitFor(() => expect(apiGet).toHaveBeenCalledTimes(2));
});
it("non-409 error surfaces the message and does not reload", async () => {
apiPost.mockRejectedValueOnce(new Error("boom"));
await renderAndExpand("team_brief");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
fireEvent.click(screen.getByRole("button", { name: "Save" }));
const alert = await screen.findByRole("alert");
expect(alert.textContent).toBe("boom");
// Only the initial mount load — no retry reload.
expect(apiGet).toHaveBeenCalledTimes(1);
});
it("entry with no version omits if_match_version (back-compat with older shape)", async () => {
// Pre-version-counter shape: drop the `version` field from the row.
apiGet.mockReset();
apiGet.mockImplementation((path: string) => {
if (path === "/workspaces/ws-test/memory") {
return Promise.resolve([
{
key: "old_entry",
value: "legacy",
expires_at: null,
updated_at: "2026-05-04T10:00:00Z",
},
]);
}
return Promise.reject(new Error(`unmocked: ${path}`));
});
apiPost.mockResolvedValue({ status: "ok" });
await renderAndExpand("old_entry");
fireEvent.click(screen.getAllByRole("button", { name: "Edit" })[0]);
const textarea = await screen.findByLabelText("Edit value for old_entry");
fireEvent.change(textarea, { target: { value: "updated" } });
fireEvent.click(screen.getByRole("button", { name: "Save" }));
await waitFor(() => expect(apiPost).toHaveBeenCalledTimes(1));
const payload = apiPost.mock.calls[0][1];
expect(payload).not.toHaveProperty("if_match_version");
expect(payload.value).toBe("updated");
});
});
@@ -1,6 +1,6 @@
"use client";
import { useState, useEffect, useMemo, useRef } from "react";
import { useState, useEffect, useLayoutEffect, useMemo, useRef, useCallback } from "react";
import ReactMarkdown from "react-markdown";
import remarkGfm from "remark-gfm";
import { api } from "@/lib/api";
@@ -184,13 +184,23 @@ function unwrapErrorText(raw: string | null): string {
export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
const [messages, setMessages] = useState<CommMessage[]>([]);
const [loading, setLoading] = useState(true);
const [loadError, setLoadError] = useState<string | null>(null);
// Dedup by timestamp+type+peer to handle API load + WebSocket race
const seenKeys = useRef(new Set<string>());
const bottomRef = useRef<HTMLDivElement>(null);
// Mirrors the my-chat scroll behaviour from ChatTab (PR #2903) —
// smooth-scroll on a long history gets interrupted by concurrent
// renders and lands the panel mid-conversation. Switch the first
// arrival to instant; subsequent appends animate.
const hasInitialScrollRef = useRef(false);
// Load history
useEffect(() => {
// Load history. Extracted so the error-state retry button can
// re-invoke without remount. ChatTab uses the same shape
// (loadInitial → loadError state → retry button).
const loadInitial = useCallback(() => {
setLoading(true);
setLoadError(null);
seenKeys.current.clear();
api.get<ActivityEntry[]>(`/workspaces/${workspaceId}/activity?source=agent&limit=50`)
.then((entries) => {
const filtered = (entries ?? [])
@@ -234,10 +244,15 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
// the .then body) — the panel just sat on the empty state
// with zero signal.
console.warn("AgentCommsPanel: load activity failed", err);
setLoadError(err instanceof Error ? err.message : String(err));
setLoading(false);
});
}, [workspaceId]);
useEffect(() => {
loadInitial();
}, [loadInitial]);
// Live updates routed through the global ReconnectingSocket. The
// previous pattern of `new WebSocket(WS_URL)` per panel had no
// onclose / no reconnect, so any drop (idle timeout, browser
@@ -358,7 +373,18 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
} catch { /* ignore */ }
});
useEffect(() => {
// useLayoutEffect (not useEffect) so the scroll runs BEFORE paint —
// otherwise the user sees the panel jump for one frame on every
// append. Mirrors ChatTab's MyChatPanel scroll block.
useLayoutEffect(() => {
if (!hasInitialScrollRef.current && messages.length > 0) {
// Instant on first arrival — smooth-scroll on a long history
// gets interrupted by concurrent renders and lands the panel
// mid-conversation (the chat-opens-in-middle bug class).
hasInitialScrollRef.current = true;
bottomRef.current?.scrollIntoView({ behavior: "instant" as ScrollBehavior });
return;
}
bottomRef.current?.scrollIntoView({ behavior: "smooth" });
}, [messages]);
@@ -366,6 +392,27 @@ export function AgentCommsPanel({ workspaceId }: { workspaceId: string }) {
return <div className="text-xs text-ink-soft text-center py-8">Loading agent communications...</div>;
}
if (loadError !== null && messages.length === 0) {
// Mirrors ChatTab my-chat error UI — surfaces the load failure
// with a retry button instead of silently rendering empty state.
return (
<div
role="alert"
className="mx-2 mt-2 rounded-lg border border-red-800/50 bg-red-950/30 px-3 py-2.5"
>
<p className="text-[11px] text-bad mb-1.5">
Failed to load agent communications: {loadError}
</p>
<button
onClick={loadInitial}
className="text-[10px] px-2 py-0.5 rounded bg-red-800/40 text-bad hover:bg-red-700/50 transition-colors"
>
Retry
</button>
</div>
);
}
if (messages.length === 0) {
return (
<div className="text-xs text-ink-soft text-center py-8">
@@ -0,0 +1,115 @@
// @vitest-environment jsdom
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, waitFor } from "@testing-library/react";
// API mock — tests can override per case via apiGetMock.mockImplementationOnce.
const apiGetMock = vi.fn<(url: string) => Promise<unknown>>();
vi.mock("@/lib/api", () => ({
api: {
get: (url: string) => apiGetMock(url),
},
}));
// useSocketEvent — no-op for these render tests; live updates aren't
// what we're verifying here.
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: () => {},
}));
// Canvas store — peer name resolution.
vi.mock("@/store/canvas", () => ({
useCanvasStore: {
getState: () => ({
nodes: [
{ id: "ws-self", data: { name: "Self" } },
{ id: "ws-peer", data: { name: "Peer Agent" } },
],
}),
},
}));
// Toaster shim — AgentCommsPanel imports showToast.
vi.mock("../../Toaster", () => ({
showToast: vi.fn(),
}));
import { AgentCommsPanel } from "../AgentCommsPanel";
// jsdom doesn't implement scrollIntoView. Tests that observe the call
// install a spy here; tests that don't care still need a no-op stub
// so the component doesn't throw.
const scrollSpy = vi.fn<(opts?: ScrollIntoViewOptions | boolean) => void>();
beforeEach(() => {
apiGetMock.mockReset();
scrollSpy.mockReset();
Element.prototype.scrollIntoView = scrollSpy as unknown as Element["scrollIntoView"];
});
afterEach(() => {
vi.clearAllMocks();
});
describe("AgentCommsPanel — initial-state parity with ChatTab my-chat", () => {
it("shows loading text while history fetch is in flight", () => {
apiGetMock.mockReturnValueOnce(new Promise(() => { /* never resolves */ }));
render(<AgentCommsPanel workspaceId="ws-self" />);
expect(screen.getByText("Loading agent communications...")).toBeDefined();
});
it("renders error UI with a Retry button when the history fetch rejects", async () => {
apiGetMock.mockRejectedValueOnce(new Error("network down"));
render(<AgentCommsPanel workspaceId="ws-self" />);
// Wait for the error state to render — loading→error transition is async.
const alert = await waitFor(() => screen.getByRole("alert"));
expect(alert.textContent).toMatch(/Failed to load agent communications/);
expect(alert.textContent).toMatch(/network down/);
// Retry button must be present and trigger a refetch.
const retry = screen.getByRole("button", { name: "Retry" });
apiGetMock.mockResolvedValueOnce([]); // success on retry
fireEvent.click(retry);
// Two calls total: initial load + retry. Pin via mock call count.
await waitFor(() => expect(apiGetMock.mock.calls.length).toBe(2));
});
it("falls back to empty-state copy when load succeeds with zero rows", async () => {
apiGetMock.mockResolvedValueOnce([]);
render(<AgentCommsPanel workspaceId="ws-self" />);
await waitFor(() =>
expect(screen.getByText("No agent-to-agent communications yet.")).toBeDefined(),
);
});
it("scrollIntoView is called with behavior=instant on the first message arrival", async () => {
apiGetMock.mockResolvedValueOnce([
{
id: "act-1",
activity_type: "a2a_send",
source_id: "ws-self",
target_id: "ws-peer",
method: "message/send",
summary: "Delegating",
request_body: { message: { parts: [{ text: "hi" }] } },
response_body: null,
status: "ok",
created_at: "2026-04-25T18:00:00Z",
},
]);
render(<AgentCommsPanel workspaceId="ws-self" />);
// useLayoutEffect is what makes the first call instant — wait for
// the panel to render at least one message.
await waitFor(() => expect(scrollSpy.mock.calls.length).toBeGreaterThan(0));
// The pinned contract: SOME call uses behavior: "instant" — the
// first-arrival case. Subsequent appends use "smooth", but those
// can't fire here (no live update yet).
const sawInstant = scrollSpy.mock.calls.some((args) => {
const opts = args[0];
return typeof opts === "object" && opts !== null && "behavior" in opts && opts.behavior === "instant";
});
expect(sawInstant).toBe(true);
});
});
+40 -2
View File
@@ -44,6 +44,8 @@ export async function uploadChatFiles(
* - `workspace:<abs-path>` (our canonical form)
* - `file:///workspace/...` (some agents emit this)
* - `/workspace/...` (bare absolute path inside the container)
* - `platform-pending:<wsid>/<file_id>` (poll-mode upload, staged
* on platform side; resolves to /pending-uploads/<file_id>/content)
* Everything that looks like an allowed-root container path is
* rewritten to the authenticated /chat/download endpoint. HTTP(S)
* URIs pass through unchanged so we can also render links to
@@ -53,6 +55,35 @@ export function resolveAttachmentHref(
workspaceId: string,
uri: string,
): string {
// platform-pending: agents-emitted URI that lives in the platform-side
// staging layer (poll-mode chat uploads, see workspace-server's
// chat_files.go ~line 690 + pendinguploads.Storage). The wire shape
// is `platform-pending:<workspace_id>/<file_id>`. Resolving it
// requires hitting GET /workspaces/<wsid>/pending-uploads/<file_id>/content
// which streams the bytes with full workspace auth. Without this
// case the browser sees an unhandled-protocol click → about:blank,
// which was the user-visible bug from 2026-05-05 (reno-stars).
if (uri.startsWith("platform-pending:")) {
const rest = uri.slice("platform-pending:".length);
const slash = rest.indexOf("/");
// Defensive: if the URI doesn't have the expected wsid/fileid
// shape, fall through to raw-URI handling so the consumer can
// still try to render it (rather than producing a broken /pending-
// uploads/// path).
if (slash > 0) {
const wsid = rest.slice(0, slash);
const fileID = rest.slice(slash + 1);
if (wsid && fileID) {
// Use the URI's own workspace_id (the bytes live in THAT
// workspace's pending-uploads store), not the chat's
// workspace_id — these CAN differ when a user drags a file
// into one workspace's chat that gets forwarded to another
// (cross-workspace delegation, agent forwarding).
return `${PLATFORM_URL}/workspaces/${wsid}/pending-uploads/${fileID}/content`;
}
}
return uri;
}
const containerPath = normalizeWorkspaceUri(uri);
if (containerPath) {
return `${PLATFORM_URL}/workspaces/${workspaceId}/chat/download?path=${encodeURIComponent(containerPath)}`;
@@ -60,6 +91,14 @@ export function resolveAttachmentHref(
return uri;
}
/** Returns true when the URI points at a platform-side resource that
* requires our auth headers — caller should route through
* downloadChatFile rather than letting the browser navigate. */
export function isPlatformAttachment(uri: string): boolean {
if (uri.startsWith("platform-pending:")) return true;
return normalizeWorkspaceUri(uri) !== null;
}
/** Extracts the absolute container path from a workspace-scoped URI,
* or null if the URI isn't a container path. The matching roots
* mirror the server's `allowedRoots` allowlist. */
@@ -96,8 +135,7 @@ export async function downloadChatFile(
attachment: ChatAttachment,
): Promise<void> {
const href = resolveAttachmentHref(workspaceId, attachment.uri);
const isContainerPath = normalizeWorkspaceUri(attachment.uri) !== null;
if (!isContainerPath) {
if (!isPlatformAttachment(attachment.uri)) {
// External URL — let the browser navigate. Opens in new tab so
// the canvas context survives a navigation. `href` here is the
// raw URI (http(s), or anything else the agent sent back).
+155 -1
View File
@@ -2,7 +2,7 @@
* @vitest-environment jsdom
*/
import { describe, it, expect, vi, afterEach } from "vitest";
import { fetchSession, redirectToLogin } from "../auth";
import { fetchSession, redirectToLogin, signOut } from "../auth";
afterEach(() => {
vi.unstubAllGlobals();
@@ -110,3 +110,157 @@ describe("redirectToLogin", () => {
expect((window.location as unknown as { href: string }).href).toBe(signupHref);
});
});
describe("signOut", () => {
// Helper — most tests need the same window.location stub.
function stubLocation(): void {
Object.defineProperty(window, "location", {
writable: true,
value: {
href: "https://acme.moleculesai.app/orgs",
pathname: "/orgs",
hostname: "acme.moleculesai.app",
protocol: "https:",
},
});
}
it("POSTs to /cp/auth/signout with credentials:include", async () => {
stubLocation();
const fetchMock = vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: "" }),
});
vi.stubGlobal("fetch", fetchMock);
await signOut();
expect(fetchMock).toHaveBeenCalledTimes(1);
expect(fetchMock).toHaveBeenCalledWith(
expect.stringContaining("/cp/auth/signout"),
expect.objectContaining({ method: "POST", credentials: "include" }),
);
});
it("navigates to provider logout_url when the response includes one", async () => {
// The hosted-logout path is what actually breaks the SSO re-auth
// loop reported on PR #2913. Without this, AuthKit's browser
// cookie keeps the user signed in via SSO and any subsequent
// /cp/auth/login silently re-auths.
stubLocation();
const hostedLogout =
"https://api.workos.com/user_management/sessions/logout?session_id=cookie&return_to=https%3A%2F%2Fapp.moleculesai.app%2Forgs";
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: hostedLogout }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe(hostedLogout);
});
it("falls back to /cp/auth/login when logout_url is empty (DisabledProvider / dev)", async () => {
// DisabledProvider returns "" — the local /cp/auth/login redirect
// works in dev/test where there's no SSO session to escape.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: "" }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
// Tenant subdomain (acme.moleculesai.app) → auth origin is app.moleculesai.app.
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("redirects even when the POST fails so the user isn't stuck on an authed page", async () => {
// Critical UX invariant: clicking 'Sign out' MUST navigate away from
// the authenticated app, even if the network is down or the cookie
// is already invalid. Anything else looks like the button is
// broken — the precise complaint that triggered this fix.
stubLocation();
vi.stubGlobal("fetch", vi.fn().mockRejectedValue(new Error("network down")));
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("redirects on 401 (session already invalid) just like 200", async () => {
// A user with an already-invalid cookie should still see the
// logout flow complete — no error, no stuck-on-app dead end.
// Note: 401 means res.ok=false → we don't read .json() at all,
// so a missing body is fine.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: false,
status: 401,
json: async () => ({}),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("falls back to /cp/auth/login when the response body is malformed", async () => {
// Defensive parsing: a body that isn't valid JSON, or doesn't
// have logout_url, or has logout_url as the wrong type — none of
// these should strand the user on the authed page. Fallback path
// takes over.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => {
throw new Error("not json");
},
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
it("falls back to /cp/auth/login when logout_url is the wrong type", async () => {
// Even valid JSON should be type-checked: a non-string logout_url
// (e.g. server-side bug, version drift) must not crash or open-
// redirect the user.
stubLocation();
vi.stubGlobal(
"fetch",
vi.fn().mockResolvedValue({
ok: true,
status: 200,
json: async () => ({ ok: true, logout_url: 42 }),
}),
);
await signOut();
const after = (window.location as unknown as { href: string }).href;
expect(after).toBe("https://app.moleculesai.app/cp/auth/login");
});
});
+77
View File
@@ -67,3 +67,80 @@ export function redirectToLogin(screenHint: "sign-up" | "sign-in" = "sign-in"):
const dest = `${authOrigin}${AUTH_BASE}/${path}?return_to=${encodeURIComponent(returnTo)}`;
window.location.href = dest;
}
/**
* signOut posts to /cp/auth/signout to clear the WorkOS session cookie
* + revoke at the provider, then navigates the browser to the
* provider-supplied hosted logout URL (so the provider's BROWSER-side
* SSO cookie is cleared too — without this, AuthKit silently re-auths
* via SSO on the next /cp/auth/login and the user is "still signed
* in" after pressing Sign out).
*
* Two-layer flow:
* 1. POST /cp/auth/signout → CP clears OUR session cookie + revokes
* session_id at the provider API. Response includes
* `logout_url` — the AuthKit hosted URL the BROWSER must navigate
* to so the provider's own browser cookie is cleared.
* 2. window.location.href = <logout_url> → AuthKit clears its
* session, then redirects the browser to the configured
* return_to (defaults to APP_URL/orgs).
*
* Best-effort by design: a 5xx, network failure, missing logout_url
* (DisabledProvider, dev), or stale cookie still results in the
* browser navigating away — leaving the user on a logged-in-looking
* page after they clicked "Sign out" is the worst possible UX. The
* fallback path navigates to /cp/auth/login on the auth origin, which
* works correctly in environments without a hosted logout flow (dev,
* tests, DisabledProvider).
*
* Throws nothing — callers can disable the button optimistically or
* await this and trust it returns. On a redirect-blocked test
* environment (jsdom under vitest) we still exit cleanly so unit tests
* can spy on the fetch call.
*/
export async function signOut(): Promise<void> {
let logoutURL: string | undefined;
// Fire-and-tolerate the POST. credentials:include is mandatory cross-
// origin so the SaaS canvas (acme.moleculesai.app) can hit
// app.moleculesai.app/cp/auth/signout with the session cookie.
try {
const res = await fetch(`${getAuthOrigin()}${AUTH_BASE}/signout`, {
method: "POST",
credentials: "include",
});
if (res.ok) {
// Body shape: {"ok": true, "logout_url": "..."}. logout_url is
// empty for DisabledProvider (dev/local) — we fall back to
// /cp/auth/login below. Defensive parsing: a malformed body
// shouldn't strand the user on the authed page.
const body: unknown = await res.json().catch(() => null);
if (
body &&
typeof body === "object" &&
"logout_url" in body &&
typeof (body as { logout_url: unknown }).logout_url === "string" &&
(body as { logout_url: string }).logout_url
) {
logoutURL = (body as { logout_url: string }).logout_url;
}
}
} catch {
// Ignore — we still redirect below.
}
if (typeof window === "undefined") return;
if (logoutURL) {
// Hosted logout: AuthKit clears its SSO cookie + redirects to
// return_to (configured server-side). This is the path that
// actually breaks the SSO re-auth loop.
window.location.href = logoutURL;
return;
}
// Fallback: no hosted logout (dev, DisabledProvider, network
// failure). Land on the login screen rather than the current URL:
// returning to a tenant URL after signout would just re-redirect
// through /cp/auth/login due to AuthGate. Send the user straight
// there with no return_to so they don't loop back into the org they
// just left.
const authOrigin = getAuthOrigin();
window.location.href = `${authOrigin}${AUTH_BASE}/login`;
}
+9
View File
@@ -1,5 +1,14 @@
# Workspace Runtime PyPI Package
## Requires Python >= 3.11
The wheel pins `requires_python>=3.11`. On Python 3.10 or older, `pip install
molecule-ai-workspace-runtime` fails with `Could not find a version that
satisfies the requirement (from versions: none)` — the pin filters the only
available artifact before pip even attempts install. Upgrade the interpreter
(`brew install python@3.12` / `apt install python3.12` / etc.) or use a
3.11+ venv.
## Overview
The shared workspace runtime infrastructure has **one editable source** and
+89 -2
View File
@@ -56,6 +56,9 @@ TOP_LEVEL_MODULES = {
"a2a_mcp_server",
"a2a_tools",
"a2a_tools_delegation",
"a2a_tools_inbox",
"a2a_tools_memory",
"a2a_tools_messaging",
"a2a_tools_rbac",
"adapter_base",
"agent",
@@ -77,6 +80,7 @@ TOP_LEVEL_MODULES = {
"internal_file_read",
"main",
"mcp_cli",
"mcp_doctor",
"mcp_heartbeat",
"mcp_inbox_pollers",
"mcp_workspace_resolver",
@@ -288,10 +292,37 @@ directory** by the `publish-runtime` GitHub Actions workflow on every
Operators running an agent outside the platform's container fleet
(any runtime that supports MCP stdio — Claude Code, hermes, codex,
etc.) can install this wheel and run the universal MCP server
locally:
locally.
### Requirements
* **Python ≥3.11.** The wheel sets `requires-python = ">=3.11"`. On
older interpreters `pip install` returns the cryptic
`Could not find a version that satisfies the requirement` — that
message is pip filtering this wheel out, NOT the package missing
from PyPI. Upgrade with `brew install python@3.12` /
`apt install python3.12` / `pyenv install 3.12` first.
* **`pipx` recommended over `pip`.** `pipx install` puts
`molecule-mcp` on PATH automatically and isolates the runtime's
deps from your system Python. Plain `pip install --user` works
but the binary lands in `~/.local/bin` (Linux) or
`~/Library/Python/3.X/bin` (macOS) which is often not on PATH on
a fresh shell — `claude mcp add molecule -- molecule-mcp` then
fails with "command not found" at first use.
### Install
```sh
# Recommended:
pipx install molecule-ai-workspace-runtime
# Alternative (manage PATH yourself):
pip install --user molecule-ai-workspace-runtime
```
### Run
```sh
pip install molecule-ai-workspace-runtime
WORKSPACE_ID=<uuid> \\
PLATFORM_URL=https://<tenant>.staging.moleculesai.app \\
MOLECULE_WORKSPACE_TOKEN=<bearer> \\
@@ -304,10 +335,66 @@ runtimes already get via the workspace's auto-spawned MCP. Register
the binary in your agent's MCP config (e.g. Claude Code's
`claude mcp add molecule -- molecule-mcp` with the env above).
### Keeping the token out of shell history
Inline `MOLECULE_WORKSPACE_TOKEN=<bearer>` ends up in `~/.zsh_history`
and (when registered via `claude mcp add`) plaintext in
`~/.claude.json`. To avoid that, write the token to a 0600 file and
point `MOLECULE_WORKSPACE_TOKEN_FILE` at it:
```sh
umask 077
printf '%s' "<bearer>" > ~/.config/molecule/token
WORKSPACE_ID=<uuid> \\
PLATFORM_URL=https://<tenant>.staging.moleculesai.app \\
MOLECULE_WORKSPACE_TOKEN_FILE=$HOME/.config/molecule/token \\
molecule-mcp
```
Token resolution order: `MOLECULE_WORKSPACE_TOKEN` (inline env) →
`MOLECULE_WORKSPACE_TOKEN_FILE` (path) → `${CONFIGS_DIR}/.auth_token`
(in-container default).
The token comes from the canvas → Tokens tab. Restarting an external
workspace from the canvas no longer revokes the token (PR #2412), so
operator tokens persist across status nudges.
### Push vs poll delivery (Claude Code specifics)
By default the inbox runs in **poll mode** — every turn the agent
calls `wait_for_message`, which blocks up to ~60s on
`/activity?since_id=…`. Real-time push delivery is also supported,
but on Claude Code it requires THREE conditions, ALL of which must
hold:
1. **The MCP server declares `experimental.claude/channel`** — this
wheel does (see `_build_initialize_result`). Nothing for you to
do.
2. **Claude Code installs the server as a marketplace plugin** — a
plain `claude mcp add molecule -- molecule-mcp` produces a
non-plugin-sourced server, which Claude Code rejects with
`channel_enable requires a marketplace plugin`. Until the
official `moleculesai/claude-code-plugin` marketplace lands
(tracking [#2936](https://github.com/Molecule-AI/molecule-core/issues/2936)),
operators who want push must scaffold their own local marketplace
under
`~/.claude/marketplaces/molecule-local/` containing a
`marketplace.json` + `plugin.json` that points at this wheel.
3. **Claude Code is launched with the dev-channels flag** — pass
`--dangerously-load-development-channels plugin:molecule@<marketplace>`
on the `claude` invocation. Without this flag the channel
capability is silently ignored.
Symptom of any condition failing: messages arrive but only via the
poll path (every ~160s), not real-time. There's currently no
diagnostic surfaced — `molecule-mcp doctor` (tracking
[#2937](https://github.com/Molecule-AI/molecule-core/issues/2937)) is
planned.
If you don't need real-time push, the default poll path works
universally with no extra setup; both modes converge on the same
`inbox_pop` ack so messages never duplicate.
See [`docs/workspace-runtime-package.md`](https://github.com/Molecule-AI/molecule-core/blob/main/docs/workspace-runtime-package.md)
for the publish flow and architecture.
"""
+295
View File
@@ -0,0 +1,295 @@
#!/usr/bin/env bash
# E2E for poll-mode chat upload (RFC #2891 phases 1-5b).
#
# Round-trip: register a workspace as poll-mode (no callback URL) → POST a
# multi-file chat upload → verify each file becomes (a) one
# `chat_upload_receive` activity row and (b) one /pending-uploads row → fetch
# the bytes back via the poll endpoint → ack → verify the row 404s on
# subsequent fetch. Also pins cross-workspace bleed protection: workspace B
# cannot read workspace A's pending uploads even with its own valid bearer.
#
# Why this exists separately from test_chat_upload_e2e.sh: that script
# covers the PUSH path (the workspace's own /internal/chat/uploads/ingest).
# This script covers the POLL path: the same canvas-side request lands on
# the platform's pendinguploads.Storage instead, and the workspace fetches
# it later. The two paths share zero handler code on the platform side, so
# both need their own E2E.
#
# Requires: platform running on localhost:8080 with migrations applied.
# bash workspace-server/scripts/dev-start.sh
# bash workspace-server/scripts/run-migrations.sh
#
# Idempotent: each run uses fresh per-script workspace UUIDs so reruns
# don't collide. Best-effort cleanup on EXIT — does NOT call
# e2e_cleanup_all_workspaces (see
# `feedback_never_run_cluster_cleanup_tests_on_live_platform.md`).
set -euo pipefail
source "$(dirname "$0")/_lib.sh"
PASS=0
FAIL=0
TIMEOUT="${A2A_TIMEOUT:-30}"
gen_uuid() {
if command -v uuidgen >/dev/null 2>&1; then
uuidgen | tr '[:upper:]' '[:lower:]'
else
python3 -c 'import uuid; print(uuid.uuid4())'
fi
}
WS_A="$(gen_uuid)"
WS_B="$(gen_uuid)"
# Per-run scratch dir collected under one trap so every assertion-failure
# path drops the temp files it made (see test_chat_attachments_e2e.sh).
TMPDIR_E2E=$(mktemp -d -t poll-chat-upload-e2e-XXXXXX)
cleanup() {
local rc=$?
curl -s -X DELETE "$BASE/workspaces/$WS_A?confirm=true" >/dev/null 2>&1 || true
curl -s -X DELETE "$BASE/workspaces/$WS_B?confirm=true" >/dev/null 2>&1 || true
rm -rf "$TMPDIR_E2E"
exit $rc
}
trap cleanup EXIT INT TERM
check() {
local desc="$1" expected="$2" actual="$3"
if echo "$actual" | grep -qF -- "$expected"; then
echo "PASS: $desc"
PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected to contain: $expected"
echo " got: $(echo "$actual" | head -10)"
FAIL=$((FAIL + 1))
fi
}
check_eq() {
local desc="$1" expected="$2" actual="$3"
if [ "$actual" = "$expected" ]; then
echo "PASS: $desc"
PASS=$((PASS + 1))
else
echo "FAIL: $desc"
echo " expected: $expected"
echo " got: $actual"
FAIL=$((FAIL + 1))
fi
}
echo "=== Poll-Mode Chat Upload E2E ==="
echo " base: $BASE"
echo " workspace A: $WS_A"
echo " workspace B: $WS_B"
echo ""
# ---------- Phase 1: register poll-mode workspace ----------
echo "--- Phase 1: Register poll-mode workspace A ---"
REG_A=$(curl -s -X POST "$BASE/registry/register" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"$WS_A\",
\"delivery_mode\": \"poll\",
\"agent_card\": {\"name\": \"poll-chat-upload-test-a\"}
}")
check "register accepts poll mode without URL" '"status":"registered"' "$REG_A"
TOK_A=$(echo "$REG_A" | e2e_extract_token || true)
[ -n "$TOK_A" ] || { echo "FAIL: no auth_token in register response (ws A)"; FAIL=$((FAIL + 1)); exit 1; }
# ---------- Phase 2: multi-file chat upload ----------
echo ""
echo "--- Phase 2: POST /chat/uploads with two files ---"
FILE1="$TMPDIR_E2E/alpha.txt"
FILE2="$TMPDIR_E2E/beta.txt"
EXPECTED1="alpha-secret-$(openssl rand -hex 4)"
EXPECTED2="beta-secret-$(openssl rand -hex 4)"
printf '%s' "$EXPECTED1" > "$FILE1"
printf '%s' "$EXPECTED2" > "$FILE2"
UPLOAD=$(curl -s -X POST "$BASE/workspaces/$WS_A/chat/uploads" \
-H "Authorization: Bearer $TOK_A" \
-F "files=@$FILE1;filename=alpha.txt;type=text/plain" \
-F "files=@$FILE2;filename=beta.txt;type=text/plain" \
-w "\nHTTP_CODE=%{http_code}\n")
UPLOAD_CODE=$(echo "$UPLOAD" | grep -oE 'HTTP_CODE=[0-9]+' | cut -d= -f2)
UPLOAD_BODY=$(echo "$UPLOAD" | sed '/^HTTP_CODE=/,$d')
check_eq "upload returns 200" "200" "$UPLOAD_CODE"
check "upload response has files array" '"files":' "$UPLOAD_BODY"
# Pull file_ids out of the URI in the response. URI shape is
# `platform-pending:<wsid>/<file_id>` — proves the response came from the
# poll-mode branch, not the push-mode internal-ingest branch.
URI1=$(echo "$UPLOAD_BODY" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][0]["uri"])')
URI2=$(echo "$UPLOAD_BODY" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][1]["uri"])')
check "URI 1 has platform-pending: scheme" "platform-pending:$WS_A/" "$URI1"
check "URI 2 has platform-pending: scheme" "platform-pending:$WS_A/" "$URI2"
FID1="${URI1##*/}"
FID2="${URI2##*/}"
[ -n "$FID1" ] && [ -n "$FID2" ] || { echo "FAIL: could not extract file IDs"; FAIL=$((FAIL + 1)); exit 1; }
echo " file_id 1: $FID1"
echo " file_id 2: $FID2"
# ---------- Phase 3: activity rows visible to the workspace ----------
echo ""
echo "--- Phase 3: /activity shows two chat_upload_receive rows ---"
# activity_logs INSERTs run in a goroutine — give them a moment.
sleep 1
ACT=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/activity?type=a2a_receive&limit=20")
check "activity feed has the alpha file" "$FID1" "$ACT"
check "activity feed has the beta file" "$FID2" "$ACT"
check "activity rows tagged chat_upload_receive" '"method":"chat_upload_receive"' "$ACT"
check "activity rows record alpha mimetype" '"mimeType":"text/plain"' "$ACT"
CHAT_UPLOAD_COUNT=$(echo "$ACT" | python3 -c '
import json, sys
rows = json.load(sys.stdin)
n = sum(1 for r in rows if (r.get("method") or "") == "chat_upload_receive")
print(n)
')
check_eq "exactly two chat_upload_receive rows" "2" "$CHAT_UPLOAD_COUNT"
# ---------- Phase 4: GET /pending-uploads/:file_id/content ----------
echo ""
echo "--- Phase 4: Fetch content for each pending upload ---"
GOT1=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check_eq "alpha bytes round-trip" "$EXPECTED1" "$GOT1"
GOT2=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID2/content")
check_eq "beta bytes round-trip" "$EXPECTED2" "$GOT2"
# Mimetype + Content-Disposition headers should match what was uploaded.
HEAD1=$(curl -s -D - -o /dev/null --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check "alpha response carries text/plain Content-Type" "Content-Type: text/plain" "$HEAD1"
check "alpha response carries Content-Disposition with filename" 'filename="alpha.txt"' "$HEAD1"
# ---------- Phase 5: idempotent re-fetch (until ack) ----------
echo ""
echo "--- Phase 5: Re-fetch before ack returns the same bytes ---"
RE_GOT1=$(curl -s --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
check_eq "re-fetch returns same alpha bytes" "$EXPECTED1" "$RE_GOT1"
# ---------- Phase 6: ack each row ----------
echo ""
echo "--- Phase 6: Ack each pending upload ---"
ACK1=$(curl -s -X POST --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/ack")
check "alpha ack returns acked:true" '"acked":true' "$ACK1"
ACK2=$(curl -s -X POST --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID2/ack")
check "beta ack returns acked:true" '"acked":true' "$ACK2"
# Re-ack should still 200 (idempotent — the row's gone but the workspace's
# at-least-once intent was already honored, and the second ack hits the
# raced path which also returns 200).
RE_ACK1=$(curl -s -w '\n%{http_code}' -X POST --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/ack")
RE_ACK1_CODE=$(printf '%s' "$RE_ACK1" | tail -n1)
# Acked rows return 404 on Get-before-Ack (the row's still in the table
# but Get filters acked_at IS NULL); workspace would not normally re-ack
# since it already saw the success. Accept both 200 and 404 here so the
# test pins the contract without being brittle on the inner ordering.
case "$RE_ACK1_CODE" in
200|404)
echo "PASS: re-ack returns 200 or 404 ($RE_ACK1_CODE)"
PASS=$((PASS + 1))
;;
*)
echo "FAIL: re-ack returned unexpected $RE_ACK1_CODE"
FAIL=$((FAIL + 1))
;;
esac
# ---------- Phase 7: GET content after ack returns 404 ----------
echo ""
echo "--- Phase 7: Acked file 404s on subsequent fetch ---"
POST_ACK=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" -H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/$FID1/content")
POST_ACK_CODE=$(printf '%s' "$POST_ACK" | tail -n1)
check_eq "acked alpha returns HTTP 404" "404" "$POST_ACK_CODE"
# ---------- Phase 8: cross-workspace bleed protection ----------
echo ""
echo "--- Phase 8: Workspace B cannot read workspace A's pending uploads ---"
# Stage a fresh upload on workspace A so we have an UN-acked row to probe.
PROBE_FILE="$TMPDIR_E2E/probe.txt"
printf '%s' "probe-bytes-$(openssl rand -hex 4)" > "$PROBE_FILE"
PROBE_UP=$(curl -s -X POST "$BASE/workspaces/$WS_A/chat/uploads" \
-H "Authorization: Bearer $TOK_A" \
-F "files=@$PROBE_FILE;filename=probe.txt;type=text/plain")
PROBE_FID=$(echo "$PROBE_UP" | python3 -c 'import sys,json; d=json.load(sys.stdin); print(d["files"][0]["uri"].split("/")[-1])')
[ -n "$PROBE_FID" ] || { echo "FAIL: probe upload returned no file_id"; FAIL=$((FAIL + 1)); exit 1; }
# Register a SECOND poll-mode workspace and capture its bearer.
REG_B=$(curl -s -X POST "$BASE/registry/register" \
-H "Content-Type: application/json" \
-d "{
\"id\": \"$WS_B\",
\"delivery_mode\": \"poll\",
\"agent_card\": {\"name\": \"poll-chat-upload-test-b\"}
}")
check "second workspace registers" '"status":"registered"' "$REG_B"
TOK_B=$(echo "$REG_B" | e2e_extract_token || true)
[ -n "$TOK_B" ] || { echo "FAIL: no auth_token (ws B)"; FAIL=$((FAIL + 1)); exit 1; }
# B's bearer hitting B's URL with A's file_id → 404 (handler checks the row's
# workspace_id matches the URL :id, not the bearer's workspace).
CROSS_RESP=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_B" \
"$BASE/workspaces/$WS_B/pending-uploads/$PROBE_FID/content")
CROSS_CODE=$(printf '%s' "$CROSS_RESP" | tail -n1)
check_eq "B's URL with A's file_id returns 404" "404" "$CROSS_CODE"
# B's bearer hitting A's URL → 401 (wsAuth pins bearer to :id). This is the
# strictest cross-workspace check: a presented-but-wrong bearer is rejected
# in EVERY platform posture (dev-mode fail-open only triggers when no bearer
# is presented at all — invalid tokens always 401).
WRONG_BEARER=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_B" \
"$BASE/workspaces/$WS_A/pending-uploads/$PROBE_FID/content")
WRONG_CODE=$(printf '%s' "$WRONG_BEARER" | tail -n1)
check_eq "B's bearer on A's URL returns 401" "401" "$WRONG_CODE"
# NB: a fully bearerless request to /pending-uploads/:fid/content returns
# 401 ONLY when the platform has MOLECULE_ENV != development (production /
# staging). On local-dev with MOLECULE_ENV=development the wsauth middleware
# fail-opens for bearerless requests so the canvas at :3000 can talk to the
# platform at :8080 without per-call token plumbing — see middleware/
# devmode.go. The strict bearerless-401 contract is covered by the wsauth
# unit + middleware tests; we don't reassert it here because the result
# depends on platform posture, not the poll-mode upload contract.
# ---------- Phase 9: invalid file_id rejected at the URL parser ----------
echo ""
echo "--- Phase 9: Invalid file_id returns 400 ---"
BAD_FID=$(curl -s -w '\n%{http_code}' --max-time "$TIMEOUT" \
-H "Authorization: Bearer $TOK_A" \
"$BASE/workspaces/$WS_A/pending-uploads/not-a-uuid/content")
BAD_FID_CODE=$(printf '%s' "$BAD_FID" | tail -n1)
check_eq "invalid file_id UUID returns 400" "400" "$BAD_FID_CODE"
# ---------- Results ----------
echo ""
echo "=== Results: $PASS passed, $FAIL failed ==="
[ "$FAIL" -eq 0 ]
+125
View File
@@ -0,0 +1,125 @@
package events
// types.go — typed taxonomy of WebSocket event names emitted by the
// workspace-server.
//
// RFC #2945 PR-B. Pre-consolidation, every BroadcastOnly /
// RecordAndBroadcast call site passed a bare string literal:
//
// h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", payload)
//
// Producers (Go workspace-server, ~30 call sites across handlers/,
// scheduler/, registry/, bundle/) and consumers (canvas TS store +
// component listeners) duplicated the same string with no shared
// definition. A producer renaming an event silently broke every
// consumer — same drift class that produced the reno-stars data-loss
// regression on the persistence side. The fix on that side was the
// AgentMessageWriter SSOT (PR-A); the fix on this side is named
// constants.
//
// Why a typed string (not a plain enum / iota): the event name
// crosses the wire to TypeScript consumers as the literal string in
// `WSMessage.Event`. Iota integers would break the canvas store's
// switch (`case "AGENT_MESSAGE":`); a typed string preserves the
// wire contract while giving Go callers compile-time discipline.
//
// Mirror in canvas: a parity gate (PR-B-2 follow-up) will assert this
// constant set ≡ the TypeScript union members in
// `canvas/src/lib/ws-events.ts`. Today the canvas consumes the names
// via bare-string comparisons; the mirror lands separately to keep
// PR-B narrow.
// EventType is the wire-typed name of a WebSocket event the platform
// broadcasts. Always emit constants from this file rather than bare
// strings — the AST gate in events_types_drift_test.go guards
// against bare-string usage in the broadcaster surfaces.
type EventType string
// Event constants — the canonical taxonomy. New events MUST be added
// here AND mirrored in canvas/src/lib/ws-events.ts (parity gate
// pending in PR-B-2). Group by semantic family so the list stays
// scan-friendly as it grows.
const (
// Chat / agent messaging — surfaces in canvas chat panels.
EventAgentMessage EventType = "AGENT_MESSAGE"
EventA2AResponse EventType = "A2A_RESPONSE"
EventActivityLogged EventType = "ACTIVITY_LOGGED"
EventChannelMessage EventType = "CHANNEL_MESSAGE"
// Workspace lifecycle.
EventWorkspaceProvisioning EventType = "WORKSPACE_PROVISIONING"
EventWorkspaceProvisionFailed EventType = "WORKSPACE_PROVISION_FAILED"
EventWorkspaceOnline EventType = "WORKSPACE_ONLINE"
EventWorkspaceOffline EventType = "WORKSPACE_OFFLINE"
EventWorkspaceDegraded EventType = "WORKSPACE_DEGRADED"
EventWorkspaceHibernated EventType = "WORKSPACE_HIBERNATED"
EventWorkspacePaused EventType = "WORKSPACE_PAUSED"
EventWorkspaceRemoved EventType = "WORKSPACE_REMOVED"
EventWorkspaceAwaitingAgent EventType = "WORKSPACE_AWAITING_AGENT"
EventWorkspaceHeartbeat EventType = "WORKSPACE_HEARTBEAT"
// Agent assignment + identity.
EventAgentAssigned EventType = "AGENT_ASSIGNED"
EventAgentReplaced EventType = "AGENT_REPLACED"
EventAgentRemoved EventType = "AGENT_REMOVED"
EventAgentMoved EventType = "AGENT_MOVED"
EventAgentCardUpdated EventType = "AGENT_CARD_UPDATED"
// Delegation lifecycle.
EventDelegationSent EventType = "DELEGATION_SENT"
EventDelegationStatus EventType = "DELEGATION_STATUS"
EventDelegationComplete EventType = "DELEGATION_COMPLETE"
EventDelegationFailed EventType = "DELEGATION_FAILED"
// Task progression + scheduler.
EventTaskUpdated EventType = "TASK_UPDATED"
EventCronExecuted EventType = "CRON_EXECUTED"
EventCronSkipped EventType = "CRON_SKIPPED"
// Approvals.
EventApprovalRequested EventType = "APPROVAL_REQUESTED"
EventApprovalEscalated EventType = "APPROVAL_ESCALATED"
// Auth / credentials.
EventExternalCredentialsRotated EventType = "EXTERNAL_CREDENTIALS_ROTATED"
)
// AllEventTypes lists every constant in this file. Used by the
// snapshot test (events_types_drift_test.go) to detect when a new
// constant is added without updating the snapshot — the catch-up
// step is mirroring the addition into canvas/src/lib/ws-events.ts so
// canvas consumers can switch on it.
//
// Keep in lexicographic order so the snapshot diff is stable on
// renames and the parity-with-TS comparison is order-independent.
var AllEventTypes = []EventType{
EventA2AResponse,
EventActivityLogged,
EventAgentAssigned,
EventAgentCardUpdated,
EventAgentMessage,
EventAgentMoved,
EventAgentRemoved,
EventAgentReplaced,
EventApprovalEscalated,
EventApprovalRequested,
EventChannelMessage,
EventCronExecuted,
EventCronSkipped,
EventDelegationComplete,
EventDelegationFailed,
EventDelegationSent,
EventDelegationStatus,
EventExternalCredentialsRotated,
EventTaskUpdated,
EventWorkspaceAwaitingAgent,
EventWorkspaceDegraded,
EventWorkspaceHeartbeat,
EventWorkspaceHibernated,
EventWorkspaceOffline,
EventWorkspaceOnline,
EventWorkspacePaused,
EventWorkspaceProvisionFailed,
EventWorkspaceProvisioning,
EventWorkspaceRemoved,
}
@@ -0,0 +1,117 @@
package events
import (
"sort"
"strings"
"testing"
)
// TestAllEventTypes_IsSnapshot pins the canonical event taxonomy.
// Adding a new constant in types.go without updating AllEventTypes
// (or vice versa) fails this test.
//
// The snapshot is also the authoritative input to the canvas-side
// parity gate (PR-B-2 follow-up): the TypeScript union members in
// canvas/src/lib/ws-events.ts MUST match this list exactly. A drift
// gate at CI time will assert set equality once the TS file lands.
func TestAllEventTypes_IsSnapshot(t *testing.T) {
// Every named constant must appear in AllEventTypes. Walk via
// reflection over the package-level vars would over-include test
// fixtures, so list the canonical names here. When a constant
// is added in types.go, append the EventType's literal value
// to the expected list below — the failure message names
// exactly what's missing so the diff is one-line obvious.
expected := []string{
"A2A_RESPONSE",
"ACTIVITY_LOGGED",
"AGENT_ASSIGNED",
"AGENT_CARD_UPDATED",
"AGENT_MESSAGE",
"AGENT_MOVED",
"AGENT_REMOVED",
"AGENT_REPLACED",
"APPROVAL_ESCALATED",
"APPROVAL_REQUESTED",
"CHANNEL_MESSAGE",
"CRON_EXECUTED",
"CRON_SKIPPED",
"DELEGATION_COMPLETE",
"DELEGATION_FAILED",
"DELEGATION_SENT",
"DELEGATION_STATUS",
"EXTERNAL_CREDENTIALS_ROTATED",
"TASK_UPDATED",
"WORKSPACE_AWAITING_AGENT",
"WORKSPACE_DEGRADED",
"WORKSPACE_HEARTBEAT",
"WORKSPACE_HIBERNATED",
"WORKSPACE_OFFLINE",
"WORKSPACE_ONLINE",
"WORKSPACE_PAUSED",
"WORKSPACE_PROVISIONING",
"WORKSPACE_PROVISION_FAILED",
"WORKSPACE_REMOVED",
}
sort.Strings(expected)
actual := make([]string, 0, len(AllEventTypes))
for _, e := range AllEventTypes {
actual = append(actual, string(e))
}
sort.Strings(actual)
if len(actual) != len(expected) {
t.Errorf("AllEventTypes count = %d, want %d\nactual: %s\nexpected: %s",
len(actual), len(expected),
strings.Join(actual, ", "),
strings.Join(expected, ", "))
return
}
for i, want := range expected {
if actual[i] != want {
t.Errorf("AllEventTypes[%d] = %q, want %q (full diff:\n actual: %v\n expected: %v\n)",
i, actual[i], want, actual, expected)
}
}
}
// TestEventType_NoEmptyConstants pins that no constant declared in
// types.go has an accidentally-empty value. The catch is the
// "WORKSPACE_X" → forgot-to-fill pattern: a typo in the literal
// would surface as the empty string, and broadcast pipelines would
// silently filter empty-name events without any error signal.
func TestEventType_NoEmptyConstants(t *testing.T) {
for _, e := range AllEventTypes {
if string(e) == "" {
t.Errorf("found empty EventType in AllEventTypes — typo in types.go?")
}
}
}
// TestEventType_AllUppercaseSnakeCase pins the wire format. Mixed
// case or kebab-case would break the canvas TypeScript switch
// statements (every consumer's `case "AGENT_MESSAGE":` is upper-
// snake). The check is the catch for an accidental
// `"agent_message"` typo that wouldn't fail the snapshot gate.
func TestEventType_AllUppercaseSnakeCase(t *testing.T) {
for _, e := range AllEventTypes {
s := string(e)
// Allowed chars: A-Z, 0-9, _ — nothing else, no leading/
// trailing underscores, no consecutive underscores.
if s != strings.ToUpper(s) {
t.Errorf("EventType %q is not all-uppercase — wire format requires upper-snake", s)
}
if strings.HasPrefix(s, "_") || strings.HasSuffix(s, "_") {
t.Errorf("EventType %q has leading/trailing underscore — disallowed", s)
}
if strings.Contains(s, "__") {
t.Errorf("EventType %q has consecutive underscores — disallowed", s)
}
for _, r := range s {
if !((r >= 'A' && r <= 'Z') || (r >= '0' && r <= '9') || r == '_') {
t.Errorf("EventType %q contains disallowed char %q", s, r)
break
}
}
}
}
+22 -70
View File
@@ -465,78 +465,30 @@ func (h *ActivityHandler) Notify(c *gin.Context) {
}
}
// Verify workspace exists
var wsName string
err := db.DB.QueryRowContext(c.Request.Context(),
`SELECT name FROM workspaces WHERE id = $1 AND status != 'removed'`, workspaceID,
).Scan(&wsName)
if err != nil {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
// Single source of truth for chat-bearing agent → user messages —
// see agent_message_writer.go for the contract. Pre-RFC-#2945, the
// broadcast + INSERT pair was inlined here and again in
// mcp_tools.go's send_message_to_user, and the duplication is what
// produced the reno-stars data-loss regression. Both paths now
// route through the same writer; future channels (Slack, Discord,
// Lark) hook in here too.
attachments := make([]AgentMessageAttachment, 0, len(body.Attachments))
for _, a := range body.Attachments {
attachments = append(attachments, AgentMessageAttachment{
URI: a.URI,
Name: a.Name,
MimeType: a.MimeType,
Size: a.Size,
})
}
broadcastPayload := map[string]interface{}{
"message": body.Message,
"workspace_id": workspaceID,
"name": wsName,
}
if len(body.Attachments) > 0 {
broadcastPayload["attachments"] = body.Attachments
}
h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", broadcastPayload)
// Persist to activity_logs so the chat history loader restores this
// message after a page reload. Pre-fix, send_message_to_user pushes
// were broadcast-only — survived the WebSocket session but vanished
// when the user refreshed because nothing wrote them to the DB.
//
// Shape chosen to match the existing loader query
// (`type=a2a_receive&source=canvas`):
// - activity_type='a2a_receive' so it joins the same query path
// - source_id=NULL so the canvas-source filter accepts it
// - method='notify' to distinguish from real A2A receives in audits
// - request_body=NULL so the loader doesn't append a duplicate
// "user message" bubble for it
// - response_body={"result": "<text>"} matches extractResponseText's
// simplest branch ({result: string} → take verbatim)
//
// Errors are logged-only — broadcast already succeeded, the user
// sees the message; persistence failure just means the message
// won't survive reload (pre-fix behavior). Don't fail the whole
// notify on a DB hiccup.
// response_body shape — chosen to feed BOTH:
// - extractResponseText: looks at body.result (string) and returns it
// - extractFilesFromTask: looks at body.parts[] for kind=file
// so a chat reload after a notify-with-attachments restores both
// the text bubble AND the download chips.
respPayload := map[string]interface{}{"result": body.Message}
if len(body.Attachments) > 0 {
fileParts := make([]map[string]interface{}, 0, len(body.Attachments))
for _, a := range body.Attachments {
fileMeta := map[string]interface{}{"uri": a.URI, "name": a.Name}
if a.MimeType != "" {
fileMeta["mimeType"] = a.MimeType
}
if a.Size > 0 {
fileMeta["size"] = a.Size
}
fileParts = append(fileParts, map[string]interface{}{
"kind": "file",
"file": fileMeta,
})
writer := NewAgentMessageWriter(db.DB, h.broadcaster)
if err := writer.Send(c.Request.Context(), workspaceID, body.Message, attachments); err != nil {
if errors.Is(err, ErrWorkspaceNotFound) {
c.JSON(http.StatusNotFound, gin.H{"error": "workspace not found"})
return
}
respPayload["parts"] = fileParts
}
respJSON, _ := json.Marshal(respPayload)
preview := body.Message
if len(preview) > 80 {
preview = preview[:80] + "…"
}
if _, err := db.DB.ExecContext(c.Request.Context(), `
INSERT INTO activity_logs (workspace_id, activity_type, method, summary, response_body, status)
VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')
`, workspaceID, "Agent message: "+preview, string(respJSON)); err != nil {
log.Printf("Notify: failed to persist message for %s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "internal error"})
return
}
c.JSON(http.StatusOK, gin.H{"status": "sent"})
@@ -0,0 +1,177 @@
package handlers
import (
"go/ast"
"go/parser"
"go/token"
"os"
"path/filepath"
"sort"
"strconv"
"strings"
"testing"
)
// TestAgentMessageBroadcastsArePersisted is a forward-looking AST
// gate: every function in this package that broadcasts an
// `AGENT_MESSAGE` WebSocket event MUST also call
// `INSERT INTO activity_logs` somewhere in its body.
//
// The reno-stars production data-loss bug (CEO Ryan PC's long-form
// onboarding-friction message visible live but missing on reload)
// happened because mcp_tools.go:toolSendMessageToUser broadcast WS
// without a paired INSERT — while the HTTP /notify sibling DID
// persist. The fix added the INSERT; this gate prevents the regression
// class from re-emerging in any future chat-bearing tool.
//
// Why an AST gate vs a code-review checklist (per memory
// feedback_behavior_based_ast_gates.md): "pin invariants by what a
// function calls, not what it's named". The shape that loses data is:
//
// BroadcastOnly(_, "AGENT_MESSAGE", _) without an INSERT companion
//
// Any new tool that emits AGENT_MESSAGE must persist or the next
// canvas refresh drops the message — same shape as reno-stars. A
// reviewer can miss this; the AST walk can't.
//
// Allowlist: empty by intent. If a future use case genuinely needs
// fire-and-forget broadcast (e.g., transient typing indicators that
// should NOT survive reload), add an entry here AND document why.
// "Doesn't need to persist" is rarely the right answer for chat —
// the canvas history is the source of truth.
func TestAgentMessageBroadcastsArePersisted(t *testing.T) {
wd, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
entries, err := os.ReadDir(wd)
if err != nil {
t.Fatalf("readdir %s: %v", wd, err)
}
type violation struct {
file string
fn string
}
var violations []violation
for _, ent := range entries {
name := ent.Name()
if ent.IsDir() || !strings.HasSuffix(name, ".go") || strings.HasSuffix(name, "_test.go") {
continue
}
path := filepath.Join(wd, name)
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, path, nil, parser.ParseComments)
if err != nil {
t.Fatalf("parse %s: %v", path, err)
}
for _, decl := range file.Decls {
fn, ok := decl.(*ast.FuncDecl)
if !ok || fn.Body == nil {
continue
}
if !funcEmitsAgentMessageBroadcast(fn) {
continue
}
if !funcInsertsIntoActivityLogs(fn) {
violations = append(violations, violation{file: name, fn: fn.Name.Name})
}
}
}
if len(violations) > 0 {
sort.Slice(violations, func(i, j int) bool {
if violations[i].file != violations[j].file {
return violations[i].file < violations[j].file
}
return violations[i].fn < violations[j].fn
})
var buf strings.Builder
for _, v := range violations {
buf.WriteString(" - ")
buf.WriteString(v.file)
buf.WriteString(":")
buf.WriteString(v.fn)
buf.WriteString("\n")
}
t.Errorf(`function(s) broadcast `+"`AGENT_MESSAGE`"+` without persisting to activity_logs:
%s
This is the reno-stars data-loss regression class: live message
visible to the user, but missing on reload because activity_log was
never written. Every chat-bearing broadcast MUST be paired with:
INSERT INTO activity_logs (workspace_id, activity_type, method,
summary, response_body, status)
VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')
See activity.go:Notify and mcp_tools.go:toolSendMessageToUser for
the canonical shapes. Don't add an allowlist entry without a
documented reason — the canvas chat history is the source of truth
and silently dropping messages is a P0 user trust break.`,
buf.String())
}
}
// funcEmitsAgentMessageBroadcast walks fn.Body for any CallExpr that
// looks like `*.BroadcastOnly(_, "AGENT_MESSAGE", _)`.
func funcEmitsAgentMessageBroadcast(fn *ast.FuncDecl) bool {
var found bool
ast.Inspect(fn.Body, func(n ast.Node) bool {
call, ok := n.(*ast.CallExpr)
if !ok {
return true
}
sel, ok := call.Fun.(*ast.SelectorExpr)
if !ok || sel.Sel.Name != "BroadcastOnly" {
return true
}
// BroadcastOnly(workspaceID, eventType, payload) — the second
// arg is the event name. Match by string-literal value.
if len(call.Args) < 2 {
return true
}
lit, ok := call.Args[1].(*ast.BasicLit)
if !ok || lit.Kind != token.STRING {
return true
}
raw := lit.Value
if unq, err := strconv.Unquote(raw); err == nil {
raw = unq
}
if raw == "AGENT_MESSAGE" {
found = true
return false
}
return true
})
return found
}
// funcInsertsIntoActivityLogs walks fn.Body for any STRING BasicLit
// whose body contains `INSERT INTO activity_logs` (the SQL literal
// passed to ExecContext). Matches the substring rather than a strict
// regex because we don't care about the exact INSERT shape here —
// only that the function persists. Specific shape pinning lives in
// the per-handler test (see TestMCPHandler_SendMessageToUser_*).
func funcInsertsIntoActivityLogs(fn *ast.FuncDecl) bool {
var found bool
ast.Inspect(fn.Body, func(n ast.Node) bool {
lit, ok := n.(*ast.BasicLit)
if !ok || lit.Kind != token.STRING {
return true
}
raw := lit.Value
if unq, err := strconv.Unquote(raw); err == nil {
raw = unq
}
if strings.Contains(raw, "INSERT INTO activity_logs") {
found = true
return false
}
return true
})
return found
}
@@ -0,0 +1,203 @@
package handlers
// AgentMessageWriter is the SSOT for "agent → user" message delivery in the
// workspace-server. Every chat-bearing path that surfaces a message to the
// canvas — HTTP /notify (Notify handler), MCP tools/call
// send_message_to_user (toolSendMessageToUser), any future channel — MUST
// route through this writer rather than re-implement the broadcast +
// persist contract inline.
//
// Why: pre-consolidation, two handlers duplicated the same "broadcast then
// INSERT activity_logs" sequence. The reno-stars production data-loss
// incident (2026-05-05, RFC #2945, PR #2944) was the symptom — the
// persistence half landed for /notify but lagged for the MCP bridge by
// months, silently dropping every long-form external-agent message until
// reload. The AST gate from #2944 catches drift; this writer eliminates
// the *possibility* of drift by giving both call sites a single
// well-tested function to call.
//
// Contract:
// 1. Look up the workspace by id; ErrWorkspaceNotFound on miss so the
// caller can return 404 with a clean message.
// 2. Broadcast a WS AGENT_MESSAGE event with {message, workspace_id,
// name, attachments?}.
// 3. INSERT a row into activity_logs:
// type='a2a_receive', method='notify', source_id NULL,
// response_body={"result": message[, "parts": [file kind...]]},
// status='ok'
// Best-effort — INSERT failure logs only, returns nil so the broadcast
// success isn't undone on the caller side.
// 4. Returns nil on success.
//
// The shape (especially the JSON response_body) is the wire contract the
// canvas's chat-history hydrator (canvas/src/.../historyHydration.ts)
// reads. Drift here silently breaks chat replay across all consumers, so
// changes to the JSON shape MUST be cross-verified against the hydrator
// in the same PR.
import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"log"
"unicode/utf8"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
)
// ErrWorkspaceNotFound is returned by AgentMessageWriter.Send when the
// workspace lookup turns up nothing (or the workspace is in
// status='removed'). Callers translate to HTTP 404 / JSON-RPC error /
// whatever surface they expose. Real DB errors (connection drop, query
// timeout) surface as wrapped errors and should be treated as 503.
var ErrWorkspaceNotFound = errors.New("agent_message: workspace not found")
// truncatePreviewRunes returns at most maxRunes runes of s, plus an ellipsis
// when truncated. Operates on the rune (codepoint) boundary instead of
// byte indices — the previous byte-slice version produced invalid UTF-8
// when maxRunes landed mid-codepoint (CJK, emoji, accented characters
// in agent-authored chat messages), and Postgres JSONB rejects invalid
// UTF-8, dropping the activity_log INSERT silently. The persistence
// failure log fires but the message vanishes from chat history — the
// exact regression class the SSOT consolidation was built to prevent.
//
// maxRunes is in runes, not bytes — `truncatePreviewRunes("你好", 1)` returns
// `"你…"`, not `"\xe4…"`. Set the cap on a UI-friendly basis (visible
// character count, not stored byte count); 80 runes covers the
// activity_logs.summary column comfortably.
func truncatePreviewRunes(s string, maxRunes int) string {
if utf8.RuneCountInString(s) <= maxRunes {
return s
}
// Walk runes until we've consumed maxRunes; cut at that byte index.
count := 0
cut := len(s)
for i := range s {
if count == maxRunes {
cut = i
break
}
count++
}
return s[:cut] + "…"
}
// AgentMessageAttachment is one file attached to an agent → user
// message. Identical to handlers.NotifyAttachment in field set; kept
// distinct so the writer's API doesn't import a handler type with HTTP
// binding tags.
type AgentMessageAttachment struct {
URI string
Name string
MimeType string
Size int64
}
// AgentMessageWriter persists + broadcasts agent → user messages. Construct
// once per process via NewAgentMessageWriter; pass the same instance to
// every handler that delivers chat (Notify, toolSendMessageToUser, etc.).
//
// Takes events.EventEmitter (not the *Broadcaster concrete type) so tests
// can substitute a fake emitter and producers in other packages can wrap
// the real broadcaster behind their own metrics / retries without leaking
// the concrete dependency.
type AgentMessageWriter struct {
db *sql.DB
broadcaster events.EventEmitter
}
// NewAgentMessageWriter binds the writer to the platform's DB pool +
// WebSocket broadcaster.
func NewAgentMessageWriter(db *sql.DB, broadcaster events.EventEmitter) *AgentMessageWriter {
return &AgentMessageWriter{db: db, broadcaster: broadcaster}
}
// Send delivers a single agent → user message. Look up + broadcast +
// persist in that order; ErrWorkspaceNotFound short-circuits before any
// broadcast or DB write so callers can 404 cleanly.
//
// Returns nil on success — including on DB-INSERT failure (the broadcast
// already returned successfully and the user has seen the message; the
// persistence-failure mode is logged at WARN but the caller's response
// stays 200 so the agent doesn't retry and double-broadcast).
func (w *AgentMessageWriter) Send(
ctx context.Context,
workspaceID, message string,
attachments []AgentMessageAttachment,
) error {
// 1. Workspace lookup. status='removed' filter is the same shape /notify
// used pre-consolidation; deleted workspaces don't get notifications.
//
// Distinguish sql.ErrNoRows ("workspace genuinely not present" — caller
// should 404) from real DB errors (connection drop, statement timeout,
// pool exhaustion — caller should 503). Pre-fix this branch returned
// ErrWorkspaceNotFound for any error, so during a DB outage every
// notify call surfaced as "workspace not found" and masked real
// incidents in the alert path.
var wsName string
err := w.db.QueryRowContext(ctx,
`SELECT name FROM workspaces WHERE id = $1 AND status != 'removed'`,
workspaceID,
).Scan(&wsName)
if errors.Is(err, sql.ErrNoRows) {
return ErrWorkspaceNotFound
}
if err != nil {
return fmt.Errorf("agent_message: workspace lookup: %w", err)
}
// 2. Build broadcast payload + WS-emit. Same shape that ChatTab's
// AGENT_MESSAGE handler in canvas/src/store/canvas-events.ts has
// consumed since the canvas chat shipped — drift here would orphan
// every live chat panel.
broadcastPayload := map[string]interface{}{
"message": message,
"workspace_id": workspaceID,
"name": wsName,
}
if len(attachments) > 0 {
broadcastPayload["attachments"] = attachments
}
w.broadcaster.BroadcastOnly(workspaceID, string(events.EventAgentMessage), broadcastPayload)
// 3. Persist for chat-history hydration. response_body shape MUST stay
// in sync with extractResponseText + extractFilesFromTask in
// canvas/src/components/tabs/chat/historyHydration.ts:
// - extractResponseText reads body.result (string) → renders text
// - extractFilesFromTask reads body.parts[] (kind=file) → renders chips
respPayload := map[string]interface{}{"result": message}
if len(attachments) > 0 {
fileParts := make([]map[string]interface{}, 0, len(attachments))
for _, a := range attachments {
fileMeta := map[string]interface{}{"uri": a.URI, "name": a.Name}
if a.MimeType != "" {
fileMeta["mimeType"] = a.MimeType
}
if a.Size > 0 {
fileMeta["size"] = a.Size
}
fileParts = append(fileParts, map[string]interface{}{
"kind": "file",
"file": fileMeta,
})
}
respPayload["parts"] = fileParts
}
respJSON, _ := json.Marshal(respPayload)
preview := truncatePreviewRunes(message, 80)
if _, err := w.db.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, summary, response_body, status)
VALUES ($1, 'a2a_receive', 'notify', $2, $3::jsonb, 'ok')
`, workspaceID, "Agent message: "+preview, string(respJSON)); err != nil {
// Best-effort: the broadcast already returned ok and the user
// has seen the message. Logging a structured line lets operators
// notice persistence-failure rates spike if the DB is unhealthy,
// without breaking the tool response or causing the agent to
// retry-and-double-broadcast.
log.Printf("agent_message: failed to persist for %s: %v", workspaceID, err)
}
return nil
}
@@ -0,0 +1,448 @@
package handlers
import (
"context"
"database/sql/driver"
"encoding/json"
"errors"
"strings"
"testing"
"unicode/utf8"
"github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
)
// AgentMessageWriter is the SSOT for agent → user chat delivery
// (RFC #2945 PR-A). These tests pin the contract the writer
// guarantees: workspace lookup, broadcast, INSERT, error semantics —
// every shape that producers (Notify, toolSendMessageToUser, future
// channels) rely on.
//
// Pre-consolidation, the broadcast-then-INSERT logic was duplicated
// across two handlers and they drifted (reno-stars, 2026-05-05). With
// the writer being the only place this logic lives, these tests are
// the regression line for every chat-bearing path simultaneously.
// jsonMatcher is a sqlmock Argument matcher that decodes the actual
// SQL arg as JSON and runs a caller-supplied predicate over the
// resulting structure. Tighter than substring matching (which can
// false-pass on a renamed key) and tolerant of map-key ordering
// (which exact-string matching is not).
type jsonMatcher struct {
predicate func(parsed map[string]any) bool
desc string
}
func (m jsonMatcher) Match(v driver.Value) bool {
s, ok := v.(string)
if !ok {
return false
}
var parsed map[string]any
if err := json.Unmarshal([]byte(s), &parsed); err != nil {
return false
}
return m.predicate(parsed)
}
// stringMatcher pins exact prefix/suffix/equality checks against a
// driver.Value that's actually a string.
type stringMatcher func(string) bool
func (f stringMatcher) Match(v driver.Value) bool {
s, ok := v.(string)
if !ok {
return false
}
return f(s)
}
// capturingEmitter records every BroadcastOnly call so tests can pin
// the WS event shape without a real ws.Hub. RecordAndBroadcast is
// also captured for completeness — the writer doesn't call it today,
// but a future producer might, and a captured-but-unasserted record
// is easier to diagnose than a nil panic.
type capturingEmitter struct {
events []capturedEvent
}
type capturedEvent struct {
workspaceID string
eventType string
payload interface{}
}
func (c *capturingEmitter) BroadcastOnly(workspaceID string, eventType string, payload interface{}) {
c.events = append(c.events, capturedEvent{workspaceID, eventType, payload})
}
func (c *capturingEmitter) RecordAndBroadcast(_ context.Context, eventType string, workspaceID string, payload interface{}) error {
c.events = append(c.events, capturedEvent{workspaceID, eventType, payload})
return nil
}
// TestAgentMessageWriter_Send_Success_NoAttachments pins the happy
// path: workspace lookup, broadcast, INSERT, return nil.
func TestAgentMessageWriter_Send_Success_NoAttachments(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-1").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WithArgs(
"ws-1",
sqlmock.AnyArg(), // summary
`{"result":"hi"}`,
).
WillReturnResult(sqlmock.NewResult(1, 1))
if err := w.Send(context.Background(), "ws-1", "hi", nil); err != nil {
t.Fatalf("Send returned %v, want nil", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("DB expectations: %v", err)
}
}
// TestAgentMessageWriter_Send_Success_WithAttachments pins the file
// attachment shape — response_body MUST contain a parts[] array with
// kind=file entries so the canvas hydrater renders download chips.
// Drift here = chips disappear on chat reload.
func TestAgentMessageWriter_Send_Success_WithAttachments(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-att").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("Ryan"))
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WithArgs(
"ws-att",
sqlmock.AnyArg(),
jsonMatcher{
desc: "response_body has result + parts with kind=file metadata",
predicate: func(p map[string]any) bool {
if p["result"] != "see attached" {
return false
}
parts, ok := p["parts"].([]any)
if !ok || len(parts) != 1 {
return false
}
part, ok := parts[0].(map[string]any)
if !ok {
return false
}
if part["kind"] != "file" {
return false
}
file, ok := part["file"].(map[string]any)
if !ok {
return false
}
return file["uri"] == "workspace://x.zip" &&
file["name"] == "x.zip" &&
file["mimeType"] == "application/zip" &&
file["size"].(float64) == 1234
},
},
).
WillReturnResult(sqlmock.NewResult(1, 1))
atts := []AgentMessageAttachment{
{URI: "workspace://x.zip", Name: "x.zip", MimeType: "application/zip", Size: 1234},
}
if err := w.Send(context.Background(), "ws-att", "see attached", atts); err != nil {
t.Fatalf("Send returned %v, want nil", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("DB expectations: %v", err)
}
}
// TestAgentMessageWriter_Send_WorkspaceNotFound pins ErrWorkspaceNotFound
// short-circuit. Must NOT broadcast, MUST NOT INSERT — caller will 404
// or surface a JSON-RPC error.
func TestAgentMessageWriter_Send_WorkspaceNotFound(t *testing.T) {
mock := setupTestDB(t)
emitter := &capturingEmitter{}
w := NewAgentMessageWriter(db.DB, emitter)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-missing").
WillReturnRows(sqlmock.NewRows([]string{"name"}))
err := w.Send(context.Background(), "ws-missing", "lost in the void", nil)
if !errors.Is(err, ErrWorkspaceNotFound) {
t.Errorf("Send returned %v, want ErrWorkspaceNotFound", err)
}
if len(emitter.events) != 0 {
t.Errorf("workspace-not-found path MUST NOT broadcast, got %d events", len(emitter.events))
}
// Implicit: no INSERT expectation registered, so a stray INSERT
// would fail ExpectationsWereMet.
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("DB expectations (INSERT must NOT fire on workspace-not-found): %v", err)
}
}
// TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil pins the
// "best-effort persistence" contract: when the activity_log INSERT
// fails (DB hiccup, transient connection, constraint), the writer
// MUST still return nil. The broadcast already succeeded; the user
// has seen the message; returning an error here would cause the
// caller (and the agent calling the tool) to retry and double-
// broadcast.
func TestAgentMessageWriter_Send_DBInsertFailureStillReturnsNil(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-dbfail").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnError(errors.New("transient db error"))
err := w.Send(context.Background(), "ws-dbfail", "should not be lost from live chat", nil)
if err != nil {
t.Errorf("DB INSERT failure must return nil (broadcast already succeeded), got %v", err)
}
}
// TestAgentMessageWriter_Send_PreviewTruncation pins the summary
// preview cap. Long messages (Ryan's onboarding-friction report was
// ~2k chars) must summarise to ≤80 chars + ellipsis so the activity
// table doesn't carry multi-KB summaries that bloat list queries.
func TestAgentMessageWriter_Send_PreviewTruncation(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-trunc").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("Ryan"))
longMsg := strings.Repeat("x", 200)
mock.ExpectExec(`INSERT INTO activity_logs`).
WithArgs(
"ws-trunc",
stringMatcher(func(s string) bool {
if !strings.HasPrefix(s, "Agent message: ") {
return false
}
preview := strings.TrimPrefix(s, "Agent message: ")
if !strings.HasSuffix(preview, "…") {
return false
}
body := strings.TrimSuffix(preview, "…")
return len(body) == 80
}),
sqlmock.AnyArg(),
).
WillReturnResult(sqlmock.NewResult(1, 1))
if err := w.Send(context.Background(), "ws-trunc", longMsg, nil); err != nil {
t.Fatalf("Send: %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("preview truncation drift: %v", err)
}
}
// TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent pins the
// WS event name + payload shape. The canvas's
// canvas-events.ts:AGENT_MESSAGE handler reads {message, workspace_id,
// name, attachments?} — drift here orphans every live chat panel.
func TestAgentMessageWriter_Send_BroadcastsAgentMessageEvent(t *testing.T) {
mock := setupTestDB(t)
emitter := &capturingEmitter{}
w := NewAgentMessageWriter(db.DB, emitter)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-bc").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("Workspace Name"))
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnResult(sqlmock.NewResult(1, 1))
atts := []AgentMessageAttachment{
{URI: "workspace://a.txt", Name: "a.txt"},
}
if err := w.Send(context.Background(), "ws-bc", "hi", atts); err != nil {
t.Fatalf("Send: %v", err)
}
if len(emitter.events) != 1 {
t.Fatalf("expected exactly 1 broadcast, got %d", len(emitter.events))
}
ev := emitter.events[0]
if ev.eventType != "AGENT_MESSAGE" {
t.Errorf("event type = %q, want AGENT_MESSAGE", ev.eventType)
}
if ev.workspaceID != "ws-bc" {
t.Errorf("workspace_id = %q, want ws-bc", ev.workspaceID)
}
pl, ok := ev.payload.(map[string]interface{})
if !ok {
t.Fatalf("payload not a map: %T", ev.payload)
}
if pl["message"] != "hi" {
t.Errorf("payload.message = %v, want hi", pl["message"])
}
if pl["workspace_id"] != "ws-bc" {
t.Errorf("payload.workspace_id = %v, want ws-bc", pl["workspace_id"])
}
if pl["name"] != "Workspace Name" {
t.Errorf("payload.name = %v, want Workspace Name", pl["name"])
}
if pl["attachments"] == nil {
t.Error("payload.attachments missing on attachment-bearing send")
}
}
// TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped pins the
// distinction between sql.ErrNoRows (legit not-found → 404) and real
// DB errors (connection drop → 503). Pre-followup the lookup branch
// returned ErrWorkspaceNotFound for ANY error, so during a DB outage
// every notify call surfaced as "workspace not found" and masked
// real incidents in alerting.
func TestAgentMessageWriter_Send_DBErrorOnLookupReturnsWrapped(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
transientErr := errors.New("connection refused")
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-dbdown").
WillReturnError(transientErr)
err := w.Send(context.Background(), "ws-dbdown", "hi", nil)
if err == nil {
t.Fatal("expected wrapped DB error, got nil")
}
if errors.Is(err, ErrWorkspaceNotFound) {
t.Errorf("DB outage MUST NOT surface as ErrWorkspaceNotFound (masks incidents in alerting); got %v", err)
}
if !errors.Is(err, transientErr) {
t.Errorf("expected wrapped %v, got %v", transientErr, err)
}
}
// TestTruncatePreviewRunes_RuneBoundary pins the multi-byte-safe
// truncation. The previous byte-slice version produced invalid UTF-8
// when the cut landed mid-codepoint (CJK, emoji, accented), and
// Postgres JSONB rejects invalid UTF-8 — INSERT fails, log.Printf
// fires, message vanishes from chat history. Per memory
// feedback_assert_exact_not_substring.md, pin the boundary cases
// directly.
func TestTruncatePreviewRunes_RuneBoundary(t *testing.T) {
cases := []struct {
name string
in string
max int
want string
}{
{"under-max ASCII", "hi", 80, "hi"},
{"under-max CJK", "你好", 80, "你好"},
{"exactly-at-max", "abcde", 5, "abcde"},
{"truncate ASCII", "abcdefghij", 5, "abcde…"},
{"truncate CJK at rune boundary", "你好世界你好世界", 4, "你好世界…"},
{"truncate emoji at rune boundary", "😀😀😀😀😀😀", 3, "😀😀😀…"},
// The pre-fix bug shape: byte-slice on non-ASCII would have
// mangled the codepoint here. With rune-boundary truncation
// the result is well-formed UTF-8.
{"non-zero with emoji prefix", "🚀abcdefghijk", 5, "🚀abcd…"},
}
for _, c := range cases {
t.Run(c.name, func(t *testing.T) {
got := truncatePreviewRunes(c.in, c.max)
if got != c.want {
t.Errorf("truncatePreviewRunes(%q, %d) = %q, want %q", c.in, c.max, got, c.want)
}
// Always-valid UTF-8 invariant. A byte-slice truncation
// could leave partial codepoints; this version must not.
if !utf8.ValidString(got) {
t.Errorf("truncatePreviewRunes(%q, %d) returned invalid UTF-8: %q", c.in, c.max, got)
}
})
}
}
// TestAgentMessageWriter_Send_NonASCIIMessagePersists pins the end-to-end
// path for non-ASCII messages — the original reno-stars regression
// surfaced via byte-slice truncation breaking JSONB INSERT. Every
// handler-level test had ASCII content, so this branch had no
// coverage. Now it does.
func TestAgentMessageWriter_Send_NonASCIIMessagePersists(t *testing.T) {
mock := setupTestDB(t)
w := NewAgentMessageWriter(db.DB, newTestBroadcaster())
// 200-rune CJK message — exceeds the 80-rune cap, would have hit
// the byte-slice bug.
msg := strings.Repeat("你", 200)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-cjk").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
mock.ExpectExec(`INSERT INTO activity_logs`).
WithArgs(
"ws-cjk",
stringMatcher(func(s string) bool {
if !strings.HasPrefix(s, "Agent message: ") {
return false
}
preview := strings.TrimPrefix(s, "Agent message: ")
if !strings.HasSuffix(preview, "…") {
return false
}
body := strings.TrimSuffix(preview, "…")
// 80 runes of 你 = 80 codepoints. Each is 3 bytes UTF-8.
if utf8.RuneCountInString(body) != 80 {
return false
}
// MUST be valid UTF-8 — pre-fix byte-slice would have
// returned half a codepoint here.
return utf8.ValidString(body)
}),
sqlmock.AnyArg(),
).
WillReturnResult(sqlmock.NewResult(1, 1))
if err := w.Send(context.Background(), "ws-cjk", msg, nil); err != nil {
t.Fatalf("Send: %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("non-ASCII path drift: %v", err)
}
}
// TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty pins the
// "no key when nil" wire contract — extra empty fields would force
// canvas consumers to defensively check for [] vs undefined; the
// existing AGENT_MESSAGE handler treats absence as "no attachments".
func TestAgentMessageWriter_Send_OmitsAttachmentsKeyWhenEmpty(t *testing.T) {
mock := setupTestDB(t)
emitter := &capturingEmitter{}
w := NewAgentMessageWriter(db.DB, emitter)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-noatt").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("X"))
mock.ExpectExec(`INSERT INTO activity_logs`).
WillReturnResult(sqlmock.NewResult(1, 1))
if err := w.Send(context.Background(), "ws-noatt", "plain text", nil); err != nil {
t.Fatalf("Send: %v", err)
}
if len(emitter.events) != 1 {
t.Fatalf("expected 1 event, got %d", len(emitter.events))
}
pl := emitter.events[0].payload.(map[string]interface{})
if _, present := pl["attachments"]; present {
t.Errorf("attachments key MUST NOT be present when empty (canvas treats absence as 'none'); payload=%v", pl)
}
}
@@ -0,0 +1,468 @@
package handlers
// class1_ast_gate_test.go — generic Class 1 leak gate per #2867 PR-A.
//
// What this gate prevents:
// The tenant-hongming leak class — a handler iterates a YAML-derived
// slice (ws.Children, sub_workspaces, etc.) and calls
// `INSERT INTO workspaces` inside the loop body without first
// checking whether a workspace with the same (parent_id, name) is
// already there. Each call to such a handler doubles the tree.
//
// Why this is broader than TestCreateWorkspaceTree_CallsLookupBeforeInsert:
// The existing gate is hard-coded to org_import.go's createWorkspaceTree.
// That catches the specific function that triggered the original
// incident — but a future handler written from scratch in a different
// file would not be covered. This gate walks every production handler
// .go file and applies a structural rule that does not depend on
// function or file names.
//
// The rule (verbatim from #2867 PR-A):
//
// "No handler in handlers/ may iterate a slice (any RangeStmt) AND
// call INSERT INTO workspaces inside the loop body without a
// preceding SELECT id FROM workspaces WHERE name=$1 AND parent_id IS
// NOT DISTINCT FROM $2 in the same function (== a lookupExistingChild
// call, OR an ON CONFLICT clause baked into the same INSERT, OR an
// explicit allowlist annotation)."
//
// Allowlist mechanism: a function whose body contains the exact comment
// string `// class1-gate: idempotent-by-design` is treated as safe.
// Use this only after writing a unit test that pins WHY the function
// is safe. The annotation is intentionally awkward to type — it should
// be rare.
import (
"go/ast"
"go/parser"
"go/token"
"os"
"path/filepath"
"regexp"
"sort"
"strings"
"testing"
)
// reINSERTWorkspaces matches the exact statement shape we care about.
// Tightened (vs bytes.Index "INSERT INTO workspaces") so the audit
// table `workspaces_audit` literal — or any other lookalike — does not
// false-positive trigger this gate. The same regex is used in the
// existing createWorkspaceTree gate (workspaces_insert_allowlist_test.go)
// — keep them in sync if either changes.
var reINSERTWorkspaces = regexp.MustCompile(`(?m)^\s*INSERT INTO workspaces\s*\(`)
// reONCONFLICT matches ON CONFLICT clauses anywhere in the same SQL
// literal. An UPSERT (INSERT ... ON CONFLICT ... DO UPDATE) is
// idempotent by definition, so the gate exempts it.
var reONCONFLICT = regexp.MustCompile(`(?i)\bON CONFLICT\b`)
// gateAllowlistComment is the magic comment a function author writes
// to opt out of this gate. Forces an explicit decision.
const gateAllowlistComment = "// class1-gate: idempotent-by-design"
// preflightCallNames are function names whose presence in a function
// body counts as "did a SELECT-by-(parent_id, name) preflight". Add
// new names here as new preflight helpers are introduced. Keep the
// list TIGHT — any sloppy addition weakens the gate.
var preflightCallNames = map[string]bool{
"lookupExistingChild": true,
}
// TestClass1_NoUnpreflightedInsertInsideRange walks every production
// .go file in this package, parses the AST, and fails the test if any
// FuncDecl violates the rule above.
//
// Failure message must include: file path, function name, line of
// the offending INSERT, line of the enclosing range, and a hint at
// the three escape hatches (preflight call, ON CONFLICT, allowlist
// comment).
func TestClass1_NoUnpreflightedInsertInsideRange(t *testing.T) {
wd, err := os.Getwd()
if err != nil {
t.Fatalf("getwd: %v", err)
}
entries, err := os.ReadDir(wd)
if err != nil {
t.Fatalf("readdir %s: %v", wd, err)
}
type violation struct {
file string
fn string
insertLine int
rangeLine int
}
var violations []violation
scanned := 0
for _, e := range entries {
name := e.Name()
if e.IsDir() || !strings.HasSuffix(name, ".go") {
continue
}
if strings.HasSuffix(name, "_test.go") {
continue
}
path := filepath.Join(wd, name)
src, err := os.ReadFile(path)
if err != nil {
t.Fatalf("read %s: %v", path, err)
}
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, name, src, parser.ParseComments)
if err != nil {
t.Fatalf("parse %s: %v", path, err)
}
scanned++
// Walk every function declaration and apply the rule.
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Body == nil {
continue
}
// Allowlist: skip if the function body contains the magic
// comment. We check via the source range of the function
// — comments inside the body are in file.Comments and
// must overlap the function's Pos/End range.
if functionHasAllowlistComment(file, fd) {
continue
}
// First pass: locate every INSERT INTO workspaces literal
// in this function. We treat each such literal as a
// candidate violation and try to clear it via the rules.
candidates := findInsertWorkspacesLiterals(fd, src, fset)
if len(candidates) == 0 {
continue
}
// Has the function called a preflight helper? Single
// pass — if any preflight name appears, every INSERT in
// the function is considered preflighted. This is more
// permissive than position-aware (preflight could be
// AFTER the INSERT and still satisfy the gate), but the
// existing org_import.go gate already pins the position
// invariant for createWorkspaceTree, and a function that
// preflights AFTER inserting would fail the position
// gate in a separate test.
hasPreflight := functionCallsAny(fd, preflightCallNames)
for _, c := range candidates {
if c.hasONCONFLICT {
continue
}
if hasPreflight {
continue
}
if c.enclosingRangeLine == 0 {
// INSERT not inside any RangeStmt — single-shot,
// not the bug pattern.
continue
}
violations = append(violations, violation{
file: name,
fn: fd.Name.Name,
insertLine: c.insertLine,
rangeLine: c.enclosingRangeLine,
})
}
}
}
if scanned == 0 {
t.Fatal("scanned 0 .go files — wrong working directory? gate would always pass")
}
if len(violations) > 0 {
// Stable sort so the failure message is deterministic across
// reruns.
sort.Slice(violations, func(i, j int) bool {
if violations[i].file != violations[j].file {
return violations[i].file < violations[j].file
}
return violations[i].insertLine < violations[j].insertLine
})
var b strings.Builder
b.WriteString("Class 1 leak gate (#2867 PR-A) — these handler functions iterate a slice and INSERT INTO workspaces inside the loop body without a (parent_id, name) preflight.\n\n")
b.WriteString("This is the bug shape that triggered the tenant-hongming leak (TeamHandler.Expand re-inserting the entire sub_workspaces tree on every call). To fix any reported violation, choose ONE of:\n")
b.WriteString(" 1. Call h.lookupExistingChild(ctx, name, parentID) before the INSERT and skip the INSERT when it returns existing=true. (preferred)\n")
b.WriteString(" 2. Use INSERT ... ON CONFLICT ... DO ... (idempotent UPSERT, like registry.go).\n")
b.WriteString(" 3. Annotate the function with a `// class1-gate: idempotent-by-design` comment AND a unit test that pins why the function is structurally idempotent. (rare; require code review)\n\n")
b.WriteString("Violations:\n")
for _, v := range violations {
b.WriteString(" - ")
b.WriteString(v.file)
b.WriteString(":")
b.WriteString(itoa(v.insertLine))
b.WriteString(" — function ")
b.WriteString(v.fn)
b.WriteString("() INSERTs inside RangeStmt at line ")
b.WriteString(itoa(v.rangeLine))
b.WriteString("\n")
}
t.Fatal(b.String())
}
}
func itoa(n int) string {
// Avoid strconv import for one call site — keeps the test focused.
if n == 0 {
return "0"
}
neg := n < 0
if neg {
n = -n
}
var buf [20]byte
i := len(buf)
for n > 0 {
i--
buf[i] = byte('0' + n%10)
n /= 10
}
if neg {
i--
buf[i] = '-'
}
return string(buf[i:])
}
// candidateInsert holds the per-INSERT facts needed to decide whether
// the gate fires.
type candidateInsert struct {
insertLine int
hasONCONFLICT bool
enclosingRangeLine int // 0 means not inside any range
}
// findInsertWorkspacesLiterals walks fd's body and returns one
// candidateInsert per INSERT INTO workspaces string literal.
//
// Position-based detection: collect every RangeStmt's body span first,
// then for each INSERT literal check if its position is inside any
// span. ast.Inspect's nil-call ordering does NOT give per-node pop
// semantics, so a stack-based approach against ast.Inspect would
// silently miscount. Position spans are deterministic and easy to
// reason about.
func findInsertWorkspacesLiterals(fd *ast.FuncDecl, src []byte, fset *token.FileSet) []candidateInsert {
var out []candidateInsert
type span struct{ start, end token.Pos }
var ranges []span
ast.Inspect(fd.Body, func(n ast.Node) bool {
rs, ok := n.(*ast.RangeStmt)
if !ok || rs.Body == nil {
return true
}
ranges = append(ranges, span{rs.Body.Lbrace, rs.Body.Rbrace})
return true
})
enclosingRangeLineFor := func(p token.Pos) int {
// Pick the innermost enclosing range — i.e., the one with the
// largest start that still covers p. Innermost is the one
// whose body actually contains the INSERT, which is the line
// most useful in a violation message.
bestStart := token.NoPos
bestLine := 0
for _, s := range ranges {
if p > s.start && p < s.end && s.start > bestStart {
bestStart = s.start
bestLine = fset.Position(s.start).Line
}
}
return bestLine
}
ast.Inspect(fd.Body, func(n ast.Node) bool {
bl, ok := n.(*ast.BasicLit)
if !ok || bl.Kind != token.STRING {
return true
}
// Strip surrounding backticks/quotes — value includes them.
lit := bl.Value
if len(lit) >= 2 {
lit = lit[1 : len(lit)-1]
}
if !reINSERTWorkspaces.MatchString(lit) {
return true
}
out = append(out, candidateInsert{
insertLine: fset.Position(bl.Pos()).Line,
hasONCONFLICT: reONCONFLICT.MatchString(lit),
enclosingRangeLine: enclosingRangeLineFor(bl.Pos()),
})
return true
})
return out
}
// functionCallsAny returns true if any CallExpr in fd's body has a
// function name (either a SelectorExpr Sel.Name or an Ident name)
// matching a key in names.
func functionCallsAny(fd *ast.FuncDecl, names map[string]bool) bool {
found := false
ast.Inspect(fd.Body, func(n ast.Node) bool {
if found {
return false
}
ce, ok := n.(*ast.CallExpr)
if !ok {
return true
}
switch fun := ce.Fun.(type) {
case *ast.Ident:
if names[fun.Name] {
found = true
return false
}
case *ast.SelectorExpr:
if names[fun.Sel.Name] {
found = true
return false
}
}
return true
})
return found
}
// functionHasAllowlistComment returns true if the function body
// (between fd.Body.Lbrace and fd.Body.Rbrace) contains a comment
// equal to gateAllowlistComment.
func functionHasAllowlistComment(file *ast.File, fd *ast.FuncDecl) bool {
if fd.Body == nil {
return false
}
start := fd.Body.Lbrace
end := fd.Body.Rbrace
for _, cg := range file.Comments {
for _, c := range cg.List {
if c.Pos() < start || c.Pos() > end {
continue
}
if strings.TrimSpace(c.Text) == gateAllowlistComment {
return true
}
}
}
return false
}
// TestClass1_GateFiresOnSyntheticBuggySource — proves the gate actually
// catches the bug shape it's named after. Without this, a regression
// to "always pass" would not be noticed until the leak shipped again.
// Per memory feedback_assert_exact_not_substring.md: tighten the test
// + verify it FAILS on old-shape source before merging.
func TestClass1_GateFiresOnSyntheticBuggySource(t *testing.T) {
const buggySrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func buggyExpand(db fakeDB, ctx context.Context, children []string) {
for _, child := range children {
// Bug shape: INSERT inside the range body, no preflight.
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2)`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, err := parser.ParseFile(fset, "buggy.go", buggySrc, parser.ParseComments)
if err != nil {
t.Fatalf("parse synthetic source: %v", err)
}
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "buggyExpand" {
continue
}
candidates := findInsertWorkspacesLiterals(fd, []byte(buggySrc), fset)
if len(candidates) != 1 {
t.Fatalf("expected 1 INSERT literal, got %d", len(candidates))
}
c := candidates[0]
if c.enclosingRangeLine == 0 {
t.Errorf("synthetic INSERT inside `for _, child := range` should be detected as enclosed by range, got enclosingRangeLine=0 — gate would miss the bug shape")
}
if c.hasONCONFLICT {
t.Errorf("synthetic INSERT has no ON CONFLICT, gate falsely treated it as idempotent")
}
if functionCallsAny(fd, preflightCallNames) {
t.Errorf("synthetic function does not call lookupExistingChild — gate falsely treated it as preflighted")
}
// All three guards say the gate WOULD fire. Pass.
return
}
t.Fatal("buggyExpand FuncDecl not found in synthetic source")
}
// TestClass1_GateAllowsONCONFLICT — pins that an INSERT with ON
// CONFLICT inside a range body is NOT flagged. registry.go's
// upsert pattern is the prod example.
func TestClass1_GateAllowsONCONFLICT(t *testing.T) {
const safeSrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func upsertLoop(db fakeDB, ctx context.Context, children []string) {
for _, child := range children {
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2) ON CONFLICT (id) DO UPDATE SET name = $2`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, _ := parser.ParseFile(fset, "safe.go", safeSrc, parser.ParseComments)
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "upsertLoop" {
continue
}
candidates := findInsertWorkspacesLiterals(fd, []byte(safeSrc), fset)
if len(candidates) != 1 {
t.Fatalf("expected 1 candidate, got %d", len(candidates))
}
if !candidates[0].hasONCONFLICT {
t.Errorf("ON CONFLICT clause should be detected, was missed — gate would falsely flag idempotent UPSERTs")
}
}
}
// TestClass1_GateAllowsAllowlistAnnotation — pins the escape hatch
// works. Annotated functions are skipped at the FuncDecl level.
func TestClass1_GateAllowsAllowlistAnnotation(t *testing.T) {
const annotatedSrc = `package handlers
import "context"
type fakeDB struct{}
func (fakeDB) ExecContext(ctx context.Context, sql string, args ...interface{}) {}
func intentionallyUnpreflighted(db fakeDB, ctx context.Context, children []string) {
// class1-gate: idempotent-by-design
for _, child := range children {
db.ExecContext(ctx, ` + "`INSERT INTO workspaces (id, name) VALUES ($1, $2)`" + `, "x", child)
}
}
`
fset := token.NewFileSet()
file, _ := parser.ParseFile(fset, "annotated.go", annotatedSrc, parser.ParseComments)
for _, decl := range file.Decls {
fd, ok := decl.(*ast.FuncDecl)
if !ok || fd.Name.Name != "intentionallyUnpreflighted" {
continue
}
if !functionHasAllowlistComment(file, fd) {
t.Error("allowlist comment should be detected for the intentionallyUnpreflighted function — escape hatch not working")
}
}
}
@@ -53,13 +53,35 @@ func NewDelegationLedger(handle *sql.DB) *DelegationLedger {
// truncatePreview caps stored preview at 4KB. The full prompt/response is
// already in activity_logs.{request,response}_body — this is the at-a-glance
// view for the dashboard, not a forensic record.
//
// Rune-safe: previous byte-slice form (s[:previewCap]) split on a byte
// boundary, which on a multi-byte codepoint at byte 4096 produced
// invalid UTF-8 — Postgres JSONB rejects → ledger row not inserted →
// audit gap. Issue #2962. Walks the string by rune, stops at the last
// rune-boundary index that fits inside the cap. ASCII-only strings hit
// the cap exactly; CJK/emoji strings stop slightly under the cap,
// never over.
//
// Mirrors the truncatePreviewRunes fix from agent_message_writer.go
// (#2959). Both call sites should consume a shared helper after both
// fixes have landed — followup deduplication tracked in #2962's body.
const previewCap = 4096
func truncatePreview(s string) string {
if len(s) <= previewCap {
return s
}
return s[:previewCap]
// Range over a string yields rune-boundary byte indices. Walk
// until the next index would exceed previewCap; the previous
// index is the safe truncation point.
end := 0
for i := range s {
if i > previewCap {
break
}
end = i
}
return s[:end]
}
// InsertOpts is the agent's record-of-intent. Caller, callee, task preview,
@@ -5,6 +5,7 @@ import (
"errors"
"strings"
"testing"
"unicode/utf8"
"github.com/DATA-DOG/go-sqlmock"
)
@@ -121,6 +122,63 @@ func TestTruncatePreview_ExactlyAtCap(t *testing.T) {
}
}
// TestTruncatePreview_NeverProducesInvalidUTF8 — pins #2962. The old
// byte-slice implementation (s[:previewCap]) split on a byte boundary,
// so a multi-byte codepoint straddling byte 4096 produced invalid
// UTF-8 → Postgres JSONB rejects → ledger row not inserted → audit
// gap. Test feeds a CJK / emoji-padded string longer than previewCap
// and asserts utf8.ValidString on the result.
func TestTruncatePreview_NeverProducesInvalidUTF8(t *testing.T) {
// Build a string of '世' (3 bytes per rune in UTF-8) that's just
// past the cap. With the old implementation, the slice at byte
// previewCap would land mid-rune and ValidString would fail.
// With the rune-aware implementation, the result is always valid
// UTF-8 even if the byte length is < previewCap.
rune3 := "世" // U+4E16, 3 bytes
// Need at least previewCap/3 + 1 runes so we cross the cap with
// margin to spare.
in := strings.Repeat(rune3, (previewCap/3)+10)
if len(in) <= previewCap {
t.Fatalf("test setup: input too short (%d bytes) — must exceed previewCap=%d", len(in), previewCap)
}
got := truncatePreview(in)
if !utf8.ValidString(got) {
t.Errorf("truncatePreview produced invalid UTF-8 — JSONB will reject this row. len(got)=%d", len(got))
}
if len(got) > previewCap {
t.Errorf("truncatePreview exceeded cap: len(got)=%d > previewCap=%d", len(got), previewCap)
}
// Defense-in-depth: the result should also be a clean rune
// prefix of the input — not some garbled sequence.
if !strings.HasPrefix(in, got) {
t.Errorf("truncatePreview should return a prefix of the input")
}
}
// TestTruncatePreview_MultiByteAtBoundary — most-targeted regression.
// Feeds an input where the cap byte falls EXACTLY in the middle of a
// 3-byte codepoint. Pre-fix, this is the case that produces invalid
// UTF-8; post-fix, the truncate stops at the previous rune boundary.
func TestTruncatePreview_MultiByteAtBoundary(t *testing.T) {
// Build a string that's `previewCap-1` ASCII bytes followed by
// '世' (3 bytes). Total = previewCap + 2. The old impl would
// slice at byte previewCap, landing inside the '世' codepoint.
prefix := strings.Repeat("a", previewCap-1)
in := prefix + "世"
if len(in) != previewCap+2 {
t.Fatalf("test setup: expected len %d, got %d", previewCap+2, len(in))
}
got := truncatePreview(in)
if !utf8.ValidString(got) {
t.Errorf("truncatePreview produced invalid UTF-8 at the multi-byte boundary case")
}
// Result should be exactly the ASCII prefix — '世' was past
// the cap so it must be dropped entirely.
if got != prefix {
t.Errorf("expected exact ASCII prefix, got %q (len=%d)", got[len(got)-10:], len(got))
}
}
// ---------- SetStatus lifecycle ----------
func TestLedgerSetStatus_QueuedToDispatched(t *testing.T) {
@@ -109,6 +109,12 @@ curl -fsS -X POST "{{PLATFORM_URL}}/registry/register" \
"version": "0.1.0"
}
}'
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/external-agent-registration
# Common errors:
# • 401 / 403 on register — WORKSPACE_AUTH_TOKEN must be the value
# shown at workspace create. Tokens are shown only once.
`
// externalChannelTemplate — Claude Code channel plugin install + .env. For
@@ -172,6 +178,18 @@ claude --dangerously-load-development-channels \
# Multi-workspace: comma-separate IDs and tokens (same order). See
# https://github.com/Molecule-AI/molecule-mcp-claude-channel for
# pairing flow, push-mode upgrade, and v0.2 roadmap.
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/claude-code-channel-plugin
# Common errors:
# • "plugin not installed" — run /plugin marketplace add then
# /plugin install lines above; /reload-plugins or restart.
# • "not on the approved channels allowlist" — custom channels need
# --dangerously-load-development-channels; team/enterprise orgs
# need admin to set channelsEnabled + allowedChannelPlugins.
# • "Inbound messages not arriving" — stderr should show
# "molecule channel: connected — watching N workspace(s)";
# verify ~/.claude/channels/molecule/.env has PLATFORM_URL + token.
`
// externalUniversalMcpTemplate — runtime-agnostic standalone path.
@@ -198,6 +216,13 @@ const externalUniversalMcpTemplate = `# Universal MCP — standalone register +
# Pair with the Claude Code or Python SDK tab if your runtime needs
# inbound A2A delivery (canvas messages → agent conversation turns).
# Requires Python >= 3.11. On 3.10 or older pip says
# "Could not find a version that satisfies the requirement
# (from versions: none)" — the wheel's requires_python pin filters
# the only available artifact before pip even attempts install.
# Upgrade the interpreter (brew install python@3.12 / apt install
# python3.12 / etc.) or use a 3.11+ venv.
# 1. Install the workspace runtime wheel:
pip install molecule-ai-workspace-runtime
@@ -217,6 +242,17 @@ claude mcp add molecule -s user -- env \
#
# Origin/WAF handling is built into the wheel — no manual headers
# needed when calling tools through the MCP server.
# Need help?
# Where to install: https://pypi.org/project/molecule-ai-workspace-runtime/
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# • "Tools not appearing in your agent" — run ` + "`claude mcp list`" + ` (or
# your runtime's equivalent) and confirm the molecule entry. If
# missing, re-run the ` + "`claude mcp add`" + ` line above.
# • "ConnectionRefused / DNS error on first call" — PLATFORM_URL must
# include the scheme (https://) and have NO trailing slash. Verify
# with: curl ${PLATFORM_URL}/healthz
`
// externalPythonTemplate uses molecule-sdk-python's RemoteAgentClient +
@@ -255,6 +291,15 @@ async def main():
if __name__ == "__main__":
asyncio.run(main())
# Need help?
# Where to install: https://pypi.org/project/molecule-ai-workspace-runtime/
# Documentation: https://doc.moleculesai.app/docs/guides/external-agent-registration
# Common errors:
# • 401 from /heartbeat — AUTH_TOKEN expired or wrong workspace_id.
# Tokens shown only once at create time; re-create to get a fresh one.
# • AGENT_URL not reachable from platform — public HTTPS URL required
# for inbound A2A. Use ngrok or Cloudflare Tunnel if behind NAT.
`
// externalHermesChannelTemplate — install snippet for operators whose
@@ -322,6 +367,16 @@ hermes gateway --replace
#
# Source + issue tracker:
# https://github.com/Molecule-AI/hermes-channel-molecule
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/external-agent-registration
# Common errors:
# • Gateway start failure — tail ~/.hermes/gateway.log. YAML
# duplicate-key in config.yaml is the most common cause; the
# gateway: block must appear exactly once.
# • Plugin not discovered after install — pip show hermes-channel-molecule
# to confirm install. Some hermes builds need ` + "`hermes plugin reload`" + `
# before the new platform_plugins entry takes effect.
`
// externalCodexTemplate — for operators whose external agent is a
@@ -368,14 +423,23 @@ mkdir -p ~/.codex
# (then open ~/.codex/config.toml in your editor and paste:)
#
# [mcp_servers.molecule]
# command = "python3"
# args = ["-m", "molecule_runtime.a2a_mcp_server"]
# command = "molecule-mcp"
# args = []
# startup_timeout_sec = 30
#
# [mcp_servers.molecule.env]
# WORKSPACE_ID = "{{WORKSPACE_ID}}"
# PLATFORM_URL = "{{PLATFORM_URL}}"
# MOLECULE_WORKSPACE_TOKEN = "<paste from create response>"
#
# Use the "molecule-mcp" console-script wrapper (NOT
# "python3 -m molecule_runtime.a2a_mcp_server"). The wrapper is what
# keeps the workspace ALIVE on the canvas: it POSTs /registry/register
# at startup and runs a 20s heartbeat thread alongside the MCP stdio
# loop. The bare a2a_mcp_server module exposes tools but does NOT
# heartbeat — pointing codex at it leaves the canvas showing this
# workspace as awaiting_agent (OFFLINE) within 60-90s even while
# tools work.
# 3. Run the bridge daemon as a durable background process — this
# is the INBOUND path. Long-polls the platform inbox and runs
@@ -403,6 +467,18 @@ disown
# available to the agent, and the bridge wakes a non-interactive
# codex turn for any inbound canvas/peer message:
codex
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# • [mcp_servers.molecule] not loaded — codex must be ≥ 0.57.
# Check with ` + "`codex --version`" + `; upgrade via npm install -g @openai/codex@latest.
# • TOML parse error after re-running setup — TOML rejects duplicate
# [mcp_servers.molecule] tables. Open ~/.codex/config.toml and
# remove the old block before pasting the new one.
# • Canvas messages don't wake codex — step 3 (codex-channel-molecule
# bridge daemon) is required for inbound push. Check
# pgrep -f codex-channel-molecule and tail ~/.codex-channel-molecule/daemon.log.
`
// externalOpenClawTemplate — for operators whose external agent is an
@@ -440,11 +516,20 @@ pip install molecule-ai-workspace-runtime
# 3. Wire the molecule MCP server. {{WORKSPACE_ID}} + {{PLATFORM_URL}}
# are stamped server-side; paste the auth token before running.
#
# Use the "molecule-mcp" console-script wrapper (NOT
# "python3 -m molecule_runtime.a2a_mcp_server"). The wrapper is what
# keeps the workspace ALIVE on the canvas: it POSTs /registry/register
# at startup and runs a 20s heartbeat thread alongside the MCP stdio
# loop. The bare a2a_mcp_server module exposes tools but does NOT
# heartbeat — pointing openclaw at it leaves the canvas showing this
# workspace as awaiting_agent (OFFLINE) within 60-90s even while
# tools work.
WORKSPACE_TOKEN="<paste from create response>"
openclaw mcp set molecule "$(cat <<EOF
{
"command": "python3",
"args": ["-m", "molecule_runtime.a2a_mcp_server"],
"command": "molecule-mcp",
"args": [],
"env": {
"WORKSPACE_ID": "{{WORKSPACE_ID}}",
"PLATFORM_URL": "{{PLATFORM_URL}}",
@@ -464,4 +549,13 @@ disown
# 5. Run an agent turn — molecule tools are now available:
openclaw agent --message "list my peers"
# Need help?
# Documentation: https://doc.moleculesai.app/docs/guides/mcp-server-setup
# Common errors:
# • Gateway not starting — tail ~/.openclaw/gateway.log. The loopback
# bind requires :18789 to be free; check with ` + "`lsof -iTCP:18789`" + `.
# • ` + "`openclaw mcp set`" + ` rejected — the heredoc generates JSON;
# verify with ` + "`jq < ~/.openclaw/mcp/molecule.json`" + ` and re-run
# ` + "`openclaw mcp set`" + ` if the file is malformed.
`
@@ -38,3 +38,40 @@ func TestExternalTemplates_NoMoleculeOrgIDPlaceholder(t *testing.T) {
}
}
}
// TestExternalMcpTemplates_UseMoleculeMcpWrapper pins the invariant
// that operator-facing snippets configuring an MCP server entry point
// use the ``molecule-mcp`` console-script wrapper (mcp_cli.main),
// NOT the bare ``a2a_mcp_server`` module.
//
// Why: a2a_mcp_server exposes the MCP tools but does NOT call
// /registry/register or run the 20s heartbeat thread. mcp_cli wraps
// it with both, which is what flips the canvas presence indicator
// from awaiting_agent (OFFLINE) to online and keeps it that way.
// Originally tracked by molecule-core#2957 — operator hit the
// silent-OFFLINE failure mode when the Codex tab pointed at the bare
// module.
//
// The hermes-channel template intentionally uses the bare module: it
// owns the platform plugin path and runs its own
// register_platform/heartbeat code in-process, so wrapping with
// mcp_cli would double-heartbeat. universalMcp / codex / openclaw
// must all use the wrapper.
func TestExternalMcpTemplates_UseMoleculeMcpWrapper(t *testing.T) {
mustUseWrapper := map[string]string{
"externalUniversalMcpTemplate": externalUniversalMcpTemplate,
"externalCodexTemplate": externalCodexTemplate,
"externalOpenClawTemplate": externalOpenClawTemplate,
}
for name, body := range mustUseWrapper {
if !strings.Contains(body, "molecule-mcp") {
t.Errorf("%s does not reference 'molecule-mcp' — operator-facing MCP snippets must point at the heartbeat-wrapping console script, not the bare a2a_mcp_server module (#2957)", name)
}
if strings.Contains(body, `"-m", "molecule_runtime.a2a_mcp_server"`) {
t.Errorf("%s spawns 'python3 -m molecule_runtime.a2a_mcp_server' — that bypasses the standalone register/heartbeat wrapper, leaving the canvas showing the workspace OFFLINE (#2957). Use 'molecule-mcp' instead.", name)
}
if strings.Contains(body, `["-m", "molecule_runtime.a2a_mcp_server"]`) {
t.Errorf("%s spawns 'python3 -m molecule_runtime.a2a_mcp_server' — that bypasses the standalone register/heartbeat wrapper, leaving the canvas showing the workspace OFFLINE (#2957). Use 'molecule-mcp' instead.", name)
}
}
}
+170 -3
View File
@@ -11,18 +11,21 @@ import (
"os"
"testing"
"errors"
"github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/events"
"github.com/gin-gonic/gin"
)
// newMCPHandler is a test helper that constructs an MCPHandler backed by the
// sqlmock DB set up by setupTestDB.
// sqlmock DB set up by setupTestDB. Uses newTestBroadcaster so handlers
// that BroadcastOnly (send_message_to_user, etc.) don't nil-panic on the
// hub — events.NewBroadcaster(nil) crashes inside hub.Broadcast.
func newMCPHandler(t *testing.T) (*MCPHandler, sqlmock.Sqlmock) {
t.Helper()
mock := setupTestDB(t)
h := NewMCPHandler(db.DB, events.NewBroadcaster(nil))
h := NewMCPHandler(db.DB, newTestBroadcaster())
return h, mock
}
@@ -628,6 +631,170 @@ func TestMCPHandler_SendMessageToUser_Blocked_WhenEnvNotSet(t *testing.T) {
}
}
// TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s pins the
// "best-effort persistence" contract: when the activity_log INSERT
// fails (DB hiccup, constraint violation, transient connection drop),
// the tool MUST still return success to the agent because the WS
// broadcast already succeeded — the user has seen the message.
//
// This matches /notify (activity.go) behavior. Returning an error
// here would cause the agent to retry and re-broadcast, double-
// rendering the message in the user's live chat panel for every
// retry until the DB recovers.
func TestMCPHandler_SendMessageToUser_DBErrorLogsAndStill200s(t *testing.T) {
t.Setenv("MOLECULE_MCP_ALLOW_SEND_MESSAGE", "true")
h, mock := newMCPHandler(t)
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-err").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
// INSERT fails — must NOT abort the tool response.
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WillReturnError(errors.New("transient db error"))
w := mcpPost(t, h, "ws-err", map[string]interface{}{
"jsonrpc": "2.0",
"id": 100,
"method": "tools/call",
"params": map[string]interface{}{
"name": "send_message_to_user",
"arguments": map[string]interface{}{
"message": "should not be lost from the live chat",
},
},
})
var resp mcpResponse
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("response was not valid JSON-RPC: %v", err)
}
// Tool response is success — INSERT failure logged, broadcast
// already succeeded.
if resp.Error != nil {
t.Errorf("tool response should be success on DB error (broadcast won), got JSON-RPC error: %+v", resp.Error)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("expected DB calls in order: %v", err)
}
}
// TestMCPHandler_SendMessageToUser_ResponseBodyShape pins the
// response_body JSON shape stored in activity_logs. This shape MUST
// match what the canvas hydrater (extractResponseText in
// historyHydration.ts) reads — specifically `{"result": "<text>"}`.
// Any drift in the JSON shape silently breaks chat history without
// failing the INSERT.
//
// Caught the same drift class flagged in
// feedback_assert_exact_not_substring.md: a substring match on
// "result" would pass even if the field were renamed; we assert the
// exact JSON shape.
func TestMCPHandler_SendMessageToUser_ResponseBodyShape(t *testing.T) {
t.Setenv("MOLECULE_MCP_ALLOW_SEND_MESSAGE", "true")
h, mock := newMCPHandler(t)
const userMessage = "Hi there from the agent"
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-shape").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
// Capture the response_body argument and assert its exact shape.
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WithArgs(
"ws-shape",
sqlmock.AnyArg(), // summary
// The response_body MUST be JSON `{"result": "<message>"}`.
// Any other shape (e.g., wrapping in a Task object) breaks
// the canvas hydrater's `body.result` extractor.
`{"result":"`+userMessage+`"}`,
).
WillReturnResult(sqlmock.NewResult(1, 1))
w := mcpPost(t, h, "ws-shape", map[string]interface{}{
"jsonrpc": "2.0",
"id": 101,
"method": "tools/call",
"params": map[string]interface{}{
"name": "send_message_to_user",
"arguments": map[string]interface{}{
"message": userMessage,
},
},
})
if w.Code != 200 {
t.Fatalf("expected 200, got %d body=%s", w.Code, w.Body.String())
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("response_body shape drift — would silently break canvas chat history: %v", err)
}
}
// TestMCPHandler_SendMessageToUser_PersistsToActivityLog pins the fix
// for the reno-stars / CEO Ryan PC chat-history data-loss bug:
// external claude-code agents using molecule-mcp's send_message_to_user
// tool route through THIS handler (not the HTTP /notify endpoint),
// and the handler used to broadcast WS only — visible live, gone on
// reload because nothing wrote to activity_logs.
//
// Pins:
// - INSERT happens on the success path (broadcast + DB write).
// - INSERT shape mirrors the HTTP /notify handler exactly:
// activity_type='a2a_receive', method='notify', request_body NULL,
// response_body={"result": message}, status='ok'. The canvas
// hydration query (`type=a2a_receive&source=canvas`) treats
// both writers as the same shape — drift here means the bug
// re-surfaces silently.
func TestMCPHandler_SendMessageToUser_PersistsToActivityLog(t *testing.T) {
t.Setenv("MOLECULE_MCP_ALLOW_SEND_MESSAGE", "true")
h, mock := newMCPHandler(t)
// Workspace lookup — the handler verifies the workspace exists
// before it does anything else. Returning a name lets the
// broadcast payload populate; the test doesn't assert on the
// broadcast (no observable WS in this fake), only on the DB.
mock.ExpectQuery("SELECT name FROM workspaces").
WithArgs("ws-msg").
WillReturnRows(sqlmock.NewRows([]string{"name"}).AddRow("CEO Ryan PC"))
// The persistence INSERT — pin the exact shape so a future
// refactor that switches columns or drops `method='notify'`
// breaks the test loud, not silently. Match by regex on the
// table + activity_type + method literals.
mock.ExpectExec(`INSERT INTO activity_logs.*'a2a_receive'.*'notify'`).
WithArgs(
"ws-msg",
sqlmock.AnyArg(), // summary "Agent message: ..."
sqlmock.AnyArg(), // response_body JSON
).
WillReturnResult(sqlmock.NewResult(1, 1))
w := mcpPost(t, h, "ws-msg", map[string]interface{}{
"jsonrpc": "2.0",
"id": 99,
"method": "tools/call",
"params": map[string]interface{}{
"name": "send_message_to_user",
"arguments": map[string]interface{}{
"message": "Hello, this should persist!",
},
},
})
var resp mcpResponse
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("response was not valid JSON-RPC: %v\nbody=%s", err, w.Body.String())
}
if resp.Error != nil {
t.Errorf("unexpected JSON-RPC error: %+v", resp.Error)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("DB expectations not met (INSERT missing → reno-stars data-loss regression): %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Parse error
// ─────────────────────────────────────────────────────────────────────────────
+17 -13
View File
@@ -11,6 +11,7 @@ import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"io"
"log"
@@ -330,20 +331,23 @@ func (h *MCPHandler) toolSendMessageToUser(ctx context.Context, workspaceID stri
return "", fmt.Errorf("send_message_to_user is not enabled on this MCP bridge (set MOLECULE_MCP_ALLOW_SEND_MESSAGE=true)")
}
var wsName string
err := h.database.QueryRowContext(ctx,
`SELECT name FROM workspaces WHERE id = $1 AND status != 'removed'`, workspaceID,
).Scan(&wsName)
if err != nil {
return "", fmt.Errorf("workspace not found")
// Single source of truth for chat-bearing agent → user messages —
// see agent_message_writer.go for the contract. The pre-RFC-#2945
// duplication of broadcast + INSERT logic between this handler and
// activity.go:Notify is what produced the reno-stars data-loss
// regression; both paths now route through the same writer.
//
// MCP send_message_to_user does not currently surface attachments
// (the tool args don't accept them); pass nil. If a future tool
// schema adds an attachments arg, build []AgentMessageAttachment
// and pass through.
writer := NewAgentMessageWriter(h.database, h.broadcaster)
if err := writer.Send(ctx, workspaceID, message, nil); err != nil {
if errors.Is(err, ErrWorkspaceNotFound) {
return "", fmt.Errorf("workspace not found")
}
return "", err
}
h.broadcaster.BroadcastOnly(workspaceID, "AGENT_MESSAGE", map[string]interface{}{
"message": message,
"workspace_id": workspaceID,
"name": wsName,
})
return "Message sent.", nil
}
@@ -0,0 +1,416 @@
package handlers
// memories_v2.go — HTTP endpoints that expose Memory v2 plugin state to
// the canvas Memory tab. Reads-only; writes still go through the MCP
// path (see mcp_tools_memory_v2.go) where SAFE-T1201 redaction +
// org-write audit happen at a single funnel.
//
// Why a separate v2 endpoint set rather than retrofitting memories.go:
//
// - memories.go reads `agent_memories` (legacy v1 table). After the
// 2026-05-05 cutover, agent commits go to the plugin's
// memory_records — agent_memories is frozen. The canvas Memory
// tab reading memories.go shows STALE data.
// - The plugin is loopback-only on each tenant (127.0.0.1:9100), so
// the canvas (browser) cannot call it directly. workspace-server
// proxies through these endpoints.
// - v2 has different shape (namespace tree, kind/source/pin/TTL,
// score) — overloading memories.go would break v1 consumers
// (admin export, the back-compat MCP shim).
//
// All endpoints sit under the same wsAuth group memories.go uses,
// so the existing per-tenant token gates them automatically.
import (
"errors"
"log"
"net/http"
"strconv"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/client"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/contract"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/namespace"
"github.com/gin-gonic/gin"
)
// MemoriesV2Handler bundles the plugin client + namespace resolver
// behind a slim HTTP surface. Construction matches the rest of the
// handlers package: NewMemoriesV2Handler followed by WithMemoryV2 (or
// the test-only withMemoryV2APIs) at boot.
type MemoriesV2Handler struct {
plugin memoryPluginAPI
resolver namespaceResolverAPI
}
// NewMemoriesV2Handler constructs an unwired handler. Every method
// returns 503 until WithMemoryV2 is called — keeps a partial deploy
// (MEMORY_PLUGIN_URL absent) from crashing the canvas with 500s.
func NewMemoriesV2Handler() *MemoriesV2Handler {
return &MemoriesV2Handler{}
}
// WithMemoryV2 attaches the live plugin client + resolver. Returns
// the receiver for fluent boot-time wiring, mirroring MCPHandler.
func (h *MemoriesV2Handler) WithMemoryV2(plugin *client.Client, resolver *namespace.Resolver) *MemoriesV2Handler {
h.plugin = plugin
h.resolver = resolver
return h
}
// withMemoryV2APIs is the test-only injection path: takes the
// interfaces directly so unit tests don't have to construct a real
// *client.Client / namespace.Resolver. Keep symmetric with
// MCPHandler.withMemoryV2APIs so handler tests can re-use the same
// stubs.
func (h *MemoriesV2Handler) withMemoryV2APIs(plugin memoryPluginAPI, resolver namespaceResolverAPI) *MemoriesV2Handler {
h.plugin = plugin
h.resolver = resolver
return h
}
// available reports whether the v2 deps are wired. Each route checks
// this and returns 503 + a clear hint when the plugin isn't
// configured, matching the MCP-side error.
func (h *MemoriesV2Handler) available() error {
if h == nil || h.plugin == nil || h.resolver == nil {
return errors.New("memory plugin is not configured (set MEMORY_PLUGIN_URL)")
}
return nil
}
// ─────────────────────────────────────────────────────────────────────────────
// GET /workspaces/:id/v2/namespaces
//
// Returns the namespace tree the canvas uses to drive the Memory tab's
// namespace dropdown. Two arrays:
//
// - readable[]: every namespace this workspace can READ from. Drives
// the "show me memories from X" filter dropdown.
// - writable[]: subset of readable that this workspace can WRITE to.
// Used for future canvas-side commit (not in this PR but the
// contract is symmetric so the dropdown can disable read-only
// entries when wiring up commit).
//
// Each entry carries name + kind + a friendly label so the canvas
// doesn't have to parse `workspace:abc-123` itself. Kind ranks the
// dropdown grouping (workspace → team → org → custom).
// ─────────────────────────────────────────────────────────────────────────────
// NamespaceView is the UI-friendly DTO returned by GET v2/namespaces.
// Internal namespace.Namespace has fields the canvas doesn't need
// (resolver-internal flags, raw metadata blobs); this strips it down.
type NamespaceView struct {
Name string `json:"name"`
Kind contract.NamespaceKind `json:"kind"`
// Label is a stable display string the canvas can render directly.
// For workspace:<id> it's "Workspace (<short-id>)"; for team:<id>
// it's "Team (<short-id>)"; org/custom carry the raw suffix.
Label string `json:"label"`
}
// NamespacesResponse is the body of GET v2/namespaces.
type NamespacesResponse struct {
Readable []NamespaceView `json:"readable"`
Writable []NamespaceView `json:"writable"`
}
// Namespaces handles GET /workspaces/:id/v2/namespaces.
func (h *MemoriesV2Handler) Namespaces(c *gin.Context) {
if err := h.available(); err != nil {
c.JSON(http.StatusServiceUnavailable, gin.H{"error": err.Error()})
return
}
workspaceID := c.Param("id")
ctx := c.Request.Context()
readable, err := h.resolver.ReadableNamespaces(ctx, workspaceID)
if err != nil {
log.Printf("v2/namespaces readable error workspace=%s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to resolve readable namespaces"})
return
}
writable, err := h.resolver.WritableNamespaces(ctx, workspaceID)
if err != nil {
log.Printf("v2/namespaces writable error workspace=%s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to resolve writable namespaces"})
return
}
c.JSON(http.StatusOK, NamespacesResponse{
Readable: namespacesToViews(readable),
Writable: namespacesToViews(writable),
})
}
// ─────────────────────────────────────────────────────────────────────────────
// GET /workspaces/:id/v2/memories
//
// Search the plugin for memories visible to this workspace.
//
// Query params (all optional):
// - namespace: a single readable namespace to scope to. Omitted ⇒ all
// readable namespaces (dropdown's "All" mode).
// - q: full-text query string. Empty ⇒ recency-ordered listing.
// - kind: one of fact|summary|checkpoint. Empty ⇒ all kinds.
// - limit: max rows. Defaults to 50, clamped to 100. Matches the
// v1 endpoint's clamp shape (memories.go:memoryRecallMaxLimit).
//
// Server-side ACL invariant: the request is ALWAYS intersected with
// the resolver's readable set on the server. A canvas-supplied
// `namespace=foo:bar` that this workspace can't read returns an empty
// list, NOT 403 — the canvas dropdown is built from /v2/namespaces
// so a forbidden value is a stale-cache bug, not malice. Existence
// non-inference: empty result is indistinguishable from "you can't
// read this namespace" — same as the wsAuth-protected v1 endpoints.
// ─────────────────────────────────────────────────────────────────────────────
const memoriesV2DefaultLimit = 50
const memoriesV2MaxLimit = 100
// Search handles GET /workspaces/:id/v2/memories.
func (h *MemoriesV2Handler) Search(c *gin.Context) {
if err := h.available(); err != nil {
c.JSON(http.StatusServiceUnavailable, gin.H{"error": err.Error()})
return
}
workspaceID := c.Param("id")
ctx := c.Request.Context()
requestedNS := c.Query("namespace")
query := c.Query("q")
kindStr := c.Query("kind")
limit := parseLimit(c.Query("limit"))
// Resolve the readable set, then intersect the request.
// IntersectReadable handles both the empty-request case (return
// all readable) and the explicit-namespace case (return [ns] iff
// readable, else []).
var requested []string
if requestedNS != "" {
requested = []string{requestedNS}
}
scopedNamespaces, err := h.resolver.IntersectReadable(ctx, workspaceID, requested)
if err != nil {
log.Printf("v2/memories intersect error workspace=%s: %v", workspaceID, err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to resolve namespaces"})
return
}
// Empty after intersection — caller asked for a namespace they
// can't read, OR they have no readable namespaces at all. Return
// [] (not 404) so the canvas can render its empty-state without
// special-casing.
if len(scopedNamespaces) == 0 {
c.JSON(http.StatusOK, MemoriesResponse{Memories: []MemoryView{}})
return
}
req := contract.SearchRequest{
Namespaces: scopedNamespaces,
Query: query,
Limit: limit,
}
if kindStr != "" {
req.Kinds = []contract.MemoryKind{contract.MemoryKind(kindStr)}
}
resp, err := h.plugin.Search(ctx, req)
if err != nil {
log.Printf("v2/memories plugin error workspace=%s: %v", workspaceID, err)
c.JSON(http.StatusBadGateway, gin.H{"error": "memory plugin search failed"})
return
}
out := MemoriesResponse{Memories: make([]MemoryView, 0, len(resp.Memories))}
for _, m := range resp.Memories {
out.Memories = append(out.Memories, memoryToView(m))
}
c.JSON(http.StatusOK, out)
}
// ─────────────────────────────────────────────────────────────────────────────
// DELETE /workspaces/:id/v2/memories/:memoryId
//
// Forget a memory. The plugin enforces its own ownership model — we
// pass `requested_by_namespace = workspace:<id>` so the audit trail
// records who initiated the forget; the plugin's ACL gate decides
// whether the deletion is allowed.
//
// 404 (not 403) on a missing or non-owned memory: existence-non-
// inferring response, matches the v1 DELETE in memories.go.
// ─────────────────────────────────────────────────────────────────────────────
// Forget handles DELETE /workspaces/:id/v2/memories/:memoryId.
func (h *MemoriesV2Handler) Forget(c *gin.Context) {
if err := h.available(); err != nil {
c.JSON(http.StatusServiceUnavailable, gin.H{"error": err.Error()})
return
}
workspaceID := c.Param("id")
memoryID := c.Param("memoryId")
ctx := c.Request.Context()
if memoryID == "" {
c.JSON(http.StatusBadRequest, gin.H{"error": "memoryId is required"})
return
}
body := contract.ForgetRequest{
RequestedByNamespace: "workspace:" + workspaceID,
}
if err := h.plugin.ForgetMemory(ctx, memoryID, body); err != nil {
// Map plugin not_found → 404. Anything else is upstream error.
var ce *contract.Error
if errors.As(err, &ce) && ce.Code == contract.ErrorCodeNotFound {
c.JSON(http.StatusNotFound, gin.H{"error": "memory not found"})
return
}
log.Printf("v2/memories forget error workspace=%s memory=%s: %v", workspaceID, memoryID, err)
c.JSON(http.StatusBadGateway, gin.H{"error": "memory plugin delete failed"})
return
}
c.JSON(http.StatusOK, gin.H{"status": "deleted"})
}
// ─────────────────────────────────────────────────────────────────────────────
// View shaping helpers
// ─────────────────────────────────────────────────────────────────────────────
// MemoryView is the canvas-facing shape of a v2 memory record. The raw
// contract.Memory carries internal fields we don't expose (raw
// `propagation` blob); MemoryView strips it to what the Memory tab
// renders.
type MemoryView struct {
ID string `json:"id"`
Namespace string `json:"namespace"`
Content string `json:"content"`
Kind contract.MemoryKind `json:"kind"`
Source contract.MemorySource `json:"source"`
Pin bool `json:"pin"`
ExpiresAt *time.Time `json:"expires_at,omitempty"`
CreatedAt time.Time `json:"created_at"`
// Score is the plugin's similarity score (1.0 = exact); only
// populated when ?q= is set and the plugin supports embedding.
Score *float64 `json:"score,omitempty"`
// SourceWorkspaceID is parsed out of `propagation.source_workspace_id`
// when present (cross-workspace propagation) — lets the canvas
// render a "from <peer>" badge so users can tell their own writes
// apart from team-shared memory.
SourceWorkspaceID string `json:"source_workspace_id,omitempty"`
}
// MemoriesResponse is the body of GET v2/memories.
type MemoriesResponse struct {
Memories []MemoryView `json:"memories"`
}
func memoryToView(m contract.Memory) MemoryView {
v := MemoryView{
ID: m.ID,
Namespace: m.Namespace,
Content: m.Content,
Kind: m.Kind,
Source: m.Source,
Pin: m.Pin,
ExpiresAt: m.ExpiresAt,
CreatedAt: m.CreatedAt,
Score: m.Score,
}
if m.Propagation != nil {
// `source_workspace_id` is a propagation contract field
// (RFC #2728 §5). Plugin emits it on writes that originated
// from a different workspace. Best-effort string extraction —
// don't fail rendering if shape drifts.
if raw, ok := m.Propagation["source_workspace_id"]; ok {
if s, ok := raw.(string); ok && s != "" {
v.SourceWorkspaceID = s
}
}
}
return v
}
// namespacesToViews converts resolver namespaces into UI-friendly
// views. Stable sort: workspace → team → org → custom, then by name.
func namespacesToViews(in []namespace.Namespace) []NamespaceView {
views := make([]NamespaceView, 0, len(in))
for _, n := range in {
views = append(views, NamespaceView{
Name: n.Name,
Kind: n.Kind,
Label: namespaceLabel(n.Name, n.Kind),
})
}
return views
}
// namespaceLabel renders a human-friendly label for a namespace. The
// canvas displays this directly; we keep the formatting server-side
// so the shape stays consistent across UIs (canvas, future TUI, etc.).
//
// Format:
// workspace:abc-123 → "Workspace (abc-123)" (UUID short-prefixed)
// team:t-1 → "Team (t-1)"
// org:acme → "Org (acme)"
// custom:foo → "foo" (operator-defined; raw)
func namespaceLabel(name string, kind contract.NamespaceKind) string {
suffix := ""
if i := indexOfColon(name); i >= 0 && i+1 < len(name) {
suffix = name[i+1:]
}
switch kind {
case contract.NamespaceKindWorkspace:
return "Workspace (" + shortID(suffix) + ")"
case contract.NamespaceKindTeam:
return "Team (" + shortID(suffix) + ")"
case contract.NamespaceKindOrg:
return "Org (" + suffix + ")"
case contract.NamespaceKindCustom:
// Custom namespaces are operator-defined; surface the raw
// suffix so they can label them however they want.
if suffix == "" {
return name
}
return suffix
default:
return name
}
}
// shortID truncates a UUID-like string to the first 8 chars so the
// dropdown stays readable. Keeps the full id available via the
// `name` field for click-to-copy / debugging.
func shortID(s string) string {
if len(s) <= 8 {
return s
}
return s[:8]
}
// indexOfColon is strings.IndexByte without the import, kept inline so
// the helper stays trivially auditable next to namespaceLabel.
func indexOfColon(s string) int {
for i := 0; i < len(s); i++ {
if s[i] == ':' {
return i
}
}
return -1
}
// parseLimit validates the ?limit= query value. Defaults +
// clamps mirror memoriesV2DefaultLimit / memoriesV2MaxLimit.
func parseLimit(raw string) int {
if raw == "" {
return memoriesV2DefaultLimit
}
n, err := strconv.Atoi(raw)
if err != nil || n <= 0 {
return memoriesV2DefaultLimit
}
if n > memoriesV2MaxLimit {
return memoriesV2MaxLimit
}
return n
}
@@ -0,0 +1,669 @@
package handlers
// memories_v2_test.go — comprehensive coverage for the Memory v2
// canvas-facing HTTP surface. Pinned shape:
//
// - 503 path when plugin unwired (every route)
// - GET /v2/namespaces success + readable/writable propagation
// - GET /v2/namespaces error path (resolver failure on either call)
// - GET /v2/memories: empty intersection, namespace passthrough,
// query+kind+limit propagation, plugin error mapping
// - DELETE /v2/memories/:id: success, plugin not_found→404, other
// plugin errors→502, missing memoryId→400
// - View shaping: namespaceLabel for all four kinds + truncation,
// memoryToView with/without propagation source, parseLimit edge
// cases (default, negative, zero, over-cap, non-numeric)
//
// Tests use the same `memoryPluginAPI` / `namespaceResolverAPI` fakes
// the MCP v2 tests use so we don't spin up a real plugin server.
import (
"context"
"encoding/json"
"errors"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/contract"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/memory/namespace"
"github.com/gin-gonic/gin"
)
// ─────────────────────────────────────────────────────────────────────────────
// Fakes
// ─────────────────────────────────────────────────────────────────────────────
type fakePlugin struct {
searchResp *contract.SearchResponse
searchErr error
searchReq contract.SearchRequest // captured for assertion
forgetErr error
forgetID string
forgetReq contract.ForgetRequest
}
func (f *fakePlugin) CommitMemory(ctx context.Context, ns string, body contract.MemoryWrite) (*contract.MemoryWriteResponse, error) {
return nil, errors.New("not implemented in fake")
}
func (f *fakePlugin) Search(ctx context.Context, body contract.SearchRequest) (*contract.SearchResponse, error) {
f.searchReq = body
if f.searchErr != nil {
return nil, f.searchErr
}
return f.searchResp, nil
}
func (f *fakePlugin) ForgetMemory(ctx context.Context, id string, body contract.ForgetRequest) error {
f.forgetID = id
f.forgetReq = body
return f.forgetErr
}
type fakeNSResolver struct {
readable []namespace.Namespace
readableErr error
writable []namespace.Namespace
writableErr error
intersect []string
intersectErr error
intersectIn []string // captured
}
func (f *fakeNSResolver) ReadableNamespaces(ctx context.Context, ws string) ([]namespace.Namespace, error) {
return f.readable, f.readableErr
}
func (f *fakeNSResolver) WritableNamespaces(ctx context.Context, ws string) ([]namespace.Namespace, error) {
return f.writable, f.writableErr
}
func (f *fakeNSResolver) CanWrite(ctx context.Context, ws, ns string) (bool, error) {
return true, nil
}
func (f *fakeNSResolver) IntersectReadable(ctx context.Context, ws string, requested []string) ([]string, error) {
f.intersectIn = requested
return f.intersect, f.intersectErr
}
// ─────────────────────────────────────────────────────────────────────────────
// Test helpers
// ─────────────────────────────────────────────────────────────────────────────
func init() {
gin.SetMode(gin.TestMode)
}
// newWiredHandler returns a handler with both the fake plugin + fake
// resolver attached. Tests that need the unwired (503) path use
// NewMemoriesV2Handler() directly.
func newWiredHandler(p *fakePlugin, r *fakeNSResolver) *MemoriesV2Handler {
return NewMemoriesV2Handler().withMemoryV2APIs(p, r)
}
func doRequest(t *testing.T, h *MemoriesV2Handler, method, path string, params gin.Params) *httptest.ResponseRecorder {
t.Helper()
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Params = params
req := httptest.NewRequest(method, path, nil)
c.Request = req
switch {
case method == http.MethodGet && strings.HasSuffix(path, "/v2/namespaces"):
h.Namespaces(c)
case method == http.MethodGet && strings.Contains(path, "/v2/memories"):
h.Search(c)
case method == http.MethodDelete:
h.Forget(c)
default:
t.Fatalf("doRequest: don't know how to dispatch %s %s", method, path)
}
return rec
}
func mustJSON(t *testing.T, body []byte, out interface{}) {
t.Helper()
if err := json.Unmarshal(body, out); err != nil {
t.Fatalf("json decode: %v\nbody=%s", err, string(body))
}
}
// ─────────────────────────────────────────────────────────────────────────────
// 503 — plugin unwired
// ─────────────────────────────────────────────────────────────────────────────
func TestMemoriesV2_PluginUnwired_All503(t *testing.T) {
h := NewMemoriesV2Handler() // no WithMemoryV2 / withMemoryV2APIs
cases := []struct {
name string
method string
path string
params gin.Params
}{
{"namespaces", http.MethodGet, "/workspaces/ws-a/v2/namespaces", gin.Params{{Key: "id", Value: "ws-a"}}},
{"search", http.MethodGet, "/workspaces/ws-a/v2/memories", gin.Params{{Key: "id", Value: "ws-a"}}},
{"forget", http.MethodDelete, "/workspaces/ws-a/v2/memories/m-1", gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "m-1"}}},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
rec := doRequest(t, h, tc.method, tc.path, tc.params)
if rec.Code != http.StatusServiceUnavailable {
t.Errorf("expected 503, got %d", rec.Code)
}
var body map[string]string
mustJSON(t, rec.Body.Bytes(), &body)
if !strings.Contains(body["error"], "MEMORY_PLUGIN_URL") {
t.Errorf("503 body missing operator hint, got: %q", body["error"])
}
})
}
}
// ─────────────────────────────────────────────────────────────────────────────
// GET /v2/namespaces
// ─────────────────────────────────────────────────────────────────────────────
func TestMemoriesV2_Namespaces_Success(t *testing.T) {
resolver := &fakeNSResolver{
readable: []namespace.Namespace{
{Name: "workspace:abc-1234-5678", Kind: contract.NamespaceKindWorkspace},
{Name: "team:t-99", Kind: contract.NamespaceKindTeam},
{Name: "org:acme", Kind: contract.NamespaceKindOrg},
{Name: "custom:special", Kind: contract.NamespaceKindCustom},
},
writable: []namespace.Namespace{
{Name: "workspace:abc-1234-5678", Kind: contract.NamespaceKindWorkspace},
},
}
h := newWiredHandler(&fakePlugin{}, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/namespaces",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != 200 {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
var body NamespacesResponse
mustJSON(t, rec.Body.Bytes(), &body)
if len(body.Readable) != 4 {
t.Errorf("expected 4 readable, got %d", len(body.Readable))
}
if len(body.Writable) != 1 {
t.Errorf("expected 1 writable, got %d", len(body.Writable))
}
// Label shaping pinned exactly — drift would silently break the
// dropdown rendering.
wantLabels := map[string]string{
"workspace:abc-1234-5678": "Workspace (abc-1234)",
"team:t-99": "Team (t-99)",
"org:acme": "Org (acme)",
"custom:special": "special",
}
for _, v := range body.Readable {
want, ok := wantLabels[v.Name]
if !ok {
t.Errorf("unexpected namespace name %q", v.Name)
continue
}
if v.Label != want {
t.Errorf("namespace %q: want label %q, got %q", v.Name, want, v.Label)
}
}
}
func TestMemoriesV2_Namespaces_ReadableError(t *testing.T) {
resolver := &fakeNSResolver{readableErr: errors.New("boom")}
h := newWiredHandler(&fakePlugin{}, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/namespaces",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d", rec.Code)
}
}
func TestMemoriesV2_Namespaces_WritableError(t *testing.T) {
resolver := &fakeNSResolver{
readable: []namespace.Namespace{},
writableErr: errors.New("boom"),
}
h := newWiredHandler(&fakePlugin{}, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/namespaces",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d", rec.Code)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// GET /v2/memories — search path
// ─────────────────────────────────────────────────────────────────────────────
func TestMemoriesV2_Search_NoReadableNamespaces_EmptyResult(t *testing.T) {
// Empty intersection (e.g. workspace just provisioned, plugin
// hasn't created namespaces yet, OR caller asked for ns they
// can't read). Expected: 200 with empty memories array, NOT 404.
resolver := &fakeNSResolver{intersect: []string{}}
plugin := &fakePlugin{searchResp: &contract.SearchResponse{Memories: []contract.Memory{}}}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != 200 {
t.Errorf("expected 200, got %d", rec.Code)
}
var body MemoriesResponse
mustJSON(t, rec.Body.Bytes(), &body)
if body.Memories == nil {
t.Error("Memories should be empty array, not nil — JSON would render null")
}
if len(body.Memories) != 0 {
t.Errorf("expected empty memories, got %d", len(body.Memories))
}
// Plugin must NOT be called when intersection is empty.
if plugin.searchReq.Namespaces != nil {
t.Error("plugin Search should not be called when intersection is empty")
}
}
func TestMemoriesV2_Search_FullPath_NamespaceQueryKindLimit(t *testing.T) {
expiresAt := time.Now().Add(24 * time.Hour)
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a"}}
score := 0.87
plugin := &fakePlugin{
searchResp: &contract.SearchResponse{
Memories: []contract.Memory{
{
ID: "m-1",
Namespace: "workspace:ws-a",
Content: "fact one",
Kind: contract.MemoryKindFact,
Source: contract.MemorySourceAgent,
Pin: true,
ExpiresAt: &expiresAt,
CreatedAt: time.Now(),
Score: &score,
Propagation: map[string]interface{}{
"source_workspace_id": "ws-peer-42",
},
},
},
},
}
h := newWiredHandler(plugin, resolver)
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Params = gin.Params{{Key: "id", Value: "ws-a"}}
c.Request = httptest.NewRequest(http.MethodGet,
"/workspaces/ws-a/v2/memories?namespace=workspace:ws-a&q=hello&kind=fact&limit=10", nil)
h.Search(c)
if rec.Code != 200 {
t.Fatalf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
// Resolver received the requested namespace as a single-element list
if len(resolver.intersectIn) != 1 || resolver.intersectIn[0] != "workspace:ws-a" {
t.Errorf("resolver.IntersectReadable received %v, want [workspace:ws-a]", resolver.intersectIn)
}
// Plugin received query + kind + limit propagated through
if plugin.searchReq.Query != "hello" {
t.Errorf("plugin.Query=%q, want hello", plugin.searchReq.Query)
}
if len(plugin.searchReq.Kinds) != 1 || plugin.searchReq.Kinds[0] != contract.MemoryKindFact {
t.Errorf("plugin.Kinds=%v, want [fact]", plugin.searchReq.Kinds)
}
if plugin.searchReq.Limit != 10 {
t.Errorf("plugin.Limit=%d, want 10", plugin.searchReq.Limit)
}
// Response shape — pin/expires_at/score/source_workspace_id all
// surfaced into MemoryView so the canvas doesn't have to dig
// through propagation map.
var body MemoriesResponse
mustJSON(t, rec.Body.Bytes(), &body)
if len(body.Memories) != 1 {
t.Fatalf("expected 1 memory, got %d", len(body.Memories))
}
m := body.Memories[0]
if !m.Pin {
t.Error("Pin not propagated")
}
if m.ExpiresAt == nil {
t.Error("ExpiresAt not propagated")
}
if m.Score == nil || *m.Score != 0.87 {
t.Errorf("Score=%v, want 0.87", m.Score)
}
if m.SourceWorkspaceID != "ws-peer-42" {
t.Errorf("SourceWorkspaceID=%q, want ws-peer-42", m.SourceWorkspaceID)
}
}
func TestMemoriesV2_Search_NoNamespaceQuery_AllReadable(t *testing.T) {
// No ?namespace= → resolver.IntersectReadable receives nil (empty
// requested) and returns ALL readable. Plugin gets full set.
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a", "team:t-1"}}
plugin := &fakePlugin{searchResp: &contract.SearchResponse{}}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != 200 {
t.Errorf("expected 200, got %d", rec.Code)
}
if resolver.intersectIn != nil {
t.Errorf("requested should be nil for unscoped query, got %v", resolver.intersectIn)
}
if len(plugin.searchReq.Namespaces) != 2 {
t.Errorf("plugin.Namespaces=%v, want both readable", plugin.searchReq.Namespaces)
}
}
func TestMemoriesV2_Search_IntersectError(t *testing.T) {
resolver := &fakeNSResolver{intersectErr: errors.New("db down")}
h := newWiredHandler(&fakePlugin{}, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != http.StatusInternalServerError {
t.Errorf("expected 500, got %d", rec.Code)
}
}
func TestMemoriesV2_Search_PluginError(t *testing.T) {
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a"}}
plugin := &fakePlugin{searchErr: errors.New("plugin down")}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != http.StatusBadGateway {
t.Errorf("expected 502 (plugin error), got %d", rec.Code)
}
}
func TestMemoriesV2_Search_PropagationMissing_NoSourceWorkspaceID(t *testing.T) {
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a"}}
plugin := &fakePlugin{
searchResp: &contract.SearchResponse{
Memories: []contract.Memory{
{ID: "m-1", Namespace: "workspace:ws-a", Content: "no propagation"},
},
},
}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
var body MemoriesResponse
mustJSON(t, rec.Body.Bytes(), &body)
if len(body.Memories) != 1 || body.Memories[0].SourceWorkspaceID != "" {
t.Errorf("SourceWorkspaceID should be empty when propagation is nil, got %q", body.Memories[0].SourceWorkspaceID)
}
}
func TestMemoriesV2_Search_PropagationWrongType_DoesNotPanic(t *testing.T) {
resolver := &fakeNSResolver{intersect: []string{"workspace:ws-a"}}
plugin := &fakePlugin{
searchResp: &contract.SearchResponse{
Memories: []contract.Memory{
{
ID: "m-1",
Content: "wrong-type propagation",
Propagation: map[string]interface{}{
"source_workspace_id": 12345, // int, not string
},
},
},
},
}
h := newWiredHandler(plugin, resolver)
rec := doRequest(t, h, http.MethodGet, "/workspaces/ws-a/v2/memories",
gin.Params{{Key: "id", Value: "ws-a"}})
if rec.Code != 200 {
t.Fatalf("expected 200 (graceful), got %d", rec.Code)
}
var body MemoriesResponse
mustJSON(t, rec.Body.Bytes(), &body)
// Wrong-typed prop entry → empty SourceWorkspaceID, no panic.
if body.Memories[0].SourceWorkspaceID != "" {
t.Errorf("expected empty SourceWorkspaceID for non-string propagation, got %q", body.Memories[0].SourceWorkspaceID)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// DELETE /v2/memories/:memoryId
// ─────────────────────────────────────────────────────────────────────────────
func TestMemoriesV2_Forget_Success(t *testing.T) {
plugin := &fakePlugin{} // forgetErr nil
h := newWiredHandler(plugin, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/mem-42",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "mem-42"}})
if rec.Code != 200 {
t.Errorf("expected 200, got %d body=%s", rec.Code, rec.Body.String())
}
if plugin.forgetID != "mem-42" {
t.Errorf("plugin received memoryID=%q, want mem-42", plugin.forgetID)
}
if plugin.forgetReq.RequestedByNamespace != "workspace:ws-a" {
t.Errorf("requested_by_namespace=%q, want workspace:ws-a", plugin.forgetReq.RequestedByNamespace)
}
}
func TestMemoriesV2_Forget_PluginNotFound_Maps404(t *testing.T) {
plugin := &fakePlugin{
forgetErr: &contract.Error{Code: contract.ErrorCodeNotFound, Message: "no such memory"},
}
h := newWiredHandler(plugin, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/m-1",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "m-1"}})
if rec.Code != http.StatusNotFound {
t.Errorf("expected 404, got %d", rec.Code)
}
}
func TestMemoriesV2_Forget_PluginOtherError_Maps502(t *testing.T) {
plugin := &fakePlugin{
forgetErr: &contract.Error{Code: contract.ErrorCodeInternal, Message: "db dead"},
}
h := newWiredHandler(plugin, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/m-1",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "m-1"}})
if rec.Code != http.StatusBadGateway {
t.Errorf("expected 502, got %d", rec.Code)
}
}
func TestMemoriesV2_Forget_NonContractError_Maps502(t *testing.T) {
// A raw error (e.g. transport failure) — not a contract.Error —
// also bubbles up as 502.
plugin := &fakePlugin{forgetErr: errors.New("connection reset")}
h := newWiredHandler(plugin, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/m-1",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: "m-1"}})
if rec.Code != http.StatusBadGateway {
t.Errorf("expected 502, got %d", rec.Code)
}
}
func TestMemoriesV2_Forget_MissingMemoryID_400(t *testing.T) {
h := newWiredHandler(&fakePlugin{}, &fakeNSResolver{})
rec := doRequest(t, h, http.MethodDelete, "/workspaces/ws-a/v2/memories/",
gin.Params{{Key: "id", Value: "ws-a"}, {Key: "memoryId", Value: ""}})
if rec.Code != http.StatusBadRequest {
t.Errorf("expected 400, got %d", rec.Code)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// View-shaping unit tests — pin individual helpers
// ─────────────────────────────────────────────────────────────────────────────
func TestNamespaceLabel_AllKinds(t *testing.T) {
cases := []struct {
name string
kind contract.NamespaceKind
want string
}{
{"workspace:abcdefghij", contract.NamespaceKindWorkspace, "Workspace (abcdefgh)"}, // truncated to 8
{"workspace:abc", contract.NamespaceKindWorkspace, "Workspace (abc)"}, // shorter than 8, kept as-is
{"team:t-99", contract.NamespaceKindTeam, "Team (t-99)"},
{"org:acme", contract.NamespaceKindOrg, "Org (acme)"},
{"custom:my-ns", contract.NamespaceKindCustom, "my-ns"},
{"custom:", contract.NamespaceKindCustom, "custom:"}, // empty suffix → fallback to raw name
{"weird-no-colon", contract.NamespaceKindWorkspace, "Workspace ()"},
{"unknown:x", contract.NamespaceKind("future"), "unknown:x"}, // unknown kind → fallback to raw name
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := namespaceLabel(tc.name, tc.kind)
if got != tc.want {
t.Errorf("namespaceLabel(%q, %q) = %q, want %q", tc.name, tc.kind, got, tc.want)
}
})
}
}
func TestParseLimit(t *testing.T) {
cases := []struct {
raw string
want int
}{
{"", memoriesV2DefaultLimit},
{"10", 10},
{"0", memoriesV2DefaultLimit}, // ≤0 → default, not error
{"-5", memoriesV2DefaultLimit}, // negative → default
{"abc", memoriesV2DefaultLimit}, // non-numeric → default
{"99999", memoriesV2MaxLimit}, // over cap → clamped
{"100", memoriesV2MaxLimit}, // exactly cap → kept
{"99", 99}, // just under cap → kept
}
for _, tc := range cases {
t.Run("raw="+tc.raw, func(t *testing.T) {
if got := parseLimit(tc.raw); got != tc.want {
t.Errorf("parseLimit(%q) = %d, want %d", tc.raw, got, tc.want)
}
})
}
}
func TestMemoryToView_AllFieldsPropagated(t *testing.T) {
now := time.Now()
exp := now.Add(time.Hour)
score := 0.95
m := contract.Memory{
ID: "m-1",
Namespace: "team:t-1",
Content: "hello",
Kind: contract.MemoryKindSummary,
Source: contract.MemorySourceUser,
Pin: true,
ExpiresAt: &exp,
CreatedAt: now,
Score: &score,
Propagation: map[string]interface{}{
"source_workspace_id": "ws-other",
},
}
v := memoryToView(m)
if v.ID != m.ID || v.Namespace != m.Namespace || v.Content != m.Content {
t.Errorf("basic fields: %+v", v)
}
if v.Kind != contract.MemoryKindSummary || v.Source != contract.MemorySourceUser {
t.Errorf("kind/source: %+v", v)
}
if !v.Pin || v.ExpiresAt == nil || v.Score == nil || *v.Score != 0.95 {
t.Errorf("pin/expires/score: %+v", v)
}
if v.SourceWorkspaceID != "ws-other" {
t.Errorf("SourceWorkspaceID=%q, want ws-other", v.SourceWorkspaceID)
}
}
func TestNamespacesToViews_PreservesOrder(t *testing.T) {
in := []namespace.Namespace{
{Name: "team:t1", Kind: contract.NamespaceKindTeam},
{Name: "workspace:w1", Kind: contract.NamespaceKindWorkspace},
}
out := namespacesToViews(in)
if len(out) != 2 {
t.Fatalf("len=%d", len(out))
}
// Resolver determines order; we just preserve it. (Sorting can be
// added at the resolver layer if the canvas needs it.)
if out[0].Name != "team:t1" || out[1].Name != "workspace:w1" {
t.Errorf("order not preserved: %+v", out)
}
}
func TestNamespacesToViews_EmptyInput_EmptySlice(t *testing.T) {
out := namespacesToViews(nil)
if out == nil {
t.Error("expected empty slice, not nil — JSON-marshals as null otherwise")
}
if len(out) != 0 {
t.Errorf("expected len 0, got %d", len(out))
}
}
func TestIndexOfColon(t *testing.T) {
cases := []struct {
s string
want int
}{
{"abc:def", 3},
{":foo", 0},
{"nocolon", -1},
{"", -1},
{"a:b:c", 1}, // first colon only
}
for _, tc := range cases {
if got := indexOfColon(tc.s); got != tc.want {
t.Errorf("indexOfColon(%q) = %d, want %d", tc.s, got, tc.want)
}
}
}
func TestWithMemoryV2_FluentReturnsReceiver(t *testing.T) {
// WithMemoryV2 is the production wiring path (takes *client.Client +
// *namespace.Resolver). withMemoryV2APIs is the test path. The
// production call is structural — assigns the two fields and
// returns the receiver — but we still want a 100% coverage gate
// to catch a future refactor that accidentally drops the fluent
// return (breaking the boot-time chain in router.go).
//
// We can't pass nil for the typed pointers and call available()
// here because Go interface-with-nil-pointer is non-nil at the
// interface level — `available()` would not detect that as
// "unwired". The unwired-plugin behaviour is exhaustively
// covered by TestMemoriesV2_PluginUnwired_All503; this test just
// pins the fluent contract.
h := NewMemoriesV2Handler()
got := h.WithMemoryV2(nil, nil)
if got != h {
t.Error("WithMemoryV2 must return receiver for fluent chaining")
}
}
func TestShortID(t *testing.T) {
cases := map[string]string{
"": "",
"short": "short",
"exactly8": "exactly8",
"longer-than-eight": "longer-t",
"abc-1234-5678-90ab": "abc-1234",
}
for in, want := range cases {
if got := shortID(in); got != want {
t.Errorf("shortID(%q) = %q, want %q", in, got, want)
}
}
}
@@ -7,6 +7,7 @@ import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"log"
"os"
@@ -79,7 +80,16 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
}
}
ctxLookup := context.Background()
// 5s timeout bounds the lookup independently of any HTTP request
// context. createWorkspaceTree runs in goroutines spawned from the
// /org/import handler, so plumbing the request context here would
// cascade-cancel into provisionWorkspaceAuto and abort in-flight
// EC2 provisioning if the client disconnected mid-import — that's
// the wrong behaviour. A short bounded timeout protects the
// per-row SELECT against a wedged DB without taking the
// drop-everything-on-disconnect tradeoff.
ctxLookup, cancelLookup := context.WithTimeout(context.Background(), 5*time.Second)
defer cancelLookup()
// Idempotency: if a workspace with the same (parent_id, name) already
// exists, skip the INSERT + canvas_layouts + broadcast + provisioning.
// This is what makes /org/import safe to call multiple times — the
@@ -91,6 +101,15 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
// (parent exists, some children missing) backfill the missing children
// instead of either no-op'ing the whole subtree or duplicating the
// existing children.
//
// /org/import is ADDITIVE-ONLY, never destructive. Children present
// in the existing tree but absent from the new template are
// preserved (no DELETE on diff). Skip-path also does NOT propagate
// updates to existing nodes — a re-import that adds an
// initial_memory or schedule to an existing workspace is silently
// dropped (the function bypasses seedInitialMemories, schedule SQL,
// channel config for skipped rows). To force-update an existing
// tree, delete and re-import or use a future /org/sync route.
existingID, existing, lookupErr := h.lookupExistingChild(ctxLookup, ws.Name, parentID)
if lookupErr != nil {
return fmt.Errorf("idempotency check for %s: %w", ws.Name, lookupErr)
@@ -605,6 +624,12 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
//
// On sql.ErrNoRows: returns ("", false, nil) — caller should INSERT.
// On a real DB error: returns ("", false, err) — caller propagates.
//
// errors.Is is wrap-safe — a future caller wrapping the error
// (database/sql can wrap driver errors with %w in some setups) would
// silently break a `err == sql.ErrNoRows` equality check, causing the
// no-rows path to fall through to the "real DB error" branch and
// abort the import. errors.Is unwraps.
func (h *OrgHandler) lookupExistingChild(ctx context.Context, name string, parentID *string) (string, bool, error) {
var existingID string
err := db.DB.QueryRowContext(ctx, `
@@ -614,7 +639,7 @@ func (h *OrgHandler) lookupExistingChild(ctx context.Context, name string, paren
AND status != 'removed'
LIMIT 1
`, name, parentID).Scan(&existingID)
if err == sql.ErrNoRows {
if errors.Is(err, sql.ErrNoRows) {
return "", false, nil
}
if err != nil {
@@ -2,7 +2,9 @@ package handlers
import (
"context"
"database/sql"
"errors"
"fmt"
"go/ast"
"go/parser"
"go/token"
@@ -123,6 +125,36 @@ func TestLookupExistingChild_DBError_Propagates(t *testing.T) {
}
}
// TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound — pins the
// wrap-safety of the errors.Is(err, sql.ErrNoRows) check. The previous
// `err == sql.ErrNoRows` equality would fall through to the
// "real DB error" branch on a wrapped no-rows error, aborting the
// import for what is in fact the no-rows happy path. driver/sql
// wrapping is currently a non-issue but a future driver change or a
// caller that wraps the result via fmt.Errorf("…: %w", err) would
// silently break the equality check. errors.Is unwraps.
func TestLookupExistingChild_WrappedNoRows_TreatedAsNotFound(t *testing.T) {
mock := setupTestDB(t)
parent := "parent-1"
wrapped := fmt.Errorf("driver-wrapped: %w", sql.ErrNoRows)
mock.ExpectQuery(`SELECT id FROM workspaces`).
WithArgs("Alpha", &parent).
WillReturnError(wrapped)
h := &OrgHandler{}
id, found, err := h.lookupExistingChild(context.Background(), "Alpha", &parent)
if err != nil {
t.Fatalf("expected wrapped no-rows to be treated as not-found (err=nil), got: %v", err)
}
if found {
t.Errorf("expected found=false on wrapped no-rows, got found=true")
}
if id != "" {
t.Errorf("expected empty id on wrapped no-rows, got %q", id)
}
}
// workspacesInsertRE matches a SQL literal that begins (after optional
// leading whitespace) with `INSERT INTO workspaces` followed by `(` —
// requiring the open-paren rules out lookalikes like
@@ -232,6 +232,20 @@ func Setup(hub *ws.Hub, broadcaster *events.Broadcaster, prov *provisioner.Provi
wsAuth.DELETE("/memories/:memoryId", memsh.Delete)
wsAuth.PATCH("/memories/:memoryId", memsh.Update)
// Memory v2 — canvas reads through the plugin so the Memory
// tab surfaces post-cutover state (memory_records) instead
// of the frozen agent_memories table that memsh.Search hits.
// Wired only when MEMORY_PLUGIN_URL is configured; absent
// plugin → endpoints return 503 with a clear hint instead
// of nil-deref crashing the canvas.
memv2 := handlers.NewMemoriesV2Handler()
if memBundle != nil {
memv2.WithMemoryV2(memBundle.Plugin, memBundle.Resolver)
}
wsAuth.GET("/v2/namespaces", memv2.Namespaces)
wsAuth.GET("/v2/memories", memv2.Search)
wsAuth.DELETE("/v2/memories/:memoryId", memv2.Forget)
// Approvals
apph := handlers.NewApprovalsHandler(broadcaster)
wsAuth.POST("/approvals", apph.Create)
+18
View File
@@ -584,6 +584,24 @@ async def send_a2a_message(peer_id: str, message: str, source_workspace_id: str
else:
detail = "JSON-RPC error with no message"
return f"{_A2A_ERROR_PREFIX}{detail} [target={target_url}]"
elif data.get("status") == "queued" and data.get("delivery_mode") == "poll":
# Workspace-server's poll-mode short-circuit envelope
# (workspace-server/internal/handlers/a2a_proxy.go ~line 402).
# The peer is poll-mode and has no URL to dispatch to, so
# the server queued the message for the peer's next inbox
# poll instead of forwarding it. Delivery is acknowledged
# but pending consumption.
#
# Pre-fix this fell through to the "unexpected response
# shape" error path → callers logged false failures, then
# delegate_task retried, and the peer received duplicate
# delegations. Issue #2967.
method = data.get("method") or "message/send"
logger.info(
"send_a2a_message: queued for poll-mode peer (method=%s, target=%s)",
method, target_url,
)
return f"queued for poll-mode peer (method={method})"
return f"{_A2A_ERROR_PREFIX}unexpected response shape (no result, no error): {str(data)[:200]} [target={target_url}]"
except _TRANSIENT_HTTP_ERRORS as e:
last_exc = e
+10 -1
View File
@@ -425,7 +425,16 @@ def _build_initialize_result() -> dict:
"tools": {"listChanged": False},
"experimental": {"claude/channel": {}},
},
"serverInfo": {"name": "a2a-delegation", "version": "1.0.0"},
# Identifier convention: this server is what users register with
# `claude mcp add molecule -- molecule-mcp` (and similar across
# other MCP hosts), so the canonical name is "molecule". Earlier
# versions reported "a2a-delegation" — accurate to the original
# purpose but a mismatch with how operators actually name it.
# Mismatch is harmless on tool routing (all MCP hosts dispatch
# by the user-supplied registration name, NOT serverInfo.name)
# but matters for any future Claude Code allowlist that gates
# channel push by hardcoded server name (issue #2934).
"serverInfo": {"name": "molecule", "version": "1.0.0"},
# Built per-call (not the module-level constant) so an operator
# who sets MOLECULE_MCP_POLL_TIMEOUT_SECS after import — e.g.
# via a wrapper script that exports then re-imports — sees
+32 -473
View File
@@ -129,481 +129,40 @@ from a2a_tools_delegation import ( # noqa: E402 (import after the from-a2a_cli
)
async def _upload_chat_files(
client: httpx.AsyncClient,
paths: list[str],
workspace_id: str | None = None,
) -> tuple[list[dict], str | None]:
"""Upload local file paths through /workspaces/<self>/chat/uploads.
The platform stages each upload under /workspace/.molecule/chat-uploads
(an "allowed root" the canvas knows how to render via the Download
endpoint) and returns metadata the broadcast payload references.
Why we route through upload instead of just passing the agent's path:
the canvas's allowed-root list is /configs, /workspace, /home, /plugins
— files at /tmp or /root would be unreachable. Uploading copies the
bytes into an allowed root regardless of where the agent wrote them.
Returns (attachments, error). On any failure the caller should NOT
fire the notify — partial-attach would surface a half-rendered chip.
"""
if not paths:
return [], None
files_payload: list[tuple[str, tuple[str, bytes, str]]] = []
for p in paths:
if not isinstance(p, str) or not p:
return [], f"Error: invalid attachment path {p!r}"
if not os.path.isfile(p):
return [], f"Error: attachment not found: {p}"
try:
with open(p, "rb") as fh:
data = fh.read()
except OSError as e:
return [], f"Error reading {p}: {e}"
# Sniff mime from filename so the canvas can pick the right
# icon / preview / inline-image renderer. Pre-fix this was
# hardcoded application/octet-stream and chat_files.go's
# Upload trusts whatever Content-Type the multipart part
# carries — `mt := fh.Header.Get("Content-Type")` only falls
# back to extension-sniffing when the header is empty. So a
# hardcoded octet-stream meant every attachment lost its
# real type forever, breaking the canvas chip's icon logic.
mime_type, _ = mimetypes.guess_type(p)
if not mime_type:
mime_type = "application/octet-stream"
files_payload.append(("files", (os.path.basename(p), data, mime_type)))
target_workspace_id = (workspace_id or "").strip() or WORKSPACE_ID
try:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{target_workspace_id}/chat/uploads",
files=files_payload,
headers=_auth_headers_for_heartbeat(target_workspace_id),
)
except Exception as e:
return [], f"Error uploading attachments: {e}"
if resp.status_code != 200:
return [], f"Error: chat/uploads returned {resp.status_code}: {resp.text[:200]}"
try:
body = resp.json()
except Exception as e:
return [], f"Error parsing upload response: {e}"
uploaded = body.get("files") or []
if not isinstance(uploaded, list) or len(uploaded) != len(paths):
return [], f"Error: upload returned {len(uploaded) if isinstance(uploaded, list) else 'invalid'} entries for {len(paths)} files"
return uploaded, None
async def tool_send_message_to_user(
message: str,
attachments: list[str] | None = None,
workspace_id: str | None = None,
) -> str:
"""Send a message directly to the user's canvas chat via WebSocket.
Args:
message: The text to display in the user's chat. Required even
when sending attachments — set to a short caption like
"Here's the build output:" or "Done — see attached."
attachments: Optional list of absolute file paths inside this
container. Each is uploaded to the platform and rendered
in the canvas as a clickable download chip. Use this
instead of pasting paths in the message text — paths
render as plain text and the user can't click them.
Examples:
attachments=["/tmp/build-output.zip"]
attachments=["/workspace/report.pdf", "/workspace/data.csv"]
workspace_id: Optional. When the agent is registered in MULTIPLE
workspaces (external multi-workspace MCP path), this
selects which workspace's chat to deliver the message to —
should match the ``arrival_workspace_id`` of the inbound
message you're replying to so the user sees the reply in
the same canvas they typed in. Single-workspace agents
omit this; the message routes to the only registered
workspace.
"""
if not message:
return "Error: message is required"
target_workspace_id = (workspace_id or "").strip() or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=60.0) as client:
uploaded, upload_err = await _upload_chat_files(
client, attachments or [], workspace_id=target_workspace_id,
)
if upload_err:
return upload_err
payload: dict = {"message": message}
if uploaded:
payload["attachments"] = uploaded
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{target_workspace_id}/notify",
json=payload,
headers=_auth_headers_for_heartbeat(target_workspace_id),
)
if resp.status_code == 200:
if uploaded:
return f"Message sent to user with {len(uploaded)} attachment(s)"
return "Message sent to user"
return f"Error: platform returned {resp.status_code}"
except Exception as e:
return f"Error sending message: {e}"
async def tool_list_peers(source_workspace_id: str | None = None) -> str:
"""List all workspaces this agent can communicate with.
Behavior:
- ``source_workspace_id`` set → list peers of that one workspace.
- Unset, single-workspace mode → list peers of WORKSPACE_ID
(the legacy path, unchanged).
- Unset, multi-workspace mode (MOLECULE_WORKSPACES populated) →
aggregate across every registered workspace, prefixing each
peer with its source so the agent / user can see the full peer
surface in one call.
Side-effect: populates ``_peer_to_source`` so subsequent
``tool_delegate_task(target)`` auto-routes through the correct
sending workspace without the agent needing ``source_workspace_id``.
"""
sources: list[str]
aggregate = False
if source_workspace_id:
sources = [source_workspace_id]
else:
registered = list_registered_workspaces()
if len(registered) > 1:
sources = registered
aggregate = True
else:
sources = [WORKSPACE_ID]
all_peers: list[tuple[str, dict]] = [] # (source, peer_record)
diagnostics: list[tuple[str, str]] = [] # (source, diagnostic)
for src in sources:
peers, diagnostic = await get_peers_with_diagnostic(source_workspace_id=src)
if peers:
for p in peers:
all_peers.append((src, p))
elif diagnostic is not None:
diagnostics.append((src, diagnostic))
if not all_peers:
if diagnostics:
joined = "; ".join(f"[{src[:8]}] {d}" for src, d in diagnostics)
return f"No peers found. {joined}"
return (
"You have no peers in the platform registry. "
"(No parent, no children, no siblings registered.)"
)
lines = []
for src, p in all_peers:
status = p.get("status", "unknown")
role = p.get("role", "")
peer_id = p["id"]
# Cache name for use in delegate_task
_peer_names[peer_id] = p["name"]
# Cache the source workspace so tool_delegate_task auto-routes
_peer_to_source[peer_id] = src
if aggregate:
lines.append(
f"- {p['name']} (ID: {peer_id}, status: {status}, role: {role}, via: {src[:8]})"
)
else:
lines.append(f"- {p['name']} (ID: {peer_id}, status: {status}, role: {role})")
return "\n".join(lines)
async def tool_get_workspace_info(source_workspace_id: str | None = None) -> str:
"""Get this workspace's own info.
``source_workspace_id`` selects which registered workspace to
introspect when the agent is registered into multiple workspaces.
Unset → falls back to module-level WORKSPACE_ID.
"""
info = await get_workspace_info(source_workspace_id=source_workspace_id)
return json.dumps(info, indent=2)
async def tool_commit_memory(
content: str,
scope: str = "LOCAL",
source_workspace_id: str | None = None,
) -> str:
"""Save important information to persistent memory.
GLOBAL scope is writable only by root workspaces (tier == 0).
RBAC memory.write permission is required for all scope levels.
The source workspace_id is embedded in every record so the platform
can enforce cross-workspace isolation and audit trail.
``source_workspace_id`` selects which registered workspace this
memory belongs to when the agent is registered into multiple
workspaces (PR-1 / multi-workspace mode). When unset, falls back
to the module-level WORKSPACE_ID — single-workspace operators see
no behaviour change.
"""
if not content:
return "Error: content is required"
content = _redact_secrets(content)
scope = scope.upper()
if scope not in ("LOCAL", "TEAM", "GLOBAL"):
scope = "LOCAL"
# RBAC: require memory.write permission (mirrors builtin_tools/memory.py)
if not _check_memory_write_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.write' "
"permission for this operation."
)
# Scope enforcement: only root workspaces (tier 0) can write GLOBAL memory.
# This prevents tenant workspaces from poisoning org-wide memory (GH#1610).
if scope == "GLOBAL" and not _is_root_workspace():
return (
"Error: RBAC — only root workspaces (tier 0) can write to GLOBAL scope. "
"Non-root workspaces may use LOCAL or TEAM scope."
)
src = source_workspace_id or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{src}/memories",
json={
"content": content,
"scope": scope,
# Embed source workspace so the platform can namespace-isolate
# and audit cross-workspace writes (GH#1610 fix).
"workspace_id": src,
},
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if resp.status_code in (200, 201):
return json.dumps({"success": True, "id": data.get("id"), "scope": scope})
return f"Error: {data.get('error', resp.text)}"
except Exception as e:
return f"Error saving memory: {e}"
async def tool_recall_memory(
query: str = "",
scope: str = "",
source_workspace_id: str | None = None,
) -> str:
"""Search persistent memory for previously saved information.
RBAC memory.read permission is required (mirrors builtin_tools/memory.py).
The workspace_id is sent as a query parameter so the platform can
cross-validate it against the auth token and defend against any future
path traversal / cross-tenant read bugs in the platform itself.
``source_workspace_id`` selects which registered workspace's memories
to search when the agent is registered into multiple workspaces.
Unset → defaults to the module-level WORKSPACE_ID.
"""
# RBAC: require memory.read permission (mirrors builtin_tools/memory.py)
if not _check_memory_read_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.read' "
"permission for this operation."
)
src = source_workspace_id or WORKSPACE_ID
params: dict[str, str] = {"workspace_id": src}
if query:
params["q"] = query
if scope:
params["scope"] = scope.upper()
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/memories",
params=params,
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if isinstance(data, list):
if not data:
return "No memories found."
lines = []
for m in data:
lines.append(f"[{m.get('scope', '?')}] {m.get('content', '')}")
return "\n".join(lines)
return json.dumps(data)
except Exception as e:
return f"Error recalling memory: {e}"
# ---------------------------------------------------------------------------
# Inbox tools — inbound delivery for the standalone molecule-mcp path.
# ---------------------------------------------------------------------------
#
# The InboxState singleton is set by mcp_cli before the MCP server starts
# (see workspace/inbox.py for the rationale). In-container runtimes never
# call ``inbox.activate(...)``, so ``inbox.get_state()`` returns None and
# these tools surface an informational error rather than raising.
#
# When-to-use guidance (mirrored in platform_tools/registry.py): agents
# in standalone-runtime mode should call ``wait_for_message`` to block
# on the next inbound message after they've emitted a reply, forming
# the loop ``wait → respond → wait``. ``inbox_peek`` is for inspecting
# the queue without consuming; ``inbox_pop`` removes a handled message.
_INBOX_NOT_ENABLED_MSG = (
"Error: inbox polling is not enabled in this runtime. The standalone "
"molecule-mcp wrapper activates it; in-container runtimes receive "
"messages via push delivery and do not need these tools."
# Messaging tool handlers — extracted to a2a_tools_messaging
# (RFC #2873 iter 4d). Re-imported here so call sites + tests that
# reference ``a2a_tools.tool_send_message_to_user`` /
# ``tool_list_peers`` / ``tool_get_workspace_info`` /
# ``tool_chat_history`` / ``_upload_chat_files`` keep resolving
# identically.
from a2a_tools_messaging import ( # noqa: E402 (import after the top-of-module imports)
_upload_chat_files,
tool_chat_history,
tool_get_workspace_info,
tool_list_peers,
tool_send_message_to_user,
)
async def tool_chat_history(
peer_id: str,
limit: int = 20,
before_ts: str = "",
source_workspace_id: str | None = None,
) -> str:
"""Fetch the prior conversation with one peer.
Hits ``/workspaces/<self>/activity?peer_id=<peer>&limit=<N>``
against the workspace-server, which returns activity rows where
the peer is either the sender (``source_id=peer`` — they sent us
the message) or the recipient (``target_id=peer`` — we sent to
them) of an A2A turn — both sides of the conversation in
chronological order.
Args:
peer_id: The other workspace's UUID. Same value the agent
sees as ``peer_id`` on a peer_agent push or ``workspace_id``
on a delegate_task call.
limit: Maximum rows to return; capped server-side at 500. The
default of 20 covers \"most recent context for this peer\"
without flooding the agent's context window.
before_ts: Optional RFC3339 timestamp; only rows strictly
older are returned. Used to page backward through long
histories — pass the oldest ``ts`` from the previous
response. Empty (default) returns the most recent ``limit``
rows.
source_workspace_id: Which registered workspace's activity log
to query. Auto-routes via ``_peer_to_source`` cache when
unset (the workspace this peer was discovered through);
falls back to module-level WORKSPACE_ID for single-workspace
operators.
Returns a JSON-encoded list of activity rows (or an error string
starting with ``Error:`` so the agent can branch). Each row carries
``activity_type``, ``source_id``, ``target_id``, ``method``,
``summary``, ``request_body``, ``response_body``, ``status``,
``created_at`` — same shape ``inbox_peek`` and the canvas chat
loader already see.
"""
if not peer_id or not isinstance(peer_id, str):
return "Error: peer_id is required"
if not isinstance(limit, int) or limit <= 0:
limit = 20
if limit > 500:
limit = 500
src = source_workspace_id or _peer_to_source.get(peer_id) or WORKSPACE_ID
params: dict[str, str] = {
"peer_id": peer_id,
"limit": str(limit),
}
# Forward verbatim — the server route validates as RFC3339 at the
# trust boundary and translates into a `created_at < $X` clause.
if before_ts:
params["before_ts"] = before_ts
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/activity",
params=params,
headers=_auth_headers_for_heartbeat(src),
)
except Exception as exc: # noqa: BLE001
return f"Error: chat_history request failed: {exc}"
if resp.status_code == 400:
# Trust-boundary rejection (malformed peer_id, etc.) — surface
# the server's reason verbatim so the agent can correct itself.
try:
err = resp.json().get("error", "bad request")
except Exception: # noqa: BLE001
err = "bad request"
return f"Error: {err}"
if resp.status_code >= 400:
return f"Error: chat_history returned HTTP {resp.status_code}"
try:
rows = resp.json()
except Exception: # noqa: BLE001
return "Error: chat_history response was not JSON"
if not isinstance(rows, list):
return "Error: chat_history response was not a list"
# Server returns DESC (most recent first); reverse to chronological
# so the agent reads the conversation top-down like a chat log.
rows.reverse()
return json.dumps(rows)
# Memory tool handlers — extracted to a2a_tools_memory (RFC #2873 iter 4c).
# Re-imported here so call sites + tests that reference
# ``a2a_tools.tool_commit_memory`` / ``tool_recall_memory`` keep
# resolving identically.
from a2a_tools_memory import ( # noqa: E402 (import after the top-of-module imports)
tool_commit_memory,
tool_recall_memory,
)
async def tool_inbox_peek(limit: int = 10) -> str:
"""Return up to ``limit`` pending inbound messages without removing them."""
import inbox # local import — avoids a circular dep at module load
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
messages = state.peek(limit=limit if isinstance(limit, int) else 10)
return json.dumps([m.to_dict() for m in messages])
async def tool_inbox_pop(activity_id: str) -> str:
"""Remove a message from the inbox queue by activity_id."""
import inbox
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
if not isinstance(activity_id, str) or not activity_id:
return "Error: activity_id is required."
removed = state.pop(activity_id)
if removed is None:
return json.dumps({"removed": False, "activity_id": activity_id})
return json.dumps({"removed": True, "activity_id": activity_id})
async def tool_wait_for_message(timeout_secs: float = 60.0) -> str:
"""Block until a new message arrives or ``timeout_secs`` elapses.
Returns the head message non-destructively; the agent decides
whether to ``inbox_pop`` it after acting.
"""
import asyncio
import inbox
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
try:
timeout = float(timeout_secs)
except (TypeError, ValueError):
timeout = 60.0
# Cap at 300s — Claude Code's default tool timeout is ~10min, and
# blocking longer than 5min wastes the prompt cache window for
# nothing useful. Operators who want longer can call repeatedly.
timeout = max(0.0, min(timeout, 300.0))
# The threading.Event-based wait would block the asyncio loop.
# Run it on the default executor so the MCP server can keep
# processing other JSON-RPC requests while we sleep.
loop = asyncio.get_running_loop()
message = await loop.run_in_executor(None, state.wait, timeout)
if message is None:
return json.dumps({"timeout": True, "timeout_secs": timeout})
return json.dumps(message.to_dict())
# Inbox tool handlers — extracted to a2a_tools_inbox (RFC #2873 iter 4e).
# Re-imported here so call sites + tests that reference
# ``a2a_tools.tool_inbox_peek`` / ``tool_inbox_pop`` / ``tool_wait_for_message``
# / ``_enrich_inbound_for_agent`` / ``_INBOX_NOT_ENABLED_MSG`` keep
# resolving identically.
from a2a_tools_inbox import ( # noqa: E402 (import after the top-of-module imports)
_INBOX_NOT_ENABLED_MSG,
_enrich_inbound_for_agent,
tool_inbox_peek,
tool_inbox_pop,
tool_wait_for_message,
)
+140
View File
@@ -0,0 +1,140 @@
"""Inbox tool handlers — single-concern slice of the a2a_tools surface.
Standalone-runtime path for inbound-message delivery (push-mode runtimes
get messages via the channel-tag synthesis in a2a_mcp_server). The
``InboxState`` singleton is set by ``mcp_cli`` before the MCP server
starts; in-container runtimes never call ``inbox.activate(...)`` so
``inbox.get_state()`` returns None and these tools surface an
informational error instead of raising.
When-to-use guidance for agents (mirrored in
``platform_tools/registry.py``):
- ``wait_for_message``: block until a new inbound message arrives, then
decide what to do with it; forms the loop ``wait → respond → wait``.
- ``inbox_peek``: inspect the queue non-destructively.
- ``inbox_pop``: remove a handled message by activity_id.
Extracted from ``a2a_tools.py`` in RFC #2873 iter 4e so the kitchen-sink
module shrinks to a back-compat shim. The extraction also makes the
``_enrich_inbound_for_agent`` helper unit-testable in isolation —
previously it was buried in ``a2a_tools`` and only exercised through
the inbox wrappers, leaving its peer-id-empty / cache-miss / registry-
unavailable branches under-covered.
"""
from __future__ import annotations
import asyncio
import json
# Surfaced when the inbox subsystem is not initialised. Returned by the
# three inbox tool wrappers below so the agent gets a clear "this
# runtime delivers via push" message instead of a NameError.
_INBOX_NOT_ENABLED_MSG = (
"Error: inbox polling is not enabled in this runtime. The standalone "
"molecule-mcp wrapper activates it; in-container runtimes receive "
"messages via push delivery and do not need these tools."
)
def _enrich_inbound_for_agent(d: dict) -> dict:
"""Add peer_name / peer_role / agent_card_url to a poll-path message.
The PUSH path (a2a_mcp_server._build_channel_notification) already
enriches the meta dict with these fields, so a Claude Code host
with channel-push sees them. The POLL path goes through
InboxMessage.to_dict, which is intentionally identity-free (the
storage layer doesn't know about the registry cache). Without this
helper, every non-Claude-Code MCP client that uses inbox_peek /
wait_for_message gets a plain message and the receiving agent
can't tell who's writing — breaking the contract documented in
a2a_mcp_server.py:303-345 ("In both paths the same fields apply").
Cache-first non-blocking enrichment (same shape as push): on cache
miss the helper returns the bare message; the next call within the
5-min TTL hits the warm cache. Failure to enrich is non-fatal —
the agent still gets text + peer_id + kind + activity_id, just
without the friendly identity.
"""
peer_id = d.get("peer_id") or ""
if not peer_id:
# canvas_user — no peer to enrich; helper returns the plain
# message unchanged so the canvas reply path still works.
return d
try:
from a2a_client import ( # local import — avoid module-load cycle
_agent_card_url_for,
enrich_peer_metadata_nonblocking,
)
except Exception: # noqa: BLE001
# If a2a_client is unavailable (test harness, partial install),
# degrade gracefully — agent still gets the bare envelope.
return d
record = enrich_peer_metadata_nonblocking(peer_id)
if record is not None:
if name := record.get("name"):
d["peer_name"] = name
if role := record.get("role"):
d["peer_role"] = role
# agent_card_url is constructable from peer_id alone — surface it
# even when registry enrichment misses, so the receiving agent has
# a single endpoint to hit for the peer's full capability list.
d["agent_card_url"] = _agent_card_url_for(peer_id)
return d
async def tool_inbox_peek(limit: int = 10) -> str:
"""Return up to ``limit`` pending inbound messages without removing them."""
import inbox # local import — avoids a circular dep at module load
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
messages = state.peek(limit=limit if isinstance(limit, int) else 10)
return json.dumps([_enrich_inbound_for_agent(m.to_dict()) for m in messages])
async def tool_inbox_pop(activity_id: str) -> str:
"""Remove a message from the inbox queue by activity_id."""
import inbox
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
if not isinstance(activity_id, str) or not activity_id:
return "Error: activity_id is required."
removed = state.pop(activity_id)
if removed is None:
return json.dumps({"removed": False, "activity_id": activity_id})
return json.dumps({"removed": True, "activity_id": activity_id})
async def tool_wait_for_message(timeout_secs: float = 60.0) -> str:
"""Block until a new message arrives or ``timeout_secs`` elapses.
Returns the head message non-destructively; the agent decides
whether to ``inbox_pop`` it after acting.
"""
import inbox
state = inbox.get_state()
if state is None:
return _INBOX_NOT_ENABLED_MSG
try:
timeout = float(timeout_secs)
except (TypeError, ValueError):
timeout = 60.0
# Cap at 300s — Claude Code's default tool timeout is ~10min, and
# blocking longer than 5min wastes the prompt cache window for
# nothing useful. Operators who want longer can call repeatedly.
timeout = max(0.0, min(timeout, 300.0))
# The threading.Event-based wait would block the asyncio loop.
# Run it on the default executor so the MCP server can keep
# processing other JSON-RPC requests while we sleep.
loop = asyncio.get_running_loop()
message = await loop.run_in_executor(None, state.wait, timeout)
if message is None:
return json.dumps({"timeout": True, "timeout_secs": timeout})
return json.dumps(_enrich_inbound_for_agent(message.to_dict()))
+141
View File
@@ -0,0 +1,141 @@
"""Memory tool handlers — single-concern slice of the a2a_tools surface.
Extracted from ``a2a_tools.py`` (RFC #2873 iter 4c). Owns the two
agent-memory MCP tools:
* ``tool_commit_memory`` — write to the workspace's persistent memory.
* ``tool_recall_memory`` — search the workspace's persistent memory.
Both go through the platform's ``/workspaces/:id/memories`` endpoint;
the platform is the source of truth for namespace isolation + audit
trail. Local responsibility here is RBAC enforcement BEFORE hitting
the network so a denied operation surfaces a clear in-band error
instead of an opaque platform 403.
Imports the RBAC primitives from ``a2a_tools_rbac`` (iter 4a).
"""
from __future__ import annotations
import json
import httpx
from a2a_client import PLATFORM_URL, WORKSPACE_ID
from a2a_tools_rbac import (
auth_headers_for_heartbeat as _auth_headers_for_heartbeat,
check_memory_read_permission as _check_memory_read_permission,
check_memory_write_permission as _check_memory_write_permission,
is_root_workspace as _is_root_workspace,
)
from builtin_tools.security import _redact_secrets
async def tool_commit_memory(
content: str,
scope: str = "LOCAL",
source_workspace_id: str | None = None,
) -> str:
"""Save important information to persistent memory.
GLOBAL scope is writable only by root workspaces (tier == 0).
RBAC memory.write permission is required for all scope levels.
The source workspace_id is embedded in every record so the platform
can enforce cross-workspace isolation and audit trail.
``source_workspace_id`` selects which registered workspace this
memory belongs to when the agent is registered into multiple
workspaces (PR-1 / multi-workspace mode). When unset, falls back
to the module-level WORKSPACE_ID — single-workspace operators see
no behaviour change.
"""
if not content:
return "Error: content is required"
content = _redact_secrets(content)
scope = scope.upper()
if scope not in ("LOCAL", "TEAM", "GLOBAL"):
scope = "LOCAL"
# RBAC: require memory.write permission (mirrors builtin_tools/memory.py)
if not _check_memory_write_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.write' "
"permission for this operation."
)
# Scope enforcement: only root workspaces (tier 0) can write GLOBAL memory.
# This prevents tenant workspaces from poisoning org-wide memory (GH#1610).
if scope == "GLOBAL" and not _is_root_workspace():
return (
"Error: RBAC — only root workspaces (tier 0) can write to GLOBAL scope. "
"Non-root workspaces may use LOCAL or TEAM scope."
)
src = source_workspace_id or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{src}/memories",
json={
"content": content,
"scope": scope,
# Embed source workspace so the platform can namespace-isolate
# and audit cross-workspace writes (GH#1610 fix).
"workspace_id": src,
},
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if resp.status_code in (200, 201):
return json.dumps({"success": True, "id": data.get("id"), "scope": scope})
return f"Error: {data.get('error', resp.text)}"
except Exception as e:
return f"Error saving memory: {e}"
async def tool_recall_memory(
query: str = "",
scope: str = "",
source_workspace_id: str | None = None,
) -> str:
"""Search persistent memory for previously saved information.
RBAC memory.read permission is required (mirrors builtin_tools/memory.py).
The workspace_id is sent as a query parameter so the platform can
cross-validate it against the auth token and defend against any future
path traversal / cross-tenant read bugs in the platform itself.
``source_workspace_id`` selects which registered workspace's memories
to search when the agent is registered into multiple workspaces.
Unset → defaults to the module-level WORKSPACE_ID.
"""
# RBAC: require memory.read permission (mirrors builtin_tools/memory.py)
if not _check_memory_read_permission():
return (
"Error: RBAC — this workspace does not have the 'memory.read' "
"permission for this operation."
)
src = source_workspace_id or WORKSPACE_ID
params: dict[str, str] = {"workspace_id": src}
if query:
params["q"] = query
if scope:
params["scope"] = scope.upper()
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/memories",
params=params,
headers=_auth_headers_for_heartbeat(src),
)
data = resp.json()
if isinstance(data, list):
if not data:
return "No memories found."
lines = []
for m in data:
lines.append(f"[{m.get('scope', '?')}] {m.get('content', '')}")
return "\n".join(lines)
return json.dumps(data)
except Exception as e:
return f"Error recalling memory: {e}"
+324
View File
@@ -0,0 +1,324 @@
"""Messaging tool handlers — single-concern slice of the a2a_tools surface.
Extracted from ``a2a_tools.py`` (RFC #2873 iter 4d). Owns the four
human-and-peer messaging MCP tools + the chat-upload helper they share:
* ``tool_send_message_to_user`` — push a canvas-chat message via the
platform's ``/notify`` endpoint.
* ``tool_list_peers`` — discover peers across one or many registered
workspaces, with side-effect of populating ``_peer_to_source`` for
delegate-task auto-routing.
* ``tool_get_workspace_info`` — JSON-encode the workspace's own info.
* ``tool_chat_history`` — fetch prior conversation rows with a peer.
* ``_upload_chat_files`` — internal helper for the message-attachments
code path; routes local file paths through the platform's
``/chat/uploads`` so the canvas can render them as download chips.
Imports the auth-header primitive from ``a2a_tools_rbac`` (iter 4a).
"""
from __future__ import annotations
import json
import mimetypes
import os
import httpx
from a2a_client import (
PLATFORM_URL,
WORKSPACE_ID,
_peer_names,
_peer_to_source,
get_peers_with_diagnostic,
get_workspace_info,
)
from a2a_tools_rbac import auth_headers_for_heartbeat as _auth_headers_for_heartbeat
from platform_auth import list_registered_workspaces
async def _upload_chat_files(
client: httpx.AsyncClient,
paths: list[str],
workspace_id: str | None = None,
) -> tuple[list[dict], str | None]:
"""Upload local file paths through /workspaces/<self>/chat/uploads.
The platform stages each upload under /workspace/.molecule/chat-uploads
(an "allowed root" the canvas knows how to render via the Download
endpoint) and returns metadata the broadcast payload references.
Why we route through upload instead of just passing the agent's path:
the canvas's allowed-root list is /configs, /workspace, /home, /plugins
— files at /tmp or /root would be unreachable. Uploading copies the
bytes into an allowed root regardless of where the agent wrote them.
Returns (attachments, error). On any failure the caller should NOT
fire the notify — partial-attach would surface a half-rendered chip.
"""
if not paths:
return [], None
files_payload: list[tuple[str, tuple[str, bytes, str]]] = []
for p in paths:
if not isinstance(p, str) or not p:
return [], f"Error: invalid attachment path {p!r}"
if not os.path.isfile(p):
return [], f"Error: attachment not found: {p}"
try:
with open(p, "rb") as fh:
data = fh.read()
except OSError as e:
return [], f"Error reading {p}: {e}"
# Sniff mime from filename so the canvas can pick the right
# icon / preview / inline-image renderer. Pre-fix this was
# hardcoded application/octet-stream and chat_files.go's
# Upload trusts whatever Content-Type the multipart part
# carries — `mt := fh.Header.Get("Content-Type")` only falls
# back to extension-sniffing when the header is empty. So a
# hardcoded octet-stream meant every attachment lost its
# real type forever, breaking the canvas chip's icon logic.
mime_type, _ = mimetypes.guess_type(p)
if not mime_type:
mime_type = "application/octet-stream"
files_payload.append(("files", (os.path.basename(p), data, mime_type)))
target_workspace_id = (workspace_id or "").strip() or WORKSPACE_ID
try:
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{target_workspace_id}/chat/uploads",
files=files_payload,
headers=_auth_headers_for_heartbeat(target_workspace_id),
)
except Exception as e:
return [], f"Error uploading attachments: {e}"
if resp.status_code != 200:
return [], f"Error: chat/uploads returned {resp.status_code}: {resp.text[:200]}"
try:
body = resp.json()
except Exception as e:
return [], f"Error parsing upload response: {e}"
uploaded = body.get("files") or []
if not isinstance(uploaded, list) or len(uploaded) != len(paths):
return [], f"Error: upload returned {len(uploaded) if isinstance(uploaded, list) else 'invalid'} entries for {len(paths)} files"
return uploaded, None
async def tool_send_message_to_user(
message: str,
attachments: list[str] | None = None,
workspace_id: str | None = None,
) -> str:
"""Send a message directly to the user's canvas chat via WebSocket.
Args:
message: The text to display in the user's chat. Required even
when sending attachments — set to a short caption like
"Here's the build output:" or "Done — see attached."
attachments: Optional list of absolute file paths inside this
container. Each is uploaded to the platform and rendered
in the canvas as a clickable download chip. Use this
instead of pasting paths in the message text — paths
render as plain text and the user can't click them.
Examples:
attachments=["/tmp/build-output.zip"]
attachments=["/workspace/report.pdf", "/workspace/data.csv"]
workspace_id: Optional. When the agent is registered in MULTIPLE
workspaces (external multi-workspace MCP path), this
selects which workspace's chat to deliver the message to —
should match the ``arrival_workspace_id`` of the inbound
message you're replying to so the user sees the reply in
the same canvas they typed in. Single-workspace agents
omit this; the message routes to the only registered
workspace.
"""
if not message:
return "Error: message is required"
target_workspace_id = (workspace_id or "").strip() or WORKSPACE_ID
try:
async with httpx.AsyncClient(timeout=60.0) as client:
uploaded, upload_err = await _upload_chat_files(
client, attachments or [], workspace_id=target_workspace_id,
)
if upload_err:
return upload_err
payload: dict = {"message": message}
if uploaded:
payload["attachments"] = uploaded
resp = await client.post(
f"{PLATFORM_URL}/workspaces/{target_workspace_id}/notify",
json=payload,
headers=_auth_headers_for_heartbeat(target_workspace_id),
)
if resp.status_code == 200:
if uploaded:
return f"Message sent to user with {len(uploaded)} attachment(s)"
return "Message sent to user"
return f"Error: platform returned {resp.status_code}"
except Exception as e:
return f"Error sending message: {e}"
async def tool_list_peers(source_workspace_id: str | None = None) -> str:
"""List all workspaces this agent can communicate with.
Behavior:
- ``source_workspace_id`` set → list peers of that one workspace.
- Unset, single-workspace mode → list peers of WORKSPACE_ID
(the legacy path, unchanged).
- Unset, multi-workspace mode (MOLECULE_WORKSPACES populated) →
aggregate across every registered workspace, prefixing each
peer with its source so the agent / user can see the full peer
surface in one call.
Side-effect: populates ``_peer_to_source`` so subsequent
``tool_delegate_task(target)`` auto-routes through the correct
sending workspace without the agent needing ``source_workspace_id``.
"""
sources: list[str]
aggregate = False
if source_workspace_id:
sources = [source_workspace_id]
else:
registered = list_registered_workspaces()
if len(registered) > 1:
sources = registered
aggregate = True
else:
sources = [WORKSPACE_ID]
all_peers: list[tuple[str, dict]] = [] # (source, peer_record)
diagnostics: list[tuple[str, str]] = [] # (source, diagnostic)
for src in sources:
peers, diagnostic = await get_peers_with_diagnostic(source_workspace_id=src)
if peers:
for p in peers:
all_peers.append((src, p))
elif diagnostic is not None:
diagnostics.append((src, diagnostic))
if not all_peers:
if diagnostics:
joined = "; ".join(f"[{src[:8]}] {d}" for src, d in diagnostics)
return f"No peers found. {joined}"
return (
"You have no peers in the platform registry. "
"(No parent, no children, no siblings registered.)"
)
lines = []
for src, p in all_peers:
status = p.get("status", "unknown")
role = p.get("role", "")
peer_id = p["id"]
# Cache name for use in delegate_task
_peer_names[peer_id] = p["name"]
# Cache the source workspace so tool_delegate_task auto-routes
_peer_to_source[peer_id] = src
if aggregate:
lines.append(
f"- {p['name']} (ID: {peer_id}, status: {status}, role: {role}, via: {src[:8]})"
)
else:
lines.append(f"- {p['name']} (ID: {peer_id}, status: {status}, role: {role})")
return "\n".join(lines)
async def tool_get_workspace_info(source_workspace_id: str | None = None) -> str:
"""Get this workspace's own info.
``source_workspace_id`` selects which registered workspace to
introspect when the agent is registered into multiple workspaces.
Unset → falls back to module-level WORKSPACE_ID.
"""
info = await get_workspace_info(source_workspace_id=source_workspace_id)
return json.dumps(info, indent=2)
async def tool_chat_history(
peer_id: str,
limit: int = 20,
before_ts: str = "",
source_workspace_id: str | None = None,
) -> str:
"""Fetch the prior conversation with one peer.
Hits ``/workspaces/<self>/activity?peer_id=<peer>&limit=<N>``
against the workspace-server, which returns activity rows where
the peer is either the sender (``source_id=peer`` — they sent us
the message) or the recipient (``target_id=peer`` — we sent to
them) of an A2A turn — both sides of the conversation in
chronological order.
Args:
peer_id: The other workspace's UUID. Same value the agent
sees as ``peer_id`` on a peer_agent push or ``workspace_id``
on a delegate_task call.
limit: Maximum rows to return; capped server-side at 500. The
default of 20 covers "most recent context for this peer"
without flooding the agent's context window.
before_ts: Optional RFC3339 timestamp; only rows strictly
older are returned. Used to page backward through long
histories — pass the oldest ``ts`` from the previous
response. Empty (default) returns the most recent ``limit``
rows.
source_workspace_id: Which registered workspace's activity log
to query. Auto-routes via ``_peer_to_source`` cache when
unset (the workspace this peer was discovered through);
falls back to module-level WORKSPACE_ID for single-workspace
operators.
Returns a JSON-encoded list of activity rows (or an error string
starting with ``Error:`` so the agent can branch). Each row carries
``activity_type``, ``source_id``, ``target_id``, ``method``,
``summary``, ``request_body``, ``response_body``, ``status``,
``created_at`` — same shape ``inbox_peek`` and the canvas chat
loader already see.
"""
if not peer_id or not isinstance(peer_id, str):
return "Error: peer_id is required"
if not isinstance(limit, int) or limit <= 0:
limit = 20
if limit > 500:
limit = 500
src = source_workspace_id or _peer_to_source.get(peer_id) or WORKSPACE_ID
params: dict[str, str] = {
"peer_id": peer_id,
"limit": str(limit),
}
# Forward verbatim — the server route validates as RFC3339 at the
# trust boundary and translates into a `created_at < $X` clause.
if before_ts:
params["before_ts"] = before_ts
try:
async with httpx.AsyncClient(timeout=10.0) as client:
resp = await client.get(
f"{PLATFORM_URL}/workspaces/{src}/activity",
params=params,
headers=_auth_headers_for_heartbeat(src),
)
except Exception as exc: # noqa: BLE001
return f"Error: chat_history request failed: {exc}"
if resp.status_code == 400:
# Trust-boundary rejection (malformed peer_id, etc.) — surface
# the server's reason verbatim so the agent can correct itself.
try:
err = resp.json().get("error", "bad request")
except Exception: # noqa: BLE001
err = "bad request"
return f"Error: {err}"
if resp.status_code >= 400:
return f"Error: chat_history returned HTTP {resp.status_code}"
try:
rows = resp.json()
except Exception: # noqa: BLE001
return "Error: chat_history response was not JSON"
if not isinstance(rows, list):
return "Error: chat_history response was not a list"
# Server returns DESC (most recent first); reverse to chronological
# so the agent reads the conversation top-down like a chat log.
rows.reverse()
return json.dumps(rows)
+27
View File
@@ -93,7 +93,34 @@ def main() -> None:
``{"id": ..., "token": ...}`` entries. One register + heartbeat
+ inbox poller per entry; messages from any workspace land in
the same agent inbox tagged with ``arrival_workspace_id``.
Subcommand:
``molecule-mcp doctor`` runs an onboarding diagnostic against the
current shell environment + platform reachability and exits.
Closes Ryan's #2934 item 6.
"""
# Subcommand dispatch — must come BEFORE env-var validation so
# `molecule-mcp doctor` can run on a partially-configured shell
# and tell the operator what's missing. Argv shapes:
# molecule-mcp → run server (this function's main path)
# molecule-mcp doctor → run diagnostic, exit
# molecule-mcp --help → defer to doctor for now (no other
# flags are supported yet)
if len(sys.argv) > 1:
if sys.argv[1] in ("doctor", "--doctor"):
import mcp_doctor
sys.exit(mcp_doctor.run())
if sys.argv[1] in ("--help", "-h", "help"):
print(
"molecule-mcp — Molecule AI universal MCP server\n\n"
"Usage:\n"
" molecule-mcp Run the MCP stdio server (registers + heartbeats)\n"
" molecule-mcp doctor Run onboarding diagnostic + exit\n\n"
"Required env: PLATFORM_URL, WORKSPACE_ID (or MOLECULE_WORKSPACES),\n"
" MOLECULE_WORKSPACE_TOKEN (or MOLECULE_WORKSPACE_TOKEN_FILE)\n",
)
sys.exit(0)
if not os.environ.get("PLATFORM_URL", "").strip():
_print_missing_env_help(
["PLATFORM_URL"],
+426
View File
@@ -0,0 +1,426 @@
"""molecule-mcp doctor — diagnostic subcommand for first-run install.
Run via ``molecule-mcp doctor``. Prints a checklist of common
onboarding failure modes and concrete next-step suggestions for each
failed check.
Closes Ryan's #2934 item 6 ("Add a molecule-mcp doctor subcommand —
this single command would have saved me 30 of the 45 minutes").
Pairs with #2935 (Python>=3.11 callout, PATH guidance, TOKEN_FILE
support) — those fixed the snippet, this gives the operator a way to
self-diagnose when something still goes wrong.
Six checks, in operator-encounter order:
1. Python version — wheel requires >=3.11 (pip says
"no versions found" on older).
2. Wheel install — molecule_runtime importable + version reported.
3. PATH for molecule-mcp — pip user-site installs land at
~/Library/Python/3.X/bin which isn't on
PATH on a fresh macOS shell. Most common
"claude mcp add can't find molecule-mcp"
cause.
4. Env vars — PLATFORM_URL set + reachable;
WORKSPACE_ID set; auth token resolvable
(env or *_FILE or .auth_token).
5. Platform health — GET ${PLATFORM_URL}/healthz returns 2xx.
Catches DNS/firewall/wrong-scheme issues
before the operator hits the real
register call.
6. Token auth — POST ${PLATFORM_URL}/registry/heartbeat
with the resolved workspace_id+token
returns 2xx. End-to-end auth verification.
Uses heartbeat (idempotent timestamp
update) instead of register (UPSERT —
would clobber agent_card metadata) so
the doctor is safe to run against a
live workspace.
Each check prints one of:
[OK] <one-line status>
[WARN] <one-line status> next: <fix suggestion>
[FAIL] <one-line status> next: <fix suggestion>
Exit 0 if all pass or only WARNs; exit 1 if any FAIL — so the
subcommand is scriptable from CI / install-checks too.
Out of scope for now (deferred follow-ups):
- Claude Code-specific checks (parse ~/.claude.json, verify each
MCP entry is plugin-sourced + dev-channels flag is set). That's
a separate Claude-Code-specific doctor and lives in the
claude-code-channel plugin, not the universal-MCP doctor.
- Automated remediation (running the suggested fix). Doctor is
a diagnostic tool — it tells the operator what's wrong + how
to fix it, doesn't apply changes.
"""
from __future__ import annotations
import importlib
import importlib.metadata
import os
import shutil
import sys
from typing import Optional
# urllib avoids a hard dep on `requests` for the doctor — the real
# CLI already imports requests via mcp_heartbeat, but doctor should
# keep working even on a partial install where requests is missing
# (that itself is a finding worth surfacing).
from urllib import request as urllib_request
from urllib.error import URLError
# ANSI colors are friendly on TTYs; auto-disable on pipe / NO_COLOR
# for CI logs where the escape sequences clutter the diff.
def _color(name: str) -> str:
if not sys.stdout.isatty() or os.environ.get("NO_COLOR"):
return ""
return {
"green": "\033[32m",
"yellow": "\033[33m",
"red": "\033[31m",
"dim": "\033[2m",
"reset": "\033[0m",
}.get(name, "")
def _ok(label: str, msg: str) -> None:
print(f" {_color('green')}[OK]{_color('reset')} {label}: {msg}")
def _warn(label: str, msg: str, fix: str) -> None:
print(f" {_color('yellow')}[WARN]{_color('reset')} {label}: {msg}")
print(f" {_color('dim')}next:{_color('reset')} {fix}")
def _fail(label: str, msg: str, fix: str) -> None:
print(f" {_color('red')}[FAIL]{_color('reset')} {label}: {msg}")
print(f" {_color('dim')}next:{_color('reset')} {fix}")
# Each check returns a "ok" | "warn" | "fail" verdict so the caller
# can compute an exit code without re-walking the print stream.
Verdict = str # "ok" | "warn" | "fail"
def check_python_version() -> Verdict:
label = "Python version"
major, minor = sys.version_info[:2]
if (major, minor) >= (3, 11):
_ok(label, f"Python {major}.{minor} (wheel requires >=3.11)")
return "ok"
_fail(
label,
f"Python {major}.{minor} is below the wheel's >=3.11 floor",
"upgrade Python (brew install python@3.12 / apt install python3.12) "
"or run molecule-mcp via a 3.11+ venv.",
)
return "fail"
def check_wheel_install() -> Verdict:
label = "Wheel install"
try:
version = importlib.metadata.version("molecule-ai-workspace-runtime")
except importlib.metadata.PackageNotFoundError:
_fail(
label,
"molecule-ai-workspace-runtime not found in this interpreter's site-packages",
"pip install molecule-ai-workspace-runtime "
"(or pipx install molecule-ai-workspace-runtime to get the "
"binary on PATH automatically).",
)
return "fail"
try:
importlib.import_module("molecule_runtime.mcp_cli")
except ImportError as e:
_fail(
label,
f"package found ({version}) but `molecule_runtime.mcp_cli` won't import: {e}",
"reinstall the wheel (pip install --force-reinstall "
"molecule-ai-workspace-runtime); if it still fails, file "
"a bug with the traceback.",
)
return "fail"
_ok(label, f"molecule-ai-workspace-runtime=={version}")
return "ok"
def check_path_for_binary() -> Verdict:
label = "PATH for molecule-mcp"
found = shutil.which("molecule-mcp")
if found:
_ok(label, f"resolves to {found}")
return "ok"
# Not on PATH — work out where pip put it so the suggestion is
# actionable instead of generic.
user_base = os.environ.get("PYTHONUSERBASE")
if not user_base:
try:
import site
user_base = site.getuserbase()
except Exception:
user_base = None
hint = (
f"add `{user_base}/bin` to PATH"
if user_base
else "switch to `pipx install molecule-ai-workspace-runtime` so the "
"binary lands in pipx's managed bin/ on PATH"
)
_fail(
label,
"molecule-mcp not found on PATH",
f"{hint}, or invoke via `python -m molecule_runtime.mcp_cli` directly.",
)
return "fail"
def _resolve_token() -> tuple[Optional[str], Optional[str]]:
"""Return ``(token_value, source_label)`` if the operator's
environment exposes a token, else ``(None, None)``.
Single source of truth used by both ``check_env_vars()`` (which
only needs the source label) and ``check_register()`` (which
needs the actual value to send a Bearer header). Keeping these
in one place means a future env-var addition only updates the
resolver — not two parallel readers that can drift.
"""
val = os.environ.get("MOLECULE_WORKSPACE_TOKEN", "").strip()
if val:
return val, "env MOLECULE_WORKSPACE_TOKEN"
file_var = os.environ.get("MOLECULE_WORKSPACE_TOKEN_FILE", "").strip()
if file_var:
if os.path.isfile(file_var):
try:
from pathlib import Path as _Path
return (
_Path(file_var).read_text().strip(),
f"file {file_var} (via MOLECULE_WORKSPACE_TOKEN_FILE)",
)
except OSError:
return None, None
return None, None
# Per-runtime container path used by the in-platform path; rarely
# set on external setups but check anyway so the message is
# accurate for both shapes.
try:
import configs_dir
candidate = configs_dir.resolve() / ".auth_token"
if candidate.is_file():
try:
return candidate.read_text().strip(), f"file {candidate}"
except OSError:
return None, None
except Exception:
pass
return None, None
def _resolve_token_summary() -> Optional[str]:
"""Return just the source label (no secret value). Convenience
wrapper around :func:`_resolve_token` for callers that don't
need the value itself.
"""
_, label = _resolve_token()
return label
def check_env_vars() -> Verdict:
label = "Env vars"
missing: list[str] = []
if not os.environ.get("PLATFORM_URL", "").strip():
missing.append("PLATFORM_URL")
if not os.environ.get("WORKSPACE_ID", "").strip() and not os.environ.get(
"MOLECULE_WORKSPACES", "",
).strip():
missing.append("WORKSPACE_ID (or MOLECULE_WORKSPACES)")
token_summary = _resolve_token_summary()
if not token_summary and not os.environ.get("MOLECULE_WORKSPACES", "").strip():
# MOLECULE_WORKSPACES is a JSON-array env that bundles its
# own per-workspace tokens — if it's set we trust the
# resolver to validate.
missing.append(
"MOLECULE_WORKSPACE_TOKEN (or MOLECULE_WORKSPACE_TOKEN_FILE, or "
"/configs/.auth_token)",
)
if missing:
_fail(
label,
f"unset: {', '.join(missing)}",
"see the canvas Connect-External-Agent modal — the snippet "
"exports all three. Use MOLECULE_WORKSPACE_TOKEN_FILE for the "
"token to keep secrets out of shell history.",
)
return "fail"
_ok(
label,
f"PLATFORM_URL + WORKSPACE_ID set; token from {token_summary or 'MOLECULE_WORKSPACES'}",
)
return "ok"
def _http_get(url: str, timeout: float = 5.0) -> tuple[Optional[int], Optional[str]]:
"""Best-effort GET that swallows transport errors and returns
(status, error_message). Status is None when the request couldn't
complete; error_message is None when the request returned 2xx.
"""
try:
# Origin header — staging tenants enforce same-origin via WAF;
# /healthz tolerates either way but matching production headers
# surfaces auth-style 401s correctly during the doctor run.
req = urllib_request.Request(
url,
headers={"Origin": os.environ.get("PLATFORM_URL", "").rstrip("/")},
)
with urllib_request.urlopen(req, timeout=timeout) as resp:
return resp.status, None
except URLError as e:
return None, str(e.reason if hasattr(e, "reason") else e)
except Exception as e:
return None, str(e)
def check_platform_health() -> Verdict:
label = "Platform reachability"
base = os.environ.get("PLATFORM_URL", "").strip().rstrip("/")
if not base:
_warn(label, "skipped (PLATFORM_URL unset — see Env vars)", "set PLATFORM_URL first")
return "warn"
if not base.startswith(("http://", "https://")):
_fail(
label,
f"PLATFORM_URL missing scheme: {base!r}",
"set PLATFORM_URL to include https:// — e.g. "
"PLATFORM_URL=https://your-tenant.staging.moleculesai.app",
)
return "fail"
if base.endswith("/"):
_warn(
label,
"PLATFORM_URL has trailing slash (will be stripped automatically)",
"remove the trailing slash to match the snippet shape",
)
status, err = _http_get(f"{base}/healthz")
if status is None:
_fail(label, f"GET {base}/healthz failed: {err}", "check DNS + firewall + scheme")
return "fail"
if not (200 <= status < 300):
_fail(label, f"GET {base}/healthz returned HTTP {status}", "verify the tenant subdomain is correct + provisioned")
return "fail"
_ok(label, f"GET {base}/healthz → {status}")
return "ok"
def check_token_auth() -> Verdict:
"""Light auth check via POST /registry/heartbeat.
Why heartbeat and not register: register is an UPSERT — sending
it from doctor would clobber the workspace's actual agent_card
(name, description, version) until the real agent next calls
register. That's an invisible production-disruption: someone
runs ``molecule-mcp doctor`` against a live workspace and the
canvas briefly displays "doctor-probe" as the agent name.
Heartbeat only updates last_heartbeat_at (and clears
awaiting_agent if needed) — that's exactly what a normal
molecule-mcp boot does every 20s, so an extra heartbeat from
the doctor is indistinguishable from background traffic.
Skipped when env vars failed earlier so the operator isn't shown
a redundant 401.
"""
label = "Token auth"
base = os.environ.get("PLATFORM_URL", "").strip().rstrip("/")
workspace_id = os.environ.get("WORKSPACE_ID", "").strip()
token, source_label = _resolve_token()
if not (base and workspace_id and token):
_warn(label, "skipped (Env vars must pass first)", "fix Env vars, re-run")
return "warn"
import json
body = json.dumps({"id": workspace_id}).encode()
req = urllib_request.Request(
f"{base}/registry/heartbeat",
data=body,
method="POST",
headers={
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
"Origin": base,
},
)
try:
with urllib_request.urlopen(req, timeout=8.0) as resp:
status = resp.status
except URLError as e:
# Pull HTTP code from HTTPError; transport errors don't have one.
status = getattr(e, "code", None)
err = str(e.reason if hasattr(e, "reason") else e)
if status is None:
_fail(label, f"POST {base}/registry/heartbeat failed: {err}", "check network")
return "fail"
except Exception as e:
_fail(label, f"POST heartbeat failed: {e}", "check network")
return "fail"
if status == 401:
_fail(
label,
"401 Unauthorized — token rejected",
"tokens are shown only once at workspace-create time; "
"re-create the workspace OR rotate via canvas Tokens tab.",
)
return "fail"
if status == 404:
_fail(
label,
f"404 — workspace_id {workspace_id} not found on {base}",
"verify WORKSPACE_ID matches a real workspace + the tenant "
"subdomain in PLATFORM_URL.",
)
return "fail"
if not (200 <= status < 300):
_fail(label, f"POST heartbeat returned HTTP {status}", "see platform logs")
return "fail"
_ok(label, f"POST {base}/registry/heartbeat → {status} (token from {source_label})")
return "ok"
# Back-compat alias: the previous name was check_register, but the
# implementation switched to a non-mutating heartbeat probe (see
# check_token_auth's docstring). Kept so external test suites or
# pinned-import scripts don't break on the rename.
check_register = check_token_auth
CHECKS = [
check_python_version,
check_wheel_install,
check_path_for_binary,
check_env_vars,
check_platform_health,
check_token_auth,
]
def run() -> int:
"""Run all checks and return a process exit code (0 ok, 1 if any fail)."""
print("molecule-mcp doctor — onboarding diagnostic")
print()
verdicts = []
for chk in CHECKS:
try:
verdicts.append(chk())
except Exception as e:
# A buggy check shouldn't kill the rest of the doctor run.
print(f" [BUG] {chk.__name__}: unexpected {type(e).__name__}: {e}")
verdicts.append("fail")
print()
fails = sum(1 for v in verdicts if v == "fail")
warns = sum(1 for v in verdicts if v == "warn")
if fails:
print(f"{fails} check(s) failed, {warns} warning(s). Fix the FAIL items above and re-run.")
return 1
if warns:
print(f"All required checks passed; {warns} warning(s) — review the next-step hints.")
return 0
print("All checks passed.")
return 0
+98 -4
View File
@@ -35,9 +35,15 @@ def resolve_workspaces() -> tuple[list[tuple[str, str]], list[str]]:
N workspaces). When set, ``WORKSPACE_ID`` / ``MOLECULE_WORKSPACE_TOKEN``
are IGNORED — the JSON is the source of truth.
2. Single-workspace fallback — ``WORKSPACE_ID`` env var + token from
``MOLECULE_WORKSPACE_TOKEN`` or ``${CONFIGS_DIR}/.auth_token``.
This is the pre-existing path; back-compat exact.
2. Single-workspace fallback — ``WORKSPACE_ID`` env var + token
resolved in this order:
a. ``MOLECULE_WORKSPACE_TOKEN`` (inline env — convenient but
leaks into shell history + plaintext MCP-host config).
b. ``MOLECULE_WORKSPACE_TOKEN_FILE`` (path to a file holding
the token — operator can keep it 0600 in their home dir;
survives shell-history scrubs).
c. ``${CONFIGS_DIR}/.auth_token`` (in-container runtimes —
the platform writes this on provision).
Returns ``(workspaces, errors)``:
* ``workspaces``: list of ``(workspace_id, token)`` — non-empty
@@ -98,16 +104,94 @@ def resolve_workspaces() -> tuple[list[tuple[str, str]], list[str]]:
wsid = os.environ.get("WORKSPACE_ID", "").strip()
if not wsid:
return [], ["WORKSPACE_ID (or MOLECULE_WORKSPACES) is required"]
# Token resolution order (#2934): inline env → file path → CONFIGS_DIR
# default. The file-path option exists so operators can keep the
# bearer out of shell history and out of MCP-host config plaintext
# (e.g. ~/.claude.json) — set MOLECULE_WORKSPACE_TOKEN_FILE to a
# 0600 file containing the token. The CONFIGS_DIR/.auth_token
# fallback predates this and stays for in-container runtimes.
tok = os.environ.get("MOLECULE_WORKSPACE_TOKEN", "").strip()
if not tok:
tok, tf_err = _read_token_from_file_env()
if tf_err:
# Operator explicitly pointed TOKEN_FILE somewhere — surface
# the SPECIFIC failure (path doesn't exist, isn't readable,
# or holds a blank file) instead of falling through to the
# generic "set one of these three vars" message. Otherwise
# they get exactly the silent failure mode #2934 flagged
# ("a new user has no chance"). Skip the CONFIGS_DIR
# fallback in this case — the operator's intent is clearly
# to use the file path; deferring to a different source
# would mask their config error.
return [], [tf_err]
if not tok:
tok = read_token_file()
if not tok:
return [], [
"MOLECULE_WORKSPACE_TOKEN (or CONFIGS_DIR/.auth_token) is required"
"MOLECULE_WORKSPACE_TOKEN, MOLECULE_WORKSPACE_TOKEN_FILE, or "
"CONFIGS_DIR/.auth_token is required"
]
return [(wsid, tok)], []
def _read_token_from_file_env() -> tuple[str, str]:
"""Read the token from the file path in MOLECULE_WORKSPACE_TOKEN_FILE.
Returns ``(token, error)``:
* env var unset/blank → ``("", "")`` — caller falls through silently
to the next source; the operator didn't ask for this path.
* file open/read fails (missing, permission denied, decode error)
→ ``("", "<specific error>")`` — caller surfaces it directly.
The operator EXPLICITLY pointed at this path, so a generic
fallthrough error would mask their config bug (#2934).
* file is blank → ``("", "<blank file error>")`` — same reasoning.
* file read returns junk with internal whitespace/newlines (e.g.
a CSV cell, accidental multi-token paste) → ``("", "<error>")``
rather than concatenating into a malformed bearer that 401s
against the platform with no context.
* happy path → ``("<token>", "")``.
"""
path = os.environ.get("MOLECULE_WORKSPACE_TOKEN_FILE", "").strip()
if not path:
return "", ""
try:
with open(path, encoding="utf-8") as fh:
raw = fh.read()
except FileNotFoundError:
return "", (
f"MOLECULE_WORKSPACE_TOKEN_FILE points to {path!r} which "
f"does not exist"
)
except PermissionError:
return "", (
f"MOLECULE_WORKSPACE_TOKEN_FILE={path!r} is not readable "
f"(permission denied)"
)
except OSError as exc:
return "", (
f"MOLECULE_WORKSPACE_TOKEN_FILE={path!r} could not be read: "
f"{exc}"
)
except UnicodeDecodeError:
return "", (
f"MOLECULE_WORKSPACE_TOKEN_FILE={path!r} is not valid UTF-8"
)
tok = raw.strip()
if not tok:
return "", (
f"MOLECULE_WORKSPACE_TOKEN_FILE={path!r} is empty"
)
# Reject tokens with internal whitespace — a CSV cell or accidental
# multi-token paste would otherwise become a malformed bearer that
# 401s against the platform with no diagnostic.
if any(ch.isspace() for ch in tok):
return "", (
f"MOLECULE_WORKSPACE_TOKEN_FILE={path!r} contains internal "
f"whitespace — expected a single token"
)
return tok, ""
def print_missing_env_help(missing: list[str], have_token_file: bool) -> None:
print("molecule-mcp: missing required environment.\n", file=sys.stderr)
print("Set the following before running molecule-mcp:", file=sys.stderr)
@@ -123,6 +207,16 @@ def print_missing_env_help(missing: list[str], have_token_file: bool) -> None:
"(canvas → Tokens tab)",
file=sys.stderr,
)
print(
" OR set MOLECULE_WORKSPACE_TOKEN_FILE"
" to a path that holds the token",
file=sys.stderr,
)
print(
" (keeps the secret out of shell"
" history and MCP-host config plaintext)",
file=sys.stderr,
)
print("", file=sys.stderr)
print(f"Currently missing: {', '.join(missing)}", file=sys.stderr)
+81
View File
@@ -273,6 +273,87 @@ class TestSendA2AMessage:
assert _TEST_PEER_ID in result
assert "/workspaces/" in result and "/a2a" in result
async def test_poll_queued_envelope_returns_success_string(self):
"""Issue #2967: workspace-server's poll-mode short-circuit returns
{status:"queued", delivery_mode:"poll", method:...} when the peer
has no URL to dispatch to. Pre-fix the bare send_a2a_message parser
only knew about JSON-RPC {result, error} keys, so this fell through
to the 'unexpected response shape' error path → callers retried,
peer got duplicate delegations.
Pin: poll-queued envelope returns a clean success string that does
NOT start with _A2A_ERROR_PREFIX, so callers route it through the
normal-outcome path. Verified discriminating: assert_NOT_startswith
the error prefix would FAIL on the old code (which returned an
error-prefixed string) and PASSES on the new code.
"""
import a2a_client
resp = _make_response(200, {
"status": "queued",
"delivery_mode": "poll",
"method": "message/send",
})
mock_client = _make_mock_client(post_resp=resp)
with patch("a2a_client.httpx.AsyncClient", return_value=mock_client):
result = await a2a_client.send_a2a_message(_TEST_PEER_ID, "task")
# Discriminating: pre-fix returned a string that startswith
# _A2A_ERROR_PREFIX, so this assertion would have FAILED on the
# old code. New code returns a queued-success string.
assert not result.startswith(a2a_client._A2A_ERROR_PREFIX), (
f"poll-queued envelope must not be tagged as A2A error; got: {result!r}"
)
assert "queued" in result.lower()
assert "poll" in result.lower()
# The method is included so a structured-log scraper can route by
# protocol verb if needed.
assert "message/send" in result
async def test_poll_queued_envelope_with_other_method(self):
"""Same envelope but a different a2a_method (the future could add
message/sendStream or similar). Pin that the parser doesn't hardcode
message/send — whatever method the server echoed is preserved.
"""
import a2a_client
resp = _make_response(200, {
"status": "queued",
"delivery_mode": "poll",
"method": "message/sendStream",
})
mock_client = _make_mock_client(post_resp=resp)
with patch("a2a_client.httpx.AsyncClient", return_value=mock_client):
result = await a2a_client.send_a2a_message(_TEST_PEER_ID, "task")
assert not result.startswith(a2a_client._A2A_ERROR_PREFIX)
assert "message/sendStream" in result
async def test_status_queued_without_poll_mode_still_falls_through(self):
"""Defensive: only the {status:"queued", delivery_mode:"poll"} pair
triggers the queued-success branch. A response with status:"queued"
but a different delivery_mode (or none) is still 'unexpected'
we don't want to silently swallow a future server bug that emits
a partial envelope. Pin both keys are required.
"""
import a2a_client
resp = _make_response(200, {
"status": "queued",
# delivery_mode missing
"method": "message/send",
})
mock_client = _make_mock_client(post_resp=resp)
with patch("a2a_client.httpx.AsyncClient", return_value=mock_client):
result = await a2a_client.send_a2a_message(_TEST_PEER_ID, "task")
# Falls through — must STILL be tagged as error.
assert result.startswith(a2a_client._A2A_ERROR_PREFIX)
assert "unexpected response shape" in result
async def test_exception_returns_error_prefix_and_message(self):
"""Network exception → returns _A2A_ERROR_PREFIX + exception text."""
import a2a_client
+4 -4
View File
@@ -241,7 +241,7 @@ class TestToolListPeersAggregation:
return [{"id": "2222bbbb-2222-2222-2222-222222222222", "name": "bob", "status": "online", "role": "dev"}], None
return [], None
with patch("a2a_tools.get_peers_with_diagnostic", side_effect=fake_get_peers):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", side_effect=fake_get_peers):
output = await a2a_tools.tool_list_peers()
assert "alice" in output
@@ -263,7 +263,7 @@ class TestToolListPeersAggregation:
assert source_workspace_id == a2a_client.WORKSPACE_ID
return [{"id": "1111aaaa-1111-1111-1111-111111111111", "name": "alice", "status": "online", "role": "ops"}], None
with patch("a2a_tools.get_peers_with_diagnostic", side_effect=fake_get_peers):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", side_effect=fake_get_peers):
output = await a2a_tools.tool_list_peers()
assert "alice" in output
@@ -286,7 +286,7 @@ class TestToolListPeersAggregation:
seen.append(source_workspace_id)
return [{"id": "1111aaaa-1111-1111-1111-111111111111", "name": "alice", "status": "online", "role": "ops"}], None
with patch("a2a_tools.get_peers_with_diagnostic", side_effect=fake_get_peers):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", side_effect=fake_get_peers):
output = await a2a_tools.tool_list_peers(source_workspace_id=ws_a)
assert seen == [ws_a]
@@ -309,7 +309,7 @@ class TestToolListPeersAggregation:
return [], "auth failed"
return [], "platform 5xx"
with patch("a2a_tools.get_peers_with_diagnostic", side_effect=fake_get_peers):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", side_effect=fake_get_peers):
out = await a2a_tools.tool_list_peers()
assert "[aaaa1111] auth failed" in out
+77 -77
View File
@@ -453,14 +453,14 @@ class TestToolSendMessageToUser:
async def test_success_200_returns_sent_message(self):
import a2a_tools
mc = _make_http_mock(post_resp=_resp(200, {}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_send_message_to_user("Hello user!")
assert result == "Message sent to user"
async def test_non_200_returns_status_code_in_error(self):
import a2a_tools
mc = _make_http_mock(post_resp=_resp(503, {}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_send_message_to_user("Hello user!")
assert "503" in result
assert "Error" in result
@@ -468,7 +468,7 @@ class TestToolSendMessageToUser:
async def test_exception_returns_error_message(self):
import a2a_tools
mc = _make_http_mock(post_exc=RuntimeError("platform unreachable"))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_send_message_to_user("Hi!")
assert "Error sending message" in result
assert "platform unreachable" in result
@@ -495,7 +495,7 @@ class TestToolSendMessageToUser:
mc = _make_http_mock(post_resp=notify_resp)
mc.post = AsyncMock(side_effect=[upload_resp, notify_resp])
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_send_message_to_user(
"Done — see attached.",
attachments=[str(f)],
@@ -523,7 +523,7 @@ class TestToolSendMessageToUser:
# with a half-rendered attachment chip.
import a2a_tools
mc = _make_http_mock()
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_send_message_to_user(
"Hi", attachments=["/no/such/file.zip"],
)
@@ -541,7 +541,7 @@ class TestToolSendMessageToUser:
mc = _make_http_mock()
mc.post = AsyncMock(return_value=upload_resp)
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_send_message_to_user(
"Hi", attachments=[str(f)],
)
@@ -555,7 +555,7 @@ class TestToolSendMessageToUser:
# an `attachments` field added to the notify body.
import a2a_tools
mc = _make_http_mock(post_resp=_resp(200, {}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
await a2a_tools.tool_send_message_to_user("plain text")
body = mc.post.await_args.kwargs.get("json") or {}
assert body == {"message": "plain text"}
@@ -570,7 +570,7 @@ class TestToolListPeers:
async def test_true_empty_returns_no_peers_message_without_diagnostic(self):
"""200 + empty list → 'no peers in the platform registry' (no failure)."""
import a2a_tools
with patch("a2a_tools.get_peers_with_diagnostic", return_value=([], None)):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", return_value=([], None)):
result = await a2a_tools.tool_list_peers()
# The new wording explicitly says no peers exist (no parent/sibling/child).
# Avoids the misleading "may be isolated" hint when discovery succeeded.
@@ -582,7 +582,7 @@ class TestToolListPeers:
"""401/403 → tool_list_peers must surface the auth failure + restart hint, not 'isolated'."""
import a2a_tools
diag = "Authentication to platform failed (HTTP 401). Restart the workspace to re-mint."
with patch("a2a_tools.get_peers_with_diagnostic", return_value=([], diag)):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", return_value=([], diag)):
result = await a2a_tools.tool_list_peers()
assert "401" in result
assert "Authentication" in result
@@ -593,7 +593,7 @@ class TestToolListPeers:
"""404 → tool_list_peers tells the user re-registration is needed."""
import a2a_tools
diag = "Workspace ID ws-test is not registered with the platform (HTTP 404). Re-register."
with patch("a2a_tools.get_peers_with_diagnostic", return_value=([], diag)):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", return_value=([], diag)):
result = await a2a_tools.tool_list_peers()
assert "404" in result
assert "registered" in result.lower()
@@ -602,7 +602,7 @@ class TestToolListPeers:
"""5xx → 'Platform error' surfaced; agent / user can correctly route to oncall."""
import a2a_tools
diag = "Platform error: HTTP 503."
with patch("a2a_tools.get_peers_with_diagnostic", return_value=([], diag)):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", return_value=([], diag)):
result = await a2a_tools.tool_list_peers()
assert "503" in result
assert "Platform error" in result
@@ -611,7 +611,7 @@ class TestToolListPeers:
"""Network error → operator can tell that the workspace can't reach the platform at all."""
import a2a_tools
diag = "Cannot reach platform at http://platform.example: timed out"
with patch("a2a_tools.get_peers_with_diagnostic", return_value=([], diag)):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", return_value=([], diag)):
result = await a2a_tools.tool_list_peers()
assert "Cannot reach platform" in result
assert "timed out" in result
@@ -624,7 +624,7 @@ class TestToolListPeers:
{"id": "ws-1", "name": "Alpha", "status": "online", "role": "worker"},
{"id": "ws-2", "name": "Beta", "status": "idle", "role": "analyst"},
]
with patch("a2a_tools.get_peers_with_diagnostic", return_value=(peers, None)):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", return_value=(peers, None)):
result = await a2a_tools.tool_list_peers()
assert "Alpha" in result
@@ -641,7 +641,7 @@ class TestToolListPeers:
# Clear any prior cache entries for these IDs
a2a_tools._peer_names.pop("ws-cache-test", None)
peers = [{"id": "ws-cache-test", "name": "CacheMe", "status": "online", "role": "w"}]
with patch("a2a_tools.get_peers_with_diagnostic", return_value=(peers, None)):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", return_value=(peers, None)):
await a2a_tools.tool_list_peers()
assert a2a_tools._peer_names.get("ws-cache-test") == "CacheMe"
@@ -651,7 +651,7 @@ class TestToolListPeers:
import a2a_tools
peers = [{"id": "ws-3", "name": "Gamma"}] # no status, no role
with patch("a2a_tools.get_peers_with_diagnostic", return_value=(peers, None)):
with patch("a2a_tools_messaging.get_peers_with_diagnostic", return_value=(peers, None)):
result = await a2a_tools.tool_list_peers()
assert "Gamma" in result
@@ -669,7 +669,7 @@ class TestToolGetWorkspaceInfo:
import a2a_tools
info = {"id": "ws-test", "name": "My Workspace", "status": "online"}
with patch("a2a_tools.get_workspace_info", return_value=info):
with patch("a2a_tools_messaging.get_workspace_info", return_value=info):
result = await a2a_tools.tool_get_workspace_info()
parsed = json.loads(result)
@@ -678,7 +678,7 @@ class TestToolGetWorkspaceInfo:
async def test_returns_error_dict_as_json(self):
import a2a_tools
with patch("a2a_tools.get_workspace_info", return_value={"error": "not found"}):
with patch("a2a_tools_messaging.get_workspace_info", return_value={"error": "not found"}):
result = await a2a_tools.tool_get_workspace_info()
parsed = json.loads(result)
@@ -702,9 +702,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-1"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("Remember this", scope="local")
data = json.loads(result)
@@ -716,9 +716,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(200, {"id": "mem-2"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("Remember this", scope="INVALID")
data = json.loads(result)
@@ -728,9 +728,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(200, {"id": "mem-3"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("Team info", scope="TEAM")
data = json.loads(result)
@@ -741,9 +741,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-4"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=True):
result = await a2a_tools.tool_commit_memory("Global info", scope="GLOBAL")
data = json.loads(result)
@@ -753,9 +753,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(200, {"id": "mem-5"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("info")
data = json.loads(result)
@@ -766,9 +766,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-6"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("info")
data = json.loads(result)
@@ -779,9 +779,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(400, {"error": "bad request payload"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("info")
assert "Error" in result
@@ -791,9 +791,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_exc=RuntimeError("storage failure"))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("info")
assert "Error saving memory" in result
@@ -808,9 +808,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-poison"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("poisoned GLOBAL memory", scope="GLOBAL")
# Must NOT have called the platform — early rejection
@@ -824,9 +824,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-7"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=False), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=False), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
result = await a2a_tools.tool_commit_memory("should be denied", scope="LOCAL")
mc.post.assert_not_called()
@@ -838,9 +838,9 @@ class TestToolCommitMemory:
import a2a_tools
mc = _make_http_mock(post_resp=_resp(201, {"id": "mem-8"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_write_permission", return_value=True), \
patch("a2a_tools._is_root_workspace", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_write_permission", return_value=True), \
patch("a2a_tools_memory._is_root_workspace", return_value=False):
await a2a_tools.tool_commit_memory("test content", scope="LOCAL")
call_kwargs = mc.post.call_args.kwargs
@@ -865,8 +865,8 @@ class TestToolRecallMemory:
{"scope": "TEAM", "content": "We use Python 3.11"},
]
mc = _make_http_mock(get_resp=_resp(200, memories))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
result = await a2a_tools.tool_recall_memory(query="capital")
assert "[LOCAL]" in result
@@ -878,8 +878,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
result = await a2a_tools.tool_recall_memory(query="anything")
assert result == "No memories found."
@@ -890,8 +890,8 @@ class TestToolRecallMemory:
payload = {"error": "search unavailable"}
mc = _make_http_mock(get_resp=_resp(200, payload))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
result = await a2a_tools.tool_recall_memory()
parsed = json.loads(result)
@@ -901,8 +901,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_exc=RuntimeError("search service down"))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
result = await a2a_tools.tool_recall_memory(query="test")
assert "Error recalling memory" in result
@@ -913,8 +913,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
await a2a_tools.tool_recall_memory(query="paris", scope="local")
call_kwargs = mc.get.call_args.kwargs
@@ -928,8 +928,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
await a2a_tools.tool_recall_memory()
call_kwargs = mc.get.call_args.kwargs
@@ -942,8 +942,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=True):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=True):
await a2a_tools.tool_recall_memory(scope="team")
call_kwargs = mc.get.call_args.kwargs
@@ -960,8 +960,8 @@ class TestToolRecallMemory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, [{"scope": "GLOBAL", "content": "secret"}]))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools._check_memory_read_permission", return_value=False):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=mc), \
patch("a2a_tools_memory._check_memory_read_permission", return_value=False):
result = await a2a_tools.tool_recall_memory(query="secret")
mc.get.assert_not_called()
@@ -994,7 +994,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock()
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_chat_history(peer_id="")
mc.get.assert_not_called()
@@ -1006,7 +1006,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
await a2a_tools.tool_chat_history(peer_id=_PEER)
url, kwargs = mc.get.call_args.args[0], mc.get.call_args.kwargs
@@ -1023,7 +1023,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
await a2a_tools.tool_chat_history(peer_id=_PEER, limit=10000)
params = mc.get.call_args.kwargs["params"]
@@ -1035,7 +1035,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
await a2a_tools.tool_chat_history(peer_id=_PEER, limit=0)
assert mc.get.call_args.kwargs["params"]["limit"] == "20"
@@ -1044,7 +1044,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
await a2a_tools.tool_chat_history(
peer_id=_PEER, before_ts="2026-05-01T00:00:00Z",
)
@@ -1063,7 +1063,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, []))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_chat_history(peer_id=_PEER)
# Exact-equality on the JSON literal (per assert-exact memory) —
@@ -1084,7 +1084,7 @@ class TestChatHistory:
{"id": "act-1", "created_at": "2026-05-01T00:01:00Z"},
]
mc = _make_http_mock(get_resp=_resp(200, rows))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_chat_history(peer_id=_PEER)
out = json.loads(result)
@@ -1097,7 +1097,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(400, {"error": "peer_id must be a UUID"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_chat_history(peer_id="bad")
assert "peer_id must be a UUID" in result
@@ -1108,7 +1108,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(500, {"error": "internal"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_chat_history(peer_id=_PEER)
assert result.startswith("Error:")
@@ -1121,7 +1121,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_exc=httpx.ConnectError("network down"))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_chat_history(peer_id=_PEER)
assert result.startswith("Error:")
@@ -1135,7 +1135,7 @@ class TestChatHistory:
import a2a_tools
mc = _make_http_mock(get_resp=_resp(200, {"unexpected": "shape"}))
with patch("a2a_tools.httpx.AsyncClient", return_value=mc):
with patch("a2a_tools_messaging.httpx.AsyncClient", return_value=mc):
result = await a2a_tools.tool_chat_history(peer_id=_PEER)
assert result.startswith("Error:")
@@ -0,0 +1,150 @@
"""Tests for `_enrich_inbound_for_agent` — the poll-path companion to
the push-path enrichment in `a2a_mcp_server._build_channel_notification`.
The MCP poll path (inbox_peek / wait_for_message) returns
`InboxMessage.to_dict()`, which has `activity_id, text, peer_id, kind,
method, created_at` but NOT the registry-resolved `peer_name`,
`peer_role`, or `agent_card_url`. The receiving agent then sees a
plain message and can't tell who's writing — breaking the universal
contract documented in `a2a_mcp_server.py:303-345` ("In both paths
the same fields apply").
The enrichment helper closes that gap. These tests pin:
- canvas_user (peer_id="") passes through unchanged
- peer_agent with cache hit gets peer_name + peer_role + agent_card_url
- peer_agent with cache miss still gets agent_card_url (constructable
from peer_id alone)
- a2a_client unavailable (test harness without registry) degrades
gracefully — agent still gets the bare envelope
"""
from __future__ import annotations
import os
# a2a_client.py reads WORKSPACE_ID at import time and raises if it's
# unset. Stamp a stub before any test pulls in a2a_tools (which transitively
# imports a2a_client). conftest.py mocks the SDK but not this env var.
os.environ.setdefault("WORKSPACE_ID", "00000000-0000-0000-0000-000000000001")
import sys
import types
from unittest.mock import patch
PEER_UUID = "11111111-2222-3333-4444-555555555555"
def test_canvas_user_passes_through_unchanged():
from a2a_tools import _enrich_inbound_for_agent
base = {
"activity_id": "act-1",
"text": "hello from canvas",
"peer_id": "",
"kind": "canvas_user",
"method": "message/send",
"created_at": "2026-05-05T11:00:00Z",
}
out = _enrich_inbound_for_agent(dict(base))
# Plain pass-through — no enrichment fields added for canvas_user.
assert out == base
assert "peer_name" not in out
assert "peer_role" not in out
assert "agent_card_url" not in out
def test_peer_agent_cache_hit_adds_name_role_and_card_url():
from a2a_tools import _enrich_inbound_for_agent
record = {"name": "ops-agent", "role": "sre"}
card_url = f"https://platform.example/registry/{PEER_UUID}/agent-card"
with patch(
"a2a_client.enrich_peer_metadata_nonblocking",
return_value=record,
), patch(
"a2a_client._agent_card_url_for",
return_value=card_url,
):
out = _enrich_inbound_for_agent({
"activity_id": "act-2",
"text": "ping",
"peer_id": PEER_UUID,
"kind": "peer_agent",
"method": "message/send",
"created_at": "2026-05-05T11:01:00Z",
})
assert out["peer_name"] == "ops-agent"
assert out["peer_role"] == "sre"
assert out["agent_card_url"] == card_url
def test_peer_agent_cache_miss_still_gets_agent_card_url():
"""agent_card_url is constructable from peer_id alone — surface it
even when registry enrichment misses, so the receiving agent has a
single endpoint to hit for the peer's full capability list."""
from a2a_tools import _enrich_inbound_for_agent
card_url = f"https://platform.example/registry/{PEER_UUID}/agent-card"
with patch(
"a2a_client.enrich_peer_metadata_nonblocking",
return_value=None, # cache miss
), patch(
"a2a_client._agent_card_url_for",
return_value=card_url,
):
out = _enrich_inbound_for_agent({
"activity_id": "act-3",
"text": "ping",
"peer_id": PEER_UUID,
"kind": "peer_agent",
"method": "message/send",
"created_at": "2026-05-05T11:02:00Z",
})
assert "peer_name" not in out
assert "peer_role" not in out
assert out["agent_card_url"] == card_url
def test_peer_agent_a2a_client_unavailable_degrades_gracefully(monkeypatch):
"""If a2a_client can't be imported (test harness, partial install),
return the bare envelope — agent still gets text + peer_id + kind +
activity_id, just without the friendly identity."""
from a2a_tools import _enrich_inbound_for_agent
# Stub a2a_client import to fail.
real_module = sys.modules.pop("a2a_client", None)
fake = types.ModuleType("a2a_client")
# Deliberately omit enrich_peer_metadata_nonblocking and
# _agent_card_url_for so the helper's fallback path fires.
sys.modules["a2a_client"] = fake
try:
out = _enrich_inbound_for_agent({
"activity_id": "act-4",
"text": "ping",
"peer_id": PEER_UUID,
"kind": "peer_agent",
"method": "message/send",
"created_at": "2026-05-05T11:03:00Z",
})
finally:
if real_module is not None:
sys.modules["a2a_client"] = real_module
else:
sys.modules.pop("a2a_client", None)
# Bare envelope passes through — receiving agent still has enough
# to act, even if the friendly identity is missing.
assert out["peer_id"] == PEER_UUID
assert out["text"] == "ping"
assert out["kind"] == "peer_agent"
assert "peer_name" not in out
assert "peer_role" not in out
assert "agent_card_url" not in out
@@ -0,0 +1,181 @@
"""Drift gate + import-contract tests for ``a2a_tools_inbox`` (RFC #2873 iter 4e).
The full behavior matrix for the three inbox tool wrappers lives in
``test_a2a_tools_inbox_wrappers.py`` (kept on the public ``a2a_tools``
module so the same tests pin both the alias and the underlying impl).
This file pins:
1. **Drift gate** — every previously-public symbol on ``a2a_tools``
(``tool_inbox_peek``, ``tool_inbox_pop``, ``tool_wait_for_message``,
``_enrich_inbound_for_agent``, ``_INBOX_NOT_ENABLED_MSG``) is the
EXACT same object as ``a2a_tools_inbox.foo``. Refactor wrapping
silently loses existing test coverage; this gate makes that drift
fail fast.
2. **Import contract** — ``a2a_tools_inbox`` does NOT pull in
``a2a_tools`` at module-load time (the layered architecture: it
depends only on stdlib + a lazy import of ``inbox`` + a lazy
import of ``a2a_client``, never the kitchen-sink module that
re-exports it).
3. **_enrich_inbound_for_agent** branches that the wrapper tests
can't easily reach: peer_id-empty (canvas_user) returns the
dict unchanged; a2a_client unavailable degrades gracefully.
"""
from __future__ import annotations
import sys
import pytest
@pytest.fixture(autouse=True)
def _require_workspace_id(monkeypatch):
monkeypatch.setenv("WORKSPACE_ID", "00000000-0000-0000-0000-000000000000")
monkeypatch.setenv("PLATFORM_URL", "http://test.invalid")
yield
# ============== Drift gate ==============
class TestBackCompatAliases:
def test_tool_inbox_peek_alias(self):
import a2a_tools
import a2a_tools_inbox
assert a2a_tools.tool_inbox_peek is a2a_tools_inbox.tool_inbox_peek
def test_tool_inbox_pop_alias(self):
import a2a_tools
import a2a_tools_inbox
assert a2a_tools.tool_inbox_pop is a2a_tools_inbox.tool_inbox_pop
def test_tool_wait_for_message_alias(self):
import a2a_tools
import a2a_tools_inbox
assert (
a2a_tools.tool_wait_for_message is a2a_tools_inbox.tool_wait_for_message
)
def test_enrich_helper_alias(self):
import a2a_tools
import a2a_tools_inbox
assert (
a2a_tools._enrich_inbound_for_agent
is a2a_tools_inbox._enrich_inbound_for_agent
)
def test_inbox_not_enabled_msg_alias(self):
import a2a_tools
import a2a_tools_inbox
assert (
a2a_tools._INBOX_NOT_ENABLED_MSG is a2a_tools_inbox._INBOX_NOT_ENABLED_MSG
)
# ============== Import contract ==============
class TestImportContract:
def test_inbox_module_does_not_import_a2a_tools_eagerly(self):
# Force a fresh load of a2a_tools_inbox without a2a_tools in sight.
for k in [k for k in list(sys.modules) if k in (
"a2a_tools_inbox", "a2a_tools",
)]:
sys.modules.pop(k, None)
import a2a_tools_inbox # noqa: F401 — load only
# a2a_tools_inbox MUST NOT have caused a2a_tools to load. The
# extracted module sits BELOW the kitchen-sink in the layering;
# the dependency arrow points the other direction.
assert "a2a_tools" not in sys.modules, (
"a2a_tools_inbox eagerly imported a2a_tools — the kitchen-sink "
"module must not be a load-time dependency of its slices."
)
# ============== _enrich_inbound_for_agent branches ==============
class TestEnrichInboundForAgent:
def test_canvas_user_returns_dict_unchanged(self):
# peer_id empty → canvas_user → no enrichment, no a2a_client touch.
from a2a_tools_inbox import _enrich_inbound_for_agent
msg = {"activity_id": "a-1", "kind": "canvas_user", "peer_id": ""}
result = _enrich_inbound_for_agent(msg)
assert result is msg # same dict, mutated in place if at all
assert "peer_name" not in result
assert "peer_role" not in result
assert "agent_card_url" not in result
def test_missing_peer_id_key_returns_unchanged(self):
from a2a_tools_inbox import _enrich_inbound_for_agent
msg = {"activity_id": "a-2", "kind": "canvas_user"} # no peer_id key
result = _enrich_inbound_for_agent(msg)
assert result is msg
assert "agent_card_url" not in result
def test_a2a_client_unavailable_degrades_gracefully(self, monkeypatch):
# Simulate a2a_client import failing (test harness, partial
# install). The helper must return the bare envelope, not raise.
from a2a_tools_inbox import _enrich_inbound_for_agent
# Force an ImportError by poisoning sys.modules.
import builtins
real_import = builtins.__import__
def fake_import(name, *args, **kwargs):
if name == "a2a_client":
raise ImportError("simulated a2a_client unavailable")
return real_import(name, *args, **kwargs)
monkeypatch.setattr(builtins, "__import__", fake_import)
msg = {"activity_id": "a-3", "kind": "peer_agent", "peer_id": "ws-x"}
result = _enrich_inbound_for_agent(msg)
# Bare envelope back — no peer_name, no agent_card_url. Crucially
# the helper did NOT raise, so the inbox tool surfaces the message
# to the agent even when the registry is unreachable.
assert result is msg
assert "peer_name" not in result
assert "agent_card_url" not in result
def test_registry_record_populates_peer_name_and_role(self, monkeypatch):
from a2a_tools_inbox import _enrich_inbound_for_agent
# Stub out the lazy-imported a2a_client functions.
import sys
import types
fake_a2a_client = types.SimpleNamespace(
_agent_card_url_for=lambda pid: f"http://test/agent/{pid}",
enrich_peer_metadata_nonblocking=lambda pid: {
"name": "PeerOne",
"role": "worker",
},
)
monkeypatch.setitem(sys.modules, "a2a_client", fake_a2a_client)
msg = {"activity_id": "a-4", "kind": "peer_agent", "peer_id": "ws-1"}
result = _enrich_inbound_for_agent(msg)
assert result["peer_name"] == "PeerOne"
assert result["peer_role"] == "worker"
assert result["agent_card_url"] == "http://test/agent/ws-1"
def test_registry_miss_keeps_agent_card_url(self, monkeypatch):
# On registry cache miss the helper still surfaces agent_card_url
# because it's constructable from peer_id alone — preserves the
# contract that the receiving agent always has somewhere to
# fetch the peer's full capability list.
from a2a_tools_inbox import _enrich_inbound_for_agent
import sys
import types
fake_a2a_client = types.SimpleNamespace(
_agent_card_url_for=lambda pid: f"http://test/agent/{pid}",
enrich_peer_metadata_nonblocking=lambda pid: None, # cache miss
)
monkeypatch.setitem(sys.modules, "a2a_client", fake_a2a_client)
msg = {"activity_id": "a-5", "kind": "peer_agent", "peer_id": "ws-2"}
result = _enrich_inbound_for_agent(msg)
assert "peer_name" not in result
assert "peer_role" not in result
assert result["agent_card_url"] == "http://test/agent/ws-2"
@@ -0,0 +1,196 @@
"""Direct unit tests for the three inbox tool wrappers in ``a2a_tools``.
After RFC #2873 iter 4d (messaging extraction), ``a2a_tools.py`` is
mostly back-compat re-exports — the only behavior still defined here
is ``report_activity`` plus three thin wrappers around the inbox state
machine: ``tool_inbox_peek`` / ``tool_inbox_pop`` / ``tool_wait_for_message``.
These wrappers were never exercised at the module level, so the
critical-path coverage gate (75% per-file floor for MCP/inbox/auth)
dropped to 54% on iter 4d. This file pins each wrapper's behavior
directly so the floor is met without changing the gate.
The wrappers are ~40 LOC of glue. The full delivery behavior
(persistence, 410 recovery, etc.) is exercised in test_inbox.py.
"""
from __future__ import annotations
import asyncio
import json
from unittest.mock import MagicMock, patch
import pytest
@pytest.fixture(autouse=True)
def _require_workspace_id(monkeypatch):
monkeypatch.setenv("WORKSPACE_ID", "00000000-0000-0000-0000-000000000000")
monkeypatch.setenv("PLATFORM_URL", "http://test.invalid")
yield
def _run(coro):
return asyncio.get_event_loop().run_until_complete(coro)
# ---------------------------------------------------------------------------
# tool_inbox_peek
# ---------------------------------------------------------------------------
class TestToolInboxPeek:
def test_returns_not_enabled_when_state_none(self):
import a2a_tools
with patch("inbox.get_state", return_value=None):
out = _run(a2a_tools.tool_inbox_peek())
assert "not enabled" in out
def test_returns_json_array_of_messages(self):
import a2a_tools
msg1 = MagicMock()
msg1.to_dict.return_value = {"activity_id": "a1", "kind": "canvas_user"}
msg2 = MagicMock()
msg2.to_dict.return_value = {"activity_id": "a2", "kind": "peer_agent"}
fake_state = MagicMock()
fake_state.peek.return_value = [msg1, msg2]
with patch("inbox.get_state", return_value=fake_state):
out = _run(a2a_tools.tool_inbox_peek(limit=5))
# peek limit is forwarded
fake_state.peek.assert_called_once_with(limit=5)
parsed = json.loads(out)
assert len(parsed) == 2
assert parsed[0]["activity_id"] == "a1"
def test_non_int_limit_falls_back_to_10(self):
import a2a_tools
fake_state = MagicMock()
fake_state.peek.return_value = []
with patch("inbox.get_state", return_value=fake_state):
_run(a2a_tools.tool_inbox_peek(limit="garbage")) # type: ignore[arg-type]
fake_state.peek.assert_called_once_with(limit=10)
# ---------------------------------------------------------------------------
# tool_inbox_pop
# ---------------------------------------------------------------------------
class TestToolInboxPop:
def test_returns_not_enabled_when_state_none(self):
import a2a_tools
with patch("inbox.get_state", return_value=None):
out = _run(a2a_tools.tool_inbox_pop("act-1"))
assert "not enabled" in out
def test_rejects_empty_activity_id(self):
import a2a_tools
fake_state = MagicMock()
with patch("inbox.get_state", return_value=fake_state):
out = _run(a2a_tools.tool_inbox_pop(""))
assert "activity_id is required" in out
fake_state.pop.assert_not_called()
def test_rejects_non_str_activity_id(self):
import a2a_tools
fake_state = MagicMock()
with patch("inbox.get_state", return_value=fake_state):
out = _run(a2a_tools.tool_inbox_pop(123)) # type: ignore[arg-type]
assert "activity_id is required" in out
fake_state.pop.assert_not_called()
def test_returns_removed_true_when_popped(self):
import a2a_tools
fake_state = MagicMock()
fake_state.pop.return_value = MagicMock() # truthy = something was removed
with patch("inbox.get_state", return_value=fake_state):
out = _run(a2a_tools.tool_inbox_pop("act-7"))
parsed = json.loads(out)
assert parsed == {"removed": True, "activity_id": "act-7"}
fake_state.pop.assert_called_once_with("act-7")
def test_returns_removed_false_when_unknown(self):
import a2a_tools
fake_state = MagicMock()
fake_state.pop.return_value = None
with patch("inbox.get_state", return_value=fake_state):
out = _run(a2a_tools.tool_inbox_pop("act-missing"))
parsed = json.loads(out)
assert parsed == {"removed": False, "activity_id": "act-missing"}
# ---------------------------------------------------------------------------
# tool_wait_for_message
# ---------------------------------------------------------------------------
class TestToolWaitForMessage:
def test_returns_not_enabled_when_state_none(self):
import a2a_tools
with patch("inbox.get_state", return_value=None):
out = _run(a2a_tools.tool_wait_for_message(timeout_secs=1.0))
assert "not enabled" in out
def test_timeout_payload_when_no_message(self):
import a2a_tools
fake_state = MagicMock()
fake_state.wait.return_value = None
with patch("inbox.get_state", return_value=fake_state):
out = _run(a2a_tools.tool_wait_for_message(timeout_secs=0.1))
parsed = json.loads(out)
assert parsed["timeout"] is True
assert parsed["timeout_secs"] == 0.1
def test_returns_message_when_delivered(self):
import a2a_tools
msg = MagicMock()
msg.to_dict.return_value = {"activity_id": "a-9", "kind": "peer_agent"}
fake_state = MagicMock()
fake_state.wait.return_value = msg
with patch("inbox.get_state", return_value=fake_state):
out = _run(a2a_tools.tool_wait_for_message(timeout_secs=2.0))
parsed = json.loads(out)
assert parsed["activity_id"] == "a-9"
def test_timeout_clamped_to_300(self):
import a2a_tools
fake_state = MagicMock()
fake_state.wait.return_value = None
with patch("inbox.get_state", return_value=fake_state):
_run(a2a_tools.tool_wait_for_message(timeout_secs=99999))
# Whatever wait was called with, it must not exceed 300
passed = fake_state.wait.call_args.args[0]
assert passed == 300.0
def test_timeout_clamped_to_zero_floor(self):
import a2a_tools
fake_state = MagicMock()
fake_state.wait.return_value = None
with patch("inbox.get_state", return_value=fake_state):
_run(a2a_tools.tool_wait_for_message(timeout_secs=-5))
passed = fake_state.wait.call_args.args[0]
assert passed == 0.0
def test_non_numeric_timeout_falls_back_to_60(self):
import a2a_tools
fake_state = MagicMock()
fake_state.wait.return_value = None
with patch("inbox.get_state", return_value=fake_state):
_run(a2a_tools.tool_wait_for_message(timeout_secs="garbage")) # type: ignore[arg-type]
passed = fake_state.wait.call_args.args[0]
assert passed == 60.0
+69
View File
@@ -0,0 +1,69 @@
"""Drift gate + smoke tests for ``a2a_tools_memory`` (RFC #2873 iter 4c).
The full behavior matrix (RBAC denies, scope enforcement, platform
HTTP error paths) lives in ``test_a2a_tools_impl.py`` (TestToolCommitMemory
+ TestToolRecallMemory) which patches `a2a_tools_memory.foo` after the
iter 4c retarget.
This file pins:
1. **Drift gate** — every previously-public symbol on ``a2a_tools``
(``tool_commit_memory``, ``tool_recall_memory``) is the EXACT same
callable as ``a2a_tools_memory.foo``. Refactor wrapping silently
loses the existing test coverage; this gate makes that drift fail
fast.
2. **Import contract** — ``a2a_tools_memory`` does NOT pull in
``a2a_tools`` at module-load time. The handlers depend on
``a2a_tools_rbac`` (the layered architecture) and ``a2a_client``,
not on the kitchen-sink module that re-exports them.
"""
from __future__ import annotations
import sys
import pytest
@pytest.fixture(autouse=True)
def _require_workspace_id(monkeypatch):
monkeypatch.setenv("WORKSPACE_ID", "00000000-0000-0000-0000-000000000000")
monkeypatch.setenv("PLATFORM_URL", "http://test.invalid")
yield
# ============== Drift gate ==============
class TestBackCompatAliases:
def test_tool_commit_memory_alias(self):
import a2a_tools
import a2a_tools_memory
assert a2a_tools.tool_commit_memory is a2a_tools_memory.tool_commit_memory
def test_tool_recall_memory_alias(self):
import a2a_tools
import a2a_tools_memory
assert a2a_tools.tool_recall_memory is a2a_tools_memory.tool_recall_memory
# ============== Import contract ==============
class TestImportContract:
def test_memory_module_does_not_load_a2a_tools(self, monkeypatch):
"""`a2a_tools_memory` must depend on `a2a_tools_rbac` (the layered
architecture) and `a2a_client`, NEVER on the kitchen-sink
`a2a_tools`. Top-level `from a2a_tools import …` would defeat
the modularization goal and risk a circular-import."""
# Drop both modules to control import order
for m in ("a2a_tools", "a2a_tools_memory"):
sys.modules.pop(m, None)
# Import memory module. Should succeed without a2a_tools loaded.
import a2a_tools_memory # noqa: F401
assert "a2a_tools_memory" in sys.modules
def test_a2a_tools_re_exports_memory_handlers(self):
"""The opposite direction: a2a_tools must surface every memory
symbol so existing call sites + tests work unchanged."""
import a2a_tools
assert hasattr(a2a_tools, "tool_commit_memory")
assert hasattr(a2a_tools, "tool_recall_memory")
@@ -0,0 +1,92 @@
"""Drift gate + smoke tests for ``a2a_tools_messaging`` (RFC #2873 iter 4d).
The full behavior matrix lives in ``test_a2a_tools_impl.py`` —
TestToolSendMessageToUser + TestToolListPeers + TestToolGetWorkspaceInfo
+ TestChatHistory all patch ``a2a_tools_messaging.foo`` after the iter
4d retarget.
This file pins:
1. **Drift gate** — every previously-public symbol on ``a2a_tools``
is the EXACT same callable / value as ``a2a_tools_messaging.foo``.
Wraps would silently lose existing test coverage; this gate
fails fast on that drift.
2. **Import contract** — ``a2a_tools_messaging`` does NOT pull in
``a2a_tools`` at module-load time (the layered architecture: it
depends on ``a2a_tools_rbac`` + ``a2a_client`` + ``platform_auth``,
never the kitchen-sink module).
"""
from __future__ import annotations
import sys
import pytest
@pytest.fixture(autouse=True)
def _require_workspace_id(monkeypatch):
monkeypatch.setenv("WORKSPACE_ID", "00000000-0000-0000-0000-000000000000")
monkeypatch.setenv("PLATFORM_URL", "http://test.invalid")
yield
# ============== Drift gate ==============
class TestBackCompatAliases:
def test_tool_send_message_to_user_alias(self):
import a2a_tools
import a2a_tools_messaging
assert (
a2a_tools.tool_send_message_to_user
is a2a_tools_messaging.tool_send_message_to_user
)
def test_tool_list_peers_alias(self):
import a2a_tools
import a2a_tools_messaging
assert a2a_tools.tool_list_peers is a2a_tools_messaging.tool_list_peers
def test_tool_get_workspace_info_alias(self):
import a2a_tools
import a2a_tools_messaging
assert (
a2a_tools.tool_get_workspace_info
is a2a_tools_messaging.tool_get_workspace_info
)
def test_tool_chat_history_alias(self):
import a2a_tools
import a2a_tools_messaging
assert a2a_tools.tool_chat_history is a2a_tools_messaging.tool_chat_history
def test_upload_chat_files_alias(self):
import a2a_tools
import a2a_tools_messaging
assert a2a_tools._upload_chat_files is a2a_tools_messaging._upload_chat_files
# ============== Import contract ==============
class TestImportContract:
def test_messaging_module_does_not_load_a2a_tools(self, monkeypatch):
"""`a2a_tools_messaging` must depend on `a2a_tools_rbac` (the
layered architecture), `a2a_client`, and `platform_auth` — but
NEVER on the kitchen-sink `a2a_tools`. Top-level
`from a2a_tools import …` would re-introduce the circular
dependency that motivated the lazy-import contract for the
delegation module."""
for m in ("a2a_tools", "a2a_tools_messaging"):
sys.modules.pop(m, None)
import a2a_tools_messaging # noqa: F401
assert "a2a_tools_messaging" in sys.modules
def test_a2a_tools_re_exports_messaging_handlers(self):
"""Opposite direction: a2a_tools surfaces every messaging
symbol so existing call sites + tests work unchanged."""
import a2a_tools
assert hasattr(a2a_tools, "tool_send_message_to_user")
assert hasattr(a2a_tools, "tool_list_peers")
assert hasattr(a2a_tools, "tool_get_workspace_info")
assert hasattr(a2a_tools, "tool_chat_history")
assert hasattr(a2a_tools, "_upload_chat_files")
+126
View File
@@ -229,3 +229,129 @@ class TestResolveWorkspacesDirect:
out, errors = mcp_workspace_resolver.resolve_workspaces()
assert out == [("ws-a", "a"), ("ws-b", "b")]
assert errors == []
# ============== Token-from-file env var (issue #2934) ==============
class TestTokenFileEnv:
"""``MOLECULE_WORKSPACE_TOKEN_FILE`` lets operators keep the bearer
out of shell history and out of MCP-host config plaintext (e.g.
~/.claude.json). Resolution order: inline TOKEN env > TOKEN_FILE
env > ${CONFIGS_DIR}/.auth_token.
"""
@pytest.fixture(autouse=True)
def _isolate(self, monkeypatch, tmp_path):
for v in (
"WORKSPACE_ID",
"MOLECULE_WORKSPACE_TOKEN",
"MOLECULE_WORKSPACE_TOKEN_FILE",
"MOLECULE_WORKSPACES",
):
monkeypatch.delenv(v, raising=False)
# Point CONFIGS_DIR at an empty tmp_path so the .auth_token
# fallback returns "" — keeps the test cases unambiguous.
monkeypatch.setenv("CONFIGS_DIR", str(tmp_path))
yield tmp_path
def test_token_file_env_resolves(self, monkeypatch, tmp_path):
token_path = tmp_path / "token.txt"
token_path.write_text("file-tok-123\n") # trailing newline must strip
monkeypatch.setenv("WORKSPACE_ID", "ws-1")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN_FILE", str(token_path))
out, errors = mcp_workspace_resolver.resolve_workspaces()
assert out == [("ws-1", "file-tok-123")]
assert errors == []
def test_inline_token_takes_precedence_over_file(self, monkeypatch, tmp_path):
# If both env vars are set, inline wins — matches the docstring's
# documented order. (Operators sometimes set both during a
# rotation; we want predictable behavior.)
token_path = tmp_path / "token.txt"
token_path.write_text("file-tok")
monkeypatch.setenv("WORKSPACE_ID", "ws-1")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN", "inline-tok")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN_FILE", str(token_path))
out, _ = mcp_workspace_resolver.resolve_workspaces()
assert out == [("ws-1", "inline-tok")]
def test_missing_file_returns_specific_error(self, monkeypatch, tmp_path):
# Operator EXPLICITLY pointed TOKEN_FILE at a non-existent path —
# surface the SPECIFIC failure (not the generic "set one of these
# three vars" message). Otherwise they hit the silent failure mode
# #2934 flagged ("a new user has no chance").
bad_path = tmp_path / "does-not-exist"
monkeypatch.setenv("WORKSPACE_ID", "ws-1")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN_FILE", str(bad_path))
out, errors = mcp_workspace_resolver.resolve_workspaces()
assert out == []
assert len(errors) == 1
assert "MOLECULE_WORKSPACE_TOKEN_FILE" in errors[0]
assert "does not exist" in errors[0]
assert str(bad_path) in errors[0]
def test_empty_file_returns_specific_error(self, monkeypatch, tmp_path):
# Blank file — operator's intent was clearly the file path, so a
# generic "no token" error would mask their config bug.
token_path = tmp_path / "empty.txt"
token_path.write_text("")
monkeypatch.setenv("WORKSPACE_ID", "ws-1")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN_FILE", str(token_path))
out, errors = mcp_workspace_resolver.resolve_workspaces()
assert out == []
assert len(errors) == 1
assert "MOLECULE_WORKSPACE_TOKEN_FILE" in errors[0]
assert "is empty" in errors[0]
def test_multi_line_file_rejected(self, monkeypatch, tmp_path):
# CSV cell or accidental multi-token paste — would otherwise become
# a malformed bearer that 401s against the platform with no
# diagnostic. Reject upfront with a specific error.
token_path = tmp_path / "junk.txt"
token_path.write_text("tok-a tok-b\n")
monkeypatch.setenv("WORKSPACE_ID", "ws-1")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN_FILE", str(token_path))
out, errors = mcp_workspace_resolver.resolve_workspaces()
assert out == []
assert len(errors) == 1
assert "internal whitespace" in errors[0]
def test_token_file_error_skips_configs_dir_fallback(
self, monkeypatch, tmp_path
):
# When TOKEN_FILE is explicitly set but broken, do NOT fall through
# to a valid CONFIGS_DIR/.auth_token — the operator's intent is
# clearly to use the file path; deferring to a different source
# would mask their config error.
configs_dir = tmp_path / "configs"
configs_dir.mkdir()
(configs_dir / ".auth_token").write_text("configs-tok")
monkeypatch.setenv("CONFIGS_DIR", str(configs_dir))
monkeypatch.setenv("WORKSPACE_ID", "ws-1")
monkeypatch.setenv(
"MOLECULE_WORKSPACE_TOKEN_FILE", str(tmp_path / "missing")
)
out, errors = mcp_workspace_resolver.resolve_workspaces()
assert out == []
# Specific TOKEN_FILE error — not the generic "no token" fallback
# and crucially not the silent success of using configs-tok.
assert len(errors) == 1
assert "does not exist" in errors[0]
def test_blank_env_var_treated_as_unset(self, monkeypatch):
# Empty string is treated as "not set" — common pitfall when
# users export an unset shell var.
monkeypatch.setenv("WORKSPACE_ID", "ws-1")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN_FILE", "")
out, errors = mcp_workspace_resolver.resolve_workspaces()
assert out == []
assert errors
def test_help_message_advertises_token_file(self, capsys):
# Help text must mention TOKEN_FILE so a first-run operator
# learns about the safer option without grepping the source.
mcp_workspace_resolver.print_missing_env_help(
["WORKSPACE_ID", "MOLECULE_WORKSPACE_TOKEN"], have_token_file=False
)
err = capsys.readouterr().err
assert "MOLECULE_WORKSPACE_TOKEN_FILE" in err
+198
View File
@@ -0,0 +1,198 @@
"""Tests for the molecule-mcp doctor subcommand (#2934 item 6).
Each `check_*` function is unit-tested in isolation via env
manipulation. The integration test (`test_run_no_env_returns_1`) pins
the end-to-end exit code on a stripped environment — what an operator
running the command for the first time on an untouched shell sees.
"""
from __future__ import annotations
import os
import sys
from pathlib import Path
from unittest import mock
import pytest
# Workspace tests run from the workspace/ directory; mcp_doctor is
# imported with the same `import mcp_doctor` shape as the rest of
# the runtime (per pyproject's package layout).
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
import mcp_doctor # noqa: E402
def test_module_exposes_six_checks():
"""The doctor's checklist is six items today. Pin the count so
a future PR that drops a check (e.g. silently merges two) gets
flagged in review.
"""
assert len(mcp_doctor.CHECKS) == 6
def test_check_python_version_passes_on_311_plus():
"""Pin the floor at 3.11 (matches the wheel's requires_python)."""
with mock.patch.object(sys, "version_info", (3, 11, 0, "final", 0)):
assert mcp_doctor.check_python_version() == "ok"
with mock.patch.object(sys, "version_info", (3, 12, 5, "final", 0)):
assert mcp_doctor.check_python_version() == "ok"
def test_check_python_version_fails_on_310():
"""3.10 is below the wheel's >=3.11 floor — must FAIL, not WARN.
pip silently filters the wheel out on 3.10 with `from versions:
none`, which reads as "package missing" — operators have spent
45min chasing that. The doctor's job is to call this out
explicitly.
"""
with mock.patch.object(sys, "version_info", (3, 10, 12, "final", 0)):
assert mcp_doctor.check_python_version() == "fail"
def test_check_env_vars_fails_when_all_unset(monkeypatch):
monkeypatch.delenv("PLATFORM_URL", raising=False)
monkeypatch.delenv("WORKSPACE_ID", raising=False)
monkeypatch.delenv("MOLECULE_WORKSPACES", raising=False)
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN", raising=False)
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN_FILE", raising=False)
assert mcp_doctor.check_env_vars() == "fail"
def test_check_env_vars_passes_with_token_env(monkeypatch):
monkeypatch.setenv("PLATFORM_URL", "https://x.moleculesai.app")
monkeypatch.setenv("WORKSPACE_ID", "ws-test")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN", "tok-abc")
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN_FILE", raising=False)
monkeypatch.delenv("MOLECULE_WORKSPACES", raising=False)
assert mcp_doctor.check_env_vars() == "ok"
def test_check_env_vars_passes_with_token_file(monkeypatch, tmp_path):
"""Ryan #2934 item 3 fix: token from a file (or keychain shim)
instead of inline env var so secrets stay out of shell history.
The doctor must accept that path equally with the inline form.
"""
token_path = tmp_path / "token"
token_path.write_text("tok-from-file")
monkeypatch.setenv("PLATFORM_URL", "https://x.moleculesai.app")
monkeypatch.setenv("WORKSPACE_ID", "ws-test")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN_FILE", str(token_path))
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN", raising=False)
monkeypatch.delenv("MOLECULE_WORKSPACES", raising=False)
assert mcp_doctor.check_env_vars() == "ok"
def test_check_platform_health_warns_when_url_unset(monkeypatch):
monkeypatch.delenv("PLATFORM_URL", raising=False)
assert mcp_doctor.check_platform_health() == "warn"
def test_check_platform_health_fails_on_missing_scheme(monkeypatch):
"""A bare hostname is the second-most-common config error after
missing-token (per the snippet's NOTE on Origin/PLATFORM_URL).
The error message must say 'missing scheme' — not 'DNS error'
so the operator can diagnose without inspecting the URL string.
"""
monkeypatch.setenv("PLATFORM_URL", "x.moleculesai.app")
assert mcp_doctor.check_platform_health() == "fail"
def test_check_register_skipped_without_env(monkeypatch):
monkeypatch.delenv("PLATFORM_URL", raising=False)
monkeypatch.delenv("WORKSPACE_ID", raising=False)
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN", raising=False)
# Skipped (warn), NOT failed — failing here would double-count
# the env-vars failure noise.
assert mcp_doctor.check_register() == "warn"
def test_check_token_auth_uses_heartbeat_endpoint(monkeypatch):
"""Pin: doctor MUST hit /registry/heartbeat, not /registry/register.
register is an UPSERT — using it from doctor would clobber the
workspace's actual agent_card metadata until the real agent next
calls register. heartbeat only updates last_heartbeat_at, which
a normal molecule-mcp boot does every 20s anyway, so the doctor's
extra heartbeat is indistinguishable from background traffic.
This test pins the URL via a urllib mock so a future refactor
that accidentally re-routes through /registry/register fails
here at PR-review time, not after operators report
"doctor-probe" briefly appearing as their agent name in canvas.
"""
monkeypatch.setenv("PLATFORM_URL", "https://x.moleculesai.app")
monkeypatch.setenv("WORKSPACE_ID", "ws-test")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN", "tok-abc")
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN_FILE", raising=False)
captured: dict[str, object] = {}
class _FakeResp:
status = 200
def __enter__(self): return self
def __exit__(self, *a): pass
def fake_urlopen(req, timeout=None):
captured["full_url"] = req.full_url
captured["method"] = req.get_method()
return _FakeResp()
monkeypatch.setattr(mcp_doctor.urllib_request, "urlopen", fake_urlopen)
verdict = mcp_doctor.check_token_auth()
assert verdict == "ok"
assert captured["method"] == "POST"
# The load-bearing assertion — must use heartbeat, never register.
assert captured["full_url"].endswith("/registry/heartbeat"), (
f"doctor must use /registry/heartbeat (idempotent), not register "
f"(UPSERT — clobbers agent_card). Got: {captured['full_url']}"
)
assert "/registry/register" not in str(captured["full_url"]), (
"doctor must NEVER POST to /registry/register — that's a UPSERT "
"that overwrites agent_card metadata until the real agent next "
"calls register."
)
def test_resolve_token_returns_value_and_label_for_env(monkeypatch):
"""The single resolver returns both the value (for Bearer header)
and a non-secret label (for the env-vars summary). Drift between
label and value is the previous bug shape."""
monkeypatch.setenv("PLATFORM_URL", "https://x.moleculesai.app")
monkeypatch.setenv("MOLECULE_WORKSPACE_TOKEN", "secret-tok-abc")
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN_FILE", raising=False)
val, label = mcp_doctor._resolve_token()
assert val == "secret-tok-abc"
assert label == "env MOLECULE_WORKSPACE_TOKEN"
# Summary helper must agree with the resolver's source.
assert mcp_doctor._resolve_token_summary() == label
def test_resolve_token_returns_none_when_missing(monkeypatch):
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN", raising=False)
monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN_FILE", raising=False)
val, label = mcp_doctor._resolve_token()
assert val is None
assert label is None
def test_run_returns_1_when_any_fail(monkeypatch, capsys):
"""End-to-end: stripped environment → at least one FAIL →
exit 1. Pin the exit-code contract so this is scriptable from
CI / install-checks too.
"""
for k in (
"PLATFORM_URL",
"WORKSPACE_ID",
"MOLECULE_WORKSPACES",
"MOLECULE_WORKSPACE_TOKEN",
"MOLECULE_WORKSPACE_TOKEN_FILE",
):
monkeypatch.delenv(k, raising=False)
code = mcp_doctor.run()
out = capsys.readouterr().out
assert code == 1
# The summary line must mention at least one failure count so
# an automated wrapper can grep for it.
assert "check(s) failed" in out
# And the human-facing label must be present so someone reading
# CI logs sees what the section is about, not a wall of [FAIL].
assert "molecule-mcp doctor" in out
+5 -5
View File
@@ -63,7 +63,7 @@ async def test_commit_memory_success(monkeypatch):
mcp = _load_mcp()
client = FakeClient()
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: client)
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: client)
result = await mcp.handle_tool_call("commit_memory", {
"content": "Architecture decision: use Go for backend",
@@ -92,7 +92,7 @@ async def test_commit_memory_default_scope(monkeypatch):
mcp = _load_mcp()
client = FakeClient()
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: client)
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: client)
result = await mcp.handle_tool_call("commit_memory", {
"content": "Some note",
@@ -108,7 +108,7 @@ async def test_recall_memory_success(monkeypatch):
mcp = _load_mcp()
client = FakeClient()
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: client)
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: client)
result = await mcp.handle_tool_call("recall_memory", {"query": "architecture"})
@@ -127,7 +127,7 @@ async def test_recall_memory_empty(monkeypatch):
async def get(self, url, params=None, headers=None, **kwargs):
return FakeResponse(200, [])
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: EmptyClient())
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: EmptyClient())
result = await mcp.handle_tool_call("recall_memory", {})
assert "No memories found" in result
@@ -139,7 +139,7 @@ async def test_recall_memory_with_scope_filter(monkeypatch):
mcp = _load_mcp()
client = FakeClient()
monkeypatch.setattr("a2a_tools.httpx.AsyncClient", lambda **kw: client)
monkeypatch.setattr("a2a_tools_memory.httpx.AsyncClient", lambda **kw: client)
await mcp.handle_tool_call("recall_memory", {"scope": "TEAM"})
+2 -2
View File
@@ -357,7 +357,7 @@ class TestA2AToolCommitMemoryRedactsSecrets:
fake_client.post = _capture
with patch("a2a_tools.httpx.AsyncClient", return_value=fake_client):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=fake_client):
await a2a_tools.tool_commit_memory(content_with_secret)
stored = captured.get("content", "")
@@ -385,7 +385,7 @@ class TestA2AToolCommitMemoryRedactsSecrets:
fake_client.post = _capture
with patch("a2a_tools.httpx.AsyncClient", return_value=fake_client):
with patch("a2a_tools_memory.httpx.AsyncClient", return_value=fake_client):
await a2a_tools.tool_commit_memory(f"key={key}")
stored = captured.get("content", "")