docs/update-remote-agent-tutorial-sdk-api
273 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
318e0ad742 |
fix(workspace): skip idle prompt when delegation results are pending (#381) (#432)
CI / Detect changes (push) Waiting to run
CI / Platform (Go) (push) Blocked by required conditions
CI / Canvas (Next.js) (push) Blocked by required conditions
CI / Shellcheck (E2E scripts) (push) Blocked by required conditions
CI / Canvas Deploy Reminder (push) Blocked by required conditions
CI / Python Lint & Test (push) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (push) Successful in 13s
E2E API Smoke Test / detect-changes (push) Successful in 1m12s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 1m16s
Handlers Postgres Integration / detect-changes (push) Successful in 1m13s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 18s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 1m3s
publish-runtime-autobump / autobump-and-tag (push) Failing after 1m34s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Has started running
Co-authored-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> Co-committed-by: Molecule AI Infra-Runtime-BE <infra-runtime-be@agents.moleculesai.app> |
||
|
|
39db2e6d73 |
fix(workspace): complete OFFSEC-003 fix — promote full sanitization to main
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 20s
sop-tier-check / tier-check (pull_request) Successful in 20s
CI / Detect changes (pull_request) Successful in 59s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 1m3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 58s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 57s
audit-force-merge / audit (pull_request) Successful in 20s
CI / Platform (Go) (pull_request) Successful in 9s
CI / Canvas (Next.js) (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 9s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m29s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Failing after 7m40s
Promotes the complete OFFSEC-003 boundary-marker sanitization from staging to main, including: - _delegate_sync_via_polling: sanitize response_preview and error strings before returning (OFFSEC-003 polling-path fix from PR #417). - tool_check_task_status JSON endpoint: sanitize summary + response_preview in both the task_id filter path and the list path. - tool_delegate_task non-polling path: preserve main's existing sanitize_a2a_result(result) wrapper (staging accidentally removed it). Closes #418. Co-Authored-By: Molecule AI · core-be <core-be@agents.moleculesai.app> |
||
|
|
93b7d9a88a |
fix(a2a_tools): add comment + test coverage for string-form error handling in delegate_task
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
sop-tier-check / tier-check (pull_request) Manual override — infra#241 duplicate runner fails immediately. PR only adds comment + tests to a2a_tools.py. core-qa APPROVED.
audit-force-merge / audit (pull_request) Successful in 2s
Staging branch
|
||
|
|
99f3cf7c8f |
[core-be-agent] fix(#354): wire delegation-results consumer into a2a executor
Close the A2A delegation auto-resume gap. Root cause: heartbeat.py's _check_delegations already writes completed delegation rows to DELEGATION_RESULTS_FILE and sends a self-message to wake the agent. executor_helpers.read_delegation_results() was defined to atomically consume that file, but a2a_executor._core_execute() never called it — so delegation results were written but the agent never saw them. Fix: call read_delegation_results() at the top of _core_execute() and prepend the results to the user input context so the agent can act on them without an explicit check_task_status call. The Temporal durable workflow path is also covered because it calls _core_execute() directly. Test: two new cases — delegation results injected when file exists; user input passed through unchanged when file is empty. Closes molecule-core#354. |
||
|
|
3eb3609b0c |
test(workspace): add queue_id-absence and push-vs-poll distinction tests
Incorporates valuable extra coverage from fullstack-engineer's PR #336: - test_push_queued_missing_queue_id_still_parsed: queue_id is optional, absence must not break parsing - test_push_queued_is_distinct_from_poll_queued: both envelope shapes parse correctly and independently, with correct delivery_mode values Also adds push_queued_no_queue_id fixture and regression gate entry. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
0a9b66a3ed |
fix(workspace): push-mode Queued returns delivery_mode="push" (not silent default "poll")
Bug: a2a_response.py:197 returned Queued(method=method) without passing
delivery_mode, silently defaulting to "poll" for push-mode busy-queue
responses. Callers branching on v.delivery_mode would mis-identify push-mode
responses as poll-mode, causing wrong dispatch logic.
Fix: pass delivery_mode="push" explicitly in the push-mode branch.
Tests: add push_queued_full/notify/no_method fixtures and 4 test cases
asserting delivery_mode="push" for all three envelope shapes. Also add
adversarial {"queued": "yes"} and {"queued": False} → Malformed guards.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
a205099652 |
fix(security): OFFSEC-003 — boundary-marker escape + shared sanitizer
Root cause (from infra-lead PR#7 review id=724): Sanitization in PR#7 wrapped peer text in [A2A_RESULT_FROM_PEER] markers, but the markers themselves were not escaped — a malicious peer could inject "[/A2A_RESULT_FROM_PEER]" to close the trust boundary early, making subsequent text appear inside the trusted zone. Fix: - Create workspace/_sanitize_a2a.py (leaf module, no circular import risk) with shared sanitize_a2a_result() + _escape_boundary_markers() - _escape_boundary_markers() escapes boundary open/close markers in the raw peer text before wrapping (primary security control) - Defense-in-depth: also escapes SYSTEM/OVERRIDE/INSTRUCTIONS/IGNORE ALL/YOU ARE NOW patterns (secondary, per PR#7 design intent) - Update a2a_tools_delegation.py: import from _sanitize_a2a; wrap tool_delegate_task return and tool_check_task_status response_preview - Add 15 tests covering boundary escape, injection patterns, integration shapes (workspace/tests/test_a2a_sanitization.py) Follow-up (non-blocking, noted in PR#7 infra-lead review): - Deduplicate if a2a_tools.py also wraps (currently handled in delegation module only — callers get sanitized output regardless) - tool_check_task_status: consider sanitizing 'summary' field too Closes: molecule-ai/molecule-ai-workspace-runtime#7 (wrong-repo PR that this supersedes) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
5ecec3f253 |
Merge pull request 'fix(a2a): reject delegate_task to your own workspace ID (self-deadlock guard)' (#291) from fix/self-delegation-guard into main
Secret scan / Scan diff for credential-shaped strings (push) Successful in 5s
|
||
|
|
f58a11d171 |
Merge pull request 'fix(runtime): MODEL_PROVIDER env is misnamed — accept MODEL/MOLECULE_MODEL, deprecate legacy name' (#280) from fix/model-provider-misnomer into main
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
|
||
|
|
31ed137b74 |
fix(a2a): reject delegate_task / delegate_task_async to your own workspace ID
Self-delegation deadlocks: the sending turn holds `_run_lock`, the receive handler waits for the same lock, the A2A request 30s-times-out, and the whole cycle is wasted (the Dev Lead system prompt warns agents off this by hand — "Never delegate_task to your own workspace ID … there is no peer who is also you"). The platform/runtime had no guard. Now both `tool_delegate_task` and `tool_delegate_task_async` early-return an actionable error when `workspace_id == effective_source` (`source_workspace_id or _peer_to_source[target] or WORKSPACE_ID`) — before `discover_peer`, so no network round-trip is wasted either. A genuinely different target (incl. another of a multi-workspace agent's own registered workspaces) is unaffected. Tests: tests/test_a2a_tools_delegation.py — new TestSelfDelegationGuard (4 cases: rejects own ID; rejects when source_workspace_id explicitly == target; async path rejects; a different target passes the guard through to discover_peer). `pytest tests/test_a2a_tools_delegation.py` → 12 passed. (tests/test_a2a_tools_impl.py's TestToolDelegateTask* suite is red on this PC2/Windows checkout — same on `main` without this change; httpx-mock infra, not this PR — CI validates on Linux.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
fe1b3d9a82 | Merge branch 'main' into fix/a2a-tools-and-workflow-cleanup | ||
|
|
e647efe7c5 |
fix(a2a): handle string error in a2a_tools.py + remove dead staging trigger
Two-part fix from PR #268 (ported by Integration Tester after PR #268 was closed without merge): PART 1 — workspace/builtin_tools/a2a_tools.py: Fixes AttributeError when platform returns a plain string as the error field. Before: data["error"].get("message") ← crashes if error is a string After: isinstance(err, dict) → err.get("message") isinstance(err, str) → use err directly otherwise → str(err) Also guards result.get("parts") against non-dict result. Includes fix for issue #279: empty-parts regression where {"parts": []} returned "(no text)" instead of str(result). PART 2 — .gitea/workflows/ and .github/workflows/ publish-workspace-server-image.yml: Removed dead "staging" branch trigger. Trunk-based migration (2026-05-08) removed the staging branch but the workflow triggers were not updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
2ba3af5330 |
fix(runtime): MODEL_PROVIDER env is misnamed — accept MODEL/MOLECULE_MODEL, deprecate the legacy name
`molecule_runtime.config.load_config` read the `MODEL_PROVIDER` env var as the *picked model id* — despite the name, it never carried the provider (that's `LLM_PROVIDER` / the YAML `provider:` field). So `claude-code`, `minimax`, and `opus` were all "valid" values for a var named MODEL_PROVIDER. That footgun bit the dev-team rollout (2026-05-10): the lead persona env files set `MODEL=claude-opus-4-7` (the intended model) *and* `MODEL_PROVIDER=claude-code` (mistaking it for "the runtime"); the loader picked up MODEL_PROVIDER → the claude CLI got `--model claude-code` → 404 on every turn, surfaced only as "Command failed with exit code 1" with empty stderr (the real error is in the stream-json stdout, swallowed by the SDK's placeholder). The 22 IC workspaces "worked" only because their `MODEL_PROVIDER=minimax` happened to fuzzy-match on MiniMax's side — they were actually running `--model minimax`, not `MiniMax-M2.7-highspeed`. New precedence in `_picked_model_from_env`: `MOLECULE_MODEL` (canonical, unambiguous) > `MODEL` (the obviously-correct name, already plumbed by workspace-server's applyRuntimeModelEnv) > `MODEL_PROVIDER` (legacy — still honored so canvas Save+Restart, the secret-mint path, and existing persona env files keep working, but if it's the only one set we log a one-time deprecation pointing at the misnomer) > the YAML `model:` field. Applied at both the top-level `model` and `runtime_config.model` resolution sites; semantics are otherwise unchanged. Bonus: workspaces that already set `MODEL` correctly now get exactly that model instead of whatever fuzzy-match the upstream did with the provider slug. Tests: 5 new cases in test_config.py (MODEL beats MODEL_PROVIDER; MOLECULE_MODEL beats MODEL; MODEL overrides YAML; legacy MODEL_PROVIDER still resolves + warns; no warning when MODEL is set) + an autouse fixture that clears MODEL*/resets the warn-latch so resolution is deterministic regardless of the CI env or test order. `pytest tests/test_config.py` — 66 passed; the config-importing suites (test_preflight, test_skills_loader) — 129 passed. Companion: molecule-dev-department PR #10 fixes the six dev-team lead `workspace.yaml`s from `model: MiniMax-M2.7` to `model: opus`. Follow-ups (not in scope here): plumb `MOLECULE_MODEL` from applyRuntimeModelEnv and the canvas; strip `MODEL`/`MODEL_PROVIDER` from the operator-host persona env files once the org-template `model:` field is authoritative end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
736d9959bc |
fix(a2a): handle push-mode queue envelope in response parser
When a push-mode workspace (one with a public URL) is at capacity, the
platform queues the delegation request and returns:
{"queued": true, "message": "...", "queue_depth": N, "queue_id": "..."}
The existing SSOT parser (a2a_response.py) only handled the poll-mode
envelope (status=queued + delivery_mode=poll). Push-mode queue
responses fell through to Malformed, causing send_a2a_message to log a
warning and return an error — even though delivery was actually queued
successfully.
Fix: add handling for data.get("queued") is True as a Queued variant
with delivery_mode="push". Checked before the poll-mode envelope so the
two cases are mutually exclusive.
Fixes observed 2026-05-10: platform returning push-mode queue
envelopes to Integration Tester when Release Manager workspace was at
capacity.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
7ae3ee786f |
feat(workspace): add static .github-token fallback to git credential helper
Adds a 4th fallback step to the token chain (cache > API > env > static)
so workspace git/gh operations survive a platform outage without requiring
a restart or platform-side fix. Addresses the 2026-05-08 incident where
every workspace lost git+gh auth simultaneously when the
/github-installation-token endpoint returned 500.
Operator places a PAT in ${CONFIGS_DIR:-/configs}/.github-token
(no root needed — /configs is agent-writable). Both _fetch_token
(git credential helper path) and _refresh_gh (gh CLI daemon path)
gain the static fallback so git and gh both recover post-incident.
Pure additive — existing cache > API > env chain is unchanged.
Empty static file is rejected (whitespace-stripped before use).
Static path never writes the cache, so the API recovers transparently
on the next refresh cycle when it comes back online.
Ref: issue #140.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|
|
1492b40b38 |
ci(docker): pin base image digests in all Dockerfiles
Pins all FROM image tags to exact SHA256 digests for reproducible builds. Without digest pinning, a registry push of a new image to the same tag can silently change the layer content between builds — a supply-chain risk especially for prod-deployed images. Pinned images (7 Dockerfiles): - golang:1.25-alpine → sha256:c4ea15b... (workspace-server/Dockerfile, Dockerfile.dev, Dockerfile.tenant, tests/harness/cp-stub/Dockerfile) - alpine:3.20 → sha256:c64c687c... (workspace-server/Dockerfile, tests/harness/cp-stub/Dockerfile) - node:20-alpine → sha256:afdf982... (workspace-server/Dockerfile.tenant) - node:22-alpine → sha256:cb15fca... (canvas/Dockerfile) - python:3.11-slim → sha256:e78299e... (workspace/Dockerfile) - nginx:1.27-alpine → sha256:62223d6... (tests/harness/cf-proxy/Dockerfile) Note: docker-compose.yml service images (postgres, redis, clickhouse, litellm, ollama) are intentionally left on major-version tags — those are runtime-pulled and updated regularly for local-dev ergonomics. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
76ac5a88dc |
[core-be-agent]
fix(tests): clear platform_auth cache before each test Fixes issue #160: workspace tests fail when MOLECULE_WORKSPACE_TOKEN is set in the environment. The bug: platform_auth._cached_token is populated at module import or first get_token() call and persists for the process lifetime. Tests that use monkeypatch.delenv("MOLECULE_WORKSPACE_TOKEN") to simulate "no token in env" were failing because delenv removes the env var but not the module-level cache — subsequent get_token() calls returned the stale cached value. Fix: add a function-scoped autouse fixture in conftest.py that calls platform_auth.clear_cache() before every test. The import is inside the fixture to avoid collection-time import issues when platform_auth is not yet available. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
57aedec1a3 |
fix(tests): isolate token resolution from real .auth_token on disk
Issue #160: workspace tests fail when MOLECULE_WORKSPACE_TOKEN is set in the test environment (or when /configs/.auth_token exists on disk, as it does in a container CI runner). Root cause: - test_resolve_token_returns_none_when_missing: monkeypatch.delenv() removes the env var, but _resolve_token() falls through to configs_dir.resolve()/.auth_token which exists in the container. - Multi-workspace tests: clear_cache() resets _cached_token, but get_token() immediately re-reads /configs/.auth_token and caches the real token before the env var is even checked. Fix: - test_mcp_doctor: patch configs_dir.resolve() to return a bare tmp_path so the disk-file fallback finds nothing. - Multi-workspace tests: patch platform_auth._token_file() to return a non-existent path (via tmp_path) alongside clear_cache(), ensuring the env var wins as intended. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
252f8d0c47 |
tech-debt: rename molecule-monorepo-net -> molecule-core-net
Renames Docker network across all code, configs, scripts, and docs. Per issue #93: the network was named molecule-monorepo-net as a holdover from when the repo was called molecule-monorepo. The canonical repo name is now molecule-core, so the network should be molecule-core-net. Files changed: - docker-compose.yml, docker-compose.infra.yml: network definition - infra/scripts/setup.sh: docker network create - scripts/nuke-and-rebuild.sh: docker network rm - workspace-server/internal/provisioner/provisioner.go: DefaultNetwork - All comments/docs: updated wording Acceptance: grep -rn 'molecule-monorepo-net' returns zero matches. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|
|
6193f67bc0 |
fix(workspace): set git user.name/email from $GITEA_USER at boot (#156)
Secret scan / Scan diff for credential-shaped strings (push) Successful in 4s
Closes #155. [FORCE-MERGE AUDIT — §SOP-7] - Approver: hongming (Gitea PR review APPROVED 2026-05-09T20:27:01Z) - Chat-go: explicit go in conversation transcript ~20:39 UTC after Hongming clicked approve - Bypassed: required status checks (all pending forever — likely runner pickup issue, separate from this PR's correctness) - Audit channel: orchestrator force-merge log + this commit message Next: workspace runtime image rebuilds via publish-runtime.yml; new workspaces pick up persistent persona git identity. |
||
|
|
a4fc04189c |
fix(workspace): set git user.name/email from $GITEA_USER at boot
branch-protection drift check / Branch protection drift (pull_request) Successful in 10s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 14s
cascade-list-drift-gate / check (pull_request) Successful in 18s
Check migration collisions / Migration version collision check (pull_request) Successful in 23s
CI / Detect changes (pull_request) Successful in 24s
E2E API Smoke Test / detect-changes (pull_request) Successful in 24s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 24s
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 22s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 16s
Harness Replays / detect-changes (pull_request) Successful in 25s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 20s
sop-tier-check / tier-check (pull_request) Failing after 21s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 38s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 26s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 41s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m46s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 13s
Harness Replays / Harness Replays (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1m31s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4m8s
CI / Python Lint & Test (pull_request) Successful in 8m54s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Canvas (Next.js) (pull_request) Failing after 10m21s
CI / Platform (Go) (pull_request) Successful in 13m8s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 22m59s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 23m26s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 23m31s
audit-force-merge / audit (pull_request) Successful in 4s
Closes #155. Without this, every commit from a workspace booted via the standard provisioner lands with an empty `user.name`/`user.email` and Gitea attributes the work to whichever PAT pushed (typically the founder's `claude-ceo-assistant`), instead of the persona that actually authored the commit. That's the same fingerprint pattern that got us suspended on GitHub 2026-05-06. GITEA_USER is already injected per-workspace by the provisioner from workspace_secrets (verified: 8/8 Core-* workspaces have it set, correctly-named, on operator + local). Boot picks it up unconditionally; falls through cleanly if unset (e.g. legacy boxes without persona identity wiring). Email uses `bot.moleculesai.app` so agent commits are visually distinct from human-authored commits in Gitea history. The `gitconfig` copy from `/root/.gitconfig` to `/home/agent/.gitconfig` is now unconditional — previously it was nested inside the `molecule-git-token-helper.sh` block, which meant the per-persona identity wouldn't propagate to the agent user when the helper was unavailable. Also added an inline note that the github.com credential-helper block is post-suspension legacy. Full removal tracked under #171; this PR deliberately doesn't touch it (smaller blast radius). Tested: docker exec sets the same config in 8 running Core-* workspaces locally and they pick up correct identity for `git config -l`. Will reset when those containers restart, hence this PR for the persistent fix. |
||
|
|
bd145dcec6 |
docs(workspace-runtime): migrate github.com refs at source so mirror inherits Gitea links (internal#41)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 5s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 4s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 12s
CI / Python Lint & Test (pull_request) Failing after 12s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 11s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 41s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m18s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m21s
The molecule-ai-workspace-runtime mirror is regenerated on every
runtime-v* tag from this monorepo's workspace/. Per saved memory
reference_runtime_repo_is_mirror_only, mirror-guard rejects direct
PRs to the mirror; edit at source.
Source-side files that propagate to the mirror's published README +
read by users of the in-monorepo workspace-runtime docs:
- scripts/build_runtime_package.py (the README generator):
* line 281 README_TEMPLATE: 'Shared workspace runtime for Molecule
AI' link → Gitea
* line 399 doc-link to workspace-runtime-package.md → Gitea path
(with /src/branch/main/ shape)
LEFT AS-IS (per Q3 audit-trail decision):
* lines 379, 392 historical issue cross-refs (#2936, #2937)
- workspace/build-all.sh:5 — comment block linking to template-*
repos. Migrated to Gitea path-shape.
- docs/workspace-runtime-package.md:
* lines 101-108 adapter→repo table (8 templates, all PUBLIC on
Gitea) — Gitea URLs
* line 247 starter-repo link — substituted host + added inline
note that starter doesn't survive the suspension migration
(recreation pending; cross-link to this issue)
* line 259 generic git clone command for new templates → Gitea
* line 289 second starter mention — same handling as 247
Files NOT touched in this PR:
- workspace/ Python source code (.py files) — those use github
paths in docstrings + a few log strings; fix bundled with the
cross-repo Go-module-style migration (per #37 Q5 + parked
follow-ups).
- 'Writing a new adapter' section's `gh repo create` command (line
254-256) — gh CLI doesn't talk to Gitea (per #45 parked follow-up).
- 'Writing a new adapter' section's ghcr.io image ref (line 276) —
per #46 ghcr→ECR migration (separate concern).
After this PR merges to staging + a runtime-v* tag is pushed, the
mirror's published README will inherit the Gitea link. Until then
the mirror's README continues to reference github.com/Molecule-AI
(stale but historical-marker-correct since the mirror existed
pre-suspension).
Refs: molecule-ai/internal#41, molecule-ai/internal#37,
molecule-ai/internal#38, molecule-ai/internal#42,
molecule-ai/internal#45, molecule-ai/internal#46
|
||
|
|
166ad20cd7 |
test(e2e): Phase 3.5 — wheel parser classifies real server response (#2967)
Previously Phase 3 only checked the workspace-server's poll-mode short-circuit
emit shape ({"status":"queued","delivery_mode":"poll","method":"..."}); the
matching client-side classification was tested in isolation against fixture
dicts in test_a2a_response.py.
This phase closes the loop by piping the actual on-the-wire response from a
real workspace-server back through the wheel's a2a_response.parse() and
asserting it classifies as the Queued variant with the right method +
delivery_mode. A regression in EITHER the server emit shape OR the client
parser will now fail this E2E, eliminating the gap that allowed the original
"unexpected response shape" production bug to ship despite green unit tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
8b9f809966 |
fix(a2a): SSOT response parser — handle poll-mode queued envelope (#2967)
Introduce ``workspace/a2a_response.py`` as the single source of truth for
the wire shapes the workspace-server proxy can return at
``/workspaces/<id>/a2a``:
* ``Result`` — JSON-RPC success
* ``Error`` — JSON-RPC error or platform-level error (with
restart-in-progress metadata when present)
* ``Queued`` — poll-mode short-circuit envelope: the platform
queued the message into the target's inbox, the
target will fetch via /activity poll
* ``Malformed`` — anything the parser can't classify (logged at
WARNING so a future server change is loud)
``send_a2a_message`` (in ``a2a_client.py``) now dispatches via
``a2a_response.parse(data)`` instead of inline ``"result" in data`` /
``"error" in data`` sniffing. The Queued variant returns a new
``_A2A_QUEUED_PREFIX`` sentinel so callers can distinguish "delivered
async, no synchronous reply" from both success-with-text and failure.
reno-stars production data caught two intermittent failures that
both reduced to the same root cause:
1. **File transfer announce silently failed** — when CEO Ryan PC
(poll-mode external molecule-mcp) sent the harmi.zip
announcement to Reno Stars Business Intelligent (also poll-mode
external), ``send_a2a_message`` saw the platform's poll-queued
envelope ``{"status":"queued","delivery_mode":"poll","method":"..."}``,
didn't recognize it as the synthetic delivery-acknowledgement
it is, and returned ``[A2A_ERROR] unexpected response shape``.
The agent fell back to a chunk-shipping path; receiver did get
the file but operator-facing logs showed a failure that didn't
actually fail.
2. **Duplicated agent comm** — same bug, inverted direction. d76
delegated to 67d, send_a2a_message returned the unexpected-shape
error, delegate_task wrapped it as DELEGATION FAILED, the calling
agent retried with sharper wording, the recipient saw the same
request twice and self-reported "二次请求 — 我先不执行".
External molecule-mcp standalone runtimes are inherently poll-mode
(they have no public URL), so every external↔external A2A pair was
hitting this on every send. The pre-fix client only handled JSON-RPC
``result``/``error`` keys and treated the queued envelope (which has
neither) as malformed. RFC #2339 PR 2 added the queued envelope on
the server side; the client never caught up.
When ``send_a2a_message`` returns the ``_A2A_QUEUED_PREFIX`` sentinel,
``tool_delegate_task`` now transparently falls back to
``_delegate_sync_via_polling`` (RFC #2829 PR-5's durable
``/delegate`` + ``/delegations`` polling path, which DOES work for
poll-mode peers because the platform's executeDelegation goroutine
writes to the inbox queue and the result row arrives when the target
picks it up + replies). The agent gets a real synchronous reply
instead of the empty queued sentinel.
* ``test_a2a_response.py`` — 62 tests, **100% line coverage** on
the parser (verified via ``coverage run --source=a2a_response``).
Includes adversarial-input fuzzing across ~25 pathological
payloads — parser must never raise.
* ``test_a2a_client.py::TestSendA2AMessagePollMode`` — 4 tests for
the new Queued/Error wiring in ``send_a2a_message``.
* ``test_delegation_sync_via_polling.py::TestPollModeAutoFallback``
— 3 tests for the auto-fallback in ``tool_delegate_task``,
including negative cases (push-mode reply must NOT trigger
fallback; genuine error must NOT silently retry).
* **Verified all new tests FAIL on pre-fix source** by stashing
a2a_client.py + a2a_tools_delegation.py and re-running — 5
failures including ImportError for the missing
``_A2A_QUEUED_PREFIX``.
Per the operator-debuggability directive:
* INFO at every Queued classification (expected variant; operator
sees normal poll-mode-peer queueing in log stream).
* INFO at the auto-fallback decision in ``tool_delegate_task``
so a future operator can correlate "send returned queued →
falling back to polling path" without reading the source.
* WARNING at every Malformed classification (server contract
drift; operator MUST see this immediately).
* Existing transient-retry WARNING preserved.
* Mirror Go-side typed model in workspace-server. The wire shape
is documented in ``a2a_response.py``'s module docstring with
file:line pointers to the canonical emitters; a future PR can
introduce ``models/a2a_response.go`` without changing wire
behavior. The fixture corpus in ``test_a2a_response.py`` is
designed so a one-sided edit breaks CI.
* ``send_message_to_user`` and ``chat_upload_receive`` use a
different endpoint (``/notify``) and aren't affected by this
bug; their parsing stays unchanged.
* 135 tests pass across ``test_a2a_response.py`` +
``test_a2a_client.py`` + ``test_delegation_sync_via_polling.py``
+ ``test_a2a_tools_impl.py``.
* ``coverage run --source=a2a_response -m pytest`` reports 100%
line coverage with 0 missing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
146c0e7c60 |
fix(a2a-client): recognize poll-mode 'queued' envelope (#2967)
workspace-server's a2a_proxy poll-mode short-circuit returns
{status: "queued", delivery_mode: "poll", method: <a2a_method>}
when the peer has no URL to dispatch to (poll-mode peers, including
every external molecule-mcp standalone runtime). The bare
send_a2a_message parser only knew about JSON-RPC {result, error}
keys, so this envelope fell through to the "unexpected response shape"
error path. Two production symptoms on the reno-stars tenant traced
to it:
1. File transfer logged as failed when it actually succeeded —
operator-facing logs showed an A2A_ERROR but the receiving
workspace did get the chunked file via the agent's fallback path.
2. delegate_task retried after the false failure → peer received
duplicate delegations → conversation got confused, the second
peer self-diagnosed in a notify ("⚠️ Peer 二次请求 — 我先不执行").
Add a third branch to the parser, BETWEEN the existing JSON-RPC
{result, error} cases and the catch-all "unexpected" fallback. The
queued envelope is delivery-acknowledged-but-pending-consumption —
not an error — so it returns a clean success string the agent can
render as a normal outcome. The success string includes "queued"
and "poll" so an operator scanning logs sees the routing path
without parsing JSON.
Defensive: the new branch only fires when BOTH status="queued" AND
delivery_mode="poll" are present. A partial envelope (one key
missing) still falls through to the catch-all, so a future server
bug that emits a malformed shape gets surfaced instead of silently
swallowed.
Tests:
- test_poll_queued_envelope_returns_success_string — pins the canonical
envelope returns a non-error string. Discriminating: verified to FAIL
on old code (returned [A2A_ERROR] string), PASS on new.
- test_poll_queued_envelope_with_other_method — pins the parser doesn't
hardcode message/send. Discriminating: also FAILS on old code.
- test_status_queued_without_poll_mode_still_falls_through — pins both
keys are required (defensive against future server bugs).
12 existing tests in TestSendA2AMessage still pass — no regression.
Scope: hotfix for the bare send_a2a_message path. The full SSOT
typed-A2AResponse refactor (#158-#163, parents under #2967) covers the
broader vocabulary alignment between Go server and Python client. This
PR ends the production symptoms now without preempting that work.
|
||
|
|
2652ea8342 |
fix(mcp-doctor): heartbeat (idempotent) instead of register (UPSERT)
Self-review caught after #2954 landed: check_register() POSTed to /registry/register with agent_card.name="doctor-probe". The endpoint is an UPSERT, so the doctor probe overwrites the workspace's actual agent_card metadata until the real agent's next register call. An operator running `molecule-mcp doctor` against a live workspace would see their canvas briefly display "doctor-probe" as the agent name — invisible production-disruption. Switches to POST /registry/heartbeat. heartbeat only updates last_heartbeat_at (and clears awaiting_agent if needed) — the same work a normal molecule-mcp boot does every 20s in steady state, so the doctor's extra heartbeat is indistinguishable from background traffic. Function renamed check_register → check_token_auth to match what it actually does. check_register kept as back-compat alias so any external test/import still resolves. Also unified the duplicated token-resolution paths into a single _resolve_token() returning (value, source_label). Pre-fix: check_register and _resolve_token_summary read env in parallel ladders — a future env-var addition would have to touch both. New tests: - test_check_token_auth_uses_heartbeat_endpoint: mocks urlopen, asserts the URL ends in /registry/heartbeat AND does NOT contain /registry/register. Pins the load-bearing invariant so a future refactor can't silently re-route through register. - test_resolve_token_returns_value_and_label_for_env: pins the consolidated resolver returns both pieces of info from the same source-decision. - test_resolve_token_returns_none_when_missing: missing-env happy path. Verification: - 13/13 tests pass (10 existing + 3 new) - Manual stripped-env run still renders 4 FAIL + 2 WARN with actionable hints, exit 1. Refs molecule-core#2934 item 6 (doctor side-effect fix-up). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
f01f374072 |
feat(mcp): add molecule-mcp doctor onboarding diagnostic
Closes #2934 item 6 — the deferred follow-up from Ryan's onboarding- friction report. Quote: "this single command would have saved me 30 of the 45 minutes." When push delivery fails or the install half-works, the operator today has no signal — they hand-grep the Claude Code binary or chase the `from versions: none` red herring. Doctor renders six checks in one screen with concrete next-step suggestions: 1. Python version >=3.11? (wheel's pin) 2. Wheel install molecule-ai-workspace-runtime importable + version surfaced 3. PATH for binary `molecule-mcp` resolves on PATH; if not, prints the resolved user-site bin dir to add (or recommends pipx) 4. Env vars PLATFORM_URL + WORKSPACE_ID + token (env or *_FILE or .auth_token) 5. Platform reach GET ${PLATFORM_URL}/healthz returns 2xx 6. Registry register POST /registry/register with the resolved token returns 2xx — end-to-end auth check Each line: `[OK|WARN|FAIL] <label>: <status>` plus a `next:` hint when not OK. ANSI colors auto-disable on non-TTY / NO_COLOR. Exit code: 0 on all-OK or only-WARN, 1 on any FAIL — scriptable from CI install-checks. ## Files `workspace/mcp_doctor.py` (new) — six check functions + `run()` entry point. Uses urllib (stdlib) so doctor works even on a partial install where `requests` is missing. `workspace/mcp_cli.py` Subcommand dispatch: molecule-mcp doctor → mcp_doctor.run() molecule-mcp --help → usage banner molecule-mcp → server (unchanged) `workspace/tests/test_mcp_doctor.py` (new) — 10 tests covering each check's pass/fail/skip path plus the end-to-end exit-code contract on a stripped env. `scripts/build_runtime_package.py` Adds `mcp_doctor` to TOP_LEVEL_MODULES so the wheel ships the new module. ## Out of scope (deferred follow-ups) - Claude Code-specific checks (parse ~/.claude.json, verify each MCP entry is plugin-sourced + dev-channels flag set). That's a separate Claude-Code-shaped doctor; lives in the channel plugin. - Automated remediation. Doctor is diagnostic — tells the operator what's wrong + how to fix it, doesn't apply changes. ## Verification - python -m pytest tests/test_mcp_doctor.py -v → 10/10 PASS - python -m pytest tests/test_mcp_cli*.py → 67/67 PASS (existing CLI suite still green; subcommand dispatch added before env-validation, doesn't disturb the server-boot path) - manual: `molecule-mcp doctor` on a stripped env renders 4 FAIL + 2 WARN + exit code 1, with each `next:` hint actionable Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3b7ed9cf53 |
Merge pull request #2946 from Molecule-AI/fix/onboarding-followup-2934
mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934) |
||
|
|
da9061c131 |
mcp: surface specific TOKEN_FILE errors + link follow-ups (#2934)
Self-review of #2935 turned up two real defects: 1. Stale README issue references — the build_runtime_package.py README template said "(issue #2934 follow-up)" twice, but the marketplace-plugin and `doctor` items now have dedicated tracking issues. Updated to point at #2936 and #2937 respectively. 2. Silent fallthrough on broken MOLECULE_WORKSPACE_TOKEN_FILE — when an operator EXPLICITLY pointed TOKEN_FILE at a path that didn't exist / wasn't readable / was blank / contained internal whitespace, the resolver silently returned the generic "set one of these three vars" error. That's exactly the silent failure mode #2934 flagged ("a new user has no chance"). Refactor `_read_token_from_file_env` to return `(token, error)`; surface the SPECIFIC failure when the operator's intent was clearly the file path. Skip the CONFIGS_DIR fallback in that case so the operator's config bug isn't masked by a different source happening to work. Adds 2 renames + 2 new tests in test_mcp_cli_split.py: - test_missing_file_returns_specific_error (asserts "does not exist") - test_empty_file_returns_specific_error (asserts "is empty") - test_multi_line_file_rejected (asserts "internal whitespace") - test_token_file_error_skips_configs_dir_fallback (asserts a valid CONFIGS_DIR/.auth_token does NOT silently rescue a broken TOKEN_FILE) All 81 mcp_cli + mcp_cli_multi_workspace + mcp_cli_split tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c4807a930d |
Merge pull request #2940 from Molecule-AI/refactor/a2a-tools-inbox-extract-rfc2873-iter4e
refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e) |
||
|
|
475da5b64c |
refactor(workspace): extract inbox tools from a2a_tools.py (RFC #2873 iter 4e)
Continues the OSS-shape refactor. After iters 4a-4d (rbac, delegation,
memory, messaging) the only behavior left in ``a2a_tools.py`` was
``report_activity`` plus three thin inbox-tool wrappers and the
``_enrich_inbound_for_agent`` helper. This iter extracts the inbox
slice to ``a2a_tools_inbox.py`` so the kitchen-sink module shrinks
from 280 LOC to ~165 LOC of imports + report_activity + back-compat
re-export blocks.
Extracted symbols:
- ``_INBOX_NOT_ENABLED_MSG`` (sentinel)
- ``_enrich_inbound_for_agent`` (poll-path peer enrichment helper)
- ``tool_inbox_peek``
- ``tool_inbox_pop``
- ``tool_wait_for_message``
Re-exports (`from a2a_tools_inbox import …`) preserve the public
``a2a_tools.tool_inbox_*`` surface so existing tests + call sites
continue to resolve unchanged.
New tests in test_a2a_tools_inbox_split.py:
1. **Drift gate (5)** — every previously-public symbol on a2a_tools
is the EXACT same object as a2a_tools_inbox.foo (`is`, not `==`),
catches a future "wrap with logging" refactor that silently loses
existing test coverage.
2. **Import contract (1)** — a2a_tools_inbox does NOT eagerly import
a2a_tools at module load. Pins the layered architecture: the
extracted slice depends on ``inbox`` + a lazy ``a2a_client``
import, never on the kitchen-sink that re-exports it.
3. **_enrich_inbound_for_agent branches (5)** — peer_id-empty
(canvas_user) returns dict unchanged; missing peer_id key same;
a2a_client unavailable (test harness, partial install) degrades
gracefully with a bare envelope; registry hit populates
peer_name + peer_role + agent_card_url; registry miss still
surfaces agent_card_url (constructable from peer_id alone).
The full timeout-clamp / validation / JSON-shape behavior matrix for
the three wrappers stays in test_a2a_tools_inbox_wrappers.py — those
tests pass identically against both the alias and the underlying impl.
Wiring updates:
- ``scripts/build_runtime_package.py``: add ``a2a_tools_inbox`` to
``TOP_LEVEL_MODULES`` so it ships in the runtime wheel and the
drift gate doesn't fail the next publish.
- ``.github/workflows/ci.yml``: add ``a2a_tools_inbox.py`` to
``CRITICAL_FILES`` so the 75% MCP/inbox/auth per-file floor
applies — this is now where the inbox-delivery code actually
lives.
|
||
|
|
1ad107cc15 |
Merge pull request #2935 from Molecule-AI/fix/onboarding-friction-2934
fix(onboarding): address Claude Code MCP onboarding friction (#2934) |
||
|
|
01deeb36cf |
fix(onboarding): address Claude Code MCP onboarding friction (#2934)
Ryan's bug report (#2934) walked through ~45 min of debugging a stock external-runtime install. This PR fixes the four items he flagged that have a small surface, and stubs out the larger ones for follow-up. Fixed in this PR ================ #1 — Python floor disclosure (README in publish bundle) Add an explicit "Requires Python ≥3.11" section that calls out the cryptic "Could not find a version that satisfies the requirement" failure mode; recommend `pipx install` over `pip install` so the binary lands on PATH automatically; show the explicit `pip install --user` alternative with the PATH caveat. #3 — MOLECULE_WORKSPACE_TOKEN_FILE support (mcp_workspace_resolver.py) Add a third resolution step between the inline env var and the in-container CONFIGS_DIR fallback. Operators can write the bearer to a 0600 file (e.g. ~/.config/molecule/token) and point MOLECULE_WORKSPACE_TOKEN_FILE at it, keeping the secret out of ~/.zsh_history and out of plaintext in MCP-host configs like ~/.claude.json. Inline TOKEN still wins on conflict so rotation flows are predictable. README documents the safer option as the recommended path. 6 new tests pin every leg (file resolves, inline wins, missing/empty file falls through, blank env unset-equivalent, help text advertises it). #4 — Push delivery 3-condition gating (README in publish bundle) Document that real-time push on Claude Code requires (a) the server to declare experimental.claude/channel (we do), (b) the server to be marketplace-plugin-sourced (operators must scaffold their own until the official marketplace lands — see #2934 follow-up), and (c) the --dangerously-load-development-channels flag on the claude invocation. Until any of the three is in place, delivery silently falls back to poll mode with no diagnostic. The README now says all of this explicitly so a new operator doesn't grep the binary for channel_enable to figure it out. #8 — serverInfo.name mismatch (a2a_mcp_server.py) The server reported `serverInfo.name = "a2a-delegation"` while operators register it as `molecule` (the name in `claude mcp add molecule …`). Harmless on tool routing today but matters for any future Claude Code allowlist that gates push by hardcoded server name. Renamed to "molecule" with an inline comment explaining the invariant. Deferred (separate issues to track) =================================== #2 — covered transitively by #1's pipx recommendation; no separate fix. #5 — `moleculesai/claude-code-plugin` marketplace repo (substantial new repo work; the README references it as a documented follow-up). #6 — `molecule-mcp doctor` subcommand (substantial new CLI surface; mentioned in the README's push-vs-poll section as the planned diagnostic for silent push fallback). #7 — `--dangerously-load-development-channels` rename — not in our control; that's Claude Code's flag. Tests ===== 164/164 mcp_cli + a2a_mcp_server tests pass locally (WORKSPACE_ID=00000000-0000-0000-0000-000000000001 pytest …) including 6 new TestTokenFileEnv cases. Wheel builds successfully via scripts/build_runtime_package.py with the new README markers verified in the output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
abc3affcb6 |
test(a2a_tools): cover inbox tool wrappers to restore 75% per-file floor
After RFC #2873 iter 4d extracted messaging tools to ``a2a_tools_messaging.py``, the only behavior left in ``a2a_tools.py`` is ``report_activity`` (covered by test_a2a_tools_impl) plus three thin wrappers around inbox state — ``tool_inbox_peek``, ``tool_inbox_pop``, ``tool_wait_for_message`` — which were never directly exercised at the module level. Per-file critical-path coverage dropped to 54.4% on the iter 4d branch, breaking the 75% MCP/inbox/auth floor in ci.yml. Adds ``test_a2a_tools_inbox_wrappers.py`` — 14 focused tests on the three wrappers covering: inbox-disabled fallback (via the _INBOX_NOT_ENABLED_MSG sentinel), input validation (empty/non-str activity_id, non-int peek limit), the timeout clamp contract on wait_for_message (300s ceiling, 0s floor, non-numeric fallback to 60s), JSON-shape pinning, and the limit/activity_id forwarding contract. Result: a2a_tools.py back to 100% covered with the existing impl-tests suite, gate green. |
||
|
|
3322524b0f |
Merge remote-tracking branch 'origin/staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d
# Conflicts: # workspace/a2a_tools.py |
||
|
|
0d0840d9d9 | Merge branch 'staging' into refactor/a2a-tools-messaging-extract-rfc2873-iter4d | ||
|
|
fc30b5c9de |
Merge pull request #2905 from Molecule-AI/fix/poll-path-message-enrichment
fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url |
||
|
|
23d3f057d3 |
Merge pull request #2890 from Molecule-AI/refactor/a2a-tools-memory-extract-rfc2873-iter4c
refactor(workspace): extract memory tools from a2a_tools.py (RFC #2873 iter 4c) |
||
|
|
8ca027ddf3 |
fix(tests): drop unused json + pytest imports
Bot lint flagged the two imports as unused (correct — neither is referenced after the file shrank during review). Resolves the two unresolved review threads silently blocking merge per the staging "all conversations resolved" gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
46a4ef83bb |
fix(tests): patch a2a_tools_memory.httpx, not a2a_tools.httpx
Iter 4c (#2890) moved tool_commit_memory + tool_recall_memory into a2a_tools_memory.py, which has its own top-level `import httpx`. test_mcp_memory.py + the secret-redact memory tests still patched `a2a_tools.httpx.AsyncClient`, which after the move is the WRONG module's reference — the real call inside the moved tool resolves to `a2a_tools_memory.httpx.AsyncClient` and reaches the network. CI catches this as 7 failures: JSONDecodeError on empty bodies and "All connection attempts failed" on the recall side. Update 7 patch sites to `a2a_tools_memory.httpx.AsyncClient`. The existing tests in `test_a2a_tools_impl.py` were already updated by the iter-4c PR; only these two files were missed. Verified: pytest workspace/tests/test_mcp_memory.py + test_secret_redact.py — 43/43 pass after the fix (both files were red on the iter-4c branch CI). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
39931acd9c |
fix(inbox-uploads): cancel BatchFetcher futures on wait_all timeout
The deadline contract was incomplete: wait_all logged the timeout but close() then called executor.shutdown(wait=True), which blocked on the leaked workers — undoing the user-facing timeout. The inbox poll loop would stall indefinitely on a hung /content fetch instead of returning to chat-message processing. Fix: wait_all now flips self._timed_out and cancels queued (not-yet- started) futures; close() reads that flag and switches to shutdown(wait=False, cancel_futures=True) on the timeout path. Currently-running workers can't be interrupted by Python's threading model, but they're now detached daemons whose blocking httpx call no longer gates the next poll. Healthy path (no timeout) keeps the existing drain-and-wait so a still-queued ack POST isn't dropped mid-write. Two new tests pin both legs of the contract end-to-end: - close-after-timeout-doesn't-block: hung worker, wait_all(0.05s) fires the timeout, close() returns in <1s instead of waiting ~5s for the worker to come back. - close-without-timeout-still-drains: 2 slow workers, wait_all completes cleanly, close() drains both ack POSTs. Resolves the BatchFetcher timeout-cancellation finding from the post-merge five-axis review of Phase 5b. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
81e83c05b7 |
fix(inbox): drop unused batch_fetcher = None after end-of-batch drain
Lint nit from review bot — _drain_uploads() runs and the function immediately advances to the cursor save + return, so the local re-assign is dead code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
5b5eacbb29 |
test(inbox): clean up daemon poller thread to prevent test cross-talk
test_start_poller_thread_is_daemon spawned a daemon thread with no stop mechanism — the leaked thread polled every 10ms with the test's patched httpx.Client mock STILL ACTIVE for ~50ms after the test scope. Later tests that re-patched httpx.Client + asserted call counts on fetch_and_stage / Client construction got their assertions inflated by the leaked thread's iterations. Symptoms: test_poll_once_skips_chat_upload_row_from_queue saw fetch_and_stage called twice instead of once on Python 3.11 CI; test_batch_fetcher_owns_client_when_not_supplied saw two Client constructions instead of one in the full local suite. Both surfaced only after Phase 5b's BatchFetcher refactor changed the timing window that allowed the leaked thread to fire mid-test. Fix: extend start_poller_thread with an optional stop_event kwarg (backward compatible — production callers pass None and rely on the daemon flag for process-exit cleanup). The test now signals + joins on stop_event before exiting scope, so the thread is gone before any later test patches httpx. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
30fb507165 |
feat(poll-upload): phase 5b — concurrent BatchFetcher + httpx client reuse
Resolves the two remaining findings from the Phase 1-4 retrospective
review (the Python-side counterparts to phase 5a):
1. Important — inbox_uploads.fetch_and_stage blocked the inbox poll
loop synchronously per row. A user dragging 4 files into chat at
once would stall the poller for 4× per-fetch latency before the
chat message reached the agent. Add BatchFetcher: a thread-pool
wrapper (default 4 workers) that submits fetches concurrently and
exposes wait_all() as the barrier the inbox loop calls before
processing the chat-message row that references the uploads.
The drain barrier is the correctness invariant: rewrite_request_body
must observe a populated URI cache when it walks the chat-message
row's parts. _poll_once now drains the BatchFetcher inline before
the first non-upload row, AND at end-of-batch (case: batch contains
only upload rows; the corresponding chat message arrives in a later
poll, but the future-poll-races-current-fetch race is closed).
2. Nit — fetch_and_stage created two httpx.Client instances per row
(one for GET /content, one for POST /ack). Refactor so a single
client serves both calls. When called from BatchFetcher, the
batch-shared client serves every row's GET + ack — so the second
fetch reuses the TCP+TLS handshake from the first.
Comprehensive tests:
- 13 new inbox_uploads tests:
- fetch_and_stage with supplied client: zero httpx.Client
constructions, GET+POST through the same client, caller's client
not closed (lifecycle owned by caller).
- fetch_and_stage without supplied client: exactly one
httpx.Client constructed (was 2 pre-fix), closed on the way out.
- BatchFetcher: 3 rows × 120ms = parallel completion < 250ms
(vs. ~360ms serial), URI cache hot when wait_all returns,
per-row failure isolation, single-client reuse across all
submits, idempotent close, submit-after-close raises,
owned-vs-supplied client lifecycle, no-op wait_all on empty
batch, graceful httpx-missing degradation.
- 3 new inbox tests:
- poll_once drains uploads before processing the chat-message row
(in-place mutation of row['request_body'] proves the URI was
rewritten BEFORE message_from_activity returned).
- poll_once with only upload rows still drains at end-of-batch.
- poll_once with no upload rows never constructs a BatchFetcher
(zero overhead on the no-upload happy path).
133 total inbox + inbox_uploads tests pass; 0 regressions.
Closes the chat-upload poll-mode-perf gap end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
3d0a7c381b |
fix(workspace): enrich poll-path inbox messages with peer_name/role/card_url
Reported: agents receiving messages via inbox_peek / wait_for_message get a plain envelope — text + peer_id + kind only. The push-path (a2a_mcp_server._build_channel_notification) already enriches the meta dict with peer_name, peer_role, and agent_card_url from the registry cache, but the poll-path returns InboxMessage.to_dict() unchanged. So a Claude Code host with channel-push gets the friendly identity, but every other MCP client (and Claude Code with push disabled — the universal default) sees plain text. This silently breaks the contract documented in a2a_mcp_server.py:303-345: > In both paths the same fields apply: kind, peer_id, peer_name, > peer_role, agent_card_url, activity_id Fix: a2a_tools._enrich_inbound_for_agent() — same shape as the push-path's enrichment, called from tool_inbox_peek and tool_wait_for_message. Cache-first non-blocking (5-min TTL via enrich_peer_metadata_nonblocking, same helper push uses), so a cache miss returns immediately with bare envelope and warms the cache for the next poll. agent_card_url is constructable from peer_id alone and surfaces even on cache miss, so the receiving agent always has a single endpoint to hit for capabilities. Degradation paths: - canvas_user (peer_id="") → pass through unchanged, no enrichment - a2a_client unavailable (test harness without registry) → bare envelope, agent still gets text + peer_id + kind + activity_id Tests: - canvas_user passes through unchanged - peer_agent cache hit → name + role + agent_card_url all present - peer_agent cache miss → agent_card_url still constructed - a2a_client unavailable → bare envelope, no crash All 4 pass against fixed code. Without the fix, the cache-hit and cache-miss tests would fail (peer_name/peer_role/agent_card_url keys absent from to_dict's output). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
6470e5f41b |
Merge pull request #2887 from Molecule-AI/refactor/a2a-tools-delegation-extract-rfc2873-iter4b
refactor(workspace): extract delegation handlers from a2a_tools.py (RFC #2873 iter 4b) |
||
|
|
abba16beb4 |
Merge pull request #2883 from Molecule-AI/refactor/a2a-tools-rbac-extract-rfc2873-iter4a
refactor(workspace): extract RBAC helpers from a2a_tools.py (RFC #2873 iter 4a) |
||
|
|
9c752e0673 |
Merge pull request #2879 from Molecule-AI/refactor/mcp-cli-split-rfc2873-iter3
refactor(workspace): split mcp_cli.py into focused modules (RFC #2873 iter 3) |
||
|
|
8e5d193761 |
fix(tests): retarget get_peers_with_diagnostic patches to a2a_tools_messaging (RFC #2873 iter 4d)
Inherits the iter 4b test retarget commit through rebase. Adds the remaining 4 patch sites in test_a2a_multi_workspace.py that target get_peers_with_diagnostic — that call site moved from a2a_tools to a2a_tools_messaging in this PR. Refs RFC #2873 iter 4d. |
||
|
|
3e0d2e650a |
refactor(workspace): extract messaging tools from a2a_tools.py to a2a_tools_messaging.py (RFC #2873 iter 4d)
Fourth slice of the a2a_tools.py split (stacked on iter 4c). Owns the
four human-and-peer messaging MCP tools + the chat-upload helper:
* _upload_chat_files — stage local paths to /chat/uploads
* tool_send_message_to_user — push canvas-chat via /notify
* tool_list_peers — discover peers across registered workspaces
* tool_get_workspace_info — JSON-encode workspace info
* tool_chat_history — fetch prior conversation rows with a peer
a2a_tools.py shrinks from 508 → 213 LOC (−295). The remaining 213
is just report_activity + back-compat re-exports. Inbox tools
(tool_inbox_peek/pop/wait_for_message) deferred to iter 4e.
Layered architecture: messaging depends on a2a_tools_rbac (iter 4a),
a2a_client, platform_auth — NOT on kitchen-sink a2a_tools. An
import-contract test pins this so future refactors that add
`from a2a_tools import …` fail in CI.
Tests:
* 28 patch sites in TestToolSendMessageToUser + TestToolListPeers +
TestToolGetWorkspaceInfo + TestChatHistory retargeted from
`a2a_tools.{httpx, get_peers_*, get_workspace_info,
_upload_chat_files, _peer_*, list_registered_workspaces}` to
`a2a_tools_messaging.…` because the call sites moved.
* test_a2a_tools_messaging.py adds 7 new tests:
- 5 alias drift gates
- 2 import-contract tests (no top-level a2a_tools dep + a2a_tools
surfaces every messaging symbol)
137 tests total in the a2a_tools suite, all green.
Refs RFC #2873.
|