molecule-ai-workspace-template-claude-code

molecule-ai/molecule-ai-workspace-template-claude-code

Author	SHA1	Message	Date
hongming-pc2	062096b20d	fix(executor): surface the CLI stream error instead of the swallowed-stderr placeholder Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 21s Details CI / Adapter unit tests (pull_request) Successful in 1m38s Details CI / Adapter unit tests (push) Successful in 1m46s Details CI / validate (pull_request) Successful in 7m0s Details CI / validate (push) Successful in 7m3s Details When the `claude` CLI errors mid-stream, claude-agent-sdk throws a bare `Exception("Command failed with exit code 1 …")` whose only text is the useless `Check stderr output for details` placeholder — but the actual failure reason (model 404, rate limit, auth) arrived a moment earlier as a stream-json `ResultMessage(is_error=True)` carrying `result` text and `api_error_status`. That was thrown away. `_run_query` now captures `ResultMessage(is_error=True)` detail (and, as a fallback, the trailing AssistantMessage text) and re-attaches it to the raised exception as `_molecule_stream_detail`. `_format_process_error` surfaces it as `cli_stream_error=…` and, when present, skips the `_probe_claude_cli_error` re-probe (#160) — the probe can't replay the failing `--model`/`--system-prompt` argv, so it may even succeed and mislead. The probe stays as the last resort when there's nothing to salvage. Regression context: the 2026-05-10 dev-team incident — six lead workspaces 404ing on every turn (`--model claude-code` → `api_error_status=404`, "There's an issue with the selected model (claude-code)"), invisible for an hour because the CLI wrote nothing to stderr and this text was discarded. See internal#226 follow-up #5. Tests: tests/test_executor_error_detail.py — 6 cases (format surfaces the salvaged detail; format still probes when there's nothing salvaged; salvaged detail takes precedence over the probe; _run_query annotates from ResultMessage(is_error); _run_query falls back to assistant text; clean success path unaffected). `pytest tests/` → 87 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 03:10:36 -07:00
Claude CEO Assistant	aaa2a79e81	fix(adapter): alias-map yaml_provider for runtime-wheel default Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s Details CI / Adapter unit tests (push) Successful in 1m21s Details CI / Adapter unit tests (pull_request) Successful in 1m18s Details CI / validate (pull_request) Failing after 2m15s Details CI / validate (push) Failing after 5m36s Details The molecule-runtime wheel auto-derives `runtime_config.provider = "anthropic"` from its default model slug `anthropic:claude-opus-4-7` when the per-workspace YAML omits both fields. The adapter receives that derived `anthropic` as `yaml_provider` and rejects it because the providers registry only knows `anthropic-oauth` / `anthropic-api`. The existing alias map (`anthropic` → `anthropic-api`, `claude-code` → `anthropic-oauth`) was applied only on the env-var path; mirroring it on the YAML path resolves the wheel default to a registered provider name. Symptom on staging-cplead-2 (2026-05-09): every workspace booted with `configuration_status=not_configured` and `configuration_error="ValueError: claude-code adapter: workspace config picks provider='anthropic' but it is not in the providers registry"`. Live-patched the running cp-lead workspaces to confirm the fix; this commit lands the durable change in the template repo so freshly-provisioned workspaces don't repeat the wedge. Tests: - test_yaml_provider_anthropic_is_aliased_to_anthropic_api (regression) - test_yaml_provider_claude_code_is_aliased_to_anthropic_oauth (symmetry) - test_yaml_provider_unknown_passes_through_for_actionable_error (guards the silent-fallback bug from #180; unaliased unknowns must still reach _resolve_provider so it raises with the helpful "Known providers: ..." message) All 81 tests pass locally. Refs: staging-cplead-2 incident 2026-05-09 Live-patched workspaces: 941a929e, 99de7cab, a8ba9dc8, a00e74df Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 20:46:02 -07:00
claude-ceo-assistant	8adc3576fd	fix(adapter): map persona-friendly slugs (claude-code, anthropic) to registry names CI / Adapter unit tests (push) Successful in 1m46s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 50s Details CI / Adapter unit tests (pull_request) Successful in 2m16s Details CI / validate (pull_request) Successful in 6m18s Details CI / validate (push) Failing after 18m56s Details Phase 4 verification surfaced a follow-up edge case the initial fix missed: the persona env files use friendlier slugs than the registry's canonical names: * MODEL_PROVIDER=claude-code -> anthropic-oauth (Claude Code subscription) * MODEL_PROVIDER=anthropic -> anthropic-api (direct Anthropic API key) Without an alias map, a lead workspace's MODEL_PROVIDER=claude-code env fell through the slug-detection path; when the YAML didn't pin a provider, the model-prefix matcher saw MODEL=MiniMax-M2.7 and routed the lead to MiniMax — even though CLAUDE_CODE_OAUTH_TOKEN was clearly the intended auth path. Add _PROVIDER_SLUG_ALIASES with the two operator-facing slugs that don't match registry names verbatim. The alias map is consulted before the slug-vs-legacy detection, so claude-code now resolves to anthropic-oauth and the lead boots through OAuth as intended. Tests ----- + test_persona_env_lead_with_minimax_model_routes_via_oauth — lock in the alias-map behavior so a future contributor can't silently re-introduce the lead-mis-routed-to-MiniMax bug. + test_anthropic_alias_resolves_to_anthropic_api — covers the second alias path. Updated test_persona_env_lead_claude_code_resolves_correctly to assert the new (correct) behavior: provider == 'anthropic-oauth', not None. Full adapter suite: 78/78 pass.	2026-05-08 14:23:59 -07:00
claude-ceo-assistant	1742b60e62	fix(adapter): honor MODEL/MODEL_PROVIDER env (persona-env convention) CI / Adapter unit tests (push) Successful in 1m40s Details Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 19s Details CI / Adapter unit tests (pull_request) Failing after 52s Details CI / validate (push) Failing after 2m17s Details CI / validate (pull_request) Successful in 13m19s Details Fixes the 2026-05-08 dev-tree wedge: 22/27 non-lead workspaces (minimax tier) stuck in degraded after /org/import, every chat hanging on `Control request timeout: initialize`. Root cause ---------- The persona env files (`~/.molecule-ai/personas/<name>/env`) declare a TWO- variable convention: - MODEL = model id ("MiniMax-M2.7-highspeed") - MODEL_PROVIDER = provider slug ("minimax") The runtime wheel's legacy `workspace/config.py` interprets MODEL_PROVIDER as the model id — a name chosen long before there was a separate MODEL env. With both set, the legacy code reads MODEL_PROVIDER="minimax" into runtime_config.model. The literal string "minimax" doesn't match any registry prefix (`minimax-` requires a hyphen suffix), falls through to providers[0] (anthropic-oauth), the auth check fails on the absent CLAUDE_CODE_OAUTH_TOKEN, the claude CLI launches anyway, and the SDK's `query.initialize()` 60s control timeout fires. The brief hypothesised `claude_sdk_executor.py` lacked dispatch logic. Phase 1 evidence: dispatch ALREADY exists in adapter.py — model -> provider -> base_url + auth_env routing was correctly built for #180. The bug was upstream: MODEL_PROVIDER's name collision with the persona-env convention silently corrupted the picked model BEFORE adapter.py saw it. Fix --- New helper `_resolve_model_and_provider_from_env` reconciles env vars against YAML inside adapter.setup() and create_executor(): 1. MODEL env -> picked_model (authoritative when set). 2. MODEL_PROVIDER env -> explicit_provider IFF the value matches a registered provider name. Backward-compat: if MODEL is unset and MODEL_PROVIDER doesn't match a registered slug, treat it as a legacy model id (canvas Save+Restart pre-this-fix). 3. YAML runtime_config.{model,provider} fills any field env didn't supply. Contained in the template repo per the brief's scope guidance — does NOT touch the runtime wheel's workspace/config.py (which would need a separate molecule-core PR), and does NOT change the persona-env dispatch policy (Phase 2 mapping 2026-05-08). Tests ----- Eleven new cases in tests/test_env_model_provider_dispatch.py covering: - persona-env shape (minimax, GLM, lead claude-code) -> correct model + slug - legacy MODEL_PROVIDER-as-model-id shape still works - env wins over YAML - YAML fallback when env unset - whitespace/empty defensive handling - case-insensitive provider slug matching Full adapter test suite: 76/76 pass. Verification path ----------------- After image rebuild + workspace re-provision, ws-* containers will boot with provider=minimax (not anthropic-oauth), ANTHROPIC_BASE_URL set to https://api.minimax.io/anthropic, MINIMAX_API_KEY projected onto ANTHROPIC_AUTH_TOKEN, and the SDK init handshake succeeding. Refs: task #181, brief 2026-05-08, related #180 (#7 in this repo)	2026-05-08 14:11:42 -07:00
dev-lead	291f356dab	fix(adapter,tests): isolate _load_providers tests from multi-path lookup Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s Details CI / Adapter unit tests (push) Successful in 1m1s Details CI / Adapter unit tests (pull_request) Successful in 1m2s Details CI / validate (push) Successful in 3m23s Details CI / validate (pull_request) Successful in 3m22s Details The 5 _load_providers tests were single-path-only: they wrote a config.yaml to tmp_path and called _load_providers(str(tmp_path)), expecting the lookup to read tmp_path/config.yaml. After the multi-path fix in #7, _load_providers also checks: 1. _CANONICAL_ADAPTER_DIR/config.yaml (= /opt/adapter/config.yaml) 2. _TEMPLATE_DIR/config.yaml (= dirname(__file__)/config.yaml) 3. ${config_path}/config.yaml (the test's tmp_path) Path 2 finds the repo's bundled config.yaml on the test runner's disk before path 3 — the tests then see the bundled providers list instead of the test's expected behavior. Two surface changes: 1. adapter.py — extract `os.path.dirname(os.path.abspath(__file__))` into a module-level `_TEMPLATE_DIR` constant, mirroring `_CANONICAL_ADAPTER_DIR`. Production behavior identical (resolved once at import). Tests can monkeypatch the module attribute to redirect the path-2 lookup. 2. tests/test_adapter_prevalidate.py — 5 _load_providers tests monkeypatch `_CANONICAL_ADAPTER_DIR` and `_TEMPLATE_DIR` to a non-existent tmp subdir, isolating the test to the workspace config_path branch they always meant to test. The 6th _load_providers test (`test_load_providers_parses_yaml_and_normalizes`) already passed because path 2 returns 7 providers and that's what that test expects — left unchanged. Verification: pytest tests/ 65/65 PASS pytest tests/test_adapter_prevalidate.py -k load_providers 6/6 PASS Closes molecule-core#129 follow-up — the unit tests were the last red on the template repo's CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 13:27:56 -07:00
claude-ceo-assistant	f8d7f8f3a8	test(adapter): install adapter import shims via conftest Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s Details CI / Adapter unit tests (push) Successful in 58s Details CI / Adapter unit tests (pull_request) Successful in 58s Details CI / validate (pull_request) Successful in 2m59s Details CI / validate (push) Successful in 3m0s Details CI runner installs only `pytest pytest-asyncio pyyaml`; without the molecule_runtime/a2a/claude_sdk_executor stubs, the new test_provider_resolution.py fails to collect with ModuleNotFoundError. test_adapter_prevalidate.py owned the same shims via a per-file _install_stubs(), but two files maintaining parallel stub copies eventually disagree on shape (BaseAdapter needing install_plugins_via_registry, etc.). Move the shim install + sys.path bump into tests/conftest.py so every test module shares a single canonical stub set, collected before any test imports adapter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 10:58:51 -07:00
claude-ceo-assistant	a2c7bf3d3b	fix(adapter): honor explicit provider config — fail fast when not in registry (#180 ) Workspace operators set 'provider: minimax' in /configs/config.yaml expecting the adapter to route to MiniMax. Pre-fix behavior: adapter ignored 'provider:' entirely, _resolve_provider model-matched against _BUILTIN_PROVIDERS (anthropic-oauth + anthropic-api only), no model_prefix matched 'MiniMax-M2.7-highspeed', silent fallback to providers[0] (anthropic-oauth) — SDK kept using CLAUDE_CODE_OAUTH_TOKEN, hit OAuth quota under a name the operator never asked for. Fix: _resolve_provider now takes an explicit_provider arg. setup() reads it from runtime_config.provider OR top-level config.yaml provider:. Explicit name in registry → returned. Not in registry → ValueError with the two paths to fix (add provider entry, or switch runtime template). 10 new tests cover: explicit-in-registry returns match, case-insensitive, not-in-registry raises with actionable message, defense-in-depth against silent fallback regression, custom-registry lookup, empty/None treated as no-explicit (back-compat). Closes #180. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 10:58:51 -07:00
Hongming Wang	b9a1fa1b1f	feat: per-vendor env routing for third-party providers (task #244 ) CI / validate (push) Failing after 0s Details CI / Adapter unit tests (push) Failing after 6s Details Third-party Anthropic-compat providers (MiniMax, GLM, Kimi, DeepSeek) all reuse the Anthropic SDK's wire format, which means the claude CLI and claude-code-sdk read the bearer token from ANTHROPIC_AUTH_TOKEN no matter which vendor is being talked to. Pre-#244: * Canvas surfaced the vendor-specific name (MINIMAX_API_KEY, etc.) to the user — so a user who saved only MINIMAX_API_KEY hit a silent 401 on first call. * The boot audit said `MINIMAX_API_KEY=set`, making it look like an SDK bug rather than a routing gap. * A user with multiple vendor keys could only run one workspace at a time because they all fought over the shared ANTHROPIC_AUTH_TOKEN slot. Diagnostic-only audit logging shipped earlier (#32) but the actual routing was never written — task #244 was mismarked complete. Changes: * config.yaml: third-party model `required_env` now references the per-vendor name (MINIMAX_API_KEY, GLM_API_KEY, KIMI_API_KEY, DEEPSEEK_API_KEY) so canvas asks the user for the right key. First-party Anthropic models still use ANTHROPIC_AUTH_TOKEN / CLAUDE_CODE_OAUTH_TOKEN. * config.yaml: each third-party provider's `auth_env` lists the vendor name FIRST (priority order) so projection picks the vendor key over a stale ANTHROPIC_AUTH_TOKEN. * adapter.py: new `_project_vendor_auth(provider)` helper, called from `setup()` right after `_resolve_provider`. Idempotent — only projects when ANTHROPIC_AUTH_TOKEN is unset (operator override always wins). Logs the projection by NAME, never by VALUE (mirrors `_audit_auth_env_presence`). * tests/test_provider_routing.py: 6 new tests pin the contract — vendor-key-set projects, AUTH_TOKEN-already-set is never clobbered, first-party providers skip projection, secret value never leaks into a log record, empty-string vendor env doesn't trigger projection, and the same routing fires for GLM / Kimi / DeepSeek. Mirrors the parallel hermes-side fix from task #249 / hermes PR #38; keeps the two runtimes' multi-vendor UX in lockstep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:20:03 -07:00
Hongming Wang	78ae139609	feat(adapter,entrypoint): boot env audit + crash-loop diagnosis logging Adds two operator-visible boot diagnostics that close the diagnosis gap exposed by the 2026-05-02 MiniMax E2E crash-loop. The universal canvas-picked-model fix (Bug B) and per-model required_env (Bug D) live in molecule-core PR #2538 — this PR adds the per-template visibility that complements them so operators can answer "is the key missing or is routing wrong?" from `docker logs` alone. Changes ------- adapter.py: - _AUTH_ENV_AUDIT tuple of 8 vendor env names (CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_API_KEY/AUTH_TOKEN/BASE_URL, MINIMAX/GLM/KIMI/DEEPSEEK_API_KEY). - _audit_auth_env_presence() helper — single INFO line of NAME=set/unset pairs. NEVER logs values; the test pins this with a "fake-secret-MUST- NOT-LEAK" sentinel that must never appear in the log message. - One call site at the end of setup()'s boot banner so every workspace start emits both "which provider got picked" and "which envs are present" in adjacent log lines. entrypoint.sh: - log_boot_context() function fired once before the gosu drop (as root) and once after (as agent) so an operator can spot env values lost across the privilege drop. Emits uid/gid/user/hostname/workspace_id/ platform_url/configs_dir/workspace_dir + the same 8 env names as NAME=set/unset. Mirror of _AUTH_ENV_AUDIT — list pinned in sync by a new AST-style test (test_audit_env_list_matches_entrypoint_sh) that parses entrypoint.sh and asserts set-equality with adapter.py's tuple. tests/test_adapter_logging.py (new): - 4 tests covering the audit contract: every name appears, all-unset scenario, empty-string treated as unset (matches routing semantics), and the cross-file sync gate against entrypoint.sh's for-loop. - Stubs molecule_runtime + a2a so the helpers can be imported without the real wheel installed in CI (mirrors test_adapter_prevalidate.py's scaffolding pattern). Why this complements molecule-core PR #2538 ------------------------------------------- - PR #2538 makes Bug B (canvas-picked model silently dropped) impossible by resolving model centrally in workspace/config.py:load_config — every adapter (claude-code, hermes, codex, future ones) gets the passthrough for free. - PR #2538 makes Bug D (preflight rejects valid auth for non-default models) impossible by REPLACE-not-union per-entry required_env. - This template PR is the per-template observability layer: when one of those universal fixes regresses (or when an operator misconfigs a vendor key), the boot logs say exactly which env was present at each tier. Validated end-to-end on workspace be27badd-00a7-4cef-91e8-af428175c76f (clean boot, MINIMAX_API_KEY=set audited, no crash-loop). Closes part of molecule-monorepo task #248. Sibling of #2538 for molecule-core. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:41:05 -07:00
Hongming Wang	02e4520cf3	chore(executor): runtime_wedge mirror follow-ups from PR #29 review Two review nits: 1. Narrow the import-arm catch in _mark_sdk_wedged and _clear_sdk_wedge_on_success to (ImportError, ModuleNotFoundError). The bare `except Exception:` would have masked an AttributeError / TypeError from a runtime_wedge API rename — silently degrading the mirror to "no-op" and making heartbeat + the smoke gate (#131) blind to claude-code wedges. The structural snapshot test in molecule-core (task #169) catches the rename at PR-time. Older runtimes that don't ship runtime_wedge at all still hit ImportError and silently no-op — the local sticky flag still gates is_wedged() inside this module so internal callers keep working. 2. Add mirror-CALL-failure injection tests. The recorder used by the original tests never raised, so the inner try around _mark_runtime_wedged(reason) (and the symmetric clear) wasn't pinned. New tests inject a recorder whose mark/clear raise on call, then assert: (a) the call attempt was recorded, (b) the local sticky flag stayed correct, (c) the failure was logged at ERROR. Pins both the contract ("mirror is best-effort, local is source of truth") AND the operator-visible signal (an ERROR log line is the only way to see a silent mirror regression). Regression-injection-checked: removing the call-side try arm makes both new tests fail with clear messages. Tests: 7 in test_runtime_wedge_mirror.py, 45 across the whole tests/ tree.	2026-05-01 18:04:24 -07:00
Hongming Wang	b2561aa825	feat(executor): mirror SDK wedge into molecule_runtime.runtime_wedge The local _sdk_wedged_reason flag was only observed inside this module — heartbeat reads runtime_wedge.is_wedged() (universal cross-cutting holder) and so does the new boot-smoke gate from molecule-core PR #2473 / task #131. Without the mirror, a wedged claude-code workspace stayed green-dot on the canvas while every chat hung, AND the publish-image gate could not catch PR-25-class init wedges before the broken image shipped to GHCR. _mark_sdk_wedged now mirrors into runtime_wedge.mark_wedged, and _clear_sdk_wedge_on_success mirrors into runtime_wedge.clear_wedge. Both are best-effort — older runtimes that don't ship runtime_wedge silently no-op the mirror, so a template pinned to an older runtime still boots. Mirror exceptions are logged but don't suppress the local sticky flag, so internal callers (retry loop, cancel handler) see consistent state regardless of the universal-side outcome. Tests cover: mark mirrors with reason, first-call-wins propagates, clear mirrors, no-op when not wedged, ImportError-resilience. Regression-injection-checked: silencing the mirror branch fails the mark+first-wins tests at unit-test time with a clear message naming the missing runtime_wedge call.	2026-05-01 17:52:24 -07:00
Hongming Wang	9eb7d7b6cd	fix(executor): pass tagged server:molecule to dev-channels flag Claude Code 2.1.x changed the flag's signature to take an allowlist of tagged entries — `server:<name>` for manually-configured MCP servers, `plugin:<name>@<marketplace>` for plugin channels. PR #25's `{flag: None}` rendered as a bare `--<flag>` with no value, the CLI rejected with `argument missing`, and the SDK timed out at `initialize`, surfacing upstream as `Control request timeout: initialize` (caught live on workspace dd40faf8 on 2026-05-01 — 100% of A2A turns wedged). Pass `server:molecule` so the SDK forwards `--dangerously-load-development-channels server:molecule`. Live-verified end-to-end: A2A returns coherent replies AND the host claude session renders inbound canvas messages as `<channel source="molecule" ...>` tags inline (push UX without inbox poll). Tests: replace the unconditional `None` pin with a tagged-form pin that asserts the exact `server:molecule` value, plus a defense-in-depth test that pins the invariants (non-None, non-empty, contains tag colon) so any regression to the bare-switch shape fails at unit-test time instead of surfacing as a live SDK initialize wedge. 38/38 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 17:15:49 -07:00
Hongming Wang	874029fca0	Revert "Merge pull request #25 from Molecule-AI/feat/forward-dev-channels-flag" This reverts commit `4d5e85f3a0`, reversing changes made to `b70aa1846b`.	2026-05-01 17:02:55 -07:00
Hongming Wang	a78626ced4	fix(adapter): keep setup() routing — strip prefix only at CLI invocation Live-test revealed a regression in PR #24's setup() strip: the wheel- default `anthropic:claude-opus-4-7` paired with an OAuth workspace (CLAUDE_CODE_OAUTH_TOKEN set, no ANTHROPIC_API_KEY) is the realistic production shape. Stripping in setup() routes those users into the `anthropic-api` provider entry, after which the CLI hangs at `initialize` because no API key env is set. Caught on workspace dd40faf8 on 2026-05-01 — banner went `provider=anthropic-api` and A2A wedged on Control request timeout. Pre-fix routing (let prefixed strings fall through to providers[0] = anthropic-oauth) is actually correct for this combo. The strip is only needed at the CLI invocation site (create_executor) where claude's `--model` arg must be a bare id. Tests: replace `test_setup_strip_routes_prefixed_anthropic_to_anthropic_api` with `test_setup_keeps_prefix_routing_oauth_for_anthropic_prefix`, which pins the inverse — prefixed model + OAuth env stays on oauth and emits no API-key warning. The 5 unit cases on `_strip_provider_prefix` plus the `create_executor` strip pins remain unchanged. 36/36 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:46:54 -07:00
Hongming Wang	e14f33a670	feat(executor): forward --dangerously-load-development-channels to claude CLI The wheel-side push UX gates (capability + instructions, molecule-core PR #2463) only matter if the host claude CLI is willing to register a non-allowlisted experimental channel. During the channels research preview the CLI requires --dangerously-load-development-channels to bypass its allowlist; without it, every notifications/claude/channel fired by the inbox bridge arrives at the host and is silently dropped. claude-agent-sdk forwards arbitrary CLI flags to the spawned subprocess via ClaudeAgentOptions.extra_args (claude_agent_sdk/_internal/transport/ subprocess_cli.py:340). Wire the flag in unconditionally — the flag is harmless on builds that already allowlist the capability and required on builds during the research preview, so there is no version skew to guard. Remove the line once channels graduate to the default allowlist. Test pins the wiring with a stubbed ClaudeAgentOptions recorder; runs in CI without claude_agent_sdk / a2a / molecule_runtime installed via the same _ensure_module/_ensure_attr pattern as the existing adapter prevalidate test, but tolerates real packages being present locally. Verified by injection: removing the extra_args line makes the test fail with a message naming the missing flag and citing the SDK file that consumes it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:26:58 -07:00
Hongming Wang	6ba4cc6a01	fix(adapter): strip LangChain-style provider prefix before CLI invocation The molecule-runtime wheel's config.py defaults model to `anthropic:claude-opus-4-7` so langchain/crewai consumers get a uniform provider:model string out of the box. The claude CLI's --model arg expects the bare model id and silently exits 1 (no stderr) on prefixed strings — root cause of the 2026-05-01 "Agent error (Exception)" mid-A2A bug. Diagnosed via strace on a live workspace: the CLI received `--model anthropic:claude-opus-4-7` and exit_group(1)'d before any non-fatal output. Add `_strip_provider_prefix` and call it in both setup() (so _resolve_provider routes anthropic:claude-X correctly to anthropic-api instead of falling back to oauth) and create_executor() (so the bare id reaches the CLI). Only known-Claude prefixes are stripped; unknown ones (openai:, bedrock:) pass through so the CLI fails loudly instead of being silently mangled. Coverage: 8 new tests — unit tests for the helper across all branches, end-to-end `create_executor` strip on dict + dataclass shapes, and a caplog-based setup() test that pins provider=anthropic-api routing after the strip (the silent-fallback failure mode this fix eliminates). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:06:53 -07:00
Hongming Wang	2b9b4306eb	fix(adapter): per-entry isolation in _load_providers + tighten _normalize_provider Two correctness issues spotted in self-review of `c6f4912`: 1. String-as-prefix typo split into character tuple. ``model_prefixes: mimo-`` (operator forgot brackets) used to iterate over characters → ``('m','i','m','o','-')``, silently routing every model id starting with 'm', 'i', or '-' through the entry. Now: non-list values coerce to empty tuple (entry survives but matches nothing — operator notices in boot banner, not via misrouted requests). 2. Single bad provider entry nuked the whole registry. _load_providers built the registry via a generator inside tuple(...). One AttributeError mid-comprehension (e.g. ``[mimo-, 123]`` — int's missing .lower()) propagated out, broad except caught it, registry silently fell back to _BUILTIN_PROVIDERS (oauth + anthropic-api only). Every third-party model would then route to anthropic-oauth — exactly the silent-fallback failure mode this PR was meant to eliminate. Now: per-entry try/except drops the bad entry with a warning, rest survives. Also: entries without a string ``name`` field are now dropped with a warning instead of silently using the placeholder ``<unnamed>`` — operator typos surface in boot logs. Tests: 28 passing (3 new regression tests covering both issues plus the no-name path). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 23:58:24 -07:00
Hongming Wang	c6f4912d09	feat(adapter): data-driven provider registry in config.yaml Move the model→endpoint→auth-env mapping out of hardcoded constants in adapter.py + entrypoint.sh into a single `providers:` list at the top of config.yaml. The adapter loads it at boot via _load_providers; canvas Config tab will read the same YAML for its Provider dropdown so UI and adapter never disagree on what's available. Adding a new provider becomes a one-line YAML edit — no Python or shell changes. Includes 5 third-party providers ready out of the box (Anthropic-compat endpoints, Bearer-style ANTHROPIC_AUTH_TOKEN OR ANTHROPIC_API_KEY auth): xiaomi-mimo https://api.xiaomimimo.com/anthropic minimax https://api.minimax.io/anthropic zai https://api.z.ai/api/anthropic (NEW) moonshot https://api.moonshot.ai/anthropic (NEW) deepseek https://api.deepseek.com/anthropic (NEW) Plus 7 new model entries in runtime_config.models (mimo-v2.5, MiniMax-M2, MiniMax-M2.7, GLM-4.6, GLM-4.5, kimi-k2.5, kimi-k2, deepseek-v4-pro, deepseek-v4-flash) so they show up in the Canvas Config dropdown. Operator override unchanged: ANTHROPIC_BASE_URL set as a workspace secret still wins over the registry default — the escape hatch for regional endpoints (Xiaomi token-plan-sgp, MiniMax api.minimaxi.com). entrypoint.sh: drops the `mimo-*` case mapping (adapter handles routing now). _BUILTIN_PROVIDERS retained as malformed-YAML fallback so a bare-bones workspace still boots with oauth + anthropic-api defaults. Tests: 25 passing. New coverage: - YAML parses + normalizes to expected shape - Malformed YAML falls back to builtins (warning, not raise) - Each new provider routes its model id to the right base_url - ANTHROPIC_AUTH_TOKEN alone satisfies third-party auth check - Operator-set ANTHROPIC_BASE_URL overrides registry default - Case-insensitive prefix match (MiniMax-M2 / minimax-m2.7 / GLM-4.6) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 23:29:40 -07:00
Hongming Wang	c646b8cebe	feat(adapter): raise on third-party model without ANTHROPIC_BASE_URL Aligns setup()'s third-party-model-without-URL handling with create_executor()'s pre-validate (#19) — both unrecoverable misconfigurations now raise ValueError at boot instead of one warning and one raising. Why: a third-party (mimo-) model selected without ANTHROPIC_BASE_URL sends every LLM request to api.anthropic.com with a non-Anthropic key, 401-ing every prompt. Workspace boots, looks "online" via heartbeat, but is structurally broken on the user-facing path. The previous warning-only path produced the same end-user symptom as the 2026-04-30 incident (workspace looks alive, every interaction fails) just via a different misconfig shape. Symmetry: create_executor raises when ANTHROPIC_BASE_URL is set to a non-Anthropic host but no model is picked. setup() now raises when a third-party model is picked but no URL is set. Together they catch both halves of the misconfig surface at boot, before the workspace enters "online" status. Adds 4 setup() tests: - raises on third-party + no URL - passes on third-party + URL - passes on OAuth alias (sonnet) + no URL - passes on Anthropic API id (claude-) + no URL Stubs molecule_runtime.plugins.load_plugins as a no-op so the pass-path tests run cleanly without the runtime installed. Test count: 11 (7 create_executor + 4 setup). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 22:50:25 -07:00
Hongming Wang	0d95b5098a	feat(adapter): pre-validate ANTHROPIC_BASE_URL + missing model combo The 2026-04-30 staging incident traced back to workspaces booting with ANTHROPIC_BASE_URL pointing at a non-Anthropic shim (MiniMax / OpenAI gateway) but no explicit model configured. The adapter silently fell back to "sonnet" — an Anthropic-native alias the upstream didn't recognize — and the SDK --print probe hung 30s before timing out. Platform's phantom-busy sweep then nuked the workspace at 10min, producing "every workspace dead" with the root cause buried in a 30s subprocess hang. Pre-validate the combo at adapter boot: when ANTHROPIC_BASE_URL host is non-Anthropic AND no explicit model is set, raise ValueError with an actionable message pointing to MODEL_PROVIDER / runtime_config.model. Also log the resolved model + base_url_host every boot so future failures explain themselves in the workspace logs without digging into the SDK subprocess. Tests live under tests/ with their own pytest.ini that anchors rootdir there — keeps pytest from importing the package __init__.py (which does the runtime-discovery relative import that requires molecule_runtime installed). 7 tests cover: misconfig raises with the right message, Anthropic-native passes, no-base-url passes, custom-url + explicit model passes, dataclass + dict shapes, unparseable URL no-crash. CI runs them on every push/PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 22:35:49 -07:00

20 Commits