Compare commits

..

10 Commits

Author SHA1 Message Date
technical-writer e2198ebe9c Merge branch 'merge/pr27-sop-checklist-gate' into merge/integration 2026-05-13 11:08:32 +00:00
technical-writer 59b6662b94 Merge branch 'merge/pr30-dev-channels-flag' into merge/integration 2026-05-13 11:08:30 +00:00
technical-writer 412d3ea2f7 Merge branch 'merge/pr31-changelog-security' into merge/integration 2026-05-13 11:08:27 +00:00
technical-writer 7a2de6677a docs(changelog): fix encoding — "after失败" → "after failure"
Chinese characters embedded in English sentence.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-13 11:06:34 +00:00
technical-writer 186929bd9d docs(changelog): fix encoding — "after失败" → "after failure"
Chinese characters embedded in English sentence. Spotted during
changelog duplicate-section merge.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-13 11:03:30 +00:00
documentation-specialist 6265ce5ec1 docs(security): add CWE-22 regression fix entry for 2026-05-13
Secret scan / secret-scan (pull_request) Successful in 26s
CI / build (pull_request) Successful in 3m2s
Pairs molecule-core#810 (Critical CWE-22 path traversal regression in
org_import.go). Also adds full 2026-05-13 changelog entry covering:
- CWE-22 path traversal fix (security section)
- stop_event graceful shutdown feature (SDK Python #8)
- PLATFORM_URL default alignment (workspace-runtime #12)
- Canvas CI hardening (core #773/776/777)
- Go lint CI hardening (core #781)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-13 08:23:48 +00:00
technical-writer 43b9d16503 docs(runtime-mcp): add dev-channels flag callout + See also cross-reference
Secret scan / secret-scan (pull_request) Successful in 40s
CI / build (pull_request) Successful in 2m22s
- Add a Callout warning in the Claude Code setup section linking to the
  dev-channels-flag page — directly surfaces the bare-flag failure mode
  at the point where operators configure the integration.
- Add dev-channels-flag to the See also section.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-13 05:20:39 +00:00
technical-writer bc1102366d docs(runtime-mcp): add dev-channels flag tagged-form requirement page
Documents why Claude Code 2.1.x+ requires
`--dangerously-load-development-channels server:molecule` (not the bare
flag) for inline channel push from the molecule-mcp wheel.

Covers: required tagged-allowlist form, the three-layer failure mode
when the flag is bare, Molecule operator setup, SDK integration via
extra_args, verification steps, and the research-preview gate forward note.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-13 05:19:19 +00:00
technical-writer 15a952c213 docs(changelog): fix duplicate 2026-05-10 sections and restore chronological order
Secret scan / secret-scan (pull_request) Successful in 34s
CI / build (pull_request) Successful in 2m15s
- Merge two ## 2026-05-10 sections into one (was: first at line 63, second at line 173, with ## 2026-04-23 incorrectly sandwiched between them)
- Move ## 2026-05-10 "Late-day updates" sub-section from ## 2026-04-23 into the merged ## 2026-05-10 section
- Move ## 2026-04-23 section to its correct chronological position (after ## 2026-04-22, before ## 2026-04-17)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-13 05:03:21 +00:00
claude-ceo-assistant edb6ee856b ci: add SOP checklist gate
Secret scan / secret-scan (pull_request) Successful in 10s
CI / build (pull_request) Successful in 2m13s
2026-05-12 20:38:13 -07:00
9 changed files with 1366 additions and 276 deletions
+823
View File
@@ -0,0 +1,823 @@
#!/usr/bin/env python3
# sop-checklist-gate — evaluate whether a PR has peer-acked each
# SOP-checklist item. Posts a commit-status that branch protection
# can require.
#
# RFC#351 Step 2 of 6 (implementation MVP).
#
# Invoked by .gitea/workflows/sop-checklist-gate.yml on:
# - pull_request_target: [opened, edited, synchronize, reopened]
# - issue_comment: [created, edited, deleted]
#
# Flow:
# 1. Load .gitea/sop-checklist-config.yaml (from BASE ref — trusted).
# 2. GET /repos/{R}/pulls/{N} — author, head.sha, tier label
# 3. GET /repos/{R}/issues/{N}/comments — extract /sop-ack and /sop-revoke
# 4. For each checklist item:
# a. Is the section marker present in PR body? (author answered)
# b. Is there ≥1 unrevoked /sop-ack from a non-author whose
# team-membership matches required_teams?
# 5. POST /repos/{R}/statuses/{sha} — context
# `sop-checklist / all-items-acked (pull_request)`,
# state=success | failure | pending, description=`acked: N/M …`.
#
# Trust boundary (mirrors RFC#324 §A4):
# This script is loaded from the BASE branch. The workflow's
# actions/checkout step pins ref=base.sha. PR-HEAD code is never
# executed. We only HTTP-call the Gitea API.
#
# Token scope:
# - read:repository / read:organization to enumerate PR + comments
# + team membership (Gitea 1.22.6 quirk: team-membership endpoint
# returns 403 if token owner is not in the team; see review-check.sh
# for the same gotcha — we surface the same fail-closed message).
# - write:repository for `POST /repos/{R}/statuses/{sha}`. Unlike
# RFC#324's pattern (which uses the JOB's own pass/fail as the
# status), we POST the status explicitly because the gate posts
# a single multi-item status with a richer description than a
# bare success/failure context can carry.
#
# Slug normalization rules (canonical form: kebab-case):
# - Lowercase
# - Whitespace + underscores → single dash
# - Strip non [a-z0-9-] characters
# - Collapse adjacent dashes
# - Strip leading/trailing dashes
# - If the result is a digit string (e.g. "1"), look up via
# config.items[*].numeric_alias to get the kebab-case slug.
#
# Examples:
# "Comprehensive_Testing" → "comprehensive-testing"
# "comprehensive testing" → "comprehensive-testing"
# "1" → "comprehensive-testing"
# "Five-Axis-Review" → "five-axis-review"
#
# Revoke semantics:
# /sop-revoke <slug> [reason] — most-recent comment per (slug, user)
# wins. So if Alice posts /sop-ack X then later /sop-revoke X, her ack
# for X is invalidated. Bob's prior /sop-ack X is unaffected. If Alice
# posts /sop-revoke X then later /sop-ack X again, the ack is restored.
from __future__ import annotations
import argparse
import json
import os
import re
import sys
import urllib.error
import urllib.parse
import urllib.request
from typing import Any
# ---------------------------------------------------------------------------
# Slug normalization
# ---------------------------------------------------------------------------
_NORMALIZE_REPLACE_RE = re.compile(r"[\s_]+")
_NORMALIZE_STRIP_RE = re.compile(r"[^a-z0-9-]")
_NORMALIZE_DASH_RE = re.compile(r"-+")
def normalize_slug(raw: str, numeric_aliases: dict[int, str] | None = None) -> str:
"""Normalize a user-supplied slug to canonical kebab-case form.
See module header for the rules.
If the input is a pure digit string AND numeric_aliases is provided,
the alias mapping is consulted. Unknown digits return "" so the caller
can flag the comment as unparseable.
"""
if raw is None:
return ""
s = raw.strip().lower()
s = _NORMALIZE_REPLACE_RE.sub("-", s)
s = _NORMALIZE_STRIP_RE.sub("", s)
s = _NORMALIZE_DASH_RE.sub("-", s)
s = s.strip("-")
if s.isdigit() and numeric_aliases is not None:
return numeric_aliases.get(int(s), "")
return s
# ---------------------------------------------------------------------------
# Comment parsing — /sop-ack and /sop-revoke
# ---------------------------------------------------------------------------
# A directive must be on its own line. Permits leading whitespace.
# Optional trailing note after the slug for /sop-ack and required reason
# for /sop-revoke (RFC#351 open question 4 — reason is captured but not
# yet validated; future iteration may require a min-length).
_DIRECTIVE_RE = re.compile(
r"^[ \t]*/(sop-ack|sop-revoke)[ \t]+([A-Za-z0-9_\- ]+?)(?:[ \t]+(.*))?[ \t]*$",
re.MULTILINE,
)
def parse_directives(
comment_body: str,
numeric_aliases: dict[int, str],
) -> list[tuple[str, str, str]]:
"""Extract /sop-ack and /sop-revoke directives from a comment body.
Returns a list of (kind, canonical_slug, note) tuples where:
kind is "sop-ack" or "sop-revoke"
canonical_slug is the normalized form (or "" if unparseable)
note is the trailing free-text (may be "")
"""
out: list[tuple[str, str, str]] = []
if not comment_body:
return out
for m in _DIRECTIVE_RE.finditer(comment_body):
kind = m.group(1)
raw_slug = (m.group(2) or "").strip()
# If the raw match included trailing words, the regex non-greedy
# captured only the first token; strip again for safety.
# We split on whitespace to keep the FIRST word as the slug, and
# everything after as the note.
parts = raw_slug.split()
if not parts:
continue
first = parts[0]
# If the slug-capture greedily matched multiple words (e.g.
# "comprehensive testing"), preserve normalize behavior: join
# the WHOLE first-word-token only; trailing words get appended to
# the note. The regex limits group(2) to [A-Za-z0-9_\- ] so we
# may have multi-word forms here — normalize handles them.
if len(parts) > 1:
# User wrote "/sop-ack comprehensive testing extra-note"
# → treat "comprehensive testing" as the slug source if it
# normalizes to a known item; otherwise treat "comprehensive"
# as slug and "testing extra-note" as note. We defer the
# disambiguation to the caller via the returned canonical
# slug. For simplicity: try the WHOLE captured string first.
canonical = normalize_slug(raw_slug, numeric_aliases)
else:
canonical = normalize_slug(first, numeric_aliases)
note_from_group = (m.group(3) or "").strip()
# If we collapsed multi-word slug into kebab and there's a
# trailing-text group too, append it.
out.append((kind, canonical, note_from_group))
return out
# ---------------------------------------------------------------------------
# PR body section detection
# ---------------------------------------------------------------------------
def section_marker_present(body: str, marker: str) -> bool:
"""Return True if `marker` appears in `body` case-insensitively
on a non-empty line (i.e. the author actually filled it in).
We require the marker substring AND non-whitespace content on the
same line OR within the next line — this prevents trivially-empty
checklists like:
## SOP-Checklist
- [ ] **Comprehensive testing performed**:
- [ ] **Local-postgres E2E run**:
from auto-passing the section-present check. The peer-ack is still
required, but answering with empty content is captured as a soft
finding via the section-present test alone.
"""
if not body or not marker:
return False
body_lower = body.lower()
marker_lower = marker.lower()
idx = body_lower.find(marker_lower)
if idx < 0:
return False
# Walk to end of line.
line_end = body.find("\n", idx)
if line_end < 0:
line_end = len(body)
line = body[idx + len(marker):line_end]
# Strip the colon + checkbox tail patterns; require at least one
# non-whitespace, non-punctuation char.
stripped = re.sub(r"[\s\*:\-\[\]]+", "", line)
if stripped:
return True
# Fall through: check the NEXT line (multi-line answers).
next_line_end = body.find("\n", line_end + 1)
if next_line_end < 0:
next_line_end = len(body)
next_line = body[line_end + 1:next_line_end]
stripped_next = re.sub(r"[\s\*:\-\[\]]+", "", next_line)
return bool(stripped_next)
# ---------------------------------------------------------------------------
# Ack-state computation
# ---------------------------------------------------------------------------
def compute_ack_state(
comments: list[dict[str, Any]],
pr_author: str,
items_by_slug: dict[str, dict[str, Any]],
numeric_aliases: dict[int, str],
team_membership_probe: "callable[[str, list[str]], list[str]]",
) -> dict[str, dict[str, Any]]:
"""Compute per-item ack state.
Each comment is processed in chronological order. The most-recent
directive per (commenter, slug) wins.
Returns a dict keyed by canonical slug:
{
"comprehensive-testing": {
"ackers": ["bob"], # non-author, team-verified
"rejected_ackers": { # debugging info
"self_ack": ["alice"],
"unknown_slug": [],
"not_in_team": ["eve"],
}
},
...
}
"""
# Step 1: collapse directives per (commenter, slug) — most recent wins.
# comments are expected to come in chronological order from the
# API (Gitea returns oldest-first by default for issues/{N}/comments).
latest_directive: dict[tuple[str, str], str] = {} # (user, slug) → kind
unparseable_per_user: dict[str, int] = {}
for c in comments:
body = c.get("body", "") or ""
user = (c.get("user") or {}).get("login", "")
if not user:
continue
for kind, slug, _note in parse_directives(body, numeric_aliases):
if not slug:
unparseable_per_user[user] = unparseable_per_user.get(user, 0) + 1
continue
latest_directive[(user, slug)] = kind
# Step 2: build candidate ackers per slug.
# Filter out self-acks and unknown slugs.
ackers_per_slug: dict[str, list[str]] = {s: [] for s in items_by_slug}
rejected_self: dict[str, list[str]] = {s: [] for s in items_by_slug}
rejected_unknown: dict[str, list[str]] = {s: [] for s in items_by_slug}
pending_team_check: dict[str, list[str]] = {s: [] for s in items_by_slug}
for (user, slug), kind in latest_directive.items():
if kind != "sop-ack":
continue # revokes leave the (user,slug) state as "no ack"
if slug not in items_by_slug:
# Slug normalized to something not in our config — store
# under a synthetic key for diagnostic surfacing. Don't add
# to any item.
continue
if user == pr_author:
rejected_self[slug].append(user)
continue
pending_team_check[slug].append(user)
# Step 3: team membership probe per slug (batched per slug to keep
# API call count down — same user may ack multiple items but the
# required_teams differ per item, so we MUST probe per (user, item)).
rejected_not_in_team: dict[str, list[str]] = {s: [] for s in items_by_slug}
for slug, candidates in pending_team_check.items():
if not candidates:
continue
required = items_by_slug[slug]["required_teams"]
approved = team_membership_probe(slug, candidates) # returns subset
rejected_not_in_team[slug] = [u for u in candidates if u not in approved]
ackers_per_slug[slug] = approved
# Stash required teams for description rendering.
items_by_slug[slug]["_required_resolved"] = required
return {
slug: {
"ackers": ackers_per_slug[slug],
"rejected": {
"self_ack": rejected_self[slug],
"not_in_team": rejected_not_in_team[slug],
},
}
for slug in items_by_slug
}
# ---------------------------------------------------------------------------
# Gitea API client
# ---------------------------------------------------------------------------
class GiteaClient:
def __init__(self, host: str, token: str):
self.base = f"https://{host}/api/v1"
self.token = token
# Cache team-name → team-id resolutions per org.
self._team_id_cache: dict[tuple[str, str], int | None] = {}
def _req(
self,
method: str,
path: str,
body: dict[str, Any] | None = None,
ok_codes: tuple[int, ...] = (200, 201, 204),
) -> tuple[int, Any]:
url = self.base + path
data = None
headers = {
"Authorization": f"token {self.token}",
"Accept": "application/json",
}
if body is not None:
data = json.dumps(body).encode("utf-8")
headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, method=method, data=data, headers=headers)
try:
with urllib.request.urlopen(req, timeout=20) as r:
raw = r.read()
code = r.getcode()
except urllib.error.HTTPError as e:
code = e.code
raw = e.read()
try:
parsed = json.loads(raw.decode("utf-8")) if raw else None
except json.JSONDecodeError:
parsed = raw.decode("utf-8", errors="replace") if raw else None
return code, parsed
def get_pr(self, owner: str, repo: str, pr: int) -> dict[str, Any]:
code, data = self._req("GET", f"/repos/{owner}/{repo}/pulls/{pr}")
if code != 200:
raise RuntimeError(f"GET pulls/{pr} → HTTP {code}: {data!r}")
return data
def get_issue_comments(
self, owner: str, repo: str, issue: int
) -> list[dict[str, Any]]:
# Paginate. Gitea default page size 50.
out: list[dict[str, Any]] = []
page = 1
while True:
code, data = self._req(
"GET",
f"/repos/{owner}/{repo}/issues/{issue}/comments?limit=50&page={page}",
)
if code != 200:
raise RuntimeError(
f"GET issues/{issue}/comments page={page} → HTTP {code}: {data!r}"
)
if not data:
break
out.extend(data)
if len(data) < 50:
break
page += 1
return out
def resolve_team_id(self, org: str, team_name: str) -> int | None:
key = (org, team_name)
if key in self._team_id_cache:
return self._team_id_cache[key]
code, data = self._req("GET", f"/orgs/{org}/teams/search?q={urllib.parse.quote(team_name)}")
team_id = None
if code == 200 and isinstance(data, dict):
for t in data.get("data", []):
if t.get("name") == team_name:
team_id = t.get("id")
break
if team_id is None and code == 200 and isinstance(data, list):
for t in data:
if t.get("name") == team_name:
team_id = t.get("id")
break
self._team_id_cache[key] = team_id
return team_id
def is_team_member(self, team_id: int, login: str) -> bool | None:
"""Return True / False / None (unknown — 403 from API)."""
code, _ = self._req(
"GET", f"/teams/{team_id}/members/{urllib.parse.quote(login)}"
)
if code in (200, 204):
return True
if code == 404:
return False
# 403 means the token owner isn't in this team, so the API
# refuses to confirm membership. Fail-closed at the caller.
return None
def post_status(
self,
owner: str,
repo: str,
sha: str,
state: str,
context: str,
description: str,
target_url: str = "",
) -> None:
body = {
"state": state,
"context": context,
"description": description[:140], # Gitea truncates to 255 but be safe
"target_url": target_url or "",
}
code, data = self._req(
"POST",
f"/repos/{owner}/{repo}/statuses/{sha}",
body=body,
ok_codes=(201,),
)
if code not in (200, 201):
raise RuntimeError(
f"POST statuses/{sha} → HTTP {code}: {data!r}"
)
# ---------------------------------------------------------------------------
# Config loader (PyYAML-free — config file is intentionally tiny + flat)
# ---------------------------------------------------------------------------
def load_config(path: str) -> dict[str, Any]:
"""Load .gitea/sop-checklist-config.yaml.
Uses PyYAML if available, otherwise falls back to a built-in
minimal parser sufficient for our flat config shape. Bundling
PyYAML on the runner is one apt install away but we avoid the
dep by keeping the config shape constrained.
"""
try:
import yaml # type: ignore[import-not-found]
with open(path) as f:
return yaml.safe_load(f)
except ImportError:
return _load_config_minimal(path)
def _load_config_minimal(path: str) -> dict[str, Any]:
"""Minimal YAML subset parser for our config shape.
Supports: top-level scalar:value, top-level map-of-map (e.g.
tier_failure_mode), top-level list of maps (items:), and within an
item map: scalars + lists of scalars. Does NOT support nested lists,
YAML anchors, multi-doc, or flow style.
"""
with open(path) as f:
lines = f.readlines()
return _parse_minimal_yaml(lines)
def _parse_minimal_yaml(lines: list[str]) -> dict[str, Any]: # noqa: C901
"""Hand-rolled subset parser. See _load_config_minimal docstring."""
# Strip comments + blank lines but preserve indentation.
cleaned: list[tuple[int, str]] = []
for raw in lines:
# Don't strip a "#" that is inside a quoted value.
body = raw.rstrip("\n")
# Remove trailing comment.
idx = body.find("#")
if idx >= 0 and (idx == 0 or body[idx - 1] in " \t"):
body = body[:idx].rstrip()
if not body.strip():
continue
indent = len(body) - len(body.lstrip(" "))
cleaned.append((indent, body.strip()))
root: dict[str, Any] = {}
i = 0
n = len(cleaned)
def parse_scalar(s: str) -> Any:
s = s.strip()
if s.startswith('"') and s.endswith('"'):
return s[1:-1]
if s.startswith("'") and s.endswith("'"):
return s[1:-1]
if s.lower() in ("true", "yes"):
return True
if s.lower() in ("false", "no"):
return False
try:
return int(s)
except ValueError:
pass
return s
def parse_inline_list(s: str) -> list[Any]:
s = s.strip()
if not (s.startswith("[") and s.endswith("]")):
return [parse_scalar(s)]
inner = s[1:-1]
if not inner.strip():
return []
return [parse_scalar(x.strip()) for x in inner.split(",")]
while i < n:
indent, line = cleaned[i]
if indent != 0:
i += 1
continue
if ":" not in line:
i += 1
continue
key, _, rest = line.partition(":")
key = key.strip()
rest = rest.strip()
if rest == "":
# Block — could be map or list.
i += 1
# Look ahead for first child.
if i < n and cleaned[i][1].startswith("- "):
# List of items.
items: list[Any] = []
while i < n and cleaned[i][0] > indent and cleaned[i][1].startswith("- "):
item_indent = cleaned[i][0]
first_kv = cleaned[i][1][2:].strip() # strip "- "
item: dict[str, Any] = {}
if ":" in first_kv:
k, _, v = first_kv.partition(":")
k = k.strip()
v = v.strip()
if v == "":
item[k] = ""
elif v.startswith(">-") or v.startswith(">"):
# Folded scalar continues on subsequent indented lines
collected: list[str] = []
i += 1
while i < n and cleaned[i][0] > item_indent:
collected.append(cleaned[i][1])
i += 1
item[k] = " ".join(collected)
items.append(item)
continue
elif v.startswith("["):
item[k] = parse_inline_list(v)
else:
item[k] = parse_scalar(v)
i += 1
# Subsequent k:v lines at deeper indent belong to this item.
while i < n and cleaned[i][0] > item_indent and not cleaned[i][1].startswith("- "):
sub_indent, sub_line = cleaned[i]
if ":" in sub_line:
k, _, v = sub_line.partition(":")
k = k.strip()
v = v.strip()
if v == "":
item[k] = ""
i += 1
elif v.startswith(">-") or v.startswith(">"):
collected = []
i += 1
while i < n and cleaned[i][0] > sub_indent:
collected.append(cleaned[i][1])
i += 1
item[k] = " ".join(collected)
elif v.startswith("["):
item[k] = parse_inline_list(v)
i += 1
else:
item[k] = parse_scalar(v)
i += 1
else:
i += 1
items.append(item)
root[key] = items
else:
# Sub-map.
submap: dict[str, Any] = {}
while i < n and cleaned[i][0] > indent:
sub_indent, sub_line = cleaned[i]
if ":" in sub_line:
k, _, v = sub_line.partition(":")
k = k.strip().strip('"').strip("'")
v = v.strip()
if v.startswith("[") and v.endswith("]"):
submap[k] = parse_inline_list(v)
else:
submap[k] = parse_scalar(v)
i += 1
root[key] = submap
else:
# Inline scalar or list.
if rest.startswith("[") and rest.endswith("]"):
root[key] = parse_inline_list(rest)
else:
root[key] = parse_scalar(rest)
i += 1
return root
# ---------------------------------------------------------------------------
# Main entry point
# ---------------------------------------------------------------------------
def render_status(
items: list[dict[str, Any]],
ack_state: dict[str, dict[str, Any]],
body_state: dict[str, bool],
) -> tuple[str, str]:
"""Return (state, description) for the commit-status post.
state is "success" if every item has at least one valid ack
(body section presence is informational only — peer-ack is the
real gate). "pending" is reserved for the soft-fail path
(tier:low) and is set by the caller.
"""
n = len(items)
fully_acked = [
it["slug"] for it in items if ack_state[it["slug"]]["ackers"]
]
missing = [
it["slug"] for it in items if not ack_state[it["slug"]]["ackers"]
]
missing_body = [it["slug"] for it in items if not body_state.get(it["slug"], False)]
desc_parts = [f"acked: {len(fully_acked)}/{n}"]
if missing:
# Show up to 3 missing slugs to stay inside the 140-char budget.
shown = ", ".join(missing[:3])
if len(missing) > 3:
shown += f", +{len(missing) - 3}"
desc_parts.append(f"missing: {shown}")
if missing_body:
desc_parts.append(f"body-unfilled: {len(missing_body)}")
state = "success" if not missing else "failure"
return state, "".join(desc_parts)
def get_tier_mode(pr: dict[str, Any], cfg: dict[str, Any]) -> str:
"""Read tier label, return 'hard' or 'soft' per cfg.tier_failure_mode."""
labels = pr.get("labels") or []
tier_labels = [l.get("name", "") for l in labels if (l.get("name", "") or "").startswith("tier:")]
mode_map = cfg.get("tier_failure_mode") or {}
default_mode = cfg.get("default_mode", "hard")
for tl in tier_labels:
if tl in mode_map:
return mode_map[tl]
return default_mode
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser()
p.add_argument("--owner", required=True)
p.add_argument("--repo", required=True)
p.add_argument("--pr", type=int, required=True)
p.add_argument("--config", default=".gitea/sop-checklist-config.yaml")
p.add_argument("--gitea-host", default="git.moleculesai.app")
p.add_argument(
"--dry-run",
action="store_true",
help="Compute state but do not POST the status.",
)
p.add_argument(
"--status-context",
default="sop-checklist / all-items-acked (pull_request)",
)
p.add_argument(
"--exit-on-state",
action="store_true",
help=(
"If set, exit non-zero when state=failure. Default OFF so the "
"job-level conclusion is independent of ack-state — the only "
"thing BP sees is the POSTed status. Useful for local debugging."
),
)
args = p.parse_args(argv)
token = os.environ.get("GITEA_TOKEN", "")
if not token and not args.dry_run:
print("::error::GITEA_TOKEN env required", file=sys.stderr)
return 2
cfg = load_config(args.config)
items: list[dict[str, Any]] = cfg["items"]
items_by_slug = {it["slug"]: it for it in items}
numeric_aliases = {
int(it["numeric_alias"]): it["slug"] for it in items if it.get("numeric_alias")
}
client = GiteaClient(args.gitea_host, token) if token else None
if not client:
print("::error::No client (dry-run without token has nothing to do)", file=sys.stderr)
return 2
pr = client.get_pr(args.owner, args.repo, args.pr)
if pr.get("state") != "open":
print(f"::notice::PR #{args.pr} is {pr.get('state')} — gate is a no-op")
return 0
author = (pr.get("user") or {}).get("login", "")
head_sha = (pr.get("head") or {}).get("sha", "")
body = pr.get("body", "") or ""
if not author or not head_sha:
print("::error::PR payload missing user.login or head.sha", file=sys.stderr)
return 1
comments = client.get_issue_comments(args.owner, args.repo, args.pr)
# Build team-membership probe closure that caches results per
# (user, team-id) so a user acking multiple items only triggers
# one membership lookup per team.
team_member_cache: dict[tuple[str, int], bool | None] = {}
def probe(slug: str, users: list[str]) -> list[str]:
item = items_by_slug[slug]
team_names: list[str] = item["required_teams"]
# Resolve names → ids. NOTE: orgs/{org}/teams/search may not be
# available — fall back to the list endpoint.
team_ids: list[int] = []
for tn in team_names:
tid = client.resolve_team_id(args.owner, tn)
if tid is None:
# Try the list endpoint as a fallback.
code, data = client._req( # noqa: SLF001
"GET", f"/orgs/{args.owner}/teams"
)
if code == 200 and isinstance(data, list):
for t in data:
if t.get("name") == tn:
tid = t.get("id")
client._team_id_cache[(args.owner, tn)] = tid # noqa: SLF001
break
if tid is not None:
team_ids.append(tid)
else:
print(
f"::warning::could not resolve team-id for '{tn}' "
f"in org '{args.owner}' — item '{slug}' will fail closed",
file=sys.stderr,
)
approved: list[str] = []
for u in users:
for tid in team_ids:
cache_key = (u, tid)
if cache_key not in team_member_cache:
team_member_cache[cache_key] = client.is_team_member(tid, u)
result = team_member_cache[cache_key]
if result is True:
approved.append(u)
break
if result is None:
print(
f"::warning::team-probe for {u} in team-id {tid} returned 403 "
"(token owner not in that team — fail-closed per RFC#324)",
file=sys.stderr,
)
# Treat as not-in-team for this user/team pair; loop
# may still find membership in another team.
return approved
ack_state = compute_ack_state(comments, author, items_by_slug, numeric_aliases, probe)
body_state = {it["slug"]: section_marker_present(body, it["pr_section_marker"]) for it in items}
state, description = render_status(items, ack_state, body_state)
mode = get_tier_mode(pr, cfg)
if state == "failure" and mode == "soft":
state = "pending"
description = f"[soft-fail tier:low] {description}"
# Diagnostics to job log.
print(f"::notice::PR #{args.pr} author={author} head={head_sha[:7]} mode={mode}")
for it in items:
slug = it["slug"]
ackers = ack_state[slug]["ackers"]
if ackers:
print(f"::notice:: [PASS] {slug} — acked by {','.join(ackers)}")
else:
r = ack_state[slug]["rejected"]
extras: list[str] = []
if r["self_ack"]:
extras.append(f"self-acks-rejected:{','.join(r['self_ack'])}")
if r["not_in_team"]:
extras.append(f"not-in-team:{','.join(r['not_in_team'])}")
extra = " (" + "; ".join(extras) + ")" if extras else ""
print(f"::notice:: [WAIT] {slug} — no valid peer-ack yet{extra}")
print(f"::notice::posting status: state={state} desc={description!r}")
if args.dry_run:
print("::notice::--dry-run: not posting status")
if args.exit_on_state:
return 0 if state in ("success", "pending") else 1
return 0
target_url = f"https://{args.gitea_host}/{args.owner}/{args.repo}/pulls/{args.pr}"
client.post_status(
args.owner, args.repo, head_sha,
state=state, context=args.status_context,
description=description, target_url=target_url,
)
print(f"::notice::status posted: {args.status_context}{state}")
# By default exit 0 — the POSTed status IS the gate, NOT the job
# conclusion. If the job exits 1 BP will see TWO failure signals
# (one from the job's auto-status, one from our POST), making the
# description less actionable. --exit-on-state restores the old
# behavior for local debugging.
if args.exit_on_state:
return 0 if state in ("success", "pending") else 1
return 0
if __name__ == "__main__":
sys.exit(main())
+109
View File
@@ -0,0 +1,109 @@
# SOP-Checklist gate — per-item required reviewer teams.
#
# RFC#351 v1 starter set. Each item lists:
# slug — canonical kebab-case form used in /sop-ack <slug>
# pr_section_marker — substring matched in the PR body to detect that
# the author filled in this item (case-insensitive)
# required_teams — list of Gitea team names; an ack from ANY one of
# these teams (logical OR) satisfies the item.
# Membership is probed at gate-time via
# GET /api/v1/teams/{id}/members/{login}.
# Team-id resolution happens at script start via
# GET /api/v1/orgs/{org}/teams (cheap, one call).
# numeric_alias — 1..7; lets reviewers type `/sop-ack 3` as a
# shortcut for `/sop-ack staging-smoke`.
#
# WHY THESE TEAM MAPPINGS:
# The RFC table referenced persona-role names like `core-qa`,
# `core-be`, `core-devops` — these are individual Gitea user logins,
# not teams. The Gitea team-membership API is /teams/{id}/members/{u},
# so we need actual teams. Orchestrator preflight 2026-05-12 verified
# only these teams exist on molecule-ai: ceo(5), engineers(2),
# managers(6), qa(20), security(21), Owners(1), and bot teams. We
# map the RFC roles to the closest existing team and surface the
# mapping explicitly so it's reviewable.
#
# HOW TO EDIT:
# - Tightening: replace `engineers` with a smaller team after creating
# it (e.g. a new `senior-engineers` team if needed).
# - Loosening: add another team to required_teams (OR semantics).
# - Add an item: append to items list and document the slug below.
#
# AUTHOR SELF-ACK IS FORBIDDEN regardless of which team contains them
# — the gate script enforces commenter != PR author before checking
# team membership.
version: 1
# Tier-aware failure mode (RFC#351 open question 2):
# For tier:high — hard-fail (status `failure`, blocks merge via BP).
# For tier:medium — hard-fail (same as high; medium is non-trivial).
# For tier:low — soft-fail (status `pending` with `acked: N/M` in the
# description). BP can choose to require the context
# or not for low-tier PRs.
# If no tier label is present, default to medium (hard-fail) — every PR
# should have a tier label per sop-tier-check, and absence indicates
# a missing-tier defect we should surface, not silently lower the bar.
tier_failure_mode:
"tier:high": hard
"tier:medium": hard
"tier:low": soft
default_mode: hard # used when no tier:* label is present
items:
- slug: comprehensive-testing
numeric_alias: 1
pr_section_marker: "Comprehensive testing performed"
required_teams: [qa, engineers]
description: >-
What was tested, how, edge cases covered. Ack from any qa-team
member (or engineers fallback while qa is small).
- slug: local-postgres-e2e
numeric_alias: 2
pr_section_marker: "Local-postgres E2E run"
required_teams: [engineers]
description: >-
Link to local CI artifact, or "N/A: pure-frontend change". Ack
from any engineer who can verify the local DB test actually ran.
- slug: staging-smoke
numeric_alias: 3
pr_section_marker: "Staging-smoke verified or pending"
required_teams: [engineers]
description: >-
Link to canary run, or "scheduled post-merge". Ack from any
engineer (core-devops/infra-sre are members of engineers team).
- slug: root-cause
numeric_alias: 4
pr_section_marker: "Root-cause not symptom"
required_teams: [managers, ceo]
description: >-
One-sentence root-cause statement. Ack from managers tier
(team-leads) or ceo. Senior judgment required to attest
root-cause-versus-symptom.
- slug: five-axis-review
numeric_alias: 5
pr_section_marker: "Five-Axis review walked"
required_teams: [engineers]
description: >-
Correctness / readability / architecture / security / performance.
Ack from any non-author engineer.
- slug: no-backwards-compat
numeric_alias: 6
pr_section_marker: "No backwards-compat shim / dead code added"
required_teams: [managers, ceo]
description: >-
Yes/no + justification if no. Senior ack required because
backward-compat shims are how dead-code accretes.
- slug: memory-consulted
numeric_alias: 7
pr_section_marker: "Memory/saved-feedback consulted"
required_teams: [engineers]
description: >-
List of feedback memories applicable to this change. Ack from
any engineer who has the same memory access.
-1
View File
@@ -7,7 +7,6 @@ on:
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
+121
View File
@@ -0,0 +1,121 @@
# sop-checklist-gate — peer-ack merge gate for SOP-checklist items.
#
# RFC#351 Step 2 of 6 (implementation MVP).
#
# === DESIGN ===
#
# Goal: each PR must answer 7 SOP-checklist questions in its body,
# and each item must have at least one /sop-ack <slug> comment from
# a non-author peer in the required team. BP requires the
# `sop-checklist / all-items-acked (pull_request)` status to merge.
#
# Triggers:
# - `pull_request_target`: opened, edited, synchronize, reopened
# → fires when PR opens, body is edited (refire — RFC#351 §4),
# or new code is pushed (head.sha changes → stale status would
# be auto-discarded by BP via dismiss_stale_reviews, but the
# status itself is per-SHA so we re-post on the new head).
# - `issue_comment`: created, edited, deleted
# → fires on any new comment so /sop-ack / /sop-revoke take
# effect immediately (Gitea 1.22.6 doesn't refire on
# pull_request_review per feedback_pull_request_review_no_refire,
# so issue_comment is the canonical refire channel).
#
# Trust boundary (mirrors RFC#324 §A4 + sop-tier-check security note):
# `pull_request_target` (not `pull_request`) — workflow def is loaded
# from BASE branch, so a PR cannot rewrite this workflow to exfiltrate
# the token. The `actions/checkout` step pins `ref: base.sha` so the
# script ALSO comes from BASE. PR-HEAD code is never executed in the
# runner.
#
# Token scope:
# - read:repository, read:organization for PR + comments + team probes
# - write:repository for POST /statuses/{sha}
# - The token owner MUST be a member of every team referenced by the
# config's required_teams (else /teams/{id}/members/{login} returns
# 403 — see review-check.sh same-gotcha doc). For the MVP we use
# the dev-lead token (a member of engineers, managers, qa, security)
# via `SOP_CHECKLIST_GATE_TOKEN`, the org-level `SOP_TIER_CHECK_TOKEN`, or a repo-scoped fallback token. Provisioning of that
# secret is a follow-up authorization step (separate from this PR).
#
# Failure mode: tier-aware (RFC#351 open question 2):
# - tier:high → state=failure (hard-fail; BP blocks merge)
# - tier:medium → state=failure (hard-fail; same)
# - tier:low → state=pending (soft-fail; BP can choose to require
# this context or skip for low-tier PRs)
# - missing/no-tier → state=failure (default-mode: hard — never lower
# the bar per feedback_fix_root_not_symptom)
#
# Slash-command contract (RFC#351 v1 + §A1.1-style notes from RFC#324):
#
# /sop-ack <slug-or-numeric-alias> [optional note]
# — register a peer-ack for one checklist item.
# — slug accepts kebab-case, snake_case, or natural-spaces
# (all normalize to canonical kebab-case).
# — numeric 1..7 maps via config.items[*].numeric_alias.
# — most-recent (user, slug) directive wins.
#
# /sop-revoke <slug-or-numeric-alias> [reason]
# — invalidate the commenter's own prior /sop-ack for this slug.
# — does NOT affect other peers' acks on the same slug.
# — most-recent (user, slug) directive wins, so a later /sop-ack
# re-restores the ack.
#
# The eval is read-only + idempotent (read PR + comments + team
# membership, compute, post status). Re-running on any event is safe —
# the new status overwrites the previous one for the same context.
name: sop-checklist-gate
on:
pull_request_target:
types: [opened, edited, synchronize, reopened]
issue_comment:
types: [created, edited, deleted]
permissions:
contents: read
pull-requests: read
# NOTE: `statuses: write` is the GitHub-Actions name for POST /statuses.
# Gitea 1.22.6 may not gate on this permission key (it just checks the
# token), but listing it explicitly documents intent for the next
# platform-version upgrade.
statuses: write
jobs:
gate:
# Run on pull_request_target events always. On issue_comment events,
# only when the comment is on a PR (issue_comment fires for issues
# too) and the body contains one of the slash-commands.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' &&
github.event.issue.pull_request != null &&
(contains(github.event.comment.body, '/sop-ack') ||
contains(github.event.comment.body, '/sop-revoke')))
runs-on: ubuntu-latest
steps:
- name: Check out BASE ref (trust boundary — never PR-head)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# For pull_request_target, the default branch is the trust
# anchor. For issue_comment the PR base may differ from the
# default branch (PR targeting `staging`), so we use the
# default-branch ref explicitly — same approach as
# qa-review.yml so the script source is always trusted.
ref: ${{ github.event.repository.default_branch }}
- name: Run sop-checklist-gate
env:
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.SOP_TIER_CHECK_TOKEN || secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}
OWNER: ${{ github.repository_owner }}
REPO_NAME: ${{ github.event.repository.name }}
run: |
set -euo pipefail
python3 .gitea/scripts/sop-checklist-gate.py \
--owner "$OWNER" \
--repo "$REPO_NAME" \
--pr "$PR_NUMBER" \
--config .gitea/sop-checklist-config.yaml \
--gitea-host git.moleculesai.app
+105 -77
View File
@@ -8,6 +8,27 @@ Entries are published daily at 23:50 UTC.
---
## 2026-05-13
### ✨ New features
- **Graceful shutdown support for remote agents**: `run_heartbeat_loop()` and `run_agent_loop()` in `molecule-sdk-python` now accept a `stop_event: threading.Event` parameter. Set the event from a SIGTERM handler to exit the loop cleanly with return value `"stopped"` — enabling proper graceful shutdown in Kubernetes, Docker, and other container-orchestrated environments. (`molecule-sdk-python` [#8](https://git.moleculesai.app/molecule-ai/molecule-sdk-python/pulls/8))
### 🔧 Fixes
- **PLATFORM_URL defaults aligned across all runtime modules**: all workspace runtime modules (`a2a_cli.py`, `a2a_client.py`, `a2a_mcp_server.py`, and 10 others) now consistently default `PLATFORM_URL` to `http://host.docker.internal:8080`. Previously some modules defaulted to `http://platform:8080`, causing connection failures in containerized deployments where the Docker host is not named `platform`. (`molecule-ai-workspace-runtime` [#12](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/12))
### 🔒 Security
- **CWE-22: Path traversal regression in org template import fixed**: a regression removed the path-traversal guard from `createWorkspaceTree` in `org_import.go`, which could allow a malicious org YAML with `filesDir: "../../../etc"` to read arbitrary server files. The fix replaces the unprotected `parseEnvFile` calls with `loadWorkspaceEnv` which applies `resolveInsideRoot` validation before accessing any path. (`molecule-core` [#810](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/810))
### 🧹 Internal
- **Canvas CI hardening**: publish workflow updated to pipefail-safe shell probes; Gitea cache export no longer masks errors; canvas image published to ECR. (`molecule-core` [#773](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/773), [#776](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/776), [#777](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/777))
- **Go lint CI hardening**: `golangci-lint run` no longer masked with `|| true`, so lint failures now fail the build loudly. (`molecule-core` [#781](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/781))
---
## 2026-05-12
### 🔒 Security
@@ -26,6 +47,7 @@ Entries are published daily at 23:50 UTC.
---
## 2026-05-11
### ✨ New features
@@ -60,6 +82,7 @@ Entries are published daily at 23:50 UTC.
---
## 2026-05-10
### ✨ New features
@@ -96,82 +119,6 @@ Entries are published daily at 23:50 UTC.
- **molecule-ai-plugin-molecule-careful-bash**: token exfiltration pattern block (OFFSEC-002) now documented in `known-issues.md`. (`molecule-ai-plugin-molecule-careful-bash` [#3](https://git.moleculesai.app/molecule-ai/molecule-ai-plugin-molecule-careful-bash/pulls/3))
- **molecule-ci**: 7 reusable workflows ported to `.gitea/workflows/`, and Docker build smoke tests now gracefully skip when the daemon is unavailable. (`molecule-ci` [#6](https://git.moleculesai.app/molecule-ai/molecule-ci/pulls/6), [#7](https://git.moleculesai.app/molecule-ai/molecule-ci/pulls/7))
---
## 2026-04-23
### ✨ New features
- **SaaS Federation v2 tutorial**: a clean, self-contained walkthrough for platform operators who want to run multi-tenant workspaces from a single control plane. Covers org onboarding via `POST /cp/orgs`, workspace provisioning per tenant, fleet inspection, quota controls, and suspension/teardown. (`molecule-core` [#1700](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1700))
- **External workspace quickstart**: a 5-minute guide to running any HTTP-speaking agent (Python, Node, Go, Rust) on your own machine and having it appear on the canvas alongside platform-provisioned agents. Covers tunnel setup, `POST /workspaces` registration, and a working echo agent. (`molecule-core` [#1760](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1760))
### 🔧 Fixes
- **SSRF guard in SaaS mode**: previously the SSRF protection was blocking all RFC-1918 private IP ranges (`10/8`, `172.16/12`, `192.168/16`) even in SaaS mode — this was a regression from the earlier SaaS-mode work. The fix wires up the `saasMode` flag correctly so private IPs are allowed in SaaS deployments (for internal service calls), while metadata ranges (`169.254/16`), CGNAT, loopback, and link-local remain blocked in every mode. IPv6 ULA (`fd00::/8`) handling is also now correct. (`molecule-core` [#1692](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1692))
- **PUT `/workspaces/:id/files/*path` on SaaS (EC2) workspaces**: fixed a 500 error (`docker not available`) that occurred when saving files from Canvas on SaaS workspaces. The handler now detects non-Docker workspaces via `workspaces.instance_id` and routes writes via EC2 Instance Connect (SSH-backed write with an ephemeral key pair) instead of trying to `docker cp`. (`molecule-core` [#1702](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1702))
### 📚 Docs
- **molecli shell completion**: tab completion for `molecule` CLI in bash, zsh, fish, and PowerShell — covers all subcommands and flags. (`docs` [#79](https://git.moleculesai.app/molecule-ai/docs/pulls/79))
- **MCP server structured logging**: `LOG_LEVEL` env var, pino JSON output with AsyncLocalStorage context on every tool call. (`docs` [#78](https://git.moleculesai.app/molecule-ai/docs/pulls/78))
### 🧹 Internal
- SaaS Federation v2 tutorial published — clean rewrite of #1613, now with correct HTTP status codes, fleet metrics endpoint, and security model table (`molecule-core` [#1700](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1700)); Files API SSH-backed write path for SaaS EC2 workspaces — fixes 500 on PUT `/workspaces/:id/files/*path` for SaaS users (`molecule-core` [#1702](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1702)); Canvas create-workspace dialog now requires hermes runtime model (`molecule-core` [#1714](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1714)).
- EC2 Instance Connect SSH tutorial published (`molecule-core` [#1617](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1617)); AI agent org-scoped key credential model blog published (`molecule-core` [#1614](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1614)); Phase 30 Day 2 social package ready (`molecule-core` [#1662](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1662)).
### 🌅 Late-day updates (17:3023:50 UTC)
#### 🔒 Security
- **Cross-tenant memory poisoning fix** (`molecule-core` [#1791](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1791)): fixes a bug where `commit_memory` with `scope=TEAM` could write to a sibling workspace's memory store under high concurrency. `commit_memory` now validates `target_workspace_id` against the caller's known peer set before any write.
- **CWE-78 shell injection hardening** (`molecule-core` [#1885](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1885)): `shellQuote` now uses `strconv.Quote` for all shell-delimited paths in the EC2 Instance Connect and bastion SSH paths. Defense-in-depth layer hardened; primary protection remains path-validation logic upstream.
#### ✨ New features
- **A2A priority queue — Phase 1** (`molecule-core` [#1892](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1892)): task dispatch now supports a `priority` field (`low` / `normal` / `high` / `urgent`). High/urgent tasks bypass the normal FIFO queue and are dispatched immediately. Phase 2 (priority inversion deadlock prevention) on the roadmap.
#### 🔧 Fixes
- **A2A queue nil-safe drain** (`molecule-core` [#1893](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1893), [#1896](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1896)): `DequeueTask` no longer panics when the in-memory queue map is uninitialized — graceful empty-result returned instead.
- **Workspaces stuck in `provisioning` after失败** (`molecule-core` [#1794](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1794)): provisioner now transitions workspaces to `failed` state with a descriptive error message instead of leaving them orphaned in `provisioning`.
- **Dedup settings hooks double-fire** (`molecule-core` [#1797](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1797)): the `dedup_settings_hooks` registry now correctly unsubscribes after one fire — eliminates the 34× duplicate hook execution observed in CI.
- **Semantic memory search returning stale results** (`molecule-core` [#1778](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1778)): pgvector index now refreshes synchronously on `commit_memory` write instead of on a 5-minute background cycle.
- **pgvector migration race in E2E CI** (`molecule-core` [#1777](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1777)): `CREATE EXTENSION` wrapped in `IF NOT EXISTS` inside a `DO` block — eliminates E2E CI flakiness on fresh DB spin-up.
- **EC2 Instance Connect endpoint not found in us-west-2** (`molecule-core` [#1779](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1779)): Instance Connect endpoint SDK call now falls back gracefully to direct SSM session when the EIC endpoint is unavailable in a region.
- **Canvas topology overlay edge labels clipped** (`molecule-core` [#1802](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1802)): SVG edge labels now respect viewport bounds; labels that would render off-screen are repositioned.
- **Audit trail panel not loading for large workspaces** (`molecule-core` [#1854](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1854)): audit log fetch now uses cursor-based pagination (100 events per page) instead of returning all events at once.
- **Hermes `response_format` not forwarded to MiniMax** (`molecule-core` [#1861](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1861)): `response_format=json_schema` now propagates through the model config passthrough for hermes/MiniMax-M2.7-highspeed workspaces.
- **Memory Inspector panel memory leak** (`molecule-core` [#1871](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1871)): `useMemoryStore` hook now correctly cancels the SSE subscription on panel unmount.
- **Token revocation cache stale-read window** (`molecule-core` [#1888](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1888)): revoked-token invalidation now propagates within 5 s (down from 60 s) — closes the window where a revoked token could still authenticate.
- **TenantGuard same-origin bypass (regression)** (`molecule-core` [#1898](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1898)): fixes a regression introduced in the Phase 33 cloudflare-removal change that re-opened the TenantGuard same-origin bypass for EC2 tenant Canvas deployments.
#### 📚 Docs
- **Chrome DevTools MCP tutorial** (`docs` [#1798](https://git.moleculesai.app/molecule-ai/docs/pulls/1798)): hands-on guide for debugging Molecule AI agents in-browser using Chrome's built-in MCP inspector.
- **Phase 34 launch page** (`docs` [#1799](https://git.moleculesai.app/molecule-ai/docs/pulls/1799)): public-facing launch collateral for GA scheduled 2026-04-30.
- **Tool Trace demo environment** (`docs` [#1844](https://git.moleculesai.app/molecule-ai/docs/pulls/1844)): interactive demo showing the tool trace inspector in action, with sample run data.
- **Enterprise battlecard** (`docs` [#1864](https://git.moleculesai.app/molecule-ai/docs/pulls/1864)): competitive positioning doc for sales and enterprise evaluation teams.
#### 🧹 Internal
- `a2a-sdk` hot-pinned to `0.3.x` across all workspace template repos (`molecule-core` [#1890](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1890)); SDK upgrade path documented in `KI-009` (`internal` [#1631](https://git.moleculesai.app/molecule-ai/internal/issues/1631)).
- Phase 34 CI matrix expanded to cover Node 22 and Go 1.24 (`molecule-ci`).
#### 🔧 Runtime fixes
- **Heartbeat 401 retry** (`molecule-ai-workspace-runtime` [#40](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/40)): heartbeat worker now retries with fresh token on 401 before declaring the workspace unreachable — eliminates false `disconnected` status during token rotation.
- **LLM token auto-detect** (`molecule-ai-workspace-runtime` [#38](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/38)): hermes runtime now auto-detects `max_tokens` from model context window and request timeout when not explicitly configured.
---
## 2026-05-10
### ✨ New features
- **A2A priority queue — Phase 1**: task dispatch now supports a `priority` field (`low` / `normal` / `high` / `urgent`). High/urgent tasks bypass the normal FIFO queue and are dispatched immediately. (`molecule-core` [#225](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/225))
@@ -214,7 +161,7 @@ Entries are published daily at 23:50 UTC.
- **SOP tier-check AND-composition of required team approvals per tier**: tier-check now enforces AND-composition of required team approvals per tier (`tier:high`). (`molecule-core` [#225](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/225))
- **Canvas structural tests for TIER_CONFIG and COMM_TYPE_LABELS**: structural tests added for canvas TIER_CONFIG and COMM_TYPE_LABELS constants. (`molecule-core` [#245](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/245))
---
## 2026-05-09
@@ -243,6 +190,7 @@ Entries are published daily at 23:50 UTC.
---
## 2026-05-08
### 🔧 Fixes
@@ -253,6 +201,7 @@ Entries are published daily at 23:50 UTC.
---
## 2026-05-07
### 📚 Docs
@@ -271,6 +220,7 @@ Entries are published daily at 23:50 UTC.
---
## 2026-05-06
### 🧹 Internal
@@ -280,6 +230,7 @@ Entries are published daily at 23:50 UTC.
---
## 2026-04-22
### ✨ New features
@@ -345,6 +296,83 @@ See the [migration blog post](/blog/cloudflare-tunnel-migration).
---
## 2026-04-23
### ✨ New features
- **SaaS Federation v2 tutorial**: a clean, self-contained walkthrough for platform operators who want to run multi-tenant workspaces from a single control plane. Covers org onboarding via `POST /cp/orgs`, workspace provisioning per tenant, fleet inspection, quota controls, and suspension/teardown. (`molecule-core` [#1700](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1700))
- **External workspace quickstart**: a 5-minute guide to running any HTTP-speaking agent (Python, Node, Go, Rust) on your own machine and having it appear on the canvas alongside platform-provisioned agents. Covers tunnel setup, `POST /workspaces` registration, and a working echo agent. (`molecule-core` [#1760](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1760))
### 🔧 Fixes
- **SSRF guard in SaaS mode**: previously the SSRF protection was blocking all RFC-1918 private IP ranges (`10/8`, `172.16/12`, `192.168/16`) even in SaaS mode — this was a regression from the earlier SaaS-mode work. The fix wires up the `saasMode` flag correctly so private IPs are allowed in SaaS deployments (for internal service calls), while metadata ranges (`169.254/16`), CGNAT, loopback, and link-local remain blocked in every mode. IPv6 ULA (`fd00::/8`) handling is also now correct. (`molecule-core` [#1692](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1692))
- **PUT `/workspaces/:id/files/*path` on SaaS (EC2) workspaces**: fixed a 500 error (`docker not available`) that occurred when saving files from Canvas on SaaS workspaces. The handler now detects non-Docker workspaces via `workspaces.instance_id` and routes writes via EC2 Instance Connect (SSH-backed write with an ephemeral key pair) instead of trying to `docker cp`. (`molecule-core` [#1702](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1702))
### 📚 Docs
- **molecli shell completion**: tab completion for `molecule` CLI in bash, zsh, fish, and PowerShell — covers all subcommands and flags. (`docs` [#79](https://git.moleculesai.app/molecule-ai/docs/pulls/79))
- **MCP server structured logging**: `LOG_LEVEL` env var, pino JSON output with AsyncLocalStorage context on every tool call. (`docs` [#78](https://git.moleculesai.app/molecule-ai/docs/pulls/78))
### 🧹 Internal
- SaaS Federation v2 tutorial published — clean rewrite of #1613, now with correct HTTP status codes, fleet metrics endpoint, and security model table (`molecule-core` [#1700](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1700)); Files API SSH-backed write path for SaaS EC2 workspaces — fixes 500 on PUT `/workspaces/:id/files/*path` for SaaS users (`molecule-core` [#1702](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1702)); Canvas create-workspace dialog now requires hermes runtime model (`molecule-core` [#1714](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1714)).
- EC2 Instance Connect SSH tutorial published (`molecule-core` [#1617](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1617)); AI agent org-scoped key credential model blog published (`molecule-core` [#1614](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1614)); Phase 30 Day 2 social package ready (`molecule-core` [#1662](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1662)).
### 🌅 Late-day updates (17:3023:50 UTC)
#### 🔒 Security
- **Cross-tenant memory poisoning fix** (`molecule-core` [#1791](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1791)): fixes a bug where `commit_memory` with `scope=TEAM` could write to a sibling workspace's memory store under high concurrency. `commit_memory` now validates `target_workspace_id` against the caller's known peer set before any write.
- **CWE-78 shell injection hardening** (`molecule-core` [#1885](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1885)): `shellQuote` now uses `strconv.Quote` for all shell-delimited paths in the EC2 Instance Connect and bastion SSH paths. Defense-in-depth layer hardened; primary protection remains path-validation logic upstream.
#### ✨ New features
- **A2A priority queue — Phase 1** (`molecule-core` [#1892](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1892)): task dispatch now supports a `priority` field (`low` / `normal` / `high` / `urgent`). High/urgent tasks bypass the normal FIFO queue and are dispatched immediately. Phase 2 (priority inversion deadlock prevention) on the roadmap.
#### 🔧 Fixes
- **A2A queue nil-safe drain** (`molecule-core` [#1893](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1893), [#1896](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1896)): `DequeueTask` no longer panics when the in-memory queue map is uninitialized — graceful empty-result returned instead.
- **Workspaces stuck in `provisioning` after failure** (`molecule-core` [#1794](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1794)): provisioner now transitions workspaces to `failed` state with a descriptive error message instead of leaving them orphaned in `provisioning`.
- **Dedup settings hooks double-fire** (`molecule-core` [#1797](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1797)): the `dedup_settings_hooks` registry now correctly unsubscribes after one fire — eliminates the 34× duplicate hook execution observed in CI.
- **Semantic memory search returning stale results** (`molecule-core` [#1778](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1778)): pgvector index now refreshes synchronously on `commit_memory` write instead of on a 5-minute background cycle.
- **pgvector migration race in E2E CI** (`molecule-core` [#1777](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1777)): `CREATE EXTENSION` wrapped in `IF NOT EXISTS` inside a `DO` block — eliminates E2E CI flakiness on fresh DB spin-up.
- **EC2 Instance Connect endpoint not found in us-west-2** (`molecule-core` [#1779](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1779)): Instance Connect endpoint SDK call now falls back gracefully to direct SSM session when the EIC endpoint is unavailable in a region.
- **Canvas topology overlay edge labels clipped** (`molecule-core` [#1802](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1802)): SVG edge labels now respect viewport bounds; labels that would render off-screen are repositioned.
- **Audit trail panel not loading for large workspaces** (`molecule-core` [#1854](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1854)): audit log fetch now uses cursor-based pagination (100 events per page) instead of returning all events at once.
- **Hermes `response_format` not forwarded to MiniMax** (`molecule-core` [#1861](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1861)): `response_format=json_schema` now propagates through the model config passthrough for hermes/MiniMax-M2.7-highspeed workspaces.
- **Memory Inspector panel memory leak** (`molecule-core` [#1871](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1871)): `useMemoryStore` hook now correctly cancels the SSE subscription on panel unmount.
- **Token revocation cache stale-read window** (`molecule-core` [#1888](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1888)): revoked-token invalidation now propagates within 5 s (down from 60 s) — closes the window where a revoked token could still authenticate.
- **TenantGuard same-origin bypass (regression)** (`molecule-core` [#1898](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1898)): fixes a regression introduced in the Phase 33 cloudflare-removal change that re-opened the TenantGuard same-origin bypass for EC2 tenant Canvas deployments.
#### 📚 Docs
- **Chrome DevTools MCP tutorial** (`docs` [#1798](https://git.moleculesai.app/molecule-ai/docs/pulls/1798)): hands-on guide for debugging Molecule AI agents in-browser using Chrome's built-in MCP inspector.
- **Phase 34 launch page** (`docs` [#1799](https://git.moleculesai.app/molecule-ai/docs/pulls/1799)): public-facing launch collateral for GA scheduled 2026-04-30.
- **Tool Trace demo environment** (`docs` [#1844](https://git.moleculesai.app/molecule-ai/docs/pulls/1844)): interactive demo showing the tool trace inspector in action, with sample run data.
- **Enterprise battlecard** (`docs` [#1864](https://git.moleculesai.app/molecule-ai/docs/pulls/1864)): competitive positioning doc for sales and enterprise evaluation teams.
#### 🧹 Internal
- `a2a-sdk` hot-pinned to `0.3.x` across all workspace template repos (`molecule-core` [#1890](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1890)); SDK upgrade path documented in `KI-009` (`internal` [#1631](https://git.moleculesai.app/molecule-ai/internal/issues/1631)).
- Phase 34 CI matrix expanded to cover Node 22 and Go 1.24 (`molecule-ci`).
#### 🔧 Runtime fixes
- **Heartbeat 401 retry** (`molecule-ai-workspace-runtime` [#40](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/40)): heartbeat worker now retries with fresh token on 401 before declaring the workspace unreachable — eliminates false `disconnected` status during token rotation.
- **LLM token auto-detect** (`molecule-ai-workspace-runtime` [#38](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/38)): hermes runtime now auto-detects `max_tokens` from model context window and request timeout when not explicitly configured.
---
## 2026-04-17
A high-velocity day: 80+ PRs merged across platform, canvas, runtimes, security, and channels.
+10
View File
@@ -63,6 +63,15 @@ claude mcp add molecule -s user -- env \
Reconnect with `/mcp` (or restart the Claude Code session) and the tools
appear in the next turn.
<Callout type="warn">
Claude Code 2.1.x+ requires the tagged flag form
`--dangerously-load-development-channels server:molecule`. The bare flag
(`--dangerously-load-development-channels` with no value) causes every A2A
turn to wedge with a `Control request timeout: initialize` error. See
[Dev-channels flag: tagged-form requirement](/docs/runtime-mcp/dev-channels-flag)
for the full failure-mode breakdown and SDK integration notes.
</Callout>
### Hermes Agent
```bash
@@ -382,6 +391,7 @@ needed when you can't run an MCP stdio server inside your agent (rare).
## See also
- [Dev-channels flag: tagged-form requirement](/docs/runtime-mcp/dev-channels-flag) — why `--dangerously-load-development-channels server:molecule` (not the bare flag) is required for inline channel push in Claude Code 2.1.x+
- [External Agents](/docs/external-agents) — manual A2A path for non-MCP runtimes
- [Tokens](/docs/tokens) — token management and rotation
- [Concepts — Workspaces](/docs/concepts#workspaces)
@@ -0,0 +1,176 @@
---
title: "Dev-channels flag — tagged-form requirement"
description: "Why Claude Code 2.1.x+ requires `--dangerously-load-development-channels server:molecule` (not the bare flag) to enable inline channel push from the molecule-mcp wheel."
---
import { Callout } from 'fumadocs-ui/components/callout';
The `molecule-mcp` wheel emits a JSON-RPC `notifications/claude/channel`
notification on every inbound A2A message so Claude Code can render it
as an inline `<channel>` synthetic user turn — zero polling, zero
per-turn stall. During the channels research preview, Claude Code only
processes that notification when the host is launched with the
`--dangerously-load-development-channels` flag *and the flag carries a
matching tagged allowlist entry*.
This page covers the form that flag must take, what breaks when it's
wrong, and when an operator has to think about it.
<Callout type="warn">
The bare flag (no value) is rejected by the post-2.1 CLI parser, and
the failure mode propagates upstream as a `Control request timeout:
initialize` from any SDK that spawns the CLI — every A2A turn wedges
100% of the time. See [Failure mode](#failure-mode) below.
</Callout>
## The flag
```
--dangerously-load-development-channels <entries...>
```
Available in Claude Code **2.1.x and later**. It opts the CLI into
processing experimental `notifications/<channel>` JSON-RPC methods
emitted by registered MCP servers and plugin channels. Without it, the
CLI silently drops those notifications during the allowlist check, even
though the wheel ships the wire shape correctly.
## Required form: tagged allowlist entries
Each entry must carry one of two prefixes:
| Form | Use for |
|---|---|
| `server:<MCP-server-name>` | Manually configured MCP servers — the name matches what you registered with `claude mcp add <name> ...` or the key under `mcpServers` in `~/.claude.json`. |
| `plugin:<plugin-name>@<owner>/<repo>` | Plugin channels installed from a Claude Code plugin marketplace. |
Multiple entries are space-separated:
```bash
claude --dangerously-load-development-channels server:molecule server:telegram
```
Untagged values (`molecule` instead of `server:molecule`) are rejected
with `--dangerously-load-development-channels entries must be tagged`.
## Failure mode
A bare flag (`--dangerously-load-development-channels` with no value)
walks through three layers of damage before surfacing:
1. **CLI**: rejects the invocation with
`error: option '--dangerously-load-development-channels <servers...>' argument missing`.
2. **SDK**: `claude-agent-sdk` (used by `claude_sdk_executor.py` in the
Claude Code workspace template) renders the kwarg as a bare switch when
the value is `None`. The CLI then never responds to the SDK's first
`initialize` control message.
3. **Workspace agent**: the SDK times out with
`Control request timeout: initialize`. Every A2A turn wedges — 100%
reproducible. Caught live on workspace `dd40faf8` on 2026-05-01.
Two small fixes prevent this: pass a tagged value (don't let `None`
render as a bare switch), and verify the CLI accepts your specific
entries before going broad.
## For Molecule operators
Pass `server:molecule` to enable the inbox bridge → MCP
`notifications/claude/channel` push for the `molecule-mcp` wheel.
```bash
claude --dangerously-load-development-channels server:molecule
```
The `molecule` here matches the name you registered the wheel under in
[Step 2 of the runtime-mcp guide](/docs/runtime-mcp#claude-code) (the
key under `mcpServers`, or the first positional arg to `claude mcp add`).
If you registered the wheel as `mol` or `molecule-prod`, use that name
in the tag.
When push is live, the session header prints:
```
Listening for channel messages from: server:molecule
```
…and inbound canvas/peer-agent messages render inline as
`<channel source="molecule" ...>` synthetic user turns instead of
arriving via `inbox_peek`.
### Embedding in an SDK-driven agent
If you spawn `claude` through `claude-agent-sdk` (e.g. the Claude Code
workspace template's `claude_sdk_executor.py`), forward the tagged value
through `extra_args`:
```python
from claude_agent_sdk import ClaudeAgentOptions
ClaudeAgentOptions(
model=self.model,
permission_mode="bypassPermissions",
cwd=self._resolve_cwd(),
mcp_servers=mcp_servers,
system_prompt=self._build_system_prompt(),
resume=self._session_id,
extra_args={"dangerously-load-development-channels": "server:molecule"},
)
```
The SDK forwards `extra_args` keys as `--<key> <value>` to the spawned
CLI. Passing `None` as the value renders as a bare switch and trips the
[Failure mode](#failure-mode) chain above.
## Verification
Verified live on 2026-05-02: with the tagged value in `extra_args`,
the in-workspace agent received `<channel source="molecule" kind="..."
peer_id="..." activity_id="..." ts="...">` tags inline as synthetic
user turns. No `wait_for_message` poll was needed for delivery. A2A
returned coherent replies on every turn.
## When this matters
Only when both of the following apply:
- You're running Claude Code (any version 2.1.x or later) as the
workspace runtime, AND
- The in-workspace `molecule-mcp` server is configured (it is, by
default, in the `claude-code` workspace template).
**Hosted Molecule SaaS handles this automatically** — the executor
passes `extra_args={"dangerously-load-development-channels": "server:molecule"}`
when spawning the CLI. Operators on hosted SaaS do not need to do
anything.
**Self-hosted operators using the Claude Code workspace template** also
get this for free since the template's executor sets `extra_args`. The
flag only needs operator attention when:
- Forking the Claude Code workspace template and stripping `extra_args`
inadvertently.
- Running `claude` directly outside the template (e.g. interactive
sessions on a developer laptop) and wanting inline `<channel>` push.
- Adding a second tagged source (e.g. `server:telegram` alongside
`server:molecule`) — append, don't replace.
Operators on Cursor, Cline, OpenCode, codex, hermes-agent, or any
non-Claude-Code MCP host are unaffected: those clients ignore the
notification and the wheel's poll path delivers via
`wait_for_message` as the universal fallback.
## Forward note
This requirement is a **research-preview gate**. Once Claude Code
graduates `notifications/<channel>` from research preview to a default
allowlist, the `--dangerously-load-development-channels` flag will no
longer be required for the `molecule` server. Drop the `extra_args`
entry in `claude_sdk_executor.py` (and any operator launch wrappers)
when that happens — the wheel emits the wire shape correctly today
and will continue to do so post-graduation.
## See also
- [Bring Your Own Runtime (MCP) — Inbound delivery](/docs/runtime-mcp#inbound-delivery-universal-poll-optional-push)
- [Bring Your Own Runtime (MCP) — Step 2: Claude Code](/docs/runtime-mcp#claude-code)
- [Troubleshooting — Control request timeout: initialize](/docs/runtime-mcp#control-request-timeout-initialize-from-the-workspace-agent)
+22
View File
@@ -9,6 +9,28 @@ This page documents security fixes shipped in the Molecule AI platform. Each ent
---
## 2026-05-13 — CWE-22: Path Traversal Regression in `org_import.go` (Resolved)
**Severity:** Critical (CWE-22)
**PR:** [#810](https://git.moleculesai.app/molecule-ai/molecule-core/pull/810)
**Affected:** `workspace-server/internal/handlers/org_import.go``createWorkspaceTree`
### Vulnerability
A regression removed the `resolveInsideRoot` path-traversal guard from `createWorkspaceTree` at `org_import.go:494`. The function called `parseEnvFile(filepath.Join(orgBaseDir, ws.FilesDir, ".env"))` without validating that `ws.FilesDir` resolved inside `orgBaseDir`.
An attacker who could submit a malicious org YAML with `filesDir: "../../../etc"` could cause the platform to read arbitrary files accessible to the server process via the `.env` loading path.
### Fix
Replaced the two raw `parseEnvFile` calls with `loadWorkspaceEnv(orgBaseDir, ws.FilesDir)`, which applies `resolveInsideRoot` internally before joining paths. This restores the guard that was present before the regression was introduced.
### User-facing summary
The org template import endpoint now validates all workspace file paths before accessing them. Attempts to access files outside the designated org directory return an error and are never processed.
---
## 2026-04-20 — CWE-22: Path Traversal in `copyFilesToContainer`
**Severity:** High (CWE-22)
@@ -1,198 +0,0 @@
---
title: Self-Hosted Workspace Deployment with Docker
---
# Self-Hosted Workspace Deployment with Docker
This guide covers running a Molecule AI workspace agent as a Docker container on a self-hosted server or VM. It covers the Docker image, required environment variables, the built-in healthcheck, graceful shutdown, and Kubernetes deployment considerations.
> **Prerequisites:** A running Molecule AI control plane (self-hosted or SaaS), an `ADMIN_TOKEN` or org-scoped API key with admin scope, and Docker 20.10+ on the host.
## How the workspace container works
The Molecule AI workspace Dockerfile includes:
- A uvicorn server on port 8000 (configurable via `PORT`)
- A healthcheck endpoint at `/.well-known/agent-card.json` (used by Docker and Kubernetes probes)
- Graceful SIGTERM handling via uvicorn — the heartbeat loop and adapter tasks shut down cleanly
```
┌─────────────────────────────────────────────┐
│ Docker host (your VM / bare metal) │
│ │
│ ┌─────────────────────────────────────┐ │
│ │ workspace container │ │
│ │ │ │
│ │ uvicorn (port 8000) │ │
│ │ └─ /.well-known/agent-card.json ← HEALTHCHECK │ │
│ │ │ │
│ │ heartbeat loop + A2A agent │ │
│ └──────────────┬──────────────────────┘ │
│ │ │
│ host.docker.internal:8080 │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────┐ │
│ │ Molecule AI control plane │ │
│ │ (platform on port 8080) │ │
│ └─────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
```
## Step 1: Create an external workspace
First register the workspace as an external (self-managed) agent on the platform.
```bash
ADMIN_TOKEN="your-admin-token"
PLATFORM_URL="https://platform.moleculesai.app" # or http://localhost:8080 for local dev
WORKSPACE=$(curl -s -X POST "${PLATFORM_URL}/workspaces" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"name": "self-hosted-agent", "runtime": "external"}')
WORKSPACE_ID=$(echo "$WORKSPACE" | python3 -c "import json,sys; print(json.load(sys.stdin)['id'])")
echo "Workspace ID: $WORKSPACE_ID"
```
Save the returned `WORKSPACE_ID`. The workspace agent obtains its bearer token automatically during its first registration with the platform.
## Step 2: Pull the workspace image
The workspace image is published to the Molecule AI ECR registry. Contact your platform administrator for the registry prefix and credentials, then log in:
```bash
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com"
docker pull "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
```
## Step 3: Configure environment variables
| Variable | Default | Description |
|---|---|---|
| `PLATFORM_URL` | `http://localhost:8080` | Platform API URL. Inside a Docker container, use `http://host.docker.internal:8080` to reach the platform on the host machine. |
| `WORKSPACE_ID` | — | Workspace ID from Step 1 (required; no default) |
| `PORT` | `8000` | Agent server port. Must match `containerPort` in Kubernetes and the port mapped with `-p` in Docker. |
## Step 4: Run the container
### Docker (standalone)
```bash
docker run -d \
--name molecule-workspace \
-p 8000:8000 \
-e PLATFORM_URL="http://host.docker.internal:8080" \
-e WORKSPACE_ID="your-workspace-id" \
-e PORT=8000 \
"${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
```
> **Note for Linux hosts:** Docker does not include `host.docker.internal` by default. On Linux, either add `--add-host=host.docker.internal:host-gateway` to the `docker run` command, or use the host machine's IP address directly (e.g. `http://192.168.1.100:8080`).
### Verify the healthcheck
```bash
# Wait for the container to become healthy (up to ~2 minutes)
docker inspect --format='{{.State.Health.Status}}' molecule-workspace
# Expected output: healthy
# Once healthy, the agent card is reachable:
curl -s http://localhost:8000/.well-known/agent-card.json | python3 -m json.tool
```
### Docker Compose
```yaml
services:
molecule-workspace:
image: "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
ports:
- "8000:8000"
environment:
PLATFORM_URL: "http://host.docker.internal:8080"
WORKSPACE_ID: "your-workspace-id"
PORT: "8000"
# Linux hosts: add host.docker.internal resolution
# extra_hosts:
# - "host.docker.internal:host-gateway"
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/.well-known/agent-card.json"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
```
## Step 5: Graceful shutdown
When the container receives SIGTERM (e.g. from `docker stop` or Kubernetes pod deletion), the workspace's uvicorn server initiates graceful shutdown: the heartbeat loop stops, active A2A tasks are given a grace period to complete, and any snapshotable state is persisted before the process exits.
To integrate the heartbeat loop into custom agent code:
```python
import asyncio
import os, signal
from heartbeat import HeartbeatLoop
# SIGTERM is handled by the Docker runtime, which sends the signal to the
# workspace process. The workspace (via uvicorn) initiates graceful shutdown:
# the heartbeat loop is stopped, any active adapter tasks are cancelled, and
# in-flight A2A requests are given a grace period to complete.
#
# For custom integration with the heartbeat loop directly:
async def main():
heartbeat = HeartbeatLoop(
platform_url=os.environ["PLATFORM_URL"],
workspace_id=os.environ["WORKSPACE_ID"],
)
heartbeat.start()
try:
await asyncio.Event().wait() # keep running
finally:
await heartbeat.stop()
print("Heartbeat loop stopped.")
```
The Docker `stop` command sends SIGTERM and waits up to 10 seconds by default before sending SIGKILL. The healthcheck ensures orchestrators detect an unhealthy container before the SIGTERM timeout.
## Kubernetes deployment
For Kubernetes deployments, use the native liveness/readiness probe configuration instead of the Docker HEALTHCHECK:
```yaml
ports:
- name: http
containerPort: 8000
livenessProbe:
httpGet:
path: /.well-known/agent-card.json
port: http
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /.well-known/agent-card.json
port: http
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
terminationGracePeriodSeconds: 120
```
> **Note:** The Kubernetes `terminationGracePeriodSeconds` should exceed the liveness probe failure threshold so that the probe can register a failure before the pod is killed. With `periodSeconds: 30` and `failureThreshold: 3`, the probe does not register a failure until approximately 120150s after the container becomes unhealthy. Set `terminationGracePeriodSeconds: 120` or higher.
## Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Container shows `unhealthy` after startup | Platform unreachable from container | Verify `PLATFORM_URL` uses `host.docker.internal` (Docker) or the correct host IP |
| `curl: (7) Failed to connect` on healthcheck | Container not fully started | Wait up to 30s; increase `start_period` |
| Agent not appearing on canvas | Wrong `WORKSPACE_ID` or expired token | Re-run registration; check platform logs |
| `host.docker.internal` not resolved | Linux host without the Docker flag | Use `--add-host=host.docker.internal:host-gateway` or the host's LAN IP |