Merge branch 'merge/pr27-sop-checklist-gate' into merge/integration

Merge branch 'merge/pr30-dev-channels-flag' into merge/integration
Merge branch 'merge/pr31-changelog-security' into merge/integration
2026-05-13 11:08:32 +00:00 · 2026-05-13 11:08:30 +00:00 · 2026-05-13 11:08:27 +00:00 · 2026-05-13 11:06:34 +00:00 · 2026-05-13 11:03:30 +00:00 · 2026-05-13 08:23:48 +00:00
9 changed files with 1366 additions and 276 deletions
@@ -0,0 +1,823 @@
+#!/usr/bin/env python3
+# sop-checklist-gate — evaluate whether a PR has peer-acked each
+# SOP-checklist item. Posts a commit-status that branch protection
+# can require.
+#
+# RFC#351 Step 2 of 6 (implementation MVP).
+#
+# Invoked by .gitea/workflows/sop-checklist-gate.yml on:
+#   - pull_request_target: [opened, edited, synchronize, reopened]
+#   - issue_comment:       [created, edited, deleted]
+#
+# Flow:
+#   1. Load .gitea/sop-checklist-config.yaml (from BASE ref — trusted).
+#   2. GET /repos/{R}/pulls/{N}          — author, head.sha, tier label
+#   3. GET /repos/{R}/issues/{N}/comments — extract /sop-ack and /sop-revoke
+#   4. For each checklist item:
+#        a. Is the section marker present in PR body? (author answered)
+#        b. Is there ≥1 unrevoked /sop-ack from a non-author whose
+#           team-membership matches required_teams?
+#   5. POST /repos/{R}/statuses/{sha}    — context
+#      `sop-checklist / all-items-acked (pull_request)`,
+#      state=success | failure | pending, description=`acked: N/M …`.
+#
+# Trust boundary (mirrors RFC#324 §A4):
+#   This script is loaded from the BASE branch. The workflow's
+#   actions/checkout step pins ref=base.sha. PR-HEAD code is never
+#   executed. We only HTTP-call the Gitea API.
+#
+# Token scope:
+#   - read:repository / read:organization to enumerate PR + comments
+#     + team membership (Gitea 1.22.6 quirk: team-membership endpoint
+#     returns 403 if token owner is not in the team; see review-check.sh
+#     for the same gotcha — we surface the same fail-closed message).
+#   - write:repository for `POST /repos/{R}/statuses/{sha}`. Unlike
+#     RFC#324's pattern (which uses the JOB's own pass/fail as the
+#     status), we POST the status explicitly because the gate posts
+#     a single multi-item status with a richer description than a
+#     bare success/failure context can carry.
+#
+# Slug normalization rules (canonical form: kebab-case):
+#   - Lowercase
+#   - Whitespace + underscores → single dash
+#   - Strip non [a-z0-9-] characters
+#   - Collapse adjacent dashes
+#   - Strip leading/trailing dashes
+#   - If the result is a digit string (e.g. "1"), look up via
+#     config.items[*].numeric_alias to get the kebab-case slug.
+#
+#   Examples:
+#       "Comprehensive_Testing"  → "comprehensive-testing"
+#       "comprehensive testing"  → "comprehensive-testing"
+#       "1"                      → "comprehensive-testing"
+#       "Five-Axis-Review"       → "five-axis-review"
+#
+# Revoke semantics:
+#   /sop-revoke <slug> [reason] — most-recent comment per (slug, user)
+#   wins. So if Alice posts /sop-ack X then later /sop-revoke X, her ack
+#   for X is invalidated. Bob's prior /sop-ack X is unaffected. If Alice
+#   posts /sop-revoke X then later /sop-ack X again, the ack is restored.
+
+from __future__ import annotations
+
+import argparse
+import json
+import os
+import re
+import sys
+import urllib.error
+import urllib.parse
+import urllib.request
+from typing import Any
+
+
+# ---------------------------------------------------------------------------
+# Slug normalization
+# ---------------------------------------------------------------------------
+
+_NORMALIZE_REPLACE_RE = re.compile(r"[\s_]+")
+_NORMALIZE_STRIP_RE = re.compile(r"[^a-z0-9-]")
+_NORMALIZE_DASH_RE = re.compile(r"-+")
+
+
+def normalize_slug(raw: str, numeric_aliases: dict[int, str] | None = None) -> str:
+    """Normalize a user-supplied slug to canonical kebab-case form.
+
+    See module header for the rules.
+
+    If the input is a pure digit string AND numeric_aliases is provided,
+    the alias mapping is consulted. Unknown digits return "" so the caller
+    can flag the comment as unparseable.
+    """
+    if raw is None:
+        return ""
+    s = raw.strip().lower()
+    s = _NORMALIZE_REPLACE_RE.sub("-", s)
+    s = _NORMALIZE_STRIP_RE.sub("", s)
+    s = _NORMALIZE_DASH_RE.sub("-", s)
+    s = s.strip("-")
+    if s.isdigit() and numeric_aliases is not None:
+        return numeric_aliases.get(int(s), "")
+    return s
+
+
+# ---------------------------------------------------------------------------
+# Comment parsing — /sop-ack and /sop-revoke
+# ---------------------------------------------------------------------------
+
+# A directive must be on its own line. Permits leading whitespace.
+# Optional trailing note after the slug for /sop-ack and required reason
+# for /sop-revoke (RFC#351 open question 4 — reason is captured but not
+# yet validated; future iteration may require a min-length).
+_DIRECTIVE_RE = re.compile(
+    r"^[ \t]*/(sop-ack|sop-revoke)[ \t]+([A-Za-z0-9_\- ]+?)(?:[ \t]+(.*))?[ \t]*$",
+    re.MULTILINE,
+)
+
+
+def parse_directives(
+    comment_body: str,
+    numeric_aliases: dict[int, str],
+) -> list[tuple[str, str, str]]:
+    """Extract /sop-ack and /sop-revoke directives from a comment body.
+
+    Returns a list of (kind, canonical_slug, note) tuples where:
+      kind is "sop-ack" or "sop-revoke"
+      canonical_slug is the normalized form (or "" if unparseable)
+      note is the trailing free-text (may be "")
+    """
+    out: list[tuple[str, str, str]] = []
+    if not comment_body:
+        return out
+    for m in _DIRECTIVE_RE.finditer(comment_body):
+        kind = m.group(1)
+        raw_slug = (m.group(2) or "").strip()
+        # If the raw match included trailing words, the regex non-greedy
+        # captured only the first token; strip again for safety.
+        # We split on whitespace to keep the FIRST word as the slug, and
+        # everything after as the note.
+        parts = raw_slug.split()
+        if not parts:
+            continue
+        first = parts[0]
+        # If the slug-capture greedily matched multiple words (e.g.
+        # "comprehensive testing"), preserve normalize behavior: join
+        # the WHOLE first-word-token only; trailing words get appended to
+        # the note. The regex limits group(2) to [A-Za-z0-9_\- ] so we
+        # may have multi-word forms here — normalize handles them.
+        if len(parts) > 1:
+            # User wrote "/sop-ack comprehensive testing extra-note"
+            # → treat "comprehensive testing" as the slug source if it
+            # normalizes to a known item; otherwise treat "comprehensive"
+            # as slug and "testing extra-note" as note. We defer the
+            # disambiguation to the caller via the returned canonical
+            # slug. For simplicity: try the WHOLE captured string first.
+            canonical = normalize_slug(raw_slug, numeric_aliases)
+        else:
+            canonical = normalize_slug(first, numeric_aliases)
+        note_from_group = (m.group(3) or "").strip()
+        # If we collapsed multi-word slug into kebab and there's a
+        # trailing-text group too, append it.
+        out.append((kind, canonical, note_from_group))
+    return out
+
+
+# ---------------------------------------------------------------------------
+# PR body section detection
+# ---------------------------------------------------------------------------
+
+
+def section_marker_present(body: str, marker: str) -> bool:
+    """Return True if `marker` appears in `body` case-insensitively
+    on a non-empty line (i.e. the author actually filled it in).
+
+    We require the marker substring AND non-whitespace content on the
+    same line OR within the next line — this prevents trivially-empty
+    checklists like:
+
+        ## SOP-Checklist
+        - [ ] **Comprehensive testing performed**:
+        - [ ] **Local-postgres E2E run**:
+
+    from auto-passing the section-present check. The peer-ack is still
+    required, but answering with empty content is captured as a soft
+    finding via the section-present test alone.
+    """
+    if not body or not marker:
+        return False
+    body_lower = body.lower()
+    marker_lower = marker.lower()
+    idx = body_lower.find(marker_lower)
+    if idx < 0:
+        return False
+    # Walk to end of line.
+    line_end = body.find("\n", idx)
+    if line_end < 0:
+        line_end = len(body)
+    line = body[idx + len(marker):line_end]
+    # Strip the colon + checkbox tail patterns; require at least one
+    # non-whitespace, non-punctuation char.
+    stripped = re.sub(r"[\s\*:\-\[\]]+", "", line)
+    if stripped:
+        return True
+    # Fall through: check the NEXT line (multi-line answers).
+    next_line_end = body.find("\n", line_end + 1)
+    if next_line_end < 0:
+        next_line_end = len(body)
+    next_line = body[line_end + 1:next_line_end]
+    stripped_next = re.sub(r"[\s\*:\-\[\]]+", "", next_line)
+    return bool(stripped_next)
+
+
+# ---------------------------------------------------------------------------
+# Ack-state computation
+# ---------------------------------------------------------------------------
+
+
+def compute_ack_state(
+    comments: list[dict[str, Any]],
+    pr_author: str,
+    items_by_slug: dict[str, dict[str, Any]],
+    numeric_aliases: dict[int, str],
+    team_membership_probe: "callable[[str, list[str]], list[str]]",
+) -> dict[str, dict[str, Any]]:
+    """Compute per-item ack state.
+
+    Each comment is processed in chronological order. The most-recent
+    directive per (commenter, slug) wins.
+
+    Returns a dict keyed by canonical slug:
+       {
+         "comprehensive-testing": {
+           "ackers": ["bob"],         # non-author, team-verified
+           "rejected_ackers": {        # debugging info
+             "self_ack": ["alice"],
+             "unknown_slug": [],
+             "not_in_team": ["eve"],
+           }
+         },
+         ...
+       }
+    """
+    # Step 1: collapse directives per (commenter, slug) — most recent wins.
+    # comments are expected to come in chronological order from the
+    # API (Gitea returns oldest-first by default for issues/{N}/comments).
+    latest_directive: dict[tuple[str, str], str] = {}  # (user, slug) → kind
+    unparseable_per_user: dict[str, int] = {}
+    for c in comments:
+        body = c.get("body", "") or ""
+        user = (c.get("user") or {}).get("login", "")
+        if not user:
+            continue
+        for kind, slug, _note in parse_directives(body, numeric_aliases):
+            if not slug:
+                unparseable_per_user[user] = unparseable_per_user.get(user, 0) + 1
+                continue
+            latest_directive[(user, slug)] = kind
+
+    # Step 2: build candidate ackers per slug.
+    # Filter out self-acks and unknown slugs.
+    ackers_per_slug: dict[str, list[str]] = {s: [] for s in items_by_slug}
+    rejected_self: dict[str, list[str]] = {s: [] for s in items_by_slug}
+    rejected_unknown: dict[str, list[str]] = {s: [] for s in items_by_slug}
+    pending_team_check: dict[str, list[str]] = {s: [] for s in items_by_slug}
+
+    for (user, slug), kind in latest_directive.items():
+        if kind != "sop-ack":
+            continue  # revokes leave the (user,slug) state as "no ack"
+        if slug not in items_by_slug:
+            # Slug normalized to something not in our config — store
+            # under a synthetic key for diagnostic surfacing. Don't add
+            # to any item.
+            continue
+        if user == pr_author:
+            rejected_self[slug].append(user)
+            continue
+        pending_team_check[slug].append(user)
+
+    # Step 3: team membership probe per slug (batched per slug to keep
+    # API call count down — same user may ack multiple items but the
+    # required_teams differ per item, so we MUST probe per (user, item)).
+    rejected_not_in_team: dict[str, list[str]] = {s: [] for s in items_by_slug}
+    for slug, candidates in pending_team_check.items():
+        if not candidates:
+            continue
+        required = items_by_slug[slug]["required_teams"]
+        approved = team_membership_probe(slug, candidates)  # returns subset
+        rejected_not_in_team[slug] = [u for u in candidates if u not in approved]
+        ackers_per_slug[slug] = approved
+        # Stash required teams for description rendering.
+        items_by_slug[slug]["_required_resolved"] = required
+
+    return {
+        slug: {
+            "ackers": ackers_per_slug[slug],
+            "rejected": {
+                "self_ack": rejected_self[slug],
+                "not_in_team": rejected_not_in_team[slug],
+            },
+        }
+        for slug in items_by_slug
+    }
+
+
+# ---------------------------------------------------------------------------
+# Gitea API client
+# ---------------------------------------------------------------------------
+
+
+class GiteaClient:
+    def __init__(self, host: str, token: str):
+        self.base = f"https://{host}/api/v1"
+        self.token = token
+        # Cache team-name → team-id resolutions per org.
+        self._team_id_cache: dict[tuple[str, str], int | None] = {}
+
+    def _req(
+        self,
+        method: str,
+        path: str,
+        body: dict[str, Any] | None = None,
+        ok_codes: tuple[int, ...] = (200, 201, 204),
+    ) -> tuple[int, Any]:
+        url = self.base + path
+        data = None
+        headers = {
+            "Authorization": f"token {self.token}",
+            "Accept": "application/json",
+        }
+        if body is not None:
+            data = json.dumps(body).encode("utf-8")
+            headers["Content-Type"] = "application/json"
+        req = urllib.request.Request(url, method=method, data=data, headers=headers)
+        try:
+            with urllib.request.urlopen(req, timeout=20) as r:
+                raw = r.read()
+                code = r.getcode()
+        except urllib.error.HTTPError as e:
+            code = e.code
+            raw = e.read()
+        try:
+            parsed = json.loads(raw.decode("utf-8")) if raw else None
+        except json.JSONDecodeError:
+            parsed = raw.decode("utf-8", errors="replace") if raw else None
+        return code, parsed
+
+    def get_pr(self, owner: str, repo: str, pr: int) -> dict[str, Any]:
+        code, data = self._req("GET", f"/repos/{owner}/{repo}/pulls/{pr}")
+        if code != 200:
+            raise RuntimeError(f"GET pulls/{pr} → HTTP {code}: {data!r}")
+        return data
+
+    def get_issue_comments(
+        self, owner: str, repo: str, issue: int
+    ) -> list[dict[str, Any]]:
+        # Paginate. Gitea default page size 50.
+        out: list[dict[str, Any]] = []
+        page = 1
+        while True:
+            code, data = self._req(
+                "GET",
+                f"/repos/{owner}/{repo}/issues/{issue}/comments?limit=50&page={page}",
+            )
+            if code != 200:
+                raise RuntimeError(
+                    f"GET issues/{issue}/comments page={page} → HTTP {code}: {data!r}"
+                )
+            if not data:
+                break
+            out.extend(data)
+            if len(data) < 50:
+                break
+            page += 1
+        return out
+
+    def resolve_team_id(self, org: str, team_name: str) -> int | None:
+        key = (org, team_name)
+        if key in self._team_id_cache:
+            return self._team_id_cache[key]
+        code, data = self._req("GET", f"/orgs/{org}/teams/search?q={urllib.parse.quote(team_name)}")
+        team_id = None
+        if code == 200 and isinstance(data, dict):
+            for t in data.get("data", []):
+                if t.get("name") == team_name:
+                    team_id = t.get("id")
+                    break
+        if team_id is None and code == 200 and isinstance(data, list):
+            for t in data:
+                if t.get("name") == team_name:
+                    team_id = t.get("id")
+                    break
+        self._team_id_cache[key] = team_id
+        return team_id
+
+    def is_team_member(self, team_id: int, login: str) -> bool | None:
+        """Return True / False / None (unknown — 403 from API)."""
+        code, _ = self._req(
+            "GET", f"/teams/{team_id}/members/{urllib.parse.quote(login)}"
+        )
+        if code in (200, 204):
+            return True
+        if code == 404:
+            return False
+        # 403 means the token owner isn't in this team, so the API
+        # refuses to confirm membership. Fail-closed at the caller.
+        return None
+
+    def post_status(
+        self,
+        owner: str,
+        repo: str,
+        sha: str,
+        state: str,
+        context: str,
+        description: str,
+        target_url: str = "",
+    ) -> None:
+        body = {
+            "state": state,
+            "context": context,
+            "description": description[:140],  # Gitea truncates to 255 but be safe
+            "target_url": target_url or "",
+        }
+        code, data = self._req(
+            "POST",
+            f"/repos/{owner}/{repo}/statuses/{sha}",
+            body=body,
+            ok_codes=(201,),
+        )
+        if code not in (200, 201):
+            raise RuntimeError(
+                f"POST statuses/{sha} → HTTP {code}: {data!r}"
+            )
+
+
+# ---------------------------------------------------------------------------
+# Config loader (PyYAML-free — config file is intentionally tiny + flat)
+# ---------------------------------------------------------------------------
+
+
+def load_config(path: str) -> dict[str, Any]:
+    """Load .gitea/sop-checklist-config.yaml.
+
+    Uses PyYAML if available, otherwise falls back to a built-in
+    minimal parser sufficient for our flat config shape. Bundling
+    PyYAML on the runner is one apt install away but we avoid the
+    dep by keeping the config shape constrained.
+    """
+    try:
+        import yaml  # type: ignore[import-not-found]
+        with open(path) as f:
+            return yaml.safe_load(f)
+    except ImportError:
+        return _load_config_minimal(path)
+
+
+def _load_config_minimal(path: str) -> dict[str, Any]:
+    """Minimal YAML subset parser for our config shape.
+
+    Supports: top-level scalar:value, top-level map-of-map (e.g.
+    tier_failure_mode), top-level list of maps (items:), and within an
+    item map: scalars + lists of scalars. Does NOT support nested lists,
+    YAML anchors, multi-doc, or flow style.
+    """
+    with open(path) as f:
+        lines = f.readlines()
+    return _parse_minimal_yaml(lines)
+
+
+def _parse_minimal_yaml(lines: list[str]) -> dict[str, Any]:  # noqa: C901
+    """Hand-rolled subset parser. See _load_config_minimal docstring."""
+    # Strip comments + blank lines but preserve indentation.
+    cleaned: list[tuple[int, str]] = []
+    for raw in lines:
+        # Don't strip a "#" that is inside a quoted value.
+        body = raw.rstrip("\n")
+        # Remove trailing comment.
+        idx = body.find("#")
+        if idx >= 0 and (idx == 0 or body[idx - 1] in " \t"):
+            body = body[:idx].rstrip()
+        if not body.strip():
+            continue
+        indent = len(body) - len(body.lstrip(" "))
+        cleaned.append((indent, body.strip()))
+
+    root: dict[str, Any] = {}
+    i = 0
+    n = len(cleaned)
+
+    def parse_scalar(s: str) -> Any:
+        s = s.strip()
+        if s.startswith('"') and s.endswith('"'):
+            return s[1:-1]
+        if s.startswith("'") and s.endswith("'"):
+            return s[1:-1]
+        if s.lower() in ("true", "yes"):
+            return True
+        if s.lower() in ("false", "no"):
+            return False
+        try:
+            return int(s)
+        except ValueError:
+            pass
+        return s
+
+    def parse_inline_list(s: str) -> list[Any]:
+        s = s.strip()
+        if not (s.startswith("[") and s.endswith("]")):
+            return [parse_scalar(s)]
+        inner = s[1:-1]
+        if not inner.strip():
+            return []
+        return [parse_scalar(x.strip()) for x in inner.split(",")]
+
+    while i < n:
+        indent, line = cleaned[i]
+        if indent != 0:
+            i += 1
+            continue
+        if ":" not in line:
+            i += 1
+            continue
+        key, _, rest = line.partition(":")
+        key = key.strip()
+        rest = rest.strip()
+        if rest == "":
+            # Block — could be map or list.
+            i += 1
+            # Look ahead for first child.
+            if i < n and cleaned[i][1].startswith("- "):
+                # List of items.
+                items: list[Any] = []
+                while i < n and cleaned[i][0] > indent and cleaned[i][1].startswith("- "):
+                    item_indent = cleaned[i][0]
+                    first_kv = cleaned[i][1][2:].strip()  # strip "- "
+                    item: dict[str, Any] = {}
+                    if ":" in first_kv:
+                        k, _, v = first_kv.partition(":")
+                        k = k.strip()
+                        v = v.strip()
+                        if v == "":
+                            item[k] = ""
+                        elif v.startswith(">-") or v.startswith(">"):
+                            # Folded scalar continues on subsequent indented lines
+                            collected: list[str] = []
+                            i += 1
+                            while i < n and cleaned[i][0] > item_indent:
+                                collected.append(cleaned[i][1])
+                                i += 1
+                            item[k] = " ".join(collected)
+                            items.append(item)
+                            continue
+                        elif v.startswith("["):
+                            item[k] = parse_inline_list(v)
+                        else:
+                            item[k] = parse_scalar(v)
+                    i += 1
+                    # Subsequent k:v lines at deeper indent belong to this item.
+                    while i < n and cleaned[i][0] > item_indent and not cleaned[i][1].startswith("- "):
+                        sub_indent, sub_line = cleaned[i]
+                        if ":" in sub_line:
+                            k, _, v = sub_line.partition(":")
+                            k = k.strip()
+                            v = v.strip()
+                            if v == "":
+                                item[k] = ""
+                                i += 1
+                            elif v.startswith(">-") or v.startswith(">"):
+                                collected = []
+                                i += 1
+                                while i < n and cleaned[i][0] > sub_indent:
+                                    collected.append(cleaned[i][1])
+                                    i += 1
+                                item[k] = " ".join(collected)
+                            elif v.startswith("["):
+                                item[k] = parse_inline_list(v)
+                                i += 1
+                            else:
+                                item[k] = parse_scalar(v)
+                                i += 1
+                        else:
+                            i += 1
+                    items.append(item)
+                root[key] = items
+            else:
+                # Sub-map.
+                submap: dict[str, Any] = {}
+                while i < n and cleaned[i][0] > indent:
+                    sub_indent, sub_line = cleaned[i]
+                    if ":" in sub_line:
+                        k, _, v = sub_line.partition(":")
+                        k = k.strip().strip('"').strip("'")
+                        v = v.strip()
+                        if v.startswith("[") and v.endswith("]"):
+                            submap[k] = parse_inline_list(v)
+                        else:
+                            submap[k] = parse_scalar(v)
+                    i += 1
+                root[key] = submap
+        else:
+            # Inline scalar or list.
+            if rest.startswith("[") and rest.endswith("]"):
+                root[key] = parse_inline_list(rest)
+            else:
+                root[key] = parse_scalar(rest)
+            i += 1
+    return root
+
+
+# ---------------------------------------------------------------------------
+# Main entry point
+# ---------------------------------------------------------------------------
+
+
+def render_status(
+    items: list[dict[str, Any]],
+    ack_state: dict[str, dict[str, Any]],
+    body_state: dict[str, bool],
+) -> tuple[str, str]:
+    """Return (state, description) for the commit-status post.
+
+    state is "success" if every item has at least one valid ack
+    (body section presence is informational only — peer-ack is the
+    real gate).  "pending" is reserved for the soft-fail path
+    (tier:low) and is set by the caller.
+    """
+    n = len(items)
+    fully_acked = [
+        it["slug"] for it in items if ack_state[it["slug"]]["ackers"]
+    ]
+    missing = [
+        it["slug"] for it in items if not ack_state[it["slug"]]["ackers"]
+    ]
+    missing_body = [it["slug"] for it in items if not body_state.get(it["slug"], False)]
+
+    desc_parts = [f"acked: {len(fully_acked)}/{n}"]
+    if missing:
+        # Show up to 3 missing slugs to stay inside the 140-char budget.
+        shown = ", ".join(missing[:3])
+        if len(missing) > 3:
+            shown += f", +{len(missing) - 3}"
+        desc_parts.append(f"missing: {shown}")
+    if missing_body:
+        desc_parts.append(f"body-unfilled: {len(missing_body)}")
+    state = "success" if not missing else "failure"
+    return state, " — ".join(desc_parts)
+
+
+def get_tier_mode(pr: dict[str, Any], cfg: dict[str, Any]) -> str:
+    """Read tier label, return 'hard' or 'soft' per cfg.tier_failure_mode."""
+    labels = pr.get("labels") or []
+    tier_labels = [l.get("name", "") for l in labels if (l.get("name", "") or "").startswith("tier:")]
+    mode_map = cfg.get("tier_failure_mode") or {}
+    default_mode = cfg.get("default_mode", "hard")
+    for tl in tier_labels:
+        if tl in mode_map:
+            return mode_map[tl]
+    return default_mode
+
+
+def main(argv: list[str] | None = None) -> int:
+    p = argparse.ArgumentParser()
+    p.add_argument("--owner", required=True)
+    p.add_argument("--repo", required=True)
+    p.add_argument("--pr", type=int, required=True)
+    p.add_argument("--config", default=".gitea/sop-checklist-config.yaml")
+    p.add_argument("--gitea-host", default="git.moleculesai.app")
+    p.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Compute state but do not POST the status.",
+    )
+    p.add_argument(
+        "--status-context",
+        default="sop-checklist / all-items-acked (pull_request)",
+    )
+    p.add_argument(
+        "--exit-on-state",
+        action="store_true",
+        help=(
+            "If set, exit non-zero when state=failure. Default OFF so the "
+            "job-level conclusion is independent of ack-state — the only "
+            "thing BP sees is the POSTed status. Useful for local debugging."
+        ),
+    )
+    args = p.parse_args(argv)
+
+    token = os.environ.get("GITEA_TOKEN", "")
+    if not token and not args.dry_run:
+        print("::error::GITEA_TOKEN env required", file=sys.stderr)
+        return 2
+
+    cfg = load_config(args.config)
+    items: list[dict[str, Any]] = cfg["items"]
+    items_by_slug = {it["slug"]: it for it in items}
+    numeric_aliases = {
+        int(it["numeric_alias"]): it["slug"] for it in items if it.get("numeric_alias")
+    }
+
+    client = GiteaClient(args.gitea_host, token) if token else None
+    if not client:
+        print("::error::No client (dry-run without token has nothing to do)", file=sys.stderr)
+        return 2
+
+    pr = client.get_pr(args.owner, args.repo, args.pr)
+    if pr.get("state") != "open":
+        print(f"::notice::PR #{args.pr} is {pr.get('state')} — gate is a no-op")
+        return 0
+
+    author = (pr.get("user") or {}).get("login", "")
+    head_sha = (pr.get("head") or {}).get("sha", "")
+    body = pr.get("body", "") or ""
+
+    if not author or not head_sha:
+        print("::error::PR payload missing user.login or head.sha", file=sys.stderr)
+        return 1
+
+    comments = client.get_issue_comments(args.owner, args.repo, args.pr)
+
+    # Build team-membership probe closure that caches results per
+    # (user, team-id) so a user acking multiple items only triggers
+    # one membership lookup per team.
+    team_member_cache: dict[tuple[str, int], bool | None] = {}
+
+    def probe(slug: str, users: list[str]) -> list[str]:
+        item = items_by_slug[slug]
+        team_names: list[str] = item["required_teams"]
+        # Resolve names → ids. NOTE: orgs/{org}/teams/search may not be
+        # available — fall back to the list endpoint.
+        team_ids: list[int] = []
+        for tn in team_names:
+            tid = client.resolve_team_id(args.owner, tn)
+            if tid is None:
+                # Try the list endpoint as a fallback.
+                code, data = client._req(  # noqa: SLF001
+                    "GET", f"/orgs/{args.owner}/teams"
+                )
+                if code == 200 and isinstance(data, list):
+                    for t in data:
+                        if t.get("name") == tn:
+                            tid = t.get("id")
+                            client._team_id_cache[(args.owner, tn)] = tid  # noqa: SLF001
+                            break
+            if tid is not None:
+                team_ids.append(tid)
+            else:
+                print(
+                    f"::warning::could not resolve team-id for '{tn}' "
+                    f"in org '{args.owner}' — item '{slug}' will fail closed",
+                    file=sys.stderr,
+                )
+        approved: list[str] = []
+        for u in users:
+            for tid in team_ids:
+                cache_key = (u, tid)
+                if cache_key not in team_member_cache:
+                    team_member_cache[cache_key] = client.is_team_member(tid, u)
+                result = team_member_cache[cache_key]
+                if result is True:
+                    approved.append(u)
+                    break
+                if result is None:
+                    print(
+                        f"::warning::team-probe for {u} in team-id {tid} returned 403 "
+                        "(token owner not in that team — fail-closed per RFC#324)",
+                        file=sys.stderr,
+                    )
+                    # Treat as not-in-team for this user/team pair; loop
+                    # may still find membership in another team.
+        return approved
+
+    ack_state = compute_ack_state(comments, author, items_by_slug, numeric_aliases, probe)
+    body_state = {it["slug"]: section_marker_present(body, it["pr_section_marker"]) for it in items}
+
+    state, description = render_status(items, ack_state, body_state)
+    mode = get_tier_mode(pr, cfg)
+    if state == "failure" and mode == "soft":
+        state = "pending"
+        description = f"[soft-fail tier:low] {description}"
+
+    # Diagnostics to job log.
+    print(f"::notice::PR #{args.pr} author={author} head={head_sha[:7]} mode={mode}")
+    for it in items:
+        slug = it["slug"]
+        ackers = ack_state[slug]["ackers"]
+        if ackers:
+            print(f"::notice::  [PASS] {slug} — acked by {','.join(ackers)}")
+        else:
+            r = ack_state[slug]["rejected"]
+            extras: list[str] = []
+            if r["self_ack"]:
+                extras.append(f"self-acks-rejected:{','.join(r['self_ack'])}")
+            if r["not_in_team"]:
+                extras.append(f"not-in-team:{','.join(r['not_in_team'])}")
+            extra = " (" + "; ".join(extras) + ")" if extras else ""
+            print(f"::notice::  [WAIT] {slug} — no valid peer-ack yet{extra}")
+
+    print(f"::notice::posting status: state={state} desc={description!r}")
+
+    if args.dry_run:
+        print("::notice::--dry-run: not posting status")
+        if args.exit_on_state:
+            return 0 if state in ("success", "pending") else 1
+        return 0
+
+    target_url = f"https://{args.gitea_host}/{args.owner}/{args.repo}/pulls/{args.pr}"
+    client.post_status(
+        args.owner, args.repo, head_sha,
+        state=state, context=args.status_context,
+        description=description, target_url=target_url,
+    )
+    print(f"::notice::status posted: {args.status_context} → {state}")
+    # By default exit 0 — the POSTed status IS the gate, NOT the job
+    # conclusion. If the job exits 1 BP will see TWO failure signals
+    # (one from the job's auto-status, one from our POST), making the
+    # description less actionable. --exit-on-state restores the old
+    # behavior for local debugging.
+    if args.exit_on_state:
+        return 0 if state in ("success", "pending") else 1
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
@@ -0,0 +1,109 @@
+# SOP-Checklist gate — per-item required reviewer teams.
+#
+# RFC#351 v1 starter set. Each item lists:
+#   slug              — canonical kebab-case form used in /sop-ack <slug>
+#   pr_section_marker — substring matched in the PR body to detect that
+#                       the author filled in this item (case-insensitive)
+#   required_teams    — list of Gitea team names; an ack from ANY one of
+#                       these teams (logical OR) satisfies the item.
+#                       Membership is probed at gate-time via
+#                       GET /api/v1/teams/{id}/members/{login}.
+#                       Team-id resolution happens at script start via
+#                       GET /api/v1/orgs/{org}/teams (cheap, one call).
+#   numeric_alias     — 1..7; lets reviewers type `/sop-ack 3` as a
+#                       shortcut for `/sop-ack staging-smoke`.
+#
+# WHY THESE TEAM MAPPINGS:
+#   The RFC table referenced persona-role names like `core-qa`,
+#   `core-be`, `core-devops` — these are individual Gitea user logins,
+#   not teams. The Gitea team-membership API is /teams/{id}/members/{u},
+#   so we need actual teams. Orchestrator preflight 2026-05-12 verified
+#   only these teams exist on molecule-ai: ceo(5), engineers(2),
+#   managers(6), qa(20), security(21), Owners(1), and bot teams. We
+#   map the RFC roles to the closest existing team and surface the
+#   mapping explicitly so it's reviewable.
+#
+# HOW TO EDIT:
+#   - Tightening: replace `engineers` with a smaller team after creating
+#     it (e.g. a new `senior-engineers` team if needed).
+#   - Loosening: add another team to required_teams (OR semantics).
+#   - Add an item: append to items list and document the slug below.
+#
+# AUTHOR SELF-ACK IS FORBIDDEN regardless of which team contains them
+# — the gate script enforces commenter != PR author before checking
+# team membership.
+
+version: 1
+
+# Tier-aware failure mode (RFC#351 open question 2):
+#   For tier:high — hard-fail (status `failure`, blocks merge via BP).
+#   For tier:medium — hard-fail (same as high; medium is non-trivial).
+#   For tier:low — soft-fail (status `pending` with `acked: N/M` in the
+#                  description). BP can choose to require the context
+#                  or not for low-tier PRs.
+# If no tier label is present, default to medium (hard-fail) — every PR
+# should have a tier label per sop-tier-check, and absence indicates
+# a missing-tier defect we should surface, not silently lower the bar.
+tier_failure_mode:
+  "tier:high": hard
+  "tier:medium": hard
+  "tier:low": soft
+default_mode: hard  # used when no tier:* label is present
+
+items:
+  - slug: comprehensive-testing
+    numeric_alias: 1
+    pr_section_marker: "Comprehensive testing performed"
+    required_teams: [qa, engineers]
+    description: >-
+      What was tested, how, edge cases covered. Ack from any qa-team
+      member (or engineers fallback while qa is small).
+
+  - slug: local-postgres-e2e
+    numeric_alias: 2
+    pr_section_marker: "Local-postgres E2E run"
+    required_teams: [engineers]
+    description: >-
+      Link to local CI artifact, or "N/A: pure-frontend change". Ack
+      from any engineer who can verify the local DB test actually ran.
+
+  - slug: staging-smoke
+    numeric_alias: 3
+    pr_section_marker: "Staging-smoke verified or pending"
+    required_teams: [engineers]
+    description: >-
+      Link to canary run, or "scheduled post-merge". Ack from any
+      engineer (core-devops/infra-sre are members of engineers team).
+
+  - slug: root-cause
+    numeric_alias: 4
+    pr_section_marker: "Root-cause not symptom"
+    required_teams: [managers, ceo]
+    description: >-
+      One-sentence root-cause statement. Ack from managers tier
+      (team-leads) or ceo. Senior judgment required to attest
+      root-cause-versus-symptom.
+
+  - slug: five-axis-review
+    numeric_alias: 5
+    pr_section_marker: "Five-Axis review walked"
+    required_teams: [engineers]
+    description: >-
+      Correctness / readability / architecture / security / performance.
+      Ack from any non-author engineer.
+
+  - slug: no-backwards-compat
+    numeric_alias: 6
+    pr_section_marker: "No backwards-compat shim / dead code added"
+    required_teams: [managers, ceo]
+    description: >-
+      Yes/no + justification if no. Senior ack required because
+      backward-compat shims are how dead-code accretes.
+
+  - slug: memory-consulted
+    numeric_alias: 7
+    pr_section_marker: "Memory/saved-feedback consulted"
+    required_teams: [engineers]
+    description: >-
+      List of feedback memories applicable to this change. Ack from
+      any engineer who has the same memory access.
@@ -7,7 +7,6 @@ on:
 jobs:
  build:
    runs-on: ubuntu-latest
-    timeout-minutes: 30
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
@@ -0,0 +1,121 @@
+# sop-checklist-gate — peer-ack merge gate for SOP-checklist items.
+#
+# RFC#351 Step 2 of 6 (implementation MVP).
+#
+# === DESIGN ===
+#
+# Goal: each PR must answer 7 SOP-checklist questions in its body,
+# and each item must have at least one /sop-ack <slug> comment from
+# a non-author peer in the required team. BP requires the
+# `sop-checklist / all-items-acked (pull_request)` status to merge.
+#
+# Triggers:
+#   - `pull_request_target`: opened, edited, synchronize, reopened
+#       → fires when PR opens, body is edited (refire — RFC#351 §4),
+#         or new code is pushed (head.sha changes → stale status would
+#         be auto-discarded by BP via dismiss_stale_reviews, but the
+#         status itself is per-SHA so we re-post on the new head).
+#   - `issue_comment`: created, edited, deleted
+#       → fires on any new comment so /sop-ack / /sop-revoke take
+#         effect immediately (Gitea 1.22.6 doesn't refire on
+#         pull_request_review per feedback_pull_request_review_no_refire,
+#         so issue_comment is the canonical refire channel).
+#
+# Trust boundary (mirrors RFC#324 §A4 + sop-tier-check security note):
+#   `pull_request_target` (not `pull_request`) — workflow def is loaded
+#   from BASE branch, so a PR cannot rewrite this workflow to exfiltrate
+#   the token. The `actions/checkout` step pins `ref: base.sha` so the
+#   script ALSO comes from BASE. PR-HEAD code is never executed in the
+#   runner.
+#
+# Token scope:
+#   - read:repository, read:organization for PR + comments + team probes
+#   - write:repository for POST /statuses/{sha}
+#   - The token owner MUST be a member of every team referenced by the
+#     config's required_teams (else /teams/{id}/members/{login} returns
+#     403 — see review-check.sh same-gotcha doc). For the MVP we use
+#     the dev-lead token (a member of engineers, managers, qa, security)
+#     via `SOP_CHECKLIST_GATE_TOKEN`, the org-level `SOP_TIER_CHECK_TOKEN`, or a repo-scoped fallback token. Provisioning of that
+#     secret is a follow-up authorization step (separate from this PR).
+#
+# Failure mode: tier-aware (RFC#351 open question 2):
+#   - tier:high   → state=failure (hard-fail; BP blocks merge)
+#   - tier:medium → state=failure (hard-fail; same)
+#   - tier:low    → state=pending (soft-fail; BP can choose to require
+#                    this context or skip for low-tier PRs)
+#   - missing/no-tier → state=failure (default-mode: hard — never lower
+#                    the bar per feedback_fix_root_not_symptom)
+#
+# Slash-command contract (RFC#351 v1 + §A1.1-style notes from RFC#324):
+#
+#   /sop-ack <slug-or-numeric-alias> [optional note]
+#       — register a peer-ack for one checklist item.
+#       — slug accepts kebab-case, snake_case, or natural-spaces
+#         (all normalize to canonical kebab-case).
+#       — numeric 1..7 maps via config.items[*].numeric_alias.
+#       — most-recent (user, slug) directive wins.
+#
+#   /sop-revoke <slug-or-numeric-alias> [reason]
+#       — invalidate the commenter's own prior /sop-ack for this slug.
+#       — does NOT affect other peers' acks on the same slug.
+#       — most-recent (user, slug) directive wins, so a later /sop-ack
+#         re-restores the ack.
+#
+# The eval is read-only + idempotent (read PR + comments + team
+# membership, compute, post status). Re-running on any event is safe —
+# the new status overwrites the previous one for the same context.
+
+name: sop-checklist-gate
+
+on:
+  pull_request_target:
+    types: [opened, edited, synchronize, reopened]
+  issue_comment:
+    types: [created, edited, deleted]
+
+permissions:
+  contents: read
+  pull-requests: read
+  # NOTE: `statuses: write` is the GitHub-Actions name for POST /statuses.
+  # Gitea 1.22.6 may not gate on this permission key (it just checks the
+  # token), but listing it explicitly documents intent for the next
+  # platform-version upgrade.
+  statuses: write
+
+jobs:
+  gate:
+    # Run on pull_request_target events always. On issue_comment events,
+    # only when the comment is on a PR (issue_comment fires for issues
+    # too) and the body contains one of the slash-commands.
+    if: |
+      github.event_name == 'pull_request_target' ||
+      (github.event_name == 'issue_comment' &&
+       github.event.issue.pull_request != null &&
+       (contains(github.event.comment.body, '/sop-ack') ||
+        contains(github.event.comment.body, '/sop-revoke')))
+    runs-on: ubuntu-latest
+    steps:
+      - name: Check out BASE ref (trust boundary — never PR-head)
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd  # v6.0.2
+        with:
+          # For pull_request_target, the default branch is the trust
+          # anchor. For issue_comment the PR base may differ from the
+          # default branch (PR targeting `staging`), so we use the
+          # default-branch ref explicitly — same approach as
+          # qa-review.yml so the script source is always trusted.
+          ref: ${{ github.event.repository.default_branch }}
+
+      - name: Run sop-checklist-gate
+        env:
+          GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.SOP_TIER_CHECK_TOKEN || secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
+          PR_NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}
+          OWNER: ${{ github.repository_owner }}
+          REPO_NAME: ${{ github.event.repository.name }}
+        run: |
+          set -euo pipefail
+          python3 .gitea/scripts/sop-checklist-gate.py \
+            --owner "$OWNER" \
+            --repo "$REPO_NAME" \
+            --pr "$PR_NUMBER" \
+            --config .gitea/sop-checklist-config.yaml \
+            --gitea-host git.moleculesai.app
@@ -8,6 +8,27 @@ Entries are published daily at 23:50 UTC.

 ---

+## 2026-05-13
+
+### ✨ New features
+
+- **Graceful shutdown support for remote agents**: `run_heartbeat_loop()` and `run_agent_loop()` in `molecule-sdk-python` now accept a `stop_event: threading.Event` parameter. Set the event from a SIGTERM handler to exit the loop cleanly with return value `"stopped"` — enabling proper graceful shutdown in Kubernetes, Docker, and other container-orchestrated environments. (`molecule-sdk-python` [#8](https://git.moleculesai.app/molecule-ai/molecule-sdk-python/pulls/8))
+
+### 🔧 Fixes
+
+- **PLATFORM_URL defaults aligned across all runtime modules**: all workspace runtime modules (`a2a_cli.py`, `a2a_client.py`, `a2a_mcp_server.py`, and 10 others) now consistently default `PLATFORM_URL` to `http://host.docker.internal:8080`. Previously some modules defaulted to `http://platform:8080`, causing connection failures in containerized deployments where the Docker host is not named `platform`. (`molecule-ai-workspace-runtime` [#12](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/12))
+
+### 🔒 Security
+
+- **CWE-22: Path traversal regression in org template import fixed**: a regression removed the path-traversal guard from `createWorkspaceTree` in `org_import.go`, which could allow a malicious org YAML with `filesDir: "../../../etc"` to read arbitrary server files. The fix replaces the unprotected `parseEnvFile` calls with `loadWorkspaceEnv` which applies `resolveInsideRoot` validation before accessing any path. (`molecule-core` [#810](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/810))
+
+### 🧹 Internal
+
+- **Canvas CI hardening**: publish workflow updated to pipefail-safe shell probes; Gitea cache export no longer masks errors; canvas image published to ECR. (`molecule-core` [#773](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/773), [#776](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/776), [#777](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/777))
+- **Go lint CI hardening**: `golangci-lint run` no longer masked with `|| true`, so lint failures now fail the build loudly. (`molecule-core` [#781](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/781))
+
+---
+
 ## 2026-05-12

 ### 🔒 Security
@@ -26,6 +47,7 @@ Entries are published daily at 23:50 UTC.

 ---

+
 ## 2026-05-11

 ### ✨ New features
@@ -60,6 +82,7 @@ Entries are published daily at 23:50 UTC.

 ---

+
 ## 2026-05-10

 ### ✨ New features
@@ -96,82 +119,6 @@ Entries are published daily at 23:50 UTC.
 - **molecule-ai-plugin-molecule-careful-bash**: token exfiltration pattern block (OFFSEC-002) now documented in `known-issues.md`. (`molecule-ai-plugin-molecule-careful-bash` [#3](https://git.moleculesai.app/molecule-ai/molecule-ai-plugin-molecule-careful-bash/pulls/3))
 - **molecule-ci**: 7 reusable workflows ported to `.gitea/workflows/`, and Docker build smoke tests now gracefully skip when the daemon is unavailable. (`molecule-ci` [#6](https://git.moleculesai.app/molecule-ai/molecule-ci/pulls/6), [#7](https://git.moleculesai.app/molecule-ai/molecule-ci/pulls/7))

---
-
-## 2026-04-23
-
-### ✨ New features
-
- **SaaS Federation v2 tutorial**: a clean, self-contained walkthrough for platform operators who want to run multi-tenant workspaces from a single control plane. Covers org onboarding via `POST /cp/orgs`, workspace provisioning per tenant, fleet inspection, quota controls, and suspension/teardown. (`molecule-core` [#1700](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1700))
- **External workspace quickstart**: a 5-minute guide to running any HTTP-speaking agent (Python, Node, Go, Rust) on your own machine and having it appear on the canvas alongside platform-provisioned agents. Covers tunnel setup, `POST /workspaces` registration, and a working echo agent. (`molecule-core` [#1760](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1760))
-
-### 🔧 Fixes
-
- **SSRF guard in SaaS mode**: previously the SSRF protection was blocking all RFC-1918 private IP ranges (`10/8`, `172.16/12`, `192.168/16`) even in SaaS mode — this was a regression from the earlier SaaS-mode work. The fix wires up the `saasMode` flag correctly so private IPs are allowed in SaaS deployments (for internal service calls), while metadata ranges (`169.254/16`), CGNAT, loopback, and link-local remain blocked in every mode. IPv6 ULA (`fd00::/8`) handling is also now correct. (`molecule-core` [#1692](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1692))
- **PUT `/workspaces/:id/files/*path` on SaaS (EC2) workspaces**: fixed a 500 error (`docker not available`) that occurred when saving files from Canvas on SaaS workspaces. The handler now detects non-Docker workspaces via `workspaces.instance_id` and routes writes via EC2 Instance Connect (SSH-backed write with an ephemeral key pair) instead of trying to `docker cp`. (`molecule-core` [#1702](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1702))
-
-### 📚 Docs
-
- **molecli shell completion**: tab completion for `molecule` CLI in bash, zsh, fish, and PowerShell — covers all subcommands and flags. (`docs` [#79](https://git.moleculesai.app/molecule-ai/docs/pulls/79))
- **MCP server structured logging**: `LOG_LEVEL` env var, pino JSON output with AsyncLocalStorage context on every tool call. (`docs` [#78](https://git.moleculesai.app/molecule-ai/docs/pulls/78))
-
-### 🧹 Internal
-
- SaaS Federation v2 tutorial published — clean rewrite of #1613, now with correct HTTP status codes, fleet metrics endpoint, and security model table (`molecule-core` [#1700](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1700)); Files API SSH-backed write path for SaaS EC2 workspaces — fixes 500 on PUT `/workspaces/:id/files/*path` for SaaS users (`molecule-core` [#1702](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1702)); Canvas create-workspace dialog now requires hermes runtime model (`molecule-core` [#1714](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1714)).
- EC2 Instance Connect SSH tutorial published (`molecule-core` [#1617](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1617)); AI agent org-scoped key credential model blog published (`molecule-core` [#1614](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1614)); Phase 30 Day 2 social package ready (`molecule-core` [#1662](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1662)).
-
-### 🌅 Late-day updates (17:30–23:50 UTC)
-
-#### 🔒 Security
-
- **Cross-tenant memory poisoning fix** (`molecule-core` [#1791](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1791)): fixes a bug where `commit_memory` with `scope=TEAM` could write to a sibling workspace's memory store under high concurrency. `commit_memory` now validates `target_workspace_id` against the caller's known peer set before any write.
- **CWE-78 shell injection hardening** (`molecule-core` [#1885](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1885)): `shellQuote` now uses `strconv.Quote` for all shell-delimited paths in the EC2 Instance Connect and bastion SSH paths. Defense-in-depth layer hardened; primary protection remains path-validation logic upstream.
-
-#### ✨ New features
-
- **A2A priority queue — Phase 1** (`molecule-core` [#1892](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1892)): task dispatch now supports a `priority` field (`low` / `normal` / `high` / `urgent`). High/urgent tasks bypass the normal FIFO queue and are dispatched immediately. Phase 2 (priority inversion deadlock prevention) on the roadmap.
-
-#### 🔧 Fixes
-
- **A2A queue nil-safe drain** (`molecule-core` [#1893](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1893), [#1896](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1896)): `DequeueTask` no longer panics when the in-memory queue map is uninitialized — graceful empty-result returned instead.
- **Workspaces stuck in `provisioning` after失败** (`molecule-core` [#1794](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1794)): provisioner now transitions workspaces to `failed` state with a descriptive error message instead of leaving them orphaned in `provisioning`.
- **Dedup settings hooks double-fire** (`molecule-core` [#1797](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1797)): the `dedup_settings_hooks` registry now correctly unsubscribes after one fire — eliminates the 3–4× duplicate hook execution observed in CI.
- **Semantic memory search returning stale results** (`molecule-core` [#1778](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1778)): pgvector index now refreshes synchronously on `commit_memory` write instead of on a 5-minute background cycle.
- **pgvector migration race in E2E CI** (`molecule-core` [#1777](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1777)): `CREATE EXTENSION` wrapped in `IF NOT EXISTS` inside a `DO` block — eliminates E2E CI flakiness on fresh DB spin-up.
- **EC2 Instance Connect endpoint not found in us-west-2** (`molecule-core` [#1779](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1779)): Instance Connect endpoint SDK call now falls back gracefully to direct SSM session when the EIC endpoint is unavailable in a region.
- **Canvas topology overlay edge labels clipped** (`molecule-core` [#1802](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1802)): SVG edge labels now respect viewport bounds; labels that would render off-screen are repositioned.
- **Audit trail panel not loading for large workspaces** (`molecule-core` [#1854](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1854)): audit log fetch now uses cursor-based pagination (100 events per page) instead of returning all events at once.
- **Hermes `response_format` not forwarded to MiniMax** (`molecule-core` [#1861](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1861)): `response_format=json_schema` now propagates through the model config passthrough for hermes/MiniMax-M2.7-highspeed workspaces.
- **Memory Inspector panel memory leak** (`molecule-core` [#1871](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1871)): `useMemoryStore` hook now correctly cancels the SSE subscription on panel unmount.
- **Token revocation cache stale-read window** (`molecule-core` [#1888](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1888)): revoked-token invalidation now propagates within 5 s (down from 60 s) — closes the window where a revoked token could still authenticate.
- **TenantGuard same-origin bypass (regression)** (`molecule-core` [#1898](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1898)): fixes a regression introduced in the Phase 33 cloudflare-removal change that re-opened the TenantGuard same-origin bypass for EC2 tenant Canvas deployments.
-
-#### 📚 Docs
-
- **Chrome DevTools MCP tutorial** (`docs` [#1798](https://git.moleculesai.app/molecule-ai/docs/pulls/1798)): hands-on guide for debugging Molecule AI agents in-browser using Chrome's built-in MCP inspector.
- **Phase 34 launch page** (`docs` [#1799](https://git.moleculesai.app/molecule-ai/docs/pulls/1799)): public-facing launch collateral for GA scheduled 2026-04-30.
- **Tool Trace demo environment** (`docs` [#1844](https://git.moleculesai.app/molecule-ai/docs/pulls/1844)): interactive demo showing the tool trace inspector in action, with sample run data.
- **Enterprise battlecard** (`docs` [#1864](https://git.moleculesai.app/molecule-ai/docs/pulls/1864)): competitive positioning doc for sales and enterprise evaluation teams.
-
-#### 🧹 Internal
-
- `a2a-sdk` hot-pinned to `0.3.x` across all workspace template repos (`molecule-core` [#1890](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1890)); SDK upgrade path documented in `KI-009` (`internal` [#1631](https://git.moleculesai.app/molecule-ai/internal/issues/1631)).
- Phase 34 CI matrix expanded to cover Node 22 and Go 1.24 (`molecule-ci`).
-
-#### 🔧 Runtime fixes
-
- **Heartbeat 401 retry** (`molecule-ai-workspace-runtime` [#40](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/40)): heartbeat worker now retries with fresh token on 401 before declaring the workspace unreachable — eliminates false `disconnected` status during token rotation.
- **LLM token auto-detect** (`molecule-ai-workspace-runtime` [#38](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/38)): hermes runtime now auto-detects `max_tokens` from model context window and request timeout when not explicitly configured.
-
---
-
-
-
-
-
-
-## 2026-05-10
-
 ### ✨ New features

 - **A2A priority queue — Phase 1**: task dispatch now supports a `priority` field (`low` / `normal` / `high` / `urgent`). High/urgent tasks bypass the normal FIFO queue and are dispatched immediately. (`molecule-core` [#225](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/225))
@@ -214,7 +161,7 @@ Entries are published daily at 23:50 UTC.

 - **SOP tier-check AND-composition of required team approvals per tier**: tier-check now enforces AND-composition of required team approvals per tier (`tier:high`). (`molecule-core` [#225](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/225))
 - **Canvas structural tests for TIER_CONFIG and COMM_TYPE_LABELS**: structural tests added for canvas TIER_CONFIG and COMM_TYPE_LABELS constants. (`molecule-core` [#245](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/245))
-
+---

 ## 2026-05-09

@@ -243,6 +190,7 @@ Entries are published daily at 23:50 UTC.

 ---

+
 ## 2026-05-08

 ### 🔧 Fixes
@@ -253,6 +201,7 @@ Entries are published daily at 23:50 UTC.

 ---

+
 ## 2026-05-07

 ### 📚 Docs
@@ -271,6 +220,7 @@ Entries are published daily at 23:50 UTC.

 ---

+
 ## 2026-05-06

 ### 🧹 Internal
@@ -280,6 +230,7 @@ Entries are published daily at 23:50 UTC.

 ---

+
 ## 2026-04-22

 ### ✨ New features
@@ -345,6 +296,83 @@ See the [migration blog post](/blog/cloudflare-tunnel-migration).



+
+
+---
+
+## 2026-04-23
+
+### ✨ New features
+
+- **SaaS Federation v2 tutorial**: a clean, self-contained walkthrough for platform operators who want to run multi-tenant workspaces from a single control plane. Covers org onboarding via `POST /cp/orgs`, workspace provisioning per tenant, fleet inspection, quota controls, and suspension/teardown. (`molecule-core` [#1700](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1700))
+- **External workspace quickstart**: a 5-minute guide to running any HTTP-speaking agent (Python, Node, Go, Rust) on your own machine and having it appear on the canvas alongside platform-provisioned agents. Covers tunnel setup, `POST /workspaces` registration, and a working echo agent. (`molecule-core` [#1760](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1760))
+
+### 🔧 Fixes
+
+- **SSRF guard in SaaS mode**: previously the SSRF protection was blocking all RFC-1918 private IP ranges (`10/8`, `172.16/12`, `192.168/16`) even in SaaS mode — this was a regression from the earlier SaaS-mode work. The fix wires up the `saasMode` flag correctly so private IPs are allowed in SaaS deployments (for internal service calls), while metadata ranges (`169.254/16`), CGNAT, loopback, and link-local remain blocked in every mode. IPv6 ULA (`fd00::/8`) handling is also now correct. (`molecule-core` [#1692](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1692))
+- **PUT `/workspaces/:id/files/*path` on SaaS (EC2) workspaces**: fixed a 500 error (`docker not available`) that occurred when saving files from Canvas on SaaS workspaces. The handler now detects non-Docker workspaces via `workspaces.instance_id` and routes writes via EC2 Instance Connect (SSH-backed write with an ephemeral key pair) instead of trying to `docker cp`. (`molecule-core` [#1702](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1702))
+
+### 📚 Docs
+
+- **molecli shell completion**: tab completion for `molecule` CLI in bash, zsh, fish, and PowerShell — covers all subcommands and flags. (`docs` [#79](https://git.moleculesai.app/molecule-ai/docs/pulls/79))
+- **MCP server structured logging**: `LOG_LEVEL` env var, pino JSON output with AsyncLocalStorage context on every tool call. (`docs` [#78](https://git.moleculesai.app/molecule-ai/docs/pulls/78))
+
+### 🧹 Internal
+
+- SaaS Federation v2 tutorial published — clean rewrite of #1613, now with correct HTTP status codes, fleet metrics endpoint, and security model table (`molecule-core` [#1700](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1700)); Files API SSH-backed write path for SaaS EC2 workspaces — fixes 500 on PUT `/workspaces/:id/files/*path` for SaaS users (`molecule-core` [#1702](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1702)); Canvas create-workspace dialog now requires hermes runtime model (`molecule-core` [#1714](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1714)).
+- EC2 Instance Connect SSH tutorial published (`molecule-core` [#1617](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1617)); AI agent org-scoped key credential model blog published (`molecule-core` [#1614](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1614)); Phase 30 Day 2 social package ready (`molecule-core` [#1662](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1662)).
+
+### 🌅 Late-day updates (17:30–23:50 UTC)
+
+#### 🔒 Security
+
+- **Cross-tenant memory poisoning fix** (`molecule-core` [#1791](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1791)): fixes a bug where `commit_memory` with `scope=TEAM` could write to a sibling workspace's memory store under high concurrency. `commit_memory` now validates `target_workspace_id` against the caller's known peer set before any write.
+- **CWE-78 shell injection hardening** (`molecule-core` [#1885](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1885)): `shellQuote` now uses `strconv.Quote` for all shell-delimited paths in the EC2 Instance Connect and bastion SSH paths. Defense-in-depth layer hardened; primary protection remains path-validation logic upstream.
+
+#### ✨ New features
+
+- **A2A priority queue — Phase 1** (`molecule-core` [#1892](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1892)): task dispatch now supports a `priority` field (`low` / `normal` / `high` / `urgent`). High/urgent tasks bypass the normal FIFO queue and are dispatched immediately. Phase 2 (priority inversion deadlock prevention) on the roadmap.
+
+#### 🔧 Fixes
+
+- **A2A queue nil-safe drain** (`molecule-core` [#1893](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1893), [#1896](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1896)): `DequeueTask` no longer panics when the in-memory queue map is uninitialized — graceful empty-result returned instead.
+- **Workspaces stuck in `provisioning` after failure** (`molecule-core` [#1794](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1794)): provisioner now transitions workspaces to `failed` state with a descriptive error message instead of leaving them orphaned in `provisioning`.
+- **Dedup settings hooks double-fire** (`molecule-core` [#1797](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1797)): the `dedup_settings_hooks` registry now correctly unsubscribes after one fire — eliminates the 3–4× duplicate hook execution observed in CI.
+- **Semantic memory search returning stale results** (`molecule-core` [#1778](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1778)): pgvector index now refreshes synchronously on `commit_memory` write instead of on a 5-minute background cycle.
+- **pgvector migration race in E2E CI** (`molecule-core` [#1777](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1777)): `CREATE EXTENSION` wrapped in `IF NOT EXISTS` inside a `DO` block — eliminates E2E CI flakiness on fresh DB spin-up.
+- **EC2 Instance Connect endpoint not found in us-west-2** (`molecule-core` [#1779](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1779)): Instance Connect endpoint SDK call now falls back gracefully to direct SSM session when the EIC endpoint is unavailable in a region.
+- **Canvas topology overlay edge labels clipped** (`molecule-core` [#1802](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1802)): SVG edge labels now respect viewport bounds; labels that would render off-screen are repositioned.
+- **Audit trail panel not loading for large workspaces** (`molecule-core` [#1854](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1854)): audit log fetch now uses cursor-based pagination (100 events per page) instead of returning all events at once.
+- **Hermes `response_format` not forwarded to MiniMax** (`molecule-core` [#1861](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1861)): `response_format=json_schema` now propagates through the model config passthrough for hermes/MiniMax-M2.7-highspeed workspaces.
+- **Memory Inspector panel memory leak** (`molecule-core` [#1871](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1871)): `useMemoryStore` hook now correctly cancels the SSE subscription on panel unmount.
+- **Token revocation cache stale-read window** (`molecule-core` [#1888](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1888)): revoked-token invalidation now propagates within 5 s (down from 60 s) — closes the window where a revoked token could still authenticate.
+- **TenantGuard same-origin bypass (regression)** (`molecule-core` [#1898](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1898)): fixes a regression introduced in the Phase 33 cloudflare-removal change that re-opened the TenantGuard same-origin bypass for EC2 tenant Canvas deployments.
+
+#### 📚 Docs
+
+- **Chrome DevTools MCP tutorial** (`docs` [#1798](https://git.moleculesai.app/molecule-ai/docs/pulls/1798)): hands-on guide for debugging Molecule AI agents in-browser using Chrome's built-in MCP inspector.
+- **Phase 34 launch page** (`docs` [#1799](https://git.moleculesai.app/molecule-ai/docs/pulls/1799)): public-facing launch collateral for GA scheduled 2026-04-30.
+- **Tool Trace demo environment** (`docs` [#1844](https://git.moleculesai.app/molecule-ai/docs/pulls/1844)): interactive demo showing the tool trace inspector in action, with sample run data.
+- **Enterprise battlecard** (`docs` [#1864](https://git.moleculesai.app/molecule-ai/docs/pulls/1864)): competitive positioning doc for sales and enterprise evaluation teams.
+
+#### 🧹 Internal
+
+- `a2a-sdk` hot-pinned to `0.3.x` across all workspace template repos (`molecule-core` [#1890](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/1890)); SDK upgrade path documented in `KI-009` (`internal` [#1631](https://git.moleculesai.app/molecule-ai/internal/issues/1631)).
+- Phase 34 CI matrix expanded to cover Node 22 and Go 1.24 (`molecule-ci`).
+
+#### 🔧 Runtime fixes
+
+- **Heartbeat 401 retry** (`molecule-ai-workspace-runtime` [#40](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/40)): heartbeat worker now retries with fresh token on 401 before declaring the workspace unreachable — eliminates false `disconnected` status during token rotation.
+- **LLM token auto-detect** (`molecule-ai-workspace-runtime` [#38](https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/pulls/38)): hermes runtime now auto-detects `max_tokens` from model context window and request timeout when not explicitly configured.
+
+---
+
+
+
+
+
+
+
 ## 2026-04-17

 A high-velocity day: 80+ PRs merged across platform, canvas, runtimes, security, and channels.
@@ -63,6 +63,15 @@ claude mcp add molecule -s user -- env \
 Reconnect with `/mcp` (or restart the Claude Code session) and the tools
 appear in the next turn.

+<Callout type="warn">
+Claude Code 2.1.x+ requires the tagged flag form
+`--dangerously-load-development-channels server:molecule`. The bare flag
+(`--dangerously-load-development-channels` with no value) causes every A2A
+turn to wedge with a `Control request timeout: initialize` error. See
+[Dev-channels flag: tagged-form requirement](/docs/runtime-mcp/dev-channels-flag)
+for the full failure-mode breakdown and SDK integration notes.
+</Callout>
+
 ### Hermes Agent

 ```bash
@@ -382,6 +391,7 @@ needed when you can't run an MCP stdio server inside your agent (rare).

 ## See also

+- [Dev-channels flag: tagged-form requirement](/docs/runtime-mcp/dev-channels-flag) — why `--dangerously-load-development-channels server:molecule` (not the bare flag) is required for inline channel push in Claude Code 2.1.x+
 - [External Agents](/docs/external-agents) — manual A2A path for non-MCP runtimes
 - [Tokens](/docs/tokens) — token management and rotation
 - [Concepts — Workspaces](/docs/concepts#workspaces)
@@ -0,0 +1,176 @@
+---
+title: "Dev-channels flag — tagged-form requirement"
+description: "Why Claude Code 2.1.x+ requires `--dangerously-load-development-channels server:molecule` (not the bare flag) to enable inline channel push from the molecule-mcp wheel."
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+The `molecule-mcp` wheel emits a JSON-RPC `notifications/claude/channel`
+notification on every inbound A2A message so Claude Code can render it
+as an inline `<channel>` synthetic user turn — zero polling, zero
+per-turn stall. During the channels research preview, Claude Code only
+processes that notification when the host is launched with the
+`--dangerously-load-development-channels` flag *and the flag carries a
+matching tagged allowlist entry*.
+
+This page covers the form that flag must take, what breaks when it's
+wrong, and when an operator has to think about it.
+
+<Callout type="warn">
+The bare flag (no value) is rejected by the post-2.1 CLI parser, and
+the failure mode propagates upstream as a `Control request timeout:
+initialize` from any SDK that spawns the CLI — every A2A turn wedges
+100% of the time. See [Failure mode](#failure-mode) below.
+</Callout>
+
+## The flag
+
+```
+--dangerously-load-development-channels <entries...>
+```
+
+Available in Claude Code **2.1.x and later**. It opts the CLI into
+processing experimental `notifications/<channel>` JSON-RPC methods
+emitted by registered MCP servers and plugin channels. Without it, the
+CLI silently drops those notifications during the allowlist check, even
+though the wheel ships the wire shape correctly.
+
+## Required form: tagged allowlist entries
+
+Each entry must carry one of two prefixes:
+
+| Form | Use for |
+|---|---|
+| `server:<MCP-server-name>` | Manually configured MCP servers — the name matches what you registered with `claude mcp add <name> ...` or the key under `mcpServers` in `~/.claude.json`. |
+| `plugin:<plugin-name>@<owner>/<repo>` | Plugin channels installed from a Claude Code plugin marketplace. |
+
+Multiple entries are space-separated:
+
+```bash
+claude --dangerously-load-development-channels server:molecule server:telegram
+```
+
+Untagged values (`molecule` instead of `server:molecule`) are rejected
+with `--dangerously-load-development-channels entries must be tagged`.
+
+## Failure mode
+
+A bare flag (`--dangerously-load-development-channels` with no value)
+walks through three layers of damage before surfacing:
+
+1. **CLI**: rejects the invocation with
+   `error: option '--dangerously-load-development-channels <servers...>' argument missing`.
+2. **SDK**: `claude-agent-sdk` (used by `claude_sdk_executor.py` in the
+   Claude Code workspace template) renders the kwarg as a bare switch when
+   the value is `None`. The CLI then never responds to the SDK's first
+   `initialize` control message.
+3. **Workspace agent**: the SDK times out with
+   `Control request timeout: initialize`. Every A2A turn wedges — 100%
+   reproducible. Caught live on workspace `dd40faf8` on 2026-05-01.
+
+Two small fixes prevent this: pass a tagged value (don't let `None`
+render as a bare switch), and verify the CLI accepts your specific
+entries before going broad.
+
+## For Molecule operators
+
+Pass `server:molecule` to enable the inbox bridge → MCP
+`notifications/claude/channel` push for the `molecule-mcp` wheel.
+
+```bash
+claude --dangerously-load-development-channels server:molecule
+```
+
+The `molecule` here matches the name you registered the wheel under in
+[Step 2 of the runtime-mcp guide](/docs/runtime-mcp#claude-code) (the
+key under `mcpServers`, or the first positional arg to `claude mcp add`).
+If you registered the wheel as `mol` or `molecule-prod`, use that name
+in the tag.
+
+When push is live, the session header prints:
+
+```
+Listening for channel messages from: server:molecule
+```
+
+…and inbound canvas/peer-agent messages render inline as
+`<channel source="molecule" ...>` synthetic user turns instead of
+arriving via `inbox_peek`.
+
+### Embedding in an SDK-driven agent
+
+If you spawn `claude` through `claude-agent-sdk` (e.g. the Claude Code
+workspace template's `claude_sdk_executor.py`), forward the tagged value
+through `extra_args`:
+
+```python
+from claude_agent_sdk import ClaudeAgentOptions
+
+ClaudeAgentOptions(
+    model=self.model,
+    permission_mode="bypassPermissions",
+    cwd=self._resolve_cwd(),
+    mcp_servers=mcp_servers,
+    system_prompt=self._build_system_prompt(),
+    resume=self._session_id,
+    extra_args={"dangerously-load-development-channels": "server:molecule"},
+)
+```
+
+The SDK forwards `extra_args` keys as `--<key> <value>` to the spawned
+CLI. Passing `None` as the value renders as a bare switch and trips the
+[Failure mode](#failure-mode) chain above.
+
+## Verification
+
+Verified live on 2026-05-02: with the tagged value in `extra_args`,
+the in-workspace agent received `<channel source="molecule" kind="..."
+peer_id="..." activity_id="..." ts="...">` tags inline as synthetic
+user turns. No `wait_for_message` poll was needed for delivery. A2A
+returned coherent replies on every turn.
+
+## When this matters
+
+Only when both of the following apply:
+
+- You're running Claude Code (any version 2.1.x or later) as the
+  workspace runtime, AND
+- The in-workspace `molecule-mcp` server is configured (it is, by
+  default, in the `claude-code` workspace template).
+
+**Hosted Molecule SaaS handles this automatically** — the executor
+passes `extra_args={"dangerously-load-development-channels": "server:molecule"}`
+when spawning the CLI. Operators on hosted SaaS do not need to do
+anything.
+
+**Self-hosted operators using the Claude Code workspace template** also
+get this for free since the template's executor sets `extra_args`. The
+flag only needs operator attention when:
+
+- Forking the Claude Code workspace template and stripping `extra_args`
+  inadvertently.
+- Running `claude` directly outside the template (e.g. interactive
+  sessions on a developer laptop) and wanting inline `<channel>` push.
+- Adding a second tagged source (e.g. `server:telegram` alongside
+  `server:molecule`) — append, don't replace.
+
+Operators on Cursor, Cline, OpenCode, codex, hermes-agent, or any
+non-Claude-Code MCP host are unaffected: those clients ignore the
+notification and the wheel's poll path delivers via
+`wait_for_message` as the universal fallback.
+
+## Forward note
+
+This requirement is a **research-preview gate**. Once Claude Code
+graduates `notifications/<channel>` from research preview to a default
+allowlist, the `--dangerously-load-development-channels` flag will no
+longer be required for the `molecule` server. Drop the `extra_args`
+entry in `claude_sdk_executor.py` (and any operator launch wrappers)
+when that happens — the wheel emits the wire shape correctly today
+and will continue to do so post-graduation.
+
+## See also
+
+- [Bring Your Own Runtime (MCP) — Inbound delivery](/docs/runtime-mcp#inbound-delivery-universal-poll-optional-push)
+- [Bring Your Own Runtime (MCP) — Step 2: Claude Code](/docs/runtime-mcp#claude-code)
+- [Troubleshooting — Control request timeout: initialize](/docs/runtime-mcp#control-request-timeout-initialize-from-the-workspace-agent)
@@ -9,6 +9,28 @@ This page documents security fixes shipped in the Molecule AI platform. Each ent

 ---

+## 2026-05-13 — CWE-22: Path Traversal Regression in `org_import.go` (Resolved)
+
+**Severity:** Critical (CWE-22)
+**PR:** [#810](https://git.moleculesai.app/molecule-ai/molecule-core/pull/810)
+**Affected:** `workspace-server/internal/handlers/org_import.go` — `createWorkspaceTree`
+
+### Vulnerability
+
+A regression removed the `resolveInsideRoot` path-traversal guard from `createWorkspaceTree` at `org_import.go:494`. The function called `parseEnvFile(filepath.Join(orgBaseDir, ws.FilesDir, ".env"))` without validating that `ws.FilesDir` resolved inside `orgBaseDir`.
+
+An attacker who could submit a malicious org YAML with `filesDir: "../../../etc"` could cause the platform to read arbitrary files accessible to the server process via the `.env` loading path.
+
+### Fix
+
+Replaced the two raw `parseEnvFile` calls with `loadWorkspaceEnv(orgBaseDir, ws.FilesDir)`, which applies `resolveInsideRoot` internally before joining paths. This restores the guard that was present before the regression was introduced.
+
+### User-facing summary
+
+The org template import endpoint now validates all workspace file paths before accessing them. Attempts to access files outside the designated org directory return an error and are never processed.
+
+---
+
 ## 2026-04-20 — CWE-22: Path Traversal in `copyFilesToContainer`

 **Severity:** High (CWE-22)
@@ -1,198 +0,0 @@
---
-title: Self-Hosted Workspace Deployment with Docker
---
-
-# Self-Hosted Workspace Deployment with Docker
-
-This guide covers running a Molecule AI workspace agent as a Docker container on a self-hosted server or VM. It covers the Docker image, required environment variables, the built-in healthcheck, graceful shutdown, and Kubernetes deployment considerations.
-
-> **Prerequisites:** A running Molecule AI control plane (self-hosted or SaaS), an `ADMIN_TOKEN` or org-scoped API key with admin scope, and Docker 20.10+ on the host.
-
-## How the workspace container works
-
-The Molecule AI workspace Dockerfile includes:
-
- A uvicorn server on port 8000 (configurable via `PORT`)
- A healthcheck endpoint at `/.well-known/agent-card.json` (used by Docker and Kubernetes probes)
- Graceful SIGTERM handling via uvicorn — the heartbeat loop and adapter tasks shut down cleanly
-
-```
-┌─────────────────────────────────────────────┐
-│  Docker host (your VM / bare metal)         │
-│                                             │
-│  ┌─────────────────────────────────────┐   │
-│  │  workspace container                 │   │
-│  │                                     │   │
-│  │  uvicorn (port 8000)                │   │
-│  │    └─ /.well-known/agent-card.json  ← HEALTHCHECK │   │
-│  │                                     │   │
-│  │  heartbeat loop + A2A agent            │   │
-│  └──────────────┬──────────────────────┘   │
-│                 │                              │
-│  host.docker.internal:8080                    │
-│                 │                              │
-│                 ▼                              │
-│  ┌─────────────────────────────────────┐   │
-│  │  Molecule AI control plane          │   │
-│  │  (platform on port 8080)            │   │
-│  └─────────────────────────────────────┘   │
-└─────────────────────────────────────────────┘
-```
-
-## Step 1: Create an external workspace
-
-First register the workspace as an external (self-managed) agent on the platform.
-
-```bash
-ADMIN_TOKEN="your-admin-token"
-PLATFORM_URL="https://platform.moleculesai.app"   # or http://localhost:8080 for local dev
-WORKSPACE=$(curl -s -X POST "${PLATFORM_URL}/workspaces" \
-  -H "Authorization: Bearer ${ADMIN_TOKEN}" \
-  -H "Content-Type: application/json" \
-  -d '{"name": "self-hosted-agent", "runtime": "external"}')
-
-WORKSPACE_ID=$(echo "$WORKSPACE" | python3 -c "import json,sys; print(json.load(sys.stdin)['id'])")
-echo "Workspace ID: $WORKSPACE_ID"
-```
-
-Save the returned `WORKSPACE_ID`. The workspace agent obtains its bearer token automatically during its first registration with the platform.
-
-## Step 2: Pull the workspace image
-
-The workspace image is published to the Molecule AI ECR registry. Contact your platform administrator for the registry prefix and credentials, then log in:
-
-```bash
-aws ecr get-login-password --region us-east-1 | \
-  docker login --username AWS --password-stdin "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com"
-
-docker pull "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
-```
-
-## Step 3: Configure environment variables
-
-| Variable | Default | Description |
-|---|---|---|
-| `PLATFORM_URL` | `http://localhost:8080` | Platform API URL. Inside a Docker container, use `http://host.docker.internal:8080` to reach the platform on the host machine. |
-| `WORKSPACE_ID` | — | Workspace ID from Step 1 (required; no default) |
-| `PORT` | `8000` | Agent server port. Must match `containerPort` in Kubernetes and the port mapped with `-p` in Docker. |
-
-## Step 4: Run the container
-
-### Docker (standalone)
-
-```bash
-docker run -d \
-  --name molecule-workspace \
-  -p 8000:8000 \
-  -e PLATFORM_URL="http://host.docker.internal:8080" \
-  -e WORKSPACE_ID="your-workspace-id" \
-  -e PORT=8000 \
-  "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
-```
-
-> **Note for Linux hosts:** Docker does not include `host.docker.internal` by default. On Linux, either add `--add-host=host.docker.internal:host-gateway` to the `docker run` command, or use the host machine's IP address directly (e.g. `http://192.168.1.100:8080`).
-
-### Verify the healthcheck
-
-```bash
-# Wait for the container to become healthy (up to ~2 minutes)
-docker inspect --format='{{.State.Health.Status}}' molecule-workspace
-
-# Expected output: healthy
-# Once healthy, the agent card is reachable:
-curl -s http://localhost:8000/.well-known/agent-card.json | python3 -m json.tool
-```
-
-### Docker Compose
-
-```yaml
-services:
-  molecule-workspace:
-    image: "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
-    ports:
-      - "8000:8000"
-    environment:
-      PLATFORM_URL: "http://host.docker.internal:8080"
-      WORKSPACE_ID: "your-workspace-id"
-      PORT: "8000"
-    # Linux hosts: add host.docker.internal resolution
-    # extra_hosts:
-    #   - "host.docker.internal:host-gateway"
-    restart: unless-stopped
-    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:8000/.well-known/agent-card.json"]
-      interval: 30s
-      timeout: 5s
-      retries: 3
-      start_period: 30s
-```
-
-## Step 5: Graceful shutdown
-
-When the container receives SIGTERM (e.g. from `docker stop` or Kubernetes pod deletion), the workspace's uvicorn server initiates graceful shutdown: the heartbeat loop stops, active A2A tasks are given a grace period to complete, and any snapshotable state is persisted before the process exits.
-
-To integrate the heartbeat loop into custom agent code:
-
-```python
-import asyncio
-import os, signal
-from heartbeat import HeartbeatLoop
-
-# SIGTERM is handled by the Docker runtime, which sends the signal to the
-# workspace process. The workspace (via uvicorn) initiates graceful shutdown:
-# the heartbeat loop is stopped, any active adapter tasks are cancelled, and
-# in-flight A2A requests are given a grace period to complete.
-#
-# For custom integration with the heartbeat loop directly:
-async def main():
-    heartbeat = HeartbeatLoop(
-        platform_url=os.environ["PLATFORM_URL"],
-        workspace_id=os.environ["WORKSPACE_ID"],
-    )
-    heartbeat.start()
-    try:
-        await asyncio.Event().wait()  # keep running
-    finally:
-        await heartbeat.stop()
-        print("Heartbeat loop stopped.")
-```
-
-The Docker `stop` command sends SIGTERM and waits up to 10 seconds by default before sending SIGKILL. The healthcheck ensures orchestrators detect an unhealthy container before the SIGTERM timeout.
-
-## Kubernetes deployment
-
-For Kubernetes deployments, use the native liveness/readiness probe configuration instead of the Docker HEALTHCHECK:
-
-```yaml
-ports:
-  - name: http
-    containerPort: 8000
-livenessProbe:
-  httpGet:
-    path: /.well-known/agent-card.json
-    port: http
-  initialDelaySeconds: 30
-  periodSeconds: 30
-  timeoutSeconds: 5
-  failureThreshold: 3
-readinessProbe:
-  httpGet:
-    path: /.well-known/agent-card.json
-    port: http
-  initialDelaySeconds: 10
-  periodSeconds: 10
-  timeoutSeconds: 5
-  failureThreshold: 3
-terminationGracePeriodSeconds: 120
-```
-
-> **Note:** The Kubernetes `terminationGracePeriodSeconds` should exceed the liveness probe failure threshold so that the probe can register a failure before the pod is killed. With `periodSeconds: 30` and `failureThreshold: 3`, the probe does not register a failure until approximately 120–150s after the container becomes unhealthy. Set `terminationGracePeriodSeconds: 120` or higher.
-
-## Troubleshooting
-
-| Symptom | Cause | Fix |
-|---|---|---|
-| Container shows `unhealthy` after startup | Platform unreachable from container | Verify `PLATFORM_URL` uses `host.docker.internal` (Docker) or the correct host IP |
-| `curl: (7) Failed to connect` on healthcheck | Container not fully started | Wait up to 30s; increase `start_period` |
-| Agent not appearing on canvas | Wrong `WORKSPACE_ID` or expired token | Re-run registration; check platform logs |
-| `host.docker.internal` not resolved | Linux host without the Docker flag | Use `--add-host=host.docker.internal:host-gateway` or the host's LAN IP |
Author	SHA1	Message	Date
technical-writer	e2198ebe9c	Merge branch 'merge/pr27-sop-checklist-gate' into merge/integration	2026-05-13 11:08:32 +00:00
technical-writer	59b6662b94	Merge branch 'merge/pr30-dev-channels-flag' into merge/integration	2026-05-13 11:08:30 +00:00
technical-writer	412d3ea2f7	Merge branch 'merge/pr31-changelog-security' into merge/integration	2026-05-13 11:08:27 +00:00
technical-writer	7a2de6677a	docs(changelog): fix encoding — "after失败" → "after failure" Chinese characters embedded in English sentence. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-13 11:06:34 +00:00
technical-writer	186929bd9d	docs(changelog): fix encoding — "after失败" → "after failure" Chinese characters embedded in English sentence. Spotted during changelog duplicate-section merge. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-13 11:03:30 +00:00
documentation-specialist	6265ce5ec1	docs(security): add CWE-22 regression fix entry for 2026-05-13 Secret scan / secret-scan (pull_request) Successful in 26s Details CI / build (pull_request) Successful in 3m2s Details Pairs molecule-core#810 (Critical CWE-22 path traversal regression in org_import.go). Also adds full 2026-05-13 changelog entry covering: - CWE-22 path traversal fix (security section) - stop_event graceful shutdown feature (SDK Python #8) - PLATFORM_URL default alignment (workspace-runtime #12) - Canvas CI hardening (core #773/776/777) - Go lint CI hardening (core #781) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-13 08:23:48 +00:00
technical-writer	43b9d16503	docs(runtime-mcp): add dev-channels flag callout + See also cross-reference Secret scan / secret-scan (pull_request) Successful in 40s Details CI / build (pull_request) Successful in 2m22s Details - Add a Callout warning in the Claude Code setup section linking to the dev-channels-flag page — directly surfaces the bare-flag failure mode at the point where operators configure the integration. - Add dev-channels-flag to the See also section. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-13 05:20:39 +00:00
technical-writer	bc1102366d	docs(runtime-mcp): add dev-channels flag tagged-form requirement page Documents why Claude Code 2.1.x+ requires `--dangerously-load-development-channels server:molecule` (not the bare flag) for inline channel push from the molecule-mcp wheel. Covers: required tagged-allowlist form, the three-layer failure mode when the flag is bare, Molecule operator setup, SDK integration via extra_args, verification steps, and the research-preview gate forward note. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-13 05:19:19 +00:00
technical-writer	15a952c213	docs(changelog): fix duplicate 2026-05-10 sections and restore chronological order Secret scan / secret-scan (pull_request) Successful in 34s Details CI / build (pull_request) Successful in 2m15s Details - Merge two ## 2026-05-10 sections into one (was: first at line 63, second at line 173, with ## 2026-04-23 incorrectly sandwiched between them) - Move ## 2026-05-10 "Late-day updates" sub-section from ## 2026-04-23 into the merged ## 2026-05-10 section - Move ## 2026-04-23 section to its correct chronological position (after ## 2026-04-22, before ## 2026-04-17) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-13 05:03:21 +00:00
claude-ceo-assistant	edb6ee856b	ci: add SOP checklist gate Secret scan / secret-scan (pull_request) Successful in 10s Details CI / build (pull_request) Successful in 2m13s Details	2026-05-12 20:38:13 -07:00