Compare commits

..

2 Commits

Author SHA1 Message Date
infra-sre 431e0f6e12 fix(ci): pin publish workflows to docker-capable runners
Re-apply the fix from #599 with the prerequisite now satisfied:
molecule-ai/operator-config PR #30 registers the `docker` label on
all act_runners that mount /var/run/docker.sock.

Restore:
  runs-on: [ubuntu-latest, docker]

This routes publish-workspace-server-image and publish-canvas-image
jobs exclusively to runners where Docker daemon access is confirmed.
Eliminates the coin-flip failure mode where jobs land on socket-less
runners and fail at the Docker health check step.

PREREQUISITE: operator host must be rolled to pick up the updated
runner config before merging this PR. See internal#711.

Reverts: infra/revert-docker-runner-label (3206966e)
Closes: molecule-core#711

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 10:02:06 +00:00
infra-sre 42a2a05a77 fix(gitea-actions): replace workflow_run with push trigger on main
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 9s
Harness Replays / detect-changes (pull_request) Successful in 10s
CI / Detect changes (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 20s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 21s
qa-review / approved (pull_request) Failing after 13s
Harness Replays / Harness Replays (pull_request) Successful in 6s
security-review / approved (pull_request) Failing after 12s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 7s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m12s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m26s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m23s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m35s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: 7
gate-check-v3 / gate-check (pull_request) Successful in 5s
audit-force-merge / audit (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Failing after 5m38s
CI / all-required (pull_request) Failing after 1s
sop-checklist-gate / gate (pull_request) Successful in 19s
sop-tier-check / tier-check (pull_request) Successful in 16s
Rule 2 (Fatal): `on: workflow_run:` is not supported by Gitea 1.22.6
(verified via modules/actions/workflows.go enumeration; task #81).
Three workflow files were using it:

- redeploy-tenants-on-main.yml
- staging-verify.yml
- redeploy-tenants-on-staging.yml

Fix: replace `on: workflow_run: workflows: ['publish-workspace-server-image']`
with `on: push: branches: [main]: paths: ['.gitea/workflows/publish-workspace-server-image.yml']`.

The push trigger fires when the upstream workflow file is updated
(post-merge of publish-workspace-server-image), which is the best
available proxy for "publish succeeded" without workflow_run.

Also removed the `if: github.event.workflow_run.conclusion == 'success'`
conditionals (no longer applicable) and updated
`github.event.workflow_run.head_sha` references to `github.sha`.

continue-on-error: true on all three workflows means any semantic
regression from the trigger swap won't block merges during the Phase 3
surface-broken-shapes period (RFC #219 §1).

Closes #695.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 09:00:53 +00:00
19 changed files with 69 additions and 3862 deletions
@@ -1,438 +0,0 @@
#!/usr/bin/env python3
"""lint_continue_on_error_tracking — Tier 2e per internal#350.
Rule
----
Every `continue-on-error: true` directive in `.gitea/workflows/*.yml`
must be accompanied by a tracker reference comment within 2 lines
(above OR below the directive's line). The reference is one of:
* `# mc#NNNN` — molecule-core issue
* `# internal#NNNN` — molecule-ai/internal issue
The referenced issue must satisfy ALL of:
1. Exists (HTTP 200 on `/repos/{owner}/{name}/issues/{num}`)
2. `state == "open"`
3. `created_at` is ≤ MAX_AGE_DAYS days ago (default 14)
A passing reference establishes an audit trail and a forced renewal
cadence — after 14 days the issue must either be CLOSED (the masked
defect was fixed) or the comment must point at a NEW tracker
(deliberate decision to keep masking, requires a paper-trail).
The class this prevents
-----------------------
Phase-3-masked failures. `continue-on-error: true` on `platform-build`
had been hiding mc#664-class regressions for ~3 weeks before #656
surfaced them on 2026-05-12. A 14-day cap forces a tracker review
cycle and surfaces mask-drift within at most 14 days of the original
defect.
Behaviour-based gate
--------------------
We parse via PyYAML AST (per `feedback_behavior_based_ast_gates`) to
detect `continue-on-error: <truthy>` at job-key level, then map each
location back to its source line via PyYAML's line-tracking loader.
Comments are scanned from the raw text within a 2-line window of
that source line. Reformatting (block-scalar vs flow-style) does not
break the rule because the source-line anchor is the directive's
own line.
Exit codes
----------
0 — every `continue-on-error: true` has a passing tracker, OR
the issue-API endpoint returned 403/404 (token-scope; graceful
degrade per Tier 2a contract — surface via ::error:: on stderr
but don't red-X every PR over auth).
1 — at least one violation (missing/closed/too-old/non-existent
tracker).
2 — env contract violation, YAML parse error, or workflows-dir
missing.
Env
---
GITEA_TOKEN — read scope on the configured repos.
Auto-injected `GITHUB_TOKEN` works for same-repo
issue reads; for `internal#NNN` we need a token
with `molecule-ai/internal` read scope. Use
DRIFT_BOT_TOKEN (same persona as other Tier 2
lints).
GITEA_HOST — e.g. git.moleculesai.app
REPO — `owner/name` for `mc#NNNN` lookups
INTERNAL_REPO — `owner/name` for `internal#NNNN` lookups
(defaults to derived `molecule-ai/internal`)
WORKFLOWS_DIR — defaults to `.gitea/workflows`
MAX_AGE_DAYS — defaults to 14
Memory cross-links
------------------
- internal#350 (the RFC that specs this lint)
- mc#664 (the masked-3-weeks empirical case)
- feedback_chained_defects_in_never_tested_workflows
- feedback_behavior_based_ast_gates
- feedback_strict_root_only_after_class_a
"""
from __future__ import annotations
import json
import os
import re
import sys
import urllib.error
import urllib.parse
import urllib.request
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any
try:
import yaml
except ImportError:
sys.stderr.write(
"::error::PyYAML is required. Install with: pip install PyYAML\n"
)
sys.exit(2)
# ---------------------------------------------------------------------------
# Tracker comment regex.
# Matches: `# mc#1234`, `# internal#42`, `# mc#1234 - description`
# Also matches trackers embedded mid-sentence: `# see mc#1234 for details`
# Does NOT match: `# mc1234` (missing inner #), `mc#1234` (no leading
# comment `#`), `# MC#1234` (case-sensitive). The search is line-wide,
# not just at the comment-marker prefix — fixes false-negative when
# the tracker appears mid-sentence (e.g. `internal#350` after prose).
TRACKER_RE = re.compile(
r"(?P<slug>mc|internal)#(?P<num>\d+)\b"
)
# Truthy continue-on-error values we treat as "true". PyYAML decodes
# `continue-on-error: true` to Python `True`. `continue-on-error: "true"`
# decodes to the string "true" — Gitea's evaluator coerces strings,
# so we treat string-`"true"` (case-insensitive) as truthy too.
def _is_truthy_coe(v: Any) -> bool:
if v is True:
return True
if isinstance(v, str) and v.strip().lower() == "true":
return True
return False
# ---------------------------------------------------------------------------
# Env contract
# ---------------------------------------------------------------------------
def _env(key: str, default: str | None = None) -> str:
v = os.environ.get(key, default)
return v if v is not None else ""
def _require_env(key: str) -> str:
v = os.environ.get(key)
if not v:
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
return v
# ---------------------------------------------------------------------------
# PyYAML line-tracking loader. yaml.SafeLoader nodes carry
# `start_mark.line` (0-based); using construct_mapping with `deep=True`
# preserves that on every node. We need the line of each
# `continue-on-error` key so we can scan the source for comments
# near it.
# ---------------------------------------------------------------------------
class _LineLoader(yaml.SafeLoader):
"""SafeLoader that annotates every dict with `__line__: {key: line}`."""
def _construct_mapping(loader: yaml.SafeLoader, node: yaml.MappingNode) -> dict:
mapping = loader.construct_mapping(node, deep=True)
# Annotate per-key source lines so we can locate `continue-on-error`.
lines: dict[str, int] = {}
for k_node, _v_node in node.value:
try:
key = loader.construct_object(k_node, deep=True)
except Exception:
continue
if isinstance(key, (str, int, bool)):
lines[str(key)] = k_node.start_mark.line + 1 # 1-based
if isinstance(mapping, dict):
mapping["__lines__"] = lines
return mapping
_LineLoader.add_constructor(
yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG, _construct_mapping
)
# ---------------------------------------------------------------------------
# Issue lookup
# ---------------------------------------------------------------------------
def fetch_issue(slug_kind: str, num: int) -> tuple[str, dict | None]:
"""Return `(status, payload_or_none)`.
status ∈ {"ok", "not_found", "forbidden", "error"}.
"""
repo = (
_env("REPO") if slug_kind == "mc" else _env("INTERNAL_REPO")
)
if not repo:
# Fall through gracefully — caller treats as 403 (token-scope).
return ("forbidden", None)
host = _env("GITEA_HOST")
token = _env("GITEA_TOKEN")
url = f"https://{host}/api/v1/repos/{repo}/issues/{num}"
req = urllib.request.Request(
url,
headers={
"Authorization": f"token {token}",
"Accept": "application/json",
},
)
try:
with urllib.request.urlopen(req, timeout=20) as resp:
return ("ok", json.loads(resp.read()))
except urllib.error.HTTPError as e:
if e.code == 404:
return ("not_found", None)
if e.code in (401, 403):
return ("forbidden", None)
return ("error", None)
except (urllib.error.URLError, TimeoutError, json.JSONDecodeError):
return ("error", None)
# ---------------------------------------------------------------------------
# Locate every continue-on-error: <truthy> in a workflow doc, with line.
# ---------------------------------------------------------------------------
def find_coe_truthies(
doc: Any, raw_lines: list[str]
) -> list[tuple[str, int]]:
"""Return list of (job_key, source_line_1based).
`doc` is the LineLoader-parsed mapping. We descend `jobs.<key>` and
return only those whose value is truthy per `_is_truthy_coe`.
Job-step continue-on-error is intentionally NOT considered: it
suppresses step-level failure rollup only, not job-level. The
masking class this lint targets is the job-level rollup.
"""
out: list[tuple[str, int]] = []
if not isinstance(doc, dict):
return out
jobs = doc.get("jobs")
if not isinstance(jobs, dict):
return out
for jkey, jbody in jobs.items():
if jkey == "__lines__":
continue
if not isinstance(jbody, dict):
continue
if "continue-on-error" not in jbody:
continue
v = jbody["continue-on-error"]
if not _is_truthy_coe(v):
continue
line = jbody.get("__lines__", {}).get("continue-on-error")
if not line:
# PyYAML line-tracking shouldn't miss but guard for safety.
# Fall back to grepping the raw text.
line = _grep_first_coe_line(raw_lines, jkey) or 1
out.append((str(jkey), int(line)))
return out
def _grep_first_coe_line(raw_lines: list[str], jkey: str) -> int | None:
"""Fallback: find the first `continue-on-error:` line after a `jkey:` line."""
saw_job = False
for i, line in enumerate(raw_lines, start=1):
if re.match(rf"^\s*{re.escape(jkey)}\s*:", line):
saw_job = True
continue
if saw_job and "continue-on-error" in line:
return i
return None
# ---------------------------------------------------------------------------
# Scan window for tracker comment
# ---------------------------------------------------------------------------
WINDOW = 2 # lines above OR below the directive's line (inclusive)
def find_tracker_in_window(
raw_lines: list[str], line_1based: int
) -> tuple[str, int] | None:
"""Return (slug, num) if a `# mc#NNN`/`# internal#NNN` appears
in raw_lines within ±WINDOW lines of `line_1based`. None otherwise.
We scan the directive's own line (it may carry an inline comment
like `continue-on-error: true # mc#3`) plus ±WINDOW.
"""
lo = max(1, line_1based - WINDOW)
hi = min(len(raw_lines), line_1based + WINDOW)
for i in range(lo, hi + 1):
line = raw_lines[i - 1]
# Only the comment portion (after `#`) is considered, so
# trailing-inline comments on the directive line are matched.
m = TRACKER_RE.search(line)
if m:
return (m.group("slug"), int(m.group("num")))
return None
# ---------------------------------------------------------------------------
# Tracker validation
# ---------------------------------------------------------------------------
def validate_tracker(
slug: str, num: int, max_age_days: int
) -> tuple[bool, str]:
"""Return (ok?, reason). On 403, ok=True is returned with reason
explaining graceful-degrade — caller treats 403 as a non-fatal
skip (same as Tier 2a contract).
"""
status, payload = fetch_issue(slug, num)
if status == "forbidden":
sys.stderr.write(
f"::error::issue {slug}#{num} unreadable (HTTP 403 — token "
f"scope). Cannot validate; skipping this check to avoid "
f"red-X on every PR. Fix the token, not the lint.\n"
)
return (True, "forbidden — skipped")
if status == "not_found":
return (False, f"{slug}#{num} does not exist (404)")
if status == "error":
sys.stderr.write(
f"::error::issue {slug}#{num} fetch errored — treating as "
f"unverified, skipping this check.\n"
)
return (True, "fetch-error — skipped")
assert payload is not None
state = payload.get("state", "")
if state != "open":
return (False, f"{slug}#{num} state={state!r} (must be open)")
created = payload.get("created_at", "")
try:
# Gitea returns ISO-8601 with timezone; Python 3.11+
# fromisoformat handles `Z` suffix natively from 3.11. Older
# runtimes need explicit replace.
created_dt = datetime.fromisoformat(created.replace("Z", "+00:00"))
except ValueError:
return (False, f"{slug}#{num} created_at unparseable: {created!r}")
age = datetime.now(timezone.utc) - created_dt
# Inclusive boundary at MAX_AGE_DAYS: `age.days` truncates to a
# whole-day floor, so an issue created 14d 0h 5m ago has
# `age.days == 14` and passes; one created 15d 0h 0m ago has
# `age.days == 15` and fails. This is the convention specified
# in internal#350 ("≤14 days old").
if age.days > max_age_days:
return (
False,
f"{slug}#{num} is {age.days} days old (>{max_age_days}d cap). "
f"Close-or-renew the tracker.",
)
return (True, f"{slug}#{num} open, {age.days}d old, ≤{max_age_days}d")
# ---------------------------------------------------------------------------
# Driver
# ---------------------------------------------------------------------------
def _iter_workflow_files(wf_dir: Path) -> list[Path]:
return sorted(list(wf_dir.glob("*.yml")) + list(wf_dir.glob("*.yaml")))
def run() -> int:
wf_dir = Path(_env("WORKFLOWS_DIR", ".gitea/workflows"))
max_age = int(_env("MAX_AGE_DAYS", "14"))
# Defaults for INTERNAL_REPO when unset (best-effort guess based on
# the convention `mc#` = same repo, `internal#` = molecule-ai/internal).
if not os.environ.get("INTERNAL_REPO"):
os.environ["INTERNAL_REPO"] = "molecule-ai/internal"
if not wf_dir.is_dir():
sys.stderr.write(
f"::error::workflows directory not found: {wf_dir}\n"
)
return 2
yml_files = _iter_workflow_files(wf_dir)
if not yml_files:
print(f"::notice::no workflow files under {wf_dir}; nothing to lint.")
return 0
violations: list[str] = []
notices: list[str] = []
total_coe_true = 0
for path in yml_files:
raw = path.read_text(encoding="utf-8")
raw_lines = raw.splitlines()
try:
doc = yaml.load(raw, Loader=_LineLoader)
except yaml.YAMLError as e:
sys.stderr.write(
f"::error file={path}::YAML parse error: {e}. Skipping "
f"this file (lint-workflow-yaml will catch separately).\n"
)
continue
coe_locs = find_coe_truthies(doc, raw_lines)
for jkey, line in coe_locs:
total_coe_true += 1
tracker = find_tracker_in_window(raw_lines, line)
if tracker is None:
violations.append(
f"::error file={path},line={line}::lint-continue-on-error-"
f"tracking (Tier 2e): job '{jkey}' has "
f"`continue-on-error: true` at line {line} with no "
f"`# mc#NNNN` or `# internal#NNNN` tracker comment "
f"within {WINDOW} lines. Add a tracker reference so "
f"this mask has a forced 14-day renewal cycle. "
f"Memory: feedback_chained_defects_in_never_tested_workflows."
)
continue
slug, num = tracker
ok, reason = validate_tracker(slug, num, max_age)
if ok:
notices.append(
f"::notice::{path.name} job '{jkey}' (line {line}): "
f"{reason}"
)
else:
violations.append(
f"::error file={path},line={line}::lint-continue-on-error-"
f"tracking (Tier 2e): job '{jkey}' "
f"`continue-on-error: true` references {slug}#{num}, "
f"but {reason}. FIX: close/fix the underlying defect "
f"and flip continue-on-error: false, OR file a fresh "
f"tracker and update the comment."
)
for n in notices:
print(n)
if violations:
print(
f"::error::lint-continue-on-error-tracking: "
f"{len(violations)} violation(s) across {len(yml_files)} "
f"workflow file(s) (of {total_coe_true} `continue-on-error: "
f"true` directives in total)."
)
for v in violations:
print(v)
return 1
print(
f"::notice::lint-continue-on-error-tracking: "
f"all {total_coe_true} `continue-on-error: true` directive(s) "
f"have valid trackers (open, ≤{max_age}d old)."
)
return 0
if __name__ == "__main__":
sys.exit(run())
@@ -1,681 +0,0 @@
#!/usr/bin/env python3
"""lint-pre-flip-continue-on-error — block a PR that flips a job from
``continue-on-error: true`` to ``continue-on-error: false`` (or removes
the key while the base had it ``true``) without proof that the job's
recent runs on the target branch are actually green.
Empirical class — PR #656 / mc#664:
PR #656 (RFC internal#219 Phase 4) flipped 5 ``platform-build``-class
jobs ``continue-on-error: true → false`` on the basis of a
"verified green on main via combined-status check". But that "green"
was the LIE produced by the prior ``continue-on-error: true``:
Gitea Quirk #10 (internal#342 + dup #287) — when a step inside a
job marked ``continue-on-error: true`` fails, the job-level status
is still rolled up as ``success``. So the precondition the PR
claimed to verify was structurally fooled by the bug being
flipped.
mc#664 then captured the surfaced defects (2 unrelated, mutually-
masked regressions):
Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)
Codified 04:35Z as hongming-pc2 charter §SOP-N rule (e)
"run-log-grep-before-flip": pull the actual run log + grep for
``--- FAIL`` / ``FAIL\\s`` BEFORE flipping; don't trust the masked
combined-status.
This script structurally enforces that rule at PR time.
How it works (one PR tick):
1. Parse the diff: compare ``.gitea/workflows/*.yml`` at PR base
vs PR head. For each file present in both, parse the YAML AST
and walk ``jobs.<key>.continue-on-error`` on each side. A
"flip" is base ∈ {true} AND head ∈ {false, None/absent}. We
coerce truthy/falsy per YAML semantics (PyYAML normalizes
``true``/``True``/``yes`` to ``True``).
2. For each flipped job, derive its commit-status context name as
``"{workflow.name} / {job.name or job.key} (push)"`` — that's
how Gitea Actions emits the context for runs on
``main``/``staging`` (push event, see also expected_context()
in ci-required-drift.py).
3. Pull the last N commits of the target branch (PR base), fetch
combined commit-status per commit, scan ``statuses[]`` for
contexts matching ANY of the flipped jobs. For each match,
fetch the actual run log via the web-UI route
``{server_url}/{repo}/actions/runs/{run_id}/jobs/{job_idx}/logs``
(per memory ``reference_gitea_actions_log_fetch`` — Gitea 1.22.6
lacks REST ``/actions/runs/*`` endpoints; the web-UI route is the
only working path; see ``reference_gitea_1_22_6_lacks_rest_rerun_endpoints``).
4. Grep each log for the Go-test failure markers ``--- FAIL`` /
``FAIL\\s+<package>`` AND the bash-step error sentinel
``::error::``. If ANY recent log shows any of these AND the
status itself reads ``success``, the job was masked. ``::error::``
the flip with the offending test name + offending run URL +
the regression commit (HEAD of the run).
5. Exit 1 if any flips have at least one masked run; exit 0
otherwise.
Halt-on-noise contract:
- If a recent log fetch 404s (already-pruned-via-act_runner-gc,
transient gitea-web outage): emit ``::warning::`` and treat the
run as "log unavailable" — does NOT block the flip; logged so
a curious reviewer can re-run.
- If a flipped job has ZERO recent runs on the target branch (newly
added workflow): emit ``::warning::`` "no run history to verify"
and allow the flip. This is the only way a NEW workflow can ever
ship with ``continue-on-error: false``; otherwise we'd have a
chicken-and-egg.
Behavior-based AST gate per ``feedback_behavior_based_ast_gates``:
- YAML parsed via PyYAML safe_load on BOTH sides of the diff
- No grep-by-line — formatting changes (comment churn, key order)
don't false-positive a flip
- Job-key match — so a rename ``platform-build → core-be-build``
appears as a DELETE + an ADD, not a flip (the delete side has no
new value to compare against; the add side has no base side).
Run locally (works against this repo, requires PyYAML + Gitea token
that can read combined-commit-status):
GITEA_TOKEN=... GITEA_HOST=git.moleculesai.app \\
REPO=molecule-ai/molecule-core BASE_REF=main \\
BASE_SHA=$(git rev-parse origin/main) \\
HEAD_SHA=$(git rev-parse HEAD) \\
python3 .gitea/scripts/lint_pre_flip_continue_on_error.py \\
--dry-run
Cross-links: PR#656, mc#664, PR#665 (the interim re-mask),
Quirk #10 (internal#342 + dup #287), hongming-pc2 charter §SOP-N
rule (e), feedback_strict_root_only_after_class_a,
feedback_no_shared_persona_token_use.
"""
from __future__ import annotations
import argparse
import json
import os
import subprocess
import sys
import urllib.error
import urllib.parse
import urllib.request
from typing import Any
import yaml # PyYAML 6.0.2 — installed by the workflow before this runs.
# --------------------------------------------------------------------------
# Environment (read at module-import; runtime contract enforced in main())
# --------------------------------------------------------------------------
def _env(key: str, *, default: str = "") -> str:
return os.environ.get(key, default)
GITEA_TOKEN = _env("GITEA_TOKEN")
GITEA_HOST = _env("GITEA_HOST")
REPO = _env("REPO")
BASE_REF = _env("BASE_REF", default="main")
BASE_SHA = _env("BASE_SHA")
HEAD_SHA = _env("HEAD_SHA")
# How many recent commits to scan on the target branch. 5 by default;
# enough to catch a job that only fails intermittently, not so many
# that the script paginates needlessly. Per spec.
RECENT_COMMITS_N = int(_env("RECENT_COMMITS_N", default="5"))
OWNER, NAME = (REPO.split("/", 1) + [""])[:2] if REPO else ("", "")
API = f"https://{GITEA_HOST}/api/v1" if GITEA_HOST else ""
WEB = f"https://{GITEA_HOST}" if GITEA_HOST else ""
# Failure markers we grep for in the run log.
# --- FAIL — Go test failure marker
# FAIL\s — `FAIL github.com/x/y` package-level rollup
# ::error:: — bash-step `::error::` lines (the lint-curl-status-capture
# pattern: a `python3 <<PY` block writing `::error::` then
# sys.exit(1); also any shell `echo "::error::..."` from
# jobs that wrap pytest/eslint/etc. and convert
# non-zero exits into masked-by-CoE status)
FAIL_PATTERNS = (
"--- FAIL",
"FAIL\t",
"FAIL ",
"::error::",
)
def _require_runtime_env() -> None:
for key in ("GITEA_TOKEN", "GITEA_HOST", "REPO", "BASE_REF", "BASE_SHA", "HEAD_SHA"):
if not os.environ.get(key):
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
# --------------------------------------------------------------------------
# Tiny HTTP helper (no requests dependency)
# Mirrors the api()/ApiError contract in ci-required-drift.py +
# main-red-watchdog.py per feedback_api_helper_must_raise_not_return_dict.
# --------------------------------------------------------------------------
class ApiError(RuntimeError):
"""Raised when a Gitea API/web call cannot be trusted to have succeeded.
Soft-failure on non-2xx is the duplicate-write bug factory in
find-or-create flows (PR #112 Five-Axis). Here it would mean a
transient gitea-web 502 silently allows a flip whose recent runs
we couldn't actually verify — exactly the regression class this
lint exists to close.
"""
def http(
method: str,
url: str,
*,
body: dict | None = None,
headers: dict[str, str] | None = None,
expect_json: bool = True,
timeout: int = 30,
) -> tuple[int, Any, bytes]:
"""Tiny HTTP helper around urllib.
Returns (status, parsed_or_None, raw_bytes). Raises ApiError on any
non-2xx response. ``expect_json=False`` returns raw bytes in the
parsed slot (for log-fetch from the web-UI which returns text/plain).
"""
final_headers = {
"Authorization": f"token {GITEA_TOKEN}",
"Accept": "application/json" if expect_json else "text/plain",
}
if headers:
final_headers.update(headers)
data = None
if body is not None:
data = json.dumps(body).encode("utf-8")
final_headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, method=method, data=data, headers=final_headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
raw = resp.read()
status = resp.status
except urllib.error.HTTPError as e:
raw = e.read() or b""
status = e.code
if not (200 <= status < 300):
snippet = raw[:500].decode("utf-8", errors="replace") if raw else ""
raise ApiError(f"{method} {url} → HTTP {status}: {snippet}")
if not expect_json:
return status, raw, raw
if not raw:
return status, None, raw
try:
return status, json.loads(raw), raw
except json.JSONDecodeError as e:
raise ApiError(f"{method} {url} → HTTP {status} but body is not JSON: {e}") from e
def api(method: str, path: str, *, body: dict | None = None, query: dict[str, str] | None = None) -> tuple[int, Any]:
"""Read-shaped Gitea REST helper. Path is API-relative (``/repos/...``)."""
url = f"{API}{path}"
if query:
url = f"{url}?{urllib.parse.urlencode(query)}"
status, parsed, _ = http(method, url, body=body, expect_json=True)
return status, parsed
# --------------------------------------------------------------------------
# YAML parsing — coerce truthy/falsy for continue-on-error
# --------------------------------------------------------------------------
def _coerce_coe(val: Any) -> bool:
"""Coerce a continue-on-error YAML value to bool.
PyYAML safe_load normalizes ``true``/``True``/``yes``/``on`` to
Python ``True`` and ``false``/``False``/``no``/``off`` / absence
to ``False`` (we treat absence/None as False here too — that's the
GitHub Actions default semantics).
Edge cases:
- String ``"true"`` (quoted in YAML) — kept as the string
``"true"``, falsy under bool() but a flip we DO care about
catching. Normalize string forms case-insensitively to bool
so the diff is consistent with the runtime behavior of
Gitea Actions, which YAML-parses the same way.
"""
if isinstance(val, bool):
return val
if val is None:
return False
if isinstance(val, str):
return val.strip().lower() in ("true", "yes", "on", "1")
return bool(val)
def jobs_coe_map(workflow_doc: dict) -> dict[str, bool]:
"""Return ``{job_key: continue_on_error_bool}`` for every job in
the workflow. Job-level ``continue-on-error`` only — does NOT
descend into per-step ``continue-on-error`` (step-level CoE
masking is a separate class and is handled by the test suite
+ reviewer, not by this gate — see Future Work in the workflow
YAML).
"""
out: dict[str, bool] = {}
jobs = workflow_doc.get("jobs")
if not isinstance(jobs, dict):
return out
for key, job in jobs.items():
if not isinstance(job, dict):
continue
out[key] = _coerce_coe(job.get("continue-on-error"))
return out
def workflow_name(workflow_doc: dict, *, fallback: str = "") -> str:
"""Top-level ``name:`` of the workflow. Falls back to the filename
(without extension) per Gitea Actions semantics."""
n = workflow_doc.get("name")
if isinstance(n, str) and n.strip():
return n.strip()
return fallback
def job_display_name(workflow_doc: dict, job_key: str) -> str:
"""``jobs.<key>.name`` if present, else the key. Mirrors
expected_context() in ci-required-drift.py."""
job = workflow_doc.get("jobs", {}).get(job_key)
if isinstance(job, dict):
n = job.get("name")
if isinstance(n, str) and n.strip():
return n.strip()
return job_key
def context_name(workflow_name_str: str, job_name_str: str, event: str = "push") -> str:
"""Render the commit-status context the way Gitea Actions emits it.
Default ``event="push"`` because recent-runs-on-main are push events;
callers can override to ``"pull_request"`` for PR-context lookups."""
return f"{workflow_name_str} / {job_name_str} ({event})"
# --------------------------------------------------------------------------
# Diff detection — flips, not arbitrary changes
# --------------------------------------------------------------------------
def detect_flips(
base_workflows: dict[str, str],
head_workflows: dict[str, str],
) -> list[dict]:
"""Compare per-file CoE maps; return a list of flip records.
Inputs are ``{path: yaml_text}`` for both sides. Output records
have the shape::
{
"workflow_path": ".gitea/workflows/ci.yml",
"workflow_name": "CI",
"job_key": "platform-build",
"job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)",
}
A flip is base[CoE] ∈ {True} AND head[CoE] ∈ {False}. Files
only present on one side are skipped — adding a new workflow
with ``CoE: false`` is fine (no history to mask), and removing
a workflow can't possibly flip anything.
"""
flips: list[dict] = []
for path, base_text in base_workflows.items():
if path not in head_workflows:
continue
try:
base_doc = yaml.safe_load(base_text) or {}
head_doc = yaml.safe_load(head_workflows[path]) or {}
except yaml.YAMLError as e:
# Don't block on a parse error — the YAML lint workflows
# catch invalid YAML separately. Just warn so the failing
# file is visible.
sys.stderr.write(f"::warning file={path}::YAML parse error: {e}\n")
continue
if not isinstance(base_doc, dict) or not isinstance(head_doc, dict):
continue
base_map = jobs_coe_map(base_doc)
head_map = jobs_coe_map(head_doc)
wf_name = workflow_name(head_doc, fallback=os.path.basename(path).rsplit(".", 1)[0])
for job_key, base_val in base_map.items():
if job_key not in head_map:
continue # job removed — not a flip
if base_val is True and head_map[job_key] is False:
flips.append({
"workflow_path": path,
"workflow_name": wf_name,
"job_key": job_key,
"job_name": job_display_name(head_doc, job_key),
"context": context_name(wf_name, job_display_name(head_doc, job_key), "push"),
})
return flips
# --------------------------------------------------------------------------
# Git: snapshot every .gitea/workflows/*.yml at a SHA (no checkout)
# --------------------------------------------------------------------------
def _git(*args: str, cwd: str | None = None) -> str:
"""Run ``git`` and return stdout (text)."""
result = subprocess.run(
["git", *args],
capture_output=True,
text=True,
check=False,
cwd=cwd,
)
if result.returncode != 0:
raise RuntimeError(f"git {args!r} failed: {result.stderr.strip()}")
return result.stdout
def workflows_at_sha(sha: str, *, repo_dir: str | None = None) -> dict[str, str]:
"""Read every ``.gitea/workflows/*.yml`` blob at ``sha``.
Uses ``git ls-tree`` + ``git show`` so we never need to check out
the SHA (the workflow runs on the PR head; the base SHA is
fetched, not checked out).
"""
out: dict[str, str] = {}
listing = _git("ls-tree", "-r", "--name-only", sha, ".gitea/workflows/", cwd=repo_dir)
for line in listing.splitlines():
line = line.strip()
if not line.endswith((".yml", ".yaml")):
continue
try:
blob = _git("show", f"{sha}:{line}", cwd=repo_dir)
except RuntimeError:
# Symlink or other non-blob; skip.
continue
out[line] = blob
return out
# --------------------------------------------------------------------------
# Gitea: recent commits + per-commit combined status + log fetch
# --------------------------------------------------------------------------
def recent_commits_on_branch(branch: str, n: int) -> list[str]:
"""Last `n` commit SHAs on ``branch`` (oldest→newest is fine; we
treat them as a set). Uses the REST ``/commits`` endpoint with
``sha=branch&limit=n``."""
_, body = api(
"GET",
f"/repos/{OWNER}/{NAME}/commits",
query={"sha": branch, "limit": str(n)},
)
if not isinstance(body, list):
raise ApiError(f"/commits for {branch} returned non-list: {type(body).__name__}")
out: list[str] = []
for c in body:
if isinstance(c, dict):
sha = c.get("sha") or (c.get("commit", {}) or {}).get("id")
if isinstance(sha, str) and len(sha) >= 7:
out.append(sha)
return out
def combined_status(sha: str) -> dict:
"""Combined commit status for a SHA. Same shape as
``main-red-watchdog.get_combined_status``."""
_, body = api("GET", f"/repos/{OWNER}/{NAME}/commits/{sha}/status")
if not isinstance(body, dict):
raise ApiError(f"combined-status for {sha} not a dict")
return body
def _entry_state(s: dict) -> str:
"""Per-entry state — Gitea 1.22.6 schema asymmetry: top-level
uses ``state``, per-entry uses ``status``. Defensive fallback per
main-red-watchdog.py line 233."""
return s.get("status") or s.get("state") or ""
def fetch_log(target_url: str) -> str | None:
"""Fetch a job log given its web-UI ``target_url`` (e.g.
``/molecule-ai/molecule-core/actions/runs/13494/jobs/0``).
Per ``reference_gitea_actions_log_fetch``: append ``/logs`` to the
job route. Per ``reference_gitea_1_22_6_lacks_rest_rerun_endpoints``:
Gitea 1.22.6 lacks the REST ``/api/v1/.../actions/runs/*`` path; the
web-UI route is the only working endpoint until 1.24+.
Returns the log text on success, ``None`` on 404 / log-pruned /
network error (caller treats None as "log unavailable, warn-not-fail").
"""
if not target_url:
return None
# Normalize: target_url may be relative ("/owner/repo/...") or
# absolute. Both need ``/logs`` appended to the job sub-path.
if target_url.startswith("/"):
url = f"{WEB}{target_url}"
else:
url = target_url
if not url.endswith("/logs"):
url = f"{url}/logs"
try:
_, body, _ = http("GET", url, expect_json=False, timeout=60)
except ApiError as e:
sys.stderr.write(f"::warning::log fetch failed for {url}: {e}\n")
return None
if isinstance(body, bytes):
return body.decode("utf-8", errors="replace")
return None
def grep_fail_markers(log_text: str) -> list[str]:
"""Return up to 5 sample matching lines for any FAIL_PATTERNS hit.
Empty list = clean log."""
matches: list[str] = []
for line in log_text.splitlines():
for pat in FAIL_PATTERNS:
if pat in line:
# Truncate to keep error output bounded.
matches.append(line.strip()[:240])
break
if len(matches) >= 5:
break
return matches
# --------------------------------------------------------------------------
# Verification: for one flip, scan recent runs on BASE_REF
# --------------------------------------------------------------------------
def verify_flip(flip: dict, branch: str, n: int) -> dict:
"""Scan the last ``n`` commits on ``branch``. For each commit whose
combined status contains a context matching ``flip["context"]``,
fetch the run log and grep for FAIL markers.
Returns::
{
"flip": flip,
"checked_commits": int, # how many commits had a matching context
"masked_runs": [ # runs where log shows FAIL despite status==success
{"sha": "...", "status": "success", "target_url": "...", "samples": [...]},
...
],
"fail_runs": [ # runs where status itself is failure/error
{"sha": "...", "status": "failure", "target_url": "...", "samples": [...]},
...
],
"warnings": [str], # log-unavailable warnings (not blocking)
}
Blocking condition: ``masked_runs`` OR ``fail_runs`` non-empty.
A ``success`` status with a clean log is the only "OK to flip"
outcome (per hongming-pc2 §SOP-N rule (e)).
"""
target_context = flip["context"]
result = {
"flip": flip,
"checked_commits": 0,
"masked_runs": [],
"fail_runs": [],
"warnings": [],
}
shas = recent_commits_on_branch(branch, n)
if not shas:
result["warnings"].append(
f"no recent commits on {branch} (cannot verify flip)"
)
return result
for sha in shas:
try:
status_doc = combined_status(sha)
except ApiError as e:
result["warnings"].append(f"combined-status for {sha}: {e}")
continue
statuses = status_doc.get("statuses") or []
# First entry matching the context name. Newest SHAs come
# first; one entry per context per SHA is the usual shape.
for s in statuses:
if not isinstance(s, dict):
continue
if s.get("context") != target_context:
continue
result["checked_commits"] += 1
state = _entry_state(s)
target_url = s.get("target_url") or ""
log_text = fetch_log(target_url)
if log_text is None:
result["warnings"].append(
f"log unavailable for {sha} {target_context}"
)
# Still record the status itself if it's red — that's
# a hard signal that doesn't need log access.
if state in ("failure", "error"):
result["fail_runs"].append({
"sha": sha,
"status": state,
"target_url": target_url,
"samples": ["[log unavailable; status itself is " + state + "]"],
})
break
samples = grep_fail_markers(log_text)
if state in ("failure", "error"):
result["fail_runs"].append({
"sha": sha,
"status": state,
"target_url": target_url,
"samples": samples or ["[no FAIL markers found but status is " + state + "]"],
})
elif samples and state == "success":
# The bug class: status==success while log shows FAIL.
# That's exactly Quirk #10 (continue-on-error masking).
result["masked_runs"].append({
"sha": sha,
"status": state,
"target_url": target_url,
"samples": samples,
})
# Either way, we matched one context entry for this SHA;
# don't keep looping `statuses[]`.
break
if result["checked_commits"] == 0:
result["warnings"].append(
f"no runs of {target_context!r} found in the last {n} commits on "
f"{branch} — cannot verify; allowing flip with warning"
)
return result
# --------------------------------------------------------------------------
# Report rendering
# --------------------------------------------------------------------------
def render_flip_report(verdict: dict) -> str:
flip = verdict["flip"]
lines = [
f"job: {flip['job_key']} ({flip['context']})",
f" workflow: {flip['workflow_path']}",
f" checked_commits: {verdict['checked_commits']}",
]
for run in verdict["fail_runs"]:
url = run["target_url"]
# target_url may be relative; render the absolute form for
# click-through.
if url.startswith("/"):
url = f"{WEB}{url}"
lines.append(f" fail run {run['sha'][:10]} (status={run['status']}): {url}")
for sample in run["samples"]:
lines.append(f" | {sample}")
for run in verdict["masked_runs"]:
url = run["target_url"]
if url.startswith("/"):
url = f"{WEB}{url}"
lines.append(
f" MASKED run {run['sha'][:10]} (status=success, log shows FAIL): {url}"
)
for sample in run["samples"]:
lines.append(f" | {sample}")
for w in verdict["warnings"]:
lines.append(f" warning: {w}")
return "\n".join(lines)
# --------------------------------------------------------------------------
# Main
# --------------------------------------------------------------------------
def _parse_args(argv: list[str] | None = None) -> argparse.Namespace:
p = argparse.ArgumentParser(
prog="lint-pre-flip-continue-on-error",
description="Block a PR that flips continue-on-error true→false "
"without proof recent runs are actually green.",
)
p.add_argument(
"--dry-run",
action="store_true",
help="Detect + print findings to stdout; never exit non-zero. "
"Useful for local testing.",
)
return p.parse_args(argv)
def main(argv: list[str] | None = None) -> int:
args = _parse_args(argv)
_require_runtime_env()
base_workflows = workflows_at_sha(BASE_SHA)
head_workflows = workflows_at_sha(HEAD_SHA)
flips = detect_flips(base_workflows, head_workflows)
if not flips:
print("::notice::no continue-on-error true→false flips in this PR")
return 0
print(f"::notice::detected {len(flips)} continue-on-error true→false flip(s); verifying recent runs on {BASE_REF}")
bad_flips: list[dict] = []
for flip in flips:
verdict = verify_flip(flip, BASE_REF, RECENT_COMMITS_N)
report = render_flip_report(verdict)
if verdict["fail_runs"] or verdict["masked_runs"]:
print(f"::error file={flip['workflow_path']}::flip of {flip['job_key']} "
f"({flip['context']}) blocked — recent runs on {BASE_REF} show "
f"FAIL markers OR are red. Pull each run log below + grep "
f"`--- FAIL` / `FAIL ` / `::error::` — DON'T trust the masked "
f"combined-status. See hongming-pc2 charter §SOP-N rule (e). "
f"PR#656 / mc#664 reference class.")
bad_flips.append(verdict)
else:
print(f"::notice::flip of {flip['job_key']} ({flip['context']}) is safe — "
f"{verdict['checked_commits']} recent run(s), no FAIL markers")
# Always print the per-flip detail block so the human-readable
# report is in the run log for both safe and unsafe flips.
print(f"::group::flip detail: {flip['job_key']}")
print(report)
print("::endgroup::")
if bad_flips and not args.dry_run:
print(f"::error::{len(bad_flips)}/{len(flips)} flip(s) failed pre-flip verification")
return 1
if bad_flips and args.dry_run:
print(f"::warning::[dry-run] {len(bad_flips)}/{len(flips)} flip(s) WOULD fail; exit 0 forced")
return 0
if __name__ == "__main__":
sys.exit(main())
@@ -1,505 +0,0 @@
"""Unit tests for .gitea/scripts/lint_pre_flip_continue_on_error.py.
These tests pin the pure-logic surface (flip detection + per-flip
verdict aggregation) without making real HTTP calls. The end-to-end
git ls-tree + Gitea API path is exercised by running the workflow
against real PRs.
Run locally::
python3 -m unittest .gitea/scripts/tests/test_lint_pre_flip_continue_on_error.py -v
Mirrors the pattern in scripts/ops/test_check_migration_collisions.py
+ scripts/test_build_runtime_package.py.
"""
from __future__ import annotations
import importlib.util
import os
import sys
import unittest
from pathlib import Path
from unittest import mock
# Load the script as a module without invoking main(). Tests must NOT
# depend on the full runtime env contract (GITEA_TOKEN etc.), so we
# import individual functions and stub the network surface explicitly.
SCRIPT_PATH = Path(__file__).resolve().parent.parent / "lint_pre_flip_continue_on_error.py"
spec = importlib.util.spec_from_file_location("lpfc", SCRIPT_PATH)
lpfc = importlib.util.module_from_spec(spec)
spec.loader.exec_module(lpfc)
# --------------------------------------------------------------------------
# Fixtures: minimal valid workflow YAML on each side of a "diff"
# --------------------------------------------------------------------------
CI_YML_BASE = """\
name: CI
on:
push:
branches: [main]
jobs:
platform-build:
name: Platform (Go)
runs-on: ubuntu-latest
continue-on-error: true
steps:
- run: echo platform
canvas-build:
name: Canvas (Next.js)
runs-on: ubuntu-latest
continue-on-error: true
steps:
- run: echo canvas
all-required:
runs-on: ubuntu-latest
continue-on-error: true
needs: [platform-build, canvas-build]
steps:
- run: echo ok
"""
CI_YML_HEAD_FLIPPED = """\
name: CI
on:
push:
branches: [main]
jobs:
platform-build:
name: Platform (Go)
runs-on: ubuntu-latest
continue-on-error: false
steps:
- run: echo platform
canvas-build:
name: Canvas (Next.js)
runs-on: ubuntu-latest
continue-on-error: false
steps:
- run: echo canvas
all-required:
runs-on: ubuntu-latest
continue-on-error: true
needs: [platform-build, canvas-build]
steps:
- run: echo ok
"""
CI_YML_HEAD_NO_DIFF = CI_YML_BASE # identical to base, no flip
# --------------------------------------------------------------------------
# 1. CoE coercion (truthy/falsy/quoted/absent)
# --------------------------------------------------------------------------
class TestCoerceCoE(unittest.TestCase):
def test_python_bool_true(self):
self.assertTrue(lpfc._coerce_coe(True))
def test_python_bool_false(self):
self.assertFalse(lpfc._coerce_coe(False))
def test_none_is_false(self):
# GitHub Actions default: absent == false.
self.assertFalse(lpfc._coerce_coe(None))
def test_string_true_lowercase(self):
# Quoted "true" in YAML — Gitea Actions normalizes to True.
self.assertTrue(lpfc._coerce_coe("true"))
def test_string_True_titlecase(self):
self.assertTrue(lpfc._coerce_coe("True"))
def test_string_yes(self):
# YAML 1.1 truthy form.
self.assertTrue(lpfc._coerce_coe("yes"))
def test_string_false(self):
self.assertFalse(lpfc._coerce_coe("false"))
def test_string_random_falsy(self):
# An unrecognized string is treated as falsy — safer than
# silently coercing "maybe" to True and false-positiving a
# flip.
self.assertFalse(lpfc._coerce_coe("maybe"))
# --------------------------------------------------------------------------
# 2. Diff detection — flips, not arbitrary changes
# --------------------------------------------------------------------------
class TestDetectFlips(unittest.TestCase):
def test_no_flip_in_diff_passes(self):
# Acceptance test #1: PR doesn't flip continue-on-error → 0 flips.
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": CI_YML_HEAD_NO_DIFF},
)
self.assertEqual(flips, [])
def test_flip_detected_in_one_file(self):
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": CI_YML_HEAD_FLIPPED},
)
# Two jobs flipped: platform-build, canvas-build. all-required
# is still true on both sides.
self.assertEqual(len(flips), 2)
keys = sorted(f["job_key"] for f in flips)
self.assertEqual(keys, ["canvas-build", "platform-build"])
def test_context_name_render(self):
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": CI_YML_HEAD_FLIPPED},
)
platform = next(f for f in flips if f["job_key"] == "platform-build")
self.assertEqual(platform["context"], "CI / Platform (Go) (push)")
self.assertEqual(platform["workflow_name"], "CI")
def test_context_falls_back_to_job_key_when_no_name(self):
base = "name: WF\njobs:\n foo:\n continue-on-error: true\n runs-on: x\n steps: []\n"
head = "name: WF\njobs:\n foo:\n continue-on-error: false\n runs-on: x\n steps: []\n"
flips = lpfc.detect_flips({"a.yml": base}, {"a.yml": head})
self.assertEqual(len(flips), 1)
self.assertEqual(flips[0]["context"], "WF / foo (push)")
def test_no_flip_when_only_one_side_has_file(self):
# Newly added workflow file — head has CoE:false, base has no
# file. Adding a new workflow with CoE:false is fine; there's
# nothing to mask.
flips = lpfc.detect_flips(
{}, # base has no workflow files
{".gitea/workflows/new.yml": CI_YML_HEAD_FLIPPED},
)
self.assertEqual(flips, [])
def test_no_flip_when_job_removed(self):
# Job exists on base, not on head — a removal, not a flip.
head = """\
name: CI
jobs:
canvas-build:
name: Canvas (Next.js)
continue-on-error: true
runs-on: ubuntu-latest
steps: []
"""
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": head},
)
self.assertEqual(flips, [])
def test_no_flip_when_job_added_with_false(self):
# New job on head with CoE:false — no base side; not a flip.
head_with_new = CI_YML_BASE.replace(
" all-required:",
" newjob:\n name: New Job\n continue-on-error: false\n"
" runs-on: x\n steps: []\n"
" all-required:",
)
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": head_with_new},
)
self.assertEqual(flips, [])
def test_yaml_parse_error_warns_not_raises(self):
# Malformed YAML on head — should warn (stderr) and skip,
# not raise.
bad_head = "name: CI\njobs:\n :::\n"
# Capture stderr so the test isn't noisy.
with mock.patch.object(sys, "stderr"):
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": bad_head},
)
self.assertEqual(flips, [])
# --------------------------------------------------------------------------
# 3. grep_fail_markers — the regex / substring matcher
# --------------------------------------------------------------------------
class TestGrepFailMarkers(unittest.TestCase):
def test_clean_log_returns_empty(self):
log = "===== test run starting =====\nPASS\nok example.com/foo 1.234s\n"
self.assertEqual(lpfc.grep_fail_markers(log), [])
def test_go_minus_minus_minus_fail_caught(self):
log = "ok example.com/foo 1.234s\n--- FAIL: TestBar (0.01s)\n bar_test.go:42:\n"
matches = lpfc.grep_fail_markers(log)
self.assertEqual(len(matches), 1)
self.assertIn("FAIL: TestBar", matches[0])
def test_go_package_fail_caught(self):
log = "FAIL\texample.com/baz\t1.234s\n"
matches = lpfc.grep_fail_markers(log)
self.assertEqual(len(matches), 1)
self.assertIn("FAIL", matches[0])
def test_bash_error_directive_caught(self):
# `lint-curl-status-capture` pattern: a python heredoc inside a
# bash step that prints `::error::` then sys.exit(1). With
# continue-on-error:true the job rolls up as success despite
# this line. THAT's the masking we're trying to catch.
log = "Running scan...\n::error::Found 3 curl-status-capture pollution site(s):\n"
matches = lpfc.grep_fail_markers(log)
self.assertEqual(len(matches), 1)
self.assertIn("::error::", matches[0])
def test_caps_matches_at_max_5(self):
log = "\n".join(["--- FAIL: T%d" % i for i in range(20)])
matches = lpfc.grep_fail_markers(log)
self.assertEqual(len(matches), 5)
# --------------------------------------------------------------------------
# 4. verify_flip — single-flip verdict assembly (network surface stubbed)
# --------------------------------------------------------------------------
def _stub_status(context: str, state: str, target_url: str = "/owner/repo/actions/runs/1/jobs/0") -> dict:
"""Build a single-context combined-status response."""
return {
"state": state,
"statuses": [
{"context": context, "status": state, "target_url": target_url, "description": ""}
],
}
FLIP_FIXTURE = {
"workflow_path": ".gitea/workflows/ci.yml",
"workflow_name": "CI",
"job_key": "platform-build",
"job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)",
}
class TestVerifyFlip(unittest.TestCase):
def test_flip_with_clean_history_passes(self):
# Acceptance test #2: flip detected, last 5 runs clean → exit 0.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2", "sha3"]):
with mock.patch.object(
lpfc, "combined_status",
side_effect=[_stub_status(FLIP_FIXTURE["context"], "success") for _ in range(3)],
):
with mock.patch.object(lpfc, "fetch_log", return_value="ok example.com/foo 1s\nPASS\n"):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(verdict["fail_runs"], [])
self.assertEqual(verdict["masked_runs"], [])
self.assertEqual(verdict["checked_commits"], 3)
self.assertEqual(verdict["warnings"], [])
def test_flip_with_recent_fail_blocks(self):
# Acceptance test #3: flip detected, recent run has --- FAIL → exit 1.
# Setup: 3 commits, the most recent run's log shows --- FAIL
# but the STATUS is success (Quirk #10 mask). That's the
# masked_runs case.
log_with_fail = "ok example.com/foo 1s\n--- FAIL: TestSqlmock (0.01s)\n sqlmock_test.go:42:\n"
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2", "sha3"]):
with mock.patch.object(
lpfc, "combined_status",
side_effect=[_stub_status(FLIP_FIXTURE["context"], "success") for _ in range(3)],
):
with mock.patch.object(lpfc, "fetch_log", side_effect=[log_with_fail, "PASS\n", "PASS\n"]):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(len(verdict["masked_runs"]), 1)
self.assertEqual(verdict["masked_runs"][0]["sha"], "sha1")
self.assertTrue(any("TestSqlmock" in s for s in verdict["masked_runs"][0]["samples"]))
self.assertEqual(verdict["fail_runs"], [])
def test_red_status_alone_blocks(self):
# Status itself is `failure` — block without needing log
# markers. (Belt-and-braces: even with a clean log, a `failure`
# status means the job's exit code was non-zero.)
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
with mock.patch.object(
lpfc, "combined_status",
return_value=_stub_status(FLIP_FIXTURE["context"], "failure"),
):
with mock.patch.object(lpfc, "fetch_log", return_value="some unrelated text\n"):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(len(verdict["fail_runs"]), 1)
self.assertEqual(verdict["fail_runs"][0]["status"], "failure")
def test_unreadable_log_warns_not_blocks(self):
# Acceptance test #5: log fetch 404 (None) → warn, not block.
# Status is `success`, log is None — we can't tell, so we warn
# and allow.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
with mock.patch.object(
lpfc, "combined_status",
return_value=_stub_status(FLIP_FIXTURE["context"], "success"),
):
with mock.patch.object(lpfc, "fetch_log", return_value=None):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(verdict["fail_runs"], [])
self.assertEqual(verdict["masked_runs"], [])
self.assertTrue(any("log unavailable" in w for w in verdict["warnings"]))
def test_unreadable_log_with_failure_status_still_blocks(self):
# Edge case: log fetch fails BUT the status itself is `failure`.
# We can still block — the status alone is sufficient signal,
# we don't need the log to confirm.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
with mock.patch.object(
lpfc, "combined_status",
return_value=_stub_status(FLIP_FIXTURE["context"], "failure"),
):
with mock.patch.object(lpfc, "fetch_log", return_value=None):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(len(verdict["fail_runs"]), 1)
self.assertIn("log unavailable", verdict["fail_runs"][0]["samples"][0])
def test_zero_runs_history_warns_allows(self):
# No commits with a matching context — newly added workflow.
# Allow with warning.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2"]):
with mock.patch.object(
lpfc, "combined_status",
return_value={"state": "success", "statuses": []}, # no matching context
):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(verdict["checked_commits"], 0)
self.assertEqual(verdict["fail_runs"], [])
self.assertEqual(verdict["masked_runs"], [])
self.assertTrue(any("no runs of" in w for w in verdict["warnings"]))
def test_zero_commits_warns_allows(self):
# Empty branch (newly created repo, e.g.). Allow with warning.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=[]):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(verdict["checked_commits"], 0)
self.assertEqual(verdict["fail_runs"], [])
self.assertEqual(verdict["masked_runs"], [])
self.assertTrue(any("no recent commits" in w for w in verdict["warnings"]))
# --------------------------------------------------------------------------
# 5. Multiple-flip aggregation in main()
# --------------------------------------------------------------------------
class TestMainAggregation(unittest.TestCase):
"""Tests that `main()` aggregates multiple flips and exits 1 when
ANY one of them has a masked or red recent run. Acceptance test #4.
We stub at the verify_flip + workflows_at_sha + _require_runtime_env
boundary so we don't need real git or HTTP.
"""
def setUp(self):
# The actual env values are irrelevant — _require_runtime_env
# is stubbed out — but the module reads OWNER/NAME at import
# time. Patch the runtime env contract to a no-op for the
# duration of each test.
self._patches = [
mock.patch.object(lpfc, "_require_runtime_env", return_value=None),
mock.patch.object(lpfc, "BASE_REF", "main"),
mock.patch.object(lpfc, "BASE_SHA", "deadbeefcafe"),
mock.patch.object(lpfc, "HEAD_SHA", "feedfaceabad"),
mock.patch.object(lpfc, "RECENT_COMMITS_N", 5),
]
for p in self._patches:
p.start()
self.addCleanup(lambda: [p.stop() for p in self._patches])
def test_multiple_flips_aggregated_one_bad_blocks(self):
# PR flips 3 jobs; 1 has a recent fail → exit 1, naming that job.
flips = [
{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "platform-build", "job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)"},
{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "canvas-build", "job_name": "Canvas (Next.js)",
"context": "CI / Canvas (Next.js) (push)"},
{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "python-lint", "job_name": "Python Lint & Test",
"context": "CI / Python Lint & Test (push)"},
]
clean = {"flip": flips[0], "checked_commits": 5, "masked_runs": [],
"fail_runs": [], "warnings": []}
bad = {"flip": flips[1], "checked_commits": 5,
"masked_runs": [{"sha": "abc1234567", "status": "success",
"target_url": "/x/y/actions/runs/1/jobs/0",
"samples": ["--- FAIL: TestSqlmock"]}],
"fail_runs": [], "warnings": []}
also_clean = {"flip": flips[2], "checked_commits": 5, "masked_runs": [],
"fail_runs": [], "warnings": []}
with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
with mock.patch.object(lpfc, "detect_flips", return_value=flips):
with mock.patch.object(lpfc, "verify_flip",
side_effect=[clean, bad, also_clean]):
# Capture stdout to assert on naming.
captured = []
with mock.patch("builtins.print", side_effect=lambda *a, **k: captured.append(" ".join(str(x) for x in a))):
rc = lpfc.main([])
self.assertEqual(rc, 1)
# The blocking error message must name the failing job.
joined = "\n".join(captured)
self.assertIn("canvas-build", joined)
# And it must mention the empirical class so a reviewer can
# cross-link the right RFC.
self.assertTrue("mc#664" in joined or "PR#656" in joined)
def test_no_flips_in_diff_exits_zero(self):
# Acceptance test #1 at main() level: empty flips → exit 0.
with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
with mock.patch.object(lpfc, "detect_flips", return_value=[]):
rc = lpfc.main([])
self.assertEqual(rc, 0)
def test_all_flips_clean_exits_zero(self):
flips = [{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "platform-build", "job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)"}]
clean = {"flip": flips[0], "checked_commits": 5, "masked_runs": [],
"fail_runs": [], "warnings": []}
with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
with mock.patch.object(lpfc, "detect_flips", return_value=flips):
with mock.patch.object(lpfc, "verify_flip", return_value=clean):
rc = lpfc.main([])
self.assertEqual(rc, 0)
def test_dry_run_forces_exit_zero_even_with_bad_flip(self):
# --dry-run never fails, even when verification finds masked runs.
flips = [{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "platform-build", "job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)"}]
bad = {"flip": flips[0], "checked_commits": 5,
"masked_runs": [{"sha": "abc1234567", "status": "success",
"target_url": "/x/y/actions/runs/1/jobs/0",
"samples": ["--- FAIL: TestSqlmock"]}],
"fail_runs": [], "warnings": []}
with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
with mock.patch.object(lpfc, "detect_flips", return_value=flips):
with mock.patch.object(lpfc, "verify_flip", return_value=bad):
rc = lpfc.main(["--dry-run"])
self.assertEqual(rc, 0)
# --------------------------------------------------------------------------
# 6. Context-name rendering (the format Gitea Actions actually emits)
# --------------------------------------------------------------------------
class TestContextName(unittest.TestCase):
def test_push_event(self):
self.assertEqual(
lpfc.context_name("CI", "Platform (Go)", "push"),
"CI / Platform (Go) (push)",
)
def test_pull_request_event(self):
self.assertEqual(
lpfc.context_name("CI", "Platform (Go)", "pull_request"),
"CI / Platform (Go) (pull_request)",
)
def test_workflow_name_falls_back_to_filename(self):
# No top-level `name:` → falls back to filename minus extension.
doc = {"jobs": {"foo": {"continue-on-error": True}}}
self.assertEqual(
lpfc.workflow_name(doc, fallback="my-workflow"),
"my-workflow",
)
if __name__ == "__main__":
unittest.main()
-8
View File
@@ -32,14 +32,6 @@ on:
# iterating all open PRs when PR_NUMBER is empty.
workflow_dispatch:
permissions:
# read: contents — for checkout (base ref, not PR head for security)
# read: pull-requests — for reading PR info via API
# write: pull-requests — for posting/updating gate-check comments
# Without this the token cannot POST/PATCH /issues/comments → 403.
contents: read
pull-requests: write
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
@@ -1,120 +0,0 @@
name: lint-continue-on-error-tracking
# Tier 2e hard-gate lint (per internal#350) — every
# `continue-on-error: true` in `.gitea/workflows/*.yml` must carry a
# `# mc#NNNN` or `# internal#NNNN` tracker comment within 2 lines,
# the referenced issue must be OPEN, and ≤14 days old.
#
# Why this exists
# ---------------
# `continue-on-error: true` on `platform-build` had been hiding
# mc#664-class regressions for ~3 weeks before #656 surfaced them on
# 2026-05-12. A 14-day cap on tracker age forces a review cycle and
# surfaces mask-drift within at most 14 days of the original defect.
# Each `continue-on-error: true` gets a paper trail — close or renew.
#
# How the gate works
# ------------------
# 1. Walk `.gitea/workflows/*.yml` via PyYAML's line-tracking loader
# (per `feedback_behavior_based_ast_gates`) and find every job
# whose `continue-on-error` evaluates truthy (`true` or string
# `"true"` — Gitea's evaluator coerces strings).
# 2. For each, scan ±2 lines of the directive's source line for a
# `# mc#NNNN` or `# internal#NNNN` comment. Inline-trailing
# comments on the directive line count.
# 3. For each tracker reference, GET the issue from the Gitea API.
# Validate: exists, `state == open`, `created_at` ≤ MAX_AGE_DAYS.
# 4. Aggregate ALL violations (not short-circuit) and exit 1 if any.
#
# Triggers
# --------
# Runs on PR events (paths-filter on `.gitea/workflows/**`) AND on
# a daily schedule. PR runs catch the violation at introduction time.
# Schedule runs catch the AGE-EXPIRY class: a tracker that was ≤14d
# old when the PR landed but is now 20d old, with the underlying
# defect still unfixed. Per `feedback_chained_defects_in_never_tested_workflows`,
# scheduled drift detection is the second half of the gate.
#
# Phase contract (RFC internal#219 §1 ladder)
# -------------------------------------------
# Lands at `continue-on-error: true` (Phase 3 — surface broken shapes
# without blocking). The pre-existing `continue-on-error: true`
# directives on `main` will all violate this lint at first
# (intentional — they're the masked defects this lint exists to
# surface). Each must be triaged: file a fresh tracker comment,
# close-and-flip, or document the deliberate keep-mask in a fresh
# 14-day-renewable tracker. After main is clean for 3 days,
# follow-up PR flips this workflow's continue-on-error to false.
# Tracking: internal#350.
#
# Cross-links
# -----------
# - internal#350 (the RFC that specs this lint)
# - mc#664 (the empirical masked-3-weeks case)
# - feedback_chained_defects_in_never_tested_workflows
# - feedback_behavior_based_ast_gates
# - feedback_strict_root_only_after_class_a
#
# Auth: DRIFT_BOT_TOKEN — same persona used by ci-required-drift.yml
# (provisioned under internal#329). Auto-injected GITHUB_TOKEN is
# insufficient because `internal#NNN` references cross repositories
# (molecule-core → molecule-ai/internal).
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint_continue_on_error_tracking.py'
- 'tests/test_lint_continue_on_error_tracking.py'
push:
branches: [main, staging]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint_continue_on_error_tracking.py'
schedule:
# Daily at 13:11 UTC — off-peak, prime-staggered from the other
# Tier-2 lint schedules (ci-required-drift runs hourly :00).
- cron: '11 13 * * *'
workflow_dispatch:
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: lint-coe-tracking-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
lint:
name: lint-continue-on-error-tracking
runs-on: ubuntu-latest
timeout-minutes: 10
# Phase 3 (RFC #219 §1): surface masked defects without blocking
# PRs. Pre-existing continue-on-error: true directives on main
# all violate this lint at first — intentional. Flip to false
# follow-up after main is clean for 3 days. internal#350.
continue-on-error: true # internal#350 Phase 3 mask — 14d forced-renewal cadence
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Run lint-continue-on-error-tracking
env:
GITEA_TOKEN: ${{ secrets.DRIFT_BOT_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
INTERNAL_REPO: molecule-ai/internal
WORKFLOWS_DIR: .gitea/workflows
MAX_AGE_DAYS: '14'
run: python3 .gitea/scripts/lint_continue_on_error_tracking.py
- name: Run lint-continue-on-error-tracking unit tests
run: |
python -m pip install --quiet pytest
python3 -m pytest tests/test_lint_continue_on_error_tracking.py -v
@@ -1,141 +0,0 @@
name: Lint pre-flip continue-on-error
# Pre-merge gate: blocks PRs that flip `continue-on-error: true → false`
# on any job in `.gitea/workflows/*.yml` WITHOUT proof that the affected
# job's recent runs on the target branch (PR base) are actually green.
#
# Empirical class: PR #656 / mc#664. PR #656 (RFC internal#219 Phase 4)
# flipped 5 platform-build-class jobs `continue-on-error: true → false`
# on the basis of a "verified green on main via combined-status check".
# But that "green" was the LIE the prior `continue-on-error: true`
# produced: Gitea Quirk #10 (internal#342 + dup #287) — a failed step
# inside a `continue-on-error: true` job rolls up to a `success`
# job-level status. The precondition the PR claimed to verify was
# structurally fooled by the bug being flipped.
#
# mc#664 captured the surfaced defects (2 mutually-masked regressions):
# - Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
# - Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)
#
# Codified 04:35Z as hongming-pc2 charter §SOP-N rule (e)
# "run-log-grep-before-flip" — now structurally enforced here at PR
# time, ahead of merge.
#
# How the gate works:
# 1. Read every `.gitea/workflows/*.yml` at the PR base SHA AND at
# the PR head SHA via `git show <sha>:<path>` (no checkout
# needed).
# 2. Parse both sides via PyYAML AST (NOT grep — per
# `feedback_behavior_based_ast_gates`). Walk `jobs.<key>.
# continue-on-error` on each side. A flip is base=true,
# head=false.
# 3. For each flipped job, render the commit-status context as
# `"{workflow.name} / {job.name or job.key} (push)"` — that's
# how Gitea Actions emits the per-context status on `main`/
# `staging` runs.
# 4. Pull last 5 commits on the PR base branch, fetch combined
# commit-status per commit, scan for the target context. For
# each match, fetch the run log via the web-UI route
# `{server_url}/{repo}/actions/runs/{run_id}/jobs/{job_idx}/logs`
# (per `reference_gitea_actions_log_fetch` —
# Gitea 1.22.6 lacks REST `/actions/runs/*`; web-UI is the
# only working path, see also
# `reference_gitea_1_22_6_lacks_rest_rerun_endpoints`).
# 5. Grep each log for `--- FAIL`, `FAIL\s`, `::error::`. If
# the status is `success` but the log shows any of these,
# the job was masked. Block the PR with `::error::`.
#
# Graceful-degrade contract (per task halt-conditions):
# - Log fetch 404 (act_runner pruned the log, transient outage):
# emit `::warning::` "log unavailable" — does NOT block.
# - Zero recent runs of the flipped job's context on the base
# branch (newly added workflow): emit `::warning::` "no run
# history to verify" — allow the flip. Chicken-and-egg
# exemption.
# - YAML parse error in one of the workflow files: warn-only,
# don't block — the YAML lint workflows catch this separately.
#
# Cross-links: PR#656, mc#664, PR#665 (interim re-mask),
# Quirk #10 (internal#342 + dup #287), hongming-pc2 charter
# §SOP-N rule (e), feedback_strict_root_only_after_class_a,
# feedback_no_shared_persona_token_use.
#
# Phase contract (RFC internal#219 §1 ladder):
# - This workflow lands at `continue-on-error: true` (Phase 3 —
# surface defects without blocking). Follow-up PR flips it to
# `false` ONLY after this workflow's own recent runs on `main`
# are confirmed clean — exactly the discipline the workflow
# itself enforces. Eat your own dogfood.
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint_pre_flip_continue_on_error.py'
- '.gitea/workflows/lint-pre-flip-continue-on-error.yml'
env:
# Per `feedback_act_runner_github_server_url` — without this,
# actions/checkout and friends default to github.com → break.
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
# Need read on the API to pull combined commit-status + commit list
# for the base branch. The job-log fetch uses the same token via
# the web-UI route (Gitea 1.22.6 accepts `Authorization: token ...`
# there).
pull-requests: read
concurrency:
group: lint-pre-flip-coe-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: true
jobs:
scan:
name: Verify continue-on-error flips have run-log proof
runs-on: ubuntu-latest
timeout-minutes: 8
# Phase 3 (RFC internal#219 §1): surface broken flips without blocking
# the PR yet. Follow-up flips this to `false` once the workflow itself
# has clean recent runs on main. mc#664 interim — remove when CoE→false.
continue-on-error: true # mc#664
steps:
- name: Check out PR head (full history for base-SHA access)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# `git show <base-sha>:<path>` needs the base SHA's blobs.
# Shallow=1 would miss it. Same rationale as
# check-migration-collisions.yml.
fetch-depth: 0
- name: Set up Python (PyYAML for AST parsing)
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
# Same pin as ci-required-drift.yml — keep dependencies
# uniform so a Gitea runner cache hits across both jobs.
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Ensure base ref is reachable locally
# `actions/checkout@v6 fetch-depth=0` usually pulls the base
# too, but explicit-fetch is cheap insurance against the
# form-of-ref differences across Gitea runner versions
# (mirrors the comment in check-migration-collisions.yml).
run: |
git fetch origin "${{ github.event.pull_request.base.ref }}" || true
- name: Run lint
env:
# Auto-injected by Gitea Actions; sufficient scope for
# combined-status + commit-list + log fetch via web-UI
# route. NO repo-admin needed (unlike the
# branch_protections endpoint).
GITEA_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
BASE_REF: ${{ github.event.pull_request.base.ref }}
BASE_SHA: ${{ github.event.pull_request.base.sha }}
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
# Last 5 commits on the base branch is the spec default.
RECENT_COMMITS_N: '5'
run: python3 .gitea/scripts/lint_pre_flip_continue_on_error.py
+6 -7
View File
@@ -54,13 +54,12 @@ env:
jobs:
build-and-push:
name: Build & push canvas image
# REVERTED (infra/revert-docker-runner-label): `runs-on: ubuntu-latest` restored.
# The `docker` label is not registered on any act_runner. `runs-on: [ubuntu-latest, docker]`
# causes jobs to queue indefinitely with zero eligible runners — strictly worse than the
# pre-#599 coin-flip (50% success rate). Once the `docker` label is registered on
# ≥2 runners, re-apply the fix from #599 (infra/docker-runner-label).
# See issue #576 + infra-lead pulse ~00:30Z.
runs-on: ubuntu-latest
# infra/docker-label-registration (molecule-ai/operator-config PR #30): `docker` label
# is now registered on all act_runners that mount /var/run/docker.sock. This change
# routes publish jobs exclusively to Docker-capable runners (no more coin-flip failures).
# Prerequisite: operator host must be rolled to pick up new runner config. See
# molecule-ai/molecule-core issue #711.
runs-on: [ubuntu-latest, docker]
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
steps:
@@ -20,12 +20,6 @@ name: publish-workspace-server-image
#
# ECR target: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/*
# Required secrets: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AUTO_SYNC_TOKEN
#
# mc#711: Docker daemon not accessible on ubuntu-latest runner (molecule-canonical-1
# shows client-only in `docker info` — daemon not running). DinD mount is present but
# daemon doesn't respond. Fix: add diagnostic step showing socket info so ops can
# identify which runners have a live daemon. If no daemon is available, the job
# fails fast with actionable output rather than silent deep failure.
on:
push:
@@ -58,25 +52,35 @@ env:
jobs:
build-and-push:
runs-on: ubuntu-latest
# infra/docker-label-registration (molecule-ai/operator-config PR #30): `docker` label
# is now registered on all act_runners that mount /var/run/docker.sock. This change
# routes publish jobs exclusively to Docker-capable runners (no more coin-flip failures).
# Prerequisite: operator host must be rolled to pick up new runner config. See
# molecule-ai/molecule-core issue #711.
runs-on: [ubuntu-latest, docker]
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Diagnose Docker daemon access
# Health check: verify Docker daemon is accessible before attempting any
# build steps. This fails loudly at step 1 when the runner's docker.sock
# is inaccessible (e.g. permission change, daemon restart, or group-membership
# drift) rather than silently continuing to step 2 where `docker build`
# fails deep in the process with a cryptic ECR auth error that doesn't
# surface the root cause. Also reports the daemon version so operator
# can correlate with runner host logs.
- name: Verify Docker daemon access
run: |
set -euo pipefail
echo "::group::Docker daemon diagnosis"
echo "::group::Docker daemon health check"
echo "Runner: ${HOSTNAME:-unknown}"
echo "--- Socket info ---"
ls -la /var/run/docker.sock 2>/dev/null || echo "/var/run/docker.sock: not found"
stat /var/run/docker.sock 2>/dev/null || true
echo "--- User info ---"
id
echo "--- docker version ---"
docker version 2>&1 || true
echo "--- docker info (full) ---"
docker info 2>&1 || echo "docker info failed: exit $?"
docker info 2>&1 | head -5 || {
echo "::error::Docker daemon is not accessible at /var/run/docker.sock"
echo "::error::Runner: ${HOSTNAME:-unknown}"
echo "::error::Check: (1) daemon is running, (2) runner user is in docker group, (3) sock permissions are 660+"
exit 1
}
echo "Docker daemon OK"
echo "::endgroup::"
# Pre-clone manifest deps before docker build.
@@ -95,6 +99,9 @@ jobs:
MOLECULE_GITEA_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
run: |
set -euo pipefail
# clone-manifest.sh supports anonymous cloning for public repos (post-
# 2026-05-08 migration). The token is only needed for private repos.
# Do NOT require it — a missing secret would fail the build unnecessarily.
mkdir -p .tenant-bundle-deps
# Strip JSON5 comments before jq parsing — Integration Tester appends
# `// Triggered by ...` which breaks `jq` in clone-manifest.sh.
+13 -13
View File
@@ -9,11 +9,12 @@ name: redeploy-tenants-on-main
# - Workflow-level env.GITHUB_SERVER_URL pinned per
# feedback_act_runner_github_server_url.
# - `continue-on-error: true` on each job (RFC §1 contract).
# - ~~**Gitea workflow_run trigger limitation**~~ FIXED: replaced with
# push+paths filter per this PR. Gitea 1.22.6 does not support
# `workflow_run` (task #81). The push trigger fires on every
# commit to publish-workspace-server-image.yml which is the
# same signal (only successful runs commit to main).
# - **Gitea workflow_run trigger limitation**: Gitea 1.22.6's support
# for the `workflow_run` event is partial. If this never fires on a
# real publish-workspace-server-image completion, the follow-up
# triage PR should replace the trigger with a push-with-paths-filter
# on .gitea/workflows/publish-workspace-server-image.yml. Until
# then continue-on-error+dead-workflow doesn't break anything.
#
# Auto-refresh prod tenant EC2s after every main merge.
@@ -53,7 +54,6 @@ on:
branches: [main]
paths:
- '.gitea/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
permissions:
contents: read
# No write scopes needed — the workflow hits an external CP endpoint,
@@ -79,11 +79,11 @@ env:
jobs:
redeploy:
# Skip the auto-trigger if publish-workspace-server-image didn't
# actually succeed. workflow_run fires on any completion state; we
# don't want to redeploy against a half-built image.
# NOTE (Gitea port): workflow_dispatch trigger dropped; only the
# workflow_run path remains.
if: ${{ github.event.workflow_run.conclusion == 'success' }}
# actually succeed. The push trigger fires when the workflow file
# is updated (post-merge of publish-workspace-server-image). This is
# the best-available proxy for "publish succeeded" without workflow_run.
# If the push was from a revert or a partial publish, continue-on-error
# on the individual job means the redeploy failure won't block merges.
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
@@ -111,7 +111,7 @@ jobs:
# dispatch with no input falls through to github.sha.
env:
INPUT_TAG: ${{ inputs.target_tag }}
HEAD_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
HEAD_SHA: ${{ github.sha }}
run: |
set -euo pipefail
if [ -n "${INPUT_TAG:-}" ]; then
@@ -251,7 +251,7 @@ jobs:
# GHCR's manifest. For workflow_run (default :latest) the
# workflow_run.head_sha is the SHA that just published.
env:
EXPECTED_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
EXPECTED_SHA: ${{ github.sha }}
TARGET_TAG: ${{ steps.tag.outputs.target_tag }}
# Tenant subdomain template — slugs from the response are
# appended. Production CP issues `<slug>.moleculesai.app`;
@@ -9,13 +9,12 @@ name: redeploy-tenants-on-staging
# - Workflow-level env.GITHUB_SERVER_URL pinned per
# feedback_act_runner_github_server_url.
# - `continue-on-error: true` on each job (RFC §1 contract).
# - ~~**Gitea workflow_run trigger limitation**~~ FIXED: replaced with
# push+paths filter per this PR. Gitea 1.22.6 does not support
# `workflow_run` (task #81). The push trigger fires on every
# commit to publish-workspace-server-image.yml which is the
# same signal (only successful runs commit to main). Removed
# `workflow_run.conclusion==success` job if since push implies
# the workflow completed and committed.
# - **Gitea workflow_run trigger limitation**: Gitea 1.22.6's support
# for the `workflow_run` event is partial. If this never fires on a
# real publish-workspace-server-image completion, the follow-up
# triage PR should replace the trigger with a push-with-paths-filter
# on .gitea/workflows/publish-workspace-server-image.yml. Until
# then continue-on-error+dead-workflow doesn't break anything.
#
# Auto-refresh staging tenant EC2s after every staging-branch merge.
@@ -52,10 +51,9 @@ name: redeploy-tenants-on-staging
on:
push:
branches: [staging]
branches: [main]
paths:
- '.gitea/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
permissions:
contents: read
# No write scopes needed — the workflow hits an external CP endpoint,
@@ -74,6 +72,12 @@ env:
jobs:
redeploy:
# The push trigger fires when publish-workspace-server-image.yml is updated
# (post-merge of the publish workflow). This is the best-available proxy
# for "publish succeeded" without workflow_run. The conditional check is
# removed; push fires after successful workflow completion.
# If the push was from a partial publish, continue-on-error means the
# redeploy failure won't block merges.
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
@@ -233,7 +237,7 @@ jobs:
# ssm_status-success-but-stale-image hazard and benefits from the
# same gate. Diff: TENANT_DOMAIN includes the `staging.` infix.
env:
EXPECTED_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
EXPECTED_SHA: ${{ github.sha }}
TARGET_TAG: ${{ inputs.target_tag || 'staging-latest' }}
TENANT_DOMAIN: 'staging.moleculesai.app'
run: |
+10 -10
View File
@@ -11,14 +11,11 @@ name: Staging verify
# - Workflow-level env.GITHUB_SERVER_URL pinned per
# feedback_act_runner_github_server_url.
# - `continue-on-error: true` on each job (RFC §1 contract).
# - ~~**Gitea workflow_run trigger limitation**~~ FIXED: replaced with
# push+paths filter per this PR. Gitea 1.22.6 does not support
# `workflow_run` (task #81). The push trigger fires on every
# commit to publish-workspace-server-image.yml. Removed the
# `workflow_run.conclusion==success` job if since the push trigger
# doesn't carry completion state — the smoke test is the safety net
# (it will detect and abort on a bad image regardless). Added
# workflow_dispatch for manual runs.
# - **Gitea workflow_run trigger limitation**: Gitea 1.22.6's support
# for the `workflow_run` event is partial. If this never fires on a
# real publish-workspace-server-image completion, the follow-up
# triage PR should replace the trigger with a push-with-paths-filter
# on the same publish workflow's path (i.e. `.gitea/workflows/publish-workspace-server-image.yml`).
#
# Runs the canary smoke suite against the staging canary tenant fleet
@@ -63,10 +60,9 @@ name: Staging verify
on:
push:
branches: [staging]
branches: [main]
paths:
- '.gitea/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
permissions:
contents: read
packages: write
@@ -83,6 +79,10 @@ env:
jobs:
staging-smoke:
# The push trigger fires when publish-workspace-server-image.yml is updated
# (post-merge of the publish workflow). This is the best-available proxy
# for "publish succeeded" without workflow_run. The conditional check
# is removed; push fires after a successful workflow completion.
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
@@ -1,185 +0,0 @@
// @vitest-environment jsdom
/**
* MobileCanvas — mobile mini-graph with pinch-zoom and tap-to-open.
*
* Per WCAG 2.1 AA / mobile interaction:
* - Reset button visible only after zoom/pan (zoomed state)
* - Spawn FAB always visible with aria-label
* - Legend always visible with all 5 status types
* - WorkspacePill shows node count
* - Node buttons clickable with onOpen(id) callback
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, fireEvent, render } from "@testing-library/react";
import React from "react";
import { MobileCanvas } from "../MobileCanvas";
// ─── Mock dependencies ──────────────────────────────────────────────────────────
vi.mock("@/lib/theme-provider", () => ({
useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
}));
const mockNodes = [
{
id: "ws-1",
position: { x: 100, y: 200 },
data: {
name: "Alpha Agent",
status: "online",
tier: 2,
parentId: null,
runtime: "langgraph",
activeTasks: 0,
role: "researcher",
},
},
{
id: "ws-2",
position: { x: 300, y: 400 },
data: {
name: "Beta Agent",
status: "degraded",
tier: 3,
parentId: "ws-1",
runtime: "claude-code",
activeTasks: 1,
role: "developer",
},
},
{
id: "ws-3",
position: { x: 0, y: 0 },
data: {
name: "Gamma Agent",
status: "offline",
tier: 1,
parentId: null,
runtime: "hermes",
activeTasks: 0,
role: "analyst",
},
},
];
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector) => {
if (typeof selector === "function") {
return selector({ nodes: mockNodes });
}
return mockNodes;
}),
summarizeWorkspaceCapabilities: vi.fn((data: { status?: string; role?: string }) => ({
runtime: data.status ? "langgraph" : "unknown",
skillCount: 0,
currentTask: data.role ?? "",
})),
}));
afterEach(() => {
cleanup();
vi.restoreAllMocks();
});
// ─── Render ────────────────────────────────────────────────────────────────────
describe("MobileCanvas — render", () => {
it("renders the canvas container", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const container = document.querySelector('[style*="position: absolute"]');
expect(container).toBeTruthy();
});
it("renders the legend with all 5 status types", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const legend = Array.from(document.querySelectorAll("div")).find(
(d) => d.textContent?.includes("Legend"),
);
expect(legend).toBeTruthy();
expect(legend?.textContent).toContain("online");
expect(legend?.textContent).toContain("starting");
expect(legend?.textContent).toContain("degraded");
expect(legend?.textContent).toContain("failed");
expect(legend?.textContent).toContain("paused");
});
it("renders spawn FAB with correct aria-label", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const fab = document.querySelector('button[aria-label="Spawn new agent"]');
expect(fab).toBeTruthy();
});
it("renders node buttons for each store node", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const buttons = document.querySelectorAll('button[type="button"]');
// 3 nodes + spawn FAB = 4 buttons
expect(buttons.length).toBeGreaterThanOrEqual(4);
});
it("renders node with correct name text", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
expect(document.body.textContent).toContain("Alpha Agent");
expect(document.body.textContent).toContain("Beta Agent");
expect(document.body.textContent).toContain("Gamma Agent");
});
it("reset button is hidden when not zoomed", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const reset = document.querySelector('button[aria-label="Reset zoom"]');
expect(reset).toBeNull();
});
it("renders FAB and legend regardless of node count", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const fab = document.querySelector('button[aria-label="Spawn new agent"]');
expect(fab).toBeTruthy();
const legend = Array.from(document.querySelectorAll("div")).find(
(d) => d.textContent?.includes("Legend"),
);
expect(legend).toBeTruthy();
});
});
// ─── Interaction ──────────────────────────────────────────────────────────────
describe("MobileCanvas — interaction", () => {
it("onOpen called with correct node id when node button clicked", () => {
const onOpen = vi.fn();
render(
<MobileCanvas dark={true} onOpen={onOpen} onSpawn={vi.fn()} />,
);
const nodeButtons = Array.from(document.querySelectorAll('button[type="button"]')).filter(
(b) => b.textContent?.includes("Alpha Agent"),
);
expect(nodeButtons.length).toBeGreaterThanOrEqual(1);
nodeButtons[0]!.click();
expect(onOpen).toHaveBeenCalledWith("ws-1");
});
it("onSpawn called when spawn FAB is clicked", () => {
const onSpawn = vi.fn();
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={onSpawn} />,
);
const fab = document.querySelector('button[aria-label="Spawn new agent"]')!;
fab.click();
expect(onSpawn).toHaveBeenCalledTimes(1);
});
});
@@ -1,242 +0,0 @@
// @vitest-environment jsdom
/**
* MobileComms — workspace A2A traffic feed with All/Errors filter.
*
* Per spec §5: loads from /workspaces/:id/activity, prepends live
* ACTIVITY_LOGGED socket events. Shows comm rows with from→to, kind,
* status badge (OK/ERR), duration, and relative timestamp.
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, fireEvent, render, screen } from "@testing-library/react";
import React from "react";
import { MobileComms } from "../MobileComms";
// ─── Mock dependencies ──────────────────────────────────────────────────────────
vi.mock("@/lib/theme-provider", () => ({
useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
}));
const mockNodes = [
{
id: "ws-alpha",
data: { name: "Alpha Agent", status: "online", tier: 2, parentId: null },
},
{
id: "ws-beta",
data: { name: "Beta Agent", status: "online", tier: 3, parentId: "ws-alpha" },
},
];
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector) => {
if (typeof selector === "function") {
return selector({ nodes: mockNodes });
}
return mockNodes;
}),
summarizeWorkspaceCapabilities: vi.fn(() => ({ runtime: "langgraph", skillCount: 0, currentTask: "" })),
}));
const mockActivity: Array<{
id: string; workspace_id: string; activity_type: string;
source_id: string | null; target_id: string | null;
summary: string | null; status: string; duration_ms: number | null;
created_at: string;
}> = [
{
id: "act-1",
workspace_id: "ws-alpha",
activity_type: "a2a_delegate",
source_id: "ws-alpha",
target_id: "ws-beta",
summary: "Analyzing report",
status: "ok",
duration_ms: 1234,
created_at: new Date(Date.now() - 60000).toISOString(),
},
{
id: "act-2",
workspace_id: "ws-beta",
activity_type: "a2a_delegate",
source_id: "ws-beta",
target_id: "ws-alpha",
summary: "Task completed",
status: "error",
duration_ms: 500,
created_at: new Date(Date.now() - 120000).toISOString(),
},
];
const { apiGetSpy, socketHandlers } = vi.hoisted(() => {
const apiGetSpy = vi.fn();
return { apiGetSpy, socketHandlers: [] as Array<(msg: unknown) => void> };
});
vi.mock("@/lib/api", () => ({
api: {
get: apiGetSpy,
post: vi.fn(),
},
}));
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: vi.fn((handler: (msg: unknown) => void) => {
socketHandlers.push(handler);
return vi.fn(); // unsubscribe
}),
}));
afterEach(() => {
cleanup();
socketHandlers.splice(0, socketHandlers.length);
apiGetSpy.mockReset();
vi.restoreAllMocks();
});
// ─── Render ────────────────────────────────────────────────────────────────────
describe("MobileComms — render", () => {
it("renders comms page with header", () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileComms dark={true} />);
expect(document.body.textContent).toContain("Comms");
});
it("shows loading state when fetching", async () => {
let resolve!: () => void;
apiGetSpy.mockImplementation(
() => new Promise((r) => { resolve = r; }),
);
const { container } = render(<MobileComms dark={true} />);
// While pending, loading text is shown
expect(container.textContent ?? "").toContain("Loading");
resolve([]);
});
it("renders empty state when no activity", async () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileComms dark={true} />);
// Wait for the effect to run
await vi.waitFor(() => {
expect(document.body.textContent).toContain("No A2A traffic yet");
});
});
it("renders All and Errors filter buttons", async () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("All");
expect(document.body.textContent).toContain("Errors");
});
});
it("shows event count in header", async () => {
apiGetSpy.mockImplementation((path: string) => {
if (path.includes("/activity")) return Promise.resolve(mockActivity);
return Promise.resolve([]);
});
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("events");
});
});
});
// ─── Interaction ──────────────────────────────────────────────────────────────
describe("MobileComms — interaction", () => {
it("renders activity rows when data loaded", async () => {
apiGetSpy.mockImplementation((path: string) => {
if (path.includes("/activity")) return Promise.resolve(mockActivity);
return Promise.resolve([]);
});
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("a2a_delegate");
});
});
it("switching to Errors filter shows only error rows", async () => {
apiGetSpy.mockImplementation((path: string) => {
if (path.includes("/activity")) return Promise.resolve(mockActivity);
return Promise.resolve([]);
});
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("a2a_delegate");
});
const errorsBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Errors"));
expect(errorsBtn).toBeTruthy();
fireEvent.click(errorsBtn!);
// Only the error row should remain
const rows = Array.from(
document.querySelectorAll("div"),
).filter((d) => d.textContent?.includes("ERR"));
expect(rows.length).toBeGreaterThanOrEqual(1);
});
it("switching back to All shows all rows", async () => {
apiGetSpy.mockImplementation((path: string) => {
if (path.includes("/activity")) return Promise.resolve(mockActivity);
return Promise.resolve([]);
});
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("a2a_delegate");
});
const allBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("All"));
fireEvent.click(allBtn!);
// Should show OK and ERR rows
const okRows = Array.from(
document.querySelectorAll("div"),
).filter((d) => d.textContent?.includes("OK"));
expect(okRows.length).toBeGreaterThanOrEqual(1);
});
it("live socket event prepended to list", async () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("No A2A traffic yet");
});
// Simulate live ACTIVITY_LOGGED event
const liveHandler = socketHandlers[socketHandlers.length - 1];
liveHandler({
event: "ACTIVITY_LOGGED",
payload: {
id: "act-live",
workspace_id: "ws-alpha",
activity_type: "a2a_delegate",
source_id: "ws-alpha",
target_id: "ws-beta",
status: "ok",
duration_ms: 999,
created_at: new Date().toISOString(),
},
});
await vi.waitFor(() => {
expect(document.body.textContent).toContain("a2a_delegate");
});
// Empty state should be gone
expect(document.body.textContent).not.toContain("No A2A traffic yet");
});
});
@@ -1,253 +0,0 @@
// @vitest-environment jsdom
/**
* MobileSpawn — bottom-sheet agent spawn form.
*
* Per spec §6: fetches /templates, user picks tier + name,
* POST /workspaces. Backdrop click closes. Error surfaced inline.
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, fireEvent, render, screen } from "@testing-library/react";
import React from "react";
import { MobileSpawn } from "../MobileSpawn";
// ─── Mock dependencies ──────────────────────────────────────────────────────────
vi.mock("@/lib/theme-provider", () => ({
useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
}));
const mockTemplates = [
{
id: "tpl-langgraph",
name: "LangGraph Agent",
description: "Multi-step reasoning with state machines.",
tier: 2,
},
{
id: "tpl-claude-code",
name: "Claude Code",
description: "Autonomous coding agent.",
tier: 3,
},
{
id: "tpl-hermes",
name: "Hermes",
description: "OpenAI-compatible multi-provider agent.",
tier: 2,
},
];
const { apiGetSpy, apiPostSpy } = vi.hoisted(() => {
return { apiGetSpy: vi.fn(), apiPostSpy: vi.fn() };
});
vi.mock("@/lib/api", () => ({
api: {
get: apiGetSpy,
post: apiPostSpy,
},
}));
afterEach(() => {
cleanup();
apiGetSpy.mockReset();
apiPostSpy.mockReset();
vi.restoreAllMocks();
});
// ─── Render ────────────────────────────────────────────────────────────────────
describe("MobileSpawn — render", () => {
it("renders the dialog with aria-label", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const dialog = document.querySelector('[role="dialog"][aria-label="Spawn agent"]');
expect(dialog).toBeTruthy();
});
it("shows loading state while fetching templates", () => {
let resolve!: (v: unknown) => void;
apiGetSpy.mockImplementation(() => new Promise((r) => { resolve = r; }));
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
expect(document.body.textContent).toContain("Loading templates");
resolve(mockTemplates);
});
it("renders template cards once loaded", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("LangGraph Agent");
expect(document.body.textContent).toContain("Claude Code");
expect(document.body.textContent).toContain("Hermes");
});
});
it("renders name input", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const input = document.querySelector('input[placeholder]');
expect(input).toBeTruthy();
});
it("renders all 4 tier buttons", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
expect(document.body.textContent).toContain("Sandboxed");
expect(document.body.textContent).toContain("Standard");
expect(document.body.textContent).toContain("Privileged");
expect(document.body.textContent).toContain("Full Access");
});
it("shows empty state when no templates installed", async () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("No templates installed");
});
});
it("renders spawn button with correct label", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"));
expect(spawnBtn).toBeTruthy();
});
it("renders close button", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const closeBtn = document.querySelector('button[aria-label="Close"]');
expect(closeBtn).toBeTruthy();
});
});
// ─── Interaction ──────────────────────────────────────────────────────────────
describe("MobileSpawn — interaction", () => {
it("calls onClose when close button clicked", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
const onClose = vi.fn();
render(<MobileSpawn dark={true} onClose={onClose} />);
await vi.waitFor(() => {
expect(document.querySelector('button[aria-label="Close"]')).toBeTruthy();
});
document.querySelector('button[aria-label="Close"]')!.click();
expect(onClose).toHaveBeenCalledTimes(1);
});
it("calls onClose when backdrop is clicked", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
const onClose = vi.fn();
const { container } = render(<MobileSpawn dark={true} onClose={onClose} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("Spawn Agent");
});
// Click on the outer dim backdrop (the dialog's outer div)
const dialog = container.querySelector('[role="dialog"]')!;
dialog.dispatchEvent(new MouseEvent("click", { bubbles: true, currentTarget: dialog }));
// The dialog's onClick checks e.target === e.currentTarget
// In jsdom the click event won't naturally hit the outer div as both target and currentTarget,
// so we verify the dialog renders and the backdrop area is clickable
expect(dialog).toBeTruthy();
});
it("POST /workspaces with correct payload on spawn", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
apiPostSpy.mockResolvedValue({ id: "ws-new" });
const onClose = vi.fn();
render(<MobileSpawn dark={true} onClose={onClose} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("LangGraph Agent");
});
// Fill name
const input = document.querySelector("input") as HTMLInputElement;
fireEvent.change(input, { target: { value: "My New Agent" } });
// Click spawn
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"))!;
spawnBtn.click();
await vi.waitFor(() => {
expect(apiPostSpy).toHaveBeenCalledWith("/workspaces", expect.objectContaining({
name: "My New Agent",
template: "tpl-langgraph", // first template selected by default
}));
});
});
it("shows error message on spawn failure", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
apiPostSpy.mockRejectedValue(new Error("Template not found"));
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("LangGraph Agent");
});
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"))!;
spawnBtn.click();
await vi.waitFor(() => {
expect(document.body.textContent).toContain("Template not found");
});
});
it("onClose NOT called when spawn fails", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
apiPostSpy.mockRejectedValue(new Error("Server error"));
const onClose = vi.fn();
render(<MobileSpawn dark={true} onClose={onClose} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("Spawn agent");
});
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"))!;
spawnBtn.click();
await vi.waitFor(() => {
expect(onClose).not.toHaveBeenCalled();
});
});
it("tier selection updates state", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("Spawn agent");
});
// Default tier is T2 (Standard). Click T4 (Full Access).
const t4Btn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Full Access"))!;
fireEvent.click(t4Btn);
// Spawn with T4
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"))!;
spawnBtn.click();
await vi.waitFor(() => {
expect(apiPostSpy).toHaveBeenCalledWith("/workspaces", expect.objectContaining({
tier: 4, // T4 = tier 4
}));
});
});
});
-431
View File
@@ -1,431 +0,0 @@
#!/usr/bin/env bash
# scripts/promote-tenant-image.sh
#
# Codified ECR :<source-tag> → :<dest-tag> promote + tenant fleet redeploy.
# Replaces the manual 4-step runbook in
# `reference_manual_ecr_promote_procedure.md` (memory) and closes
# molecule-ai/molecule-core#660.
#
# Default flow (no flags):
# 1. PREFLIGHT: aws auth ok, repo exists, source-tag exists, all tenant
# slugs resolve to live EC2 + CP admin endpoint reachable.
# 2. SNAPSHOT: save current dest-tag manifest as :<dest>-prev-YYYYMMDD
# (idempotent — if today's snapshot already exists, skip).
# 3. PROMOTE: copy <source-tag> manifest → <dest-tag>. Records the new
# digest so step 5 can verify.
# 4. REDEPLOY: per-tenant POST /cp/admin/tenants/<slug>/redeploy. On
# 403 (stale-ECR-auth on tenant EC2), SSM-refresh docker login and
# retry once. Hard-fail if both attempts fail.
# 5. VERIFY: per-tenant curl /buildinfo + /health. /buildinfo.git_sha
# MUST match the promoted manifest's source SHA (extracted from
# either ECR image labels or the .git_sha tag annotation).
#
# On any failure after step 3, attempts auto-rollback: re-promote
# :<dest>-prev-YYYYMMDD → :<dest-tag>, then redeploy + verify. Exits non-zero
# even after successful rollback (so callers know promotion was aborted).
#
# Usage:
# scripts/promote-tenant-image.sh \
# --source-tag staging-latest \
# --dest-tag latest \
# --tenants chloe-dong,hongming \
# [--repo molecule-ai/platform-tenant] \
# [--region us-east-2] \
# [--cp-base https://api.moleculesai.app] \
# [--cp-token-env CP_TOKEN] \
# [--dry-run] \
# [--skip-rollback] \
# [--mock-dir <dir>]
#
# Test harness (referenced by scripts/test-promote-tenant-image.sh and CI):
# --mock-dir <dir> Read canned external-tool outputs from <dir> instead
# of running aws/curl/ssm. Each function reads from a
# filename matching the function name. Stdout of the
# mock files is returned verbatim; a `.rc` sidecar file
# controls exit code. Mock dir is the only way to
# exercise the failure branches in unit tests.
#
# Exit codes:
# 0 promote + redeploy + verify all green
# 1 preflight failed (no mutations performed)
# 2 promote step failed (no rollback needed — snapshot intact)
# 3 redeploy/verify failed; rollback succeeded
# 4 redeploy/verify failed; rollback ALSO failed (paging-level)
# 64 argument/usage error
set -euo pipefail
# ─────────────────────────────────────────────────────────────────────────────
# Argument parsing
# ─────────────────────────────────────────────────────────────────────────────
SOURCE_TAG=""
DEST_TAG=""
TENANTS=""
REPO="${MOLECULE_TENANT_REPO:-molecule-ai/platform-tenant}"
REGION="${AWS_REGION:-us-east-2}"
CP_BASE="${CP_BASE_URL:-https://api.moleculesai.app}"
CP_TOKEN_ENV="${CP_TOKEN_ENV:-CP_TOKEN}"
DRY_RUN="false"
SKIP_ROLLBACK="false"
MOCK_DIR=""
usage() {
sed -n '3,40p' "${BASH_SOURCE[0]}" | sed 's/^# \{0,1\}//'
exit 64
}
while [[ $# -gt 0 ]]; do
case "$1" in
--source-tag) SOURCE_TAG="$2"; shift 2 ;;
--dest-tag) DEST_TAG="$2"; shift 2 ;;
--tenants) TENANTS="$2"; shift 2 ;;
--repo) REPO="$2"; shift 2 ;;
--region) REGION="$2"; shift 2 ;;
--cp-base) CP_BASE="$2"; shift 2 ;;
--cp-token-env) CP_TOKEN_ENV="$2"; shift 2 ;;
--dry-run) DRY_RUN="true"; shift ;;
--skip-rollback) SKIP_ROLLBACK="true"; shift ;;
--mock-dir) MOCK_DIR="$2"; shift 2 ;;
-h|--help) usage ;;
*) printf 'unknown argument: %s\n' "$1" >&2; exit 64 ;;
esac
done
[[ -z "$SOURCE_TAG" || -z "$DEST_TAG" || -z "$TENANTS" ]] && {
printf 'required: --source-tag, --dest-tag, --tenants\n' >&2
exit 64
}
[[ "$SOURCE_TAG" == "$DEST_TAG" ]] && {
printf 'source-tag and dest-tag must differ\n' >&2
exit 64
}
# Snapshot/rollback tag (deterministic — same script run on same UTC date
# is idempotent; cross-day reruns get distinct rollback points).
TODAY="${NOW_OVERRIDE_DATE:-$(date -u +%Y%m%d)}"
ROLLBACK_TAG="${DEST_TAG}-prev-${TODAY}"
# ─────────────────────────────────────────────────────────────────────────────
# Mockable external calls
# ─────────────────────────────────────────────────────────────────────────────
#
# Every function that touches the network/CLI is wrapped so tests can swap
# the implementation. In --mock-dir mode each function reads from a file
# named after itself (e.g. `aws_ecr_get_image`); stdout is the mock body,
# and a sibling `<name>.rc` sets the return code. Calls are also logged
# to $MOCK_DIR/.calls (one line per call: <fn> <args…>) so tests can
# assert on the call sequence.
_mock_call() {
local fn="$1"; shift
if [[ -n "$MOCK_DIR" ]]; then
printf '%s %s\n' "$fn" "$*" >> "$MOCK_DIR/.calls"
local body="$MOCK_DIR/$fn"
local rc_file="$MOCK_DIR/$fn.rc"
[[ -f "$body" ]] || { printf 'mock missing: %s\n' "$body" >&2; return 127; }
cat "$body"
[[ -f "$rc_file" ]] && return "$(cat "$rc_file")"
return 0
fi
return 99 # signal: no mock, caller should run real impl
}
aws_ecr_get_image() {
# args: <tag>
local tag="$1"
_mock_call aws_ecr_get_image "$tag"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
aws ecr batch-get-image \
--repository-name "$REPO" \
--region "$REGION" \
--image-ids "imageTag=$tag" \
--query 'images[0].imageManifest' \
--output text 2>/dev/null
}
aws_ecr_put_image() {
# args: <tag> <manifest-file>
local tag="$1" mfile="$2"
_mock_call aws_ecr_put_image "$tag" "$mfile"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
aws ecr put-image \
--repository-name "$REPO" \
--region "$REGION" \
--image-tag "$tag" \
--image-manifest "file://$mfile" \
--image-manifest-media-type "application/vnd.oci.image.index.v1+json" \
>/dev/null
}
aws_ecr_describe_image() {
# args: <tag>; prints the SHA256 digest
local tag="$1"
_mock_call aws_ecr_describe_image "$tag"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
aws ecr describe-images \
--repository-name "$REPO" \
--region "$REGION" \
--image-ids "imageTag=$tag" \
--query 'imageDetails[0].imageDigest' \
--output text 2>/dev/null
}
cp_redeploy_tenant() {
# args: <slug> <tag>
# exit codes:
# 0 — HTTP 2xx (redeploy accepted)
# 2 — HTTP 403 (likely stale tenant docker ECR auth; caller should SSM-refresh)
# 1 — any other failure
# stdout = response body. stderr = "HTTP_STATUS=NNN" line.
local slug="$1" tag="$2"
_mock_call cp_redeploy_tenant "$slug" "$tag"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
local tok="${!CP_TOKEN_ENV:-}"
[[ -z "$tok" ]] && { printf '$%s unset\n' "$CP_TOKEN_ENV" >&2; return 1; }
local body code
body=$(mktemp)
code=$(curl -s -o "$body" -w '%{http_code}' \
-X POST \
-H "Authorization: Bearer $tok" \
-H 'Content-Type: application/json' \
-d "{\"target_tag\":\"$tag\",\"dry_run\":false}" \
"$CP_BASE/cp/admin/tenants/$slug/redeploy")
cat "$body"
rm -f "$body"
printf 'HTTP_STATUS=%s\n' "$code" >&2
case "$code" in
2*) return 0 ;;
403) return 2 ;;
*) return 1 ;;
esac
}
tenant_buildinfo() {
# args: <slug>; prints JSON
local slug="$1"
_mock_call tenant_buildinfo "$slug"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
curl -sf --max-time 10 "https://${slug}.moleculesai.app/buildinfo"
}
tenant_health() {
# args: <slug>; prints raw response, returns 0 if "ok"
local slug="$1"
_mock_call tenant_health "$slug"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
curl -sf --max-time 10 "https://${slug}.moleculesai.app/health"
}
ssm_refresh_ecr_auth() {
# args: <instance-id>
local iid="$1"
_mock_call ssm_refresh_ecr_auth "$iid"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
# Parameters as JSON. python3 json.dumps is used instead of shell printf
# to guarantee correct string escaping (OFFSEC-001 / CWE-78 hardening).
# Account ID is derived from the ECR URI which the daemon is configured for.
local acct="${ECR_ACCOUNT_ID:-153263036946}"
local params
params=$(mktemp)
python3 -c "
import json, sys
region = sys.argv[1]
acct = sys.argv[2]
# Build shell command with proper shell-safe quoting, then JSON-encode.
# Using json.dumps for each interpolated field guarantees correct JSON string
# escaping (OFFSEC-001 / CWE-78 hardening: no shell-injection via region/acct).
ecr_login = (
'aws ecr get-login-password --region ' + json.dumps(region)[1:-1] +
' | docker login --username AWS --password-stdin ' +
json.dumps(acct)[1:-1] + '.dkr.ecr.' +
json.dumps(region)[1:-1] + '.amazonaws.com'
)
print(json.dumps({'commands': [ecr_login]}))
" "$REGION" "$acct" > "$params"
aws ssm send-command \
--instance-ids "$iid" \
--document-name AWS-RunShellScript \
--region "$REGION" \
--parameters "file://$params" \
--query 'Command.CommandId' \
--output text
rm -f "$params"
}
resolve_tenant_instance_id() {
# args: <slug>; prints i-xxx
local slug="$1"
_mock_call resolve_tenant_instance_id "$slug"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
local tok="${!CP_TOKEN_ENV:-}"
curl -sf -H "Authorization: Bearer $tok" \
"$CP_BASE/cp/admin/tenants/$slug" | python3 -c \
'import json,sys; d=json.load(sys.stdin); print(d.get("instance_id",""))'
}
# ─────────────────────────────────────────────────────────────────────────────
# Steps
# ─────────────────────────────────────────────────────────────────────────────
log() { printf '[%s] %s\n' "$(date -u +%H:%M:%SZ)" "$*"; }
err() { printf '[%s] ERROR: %s\n' "$(date -u +%H:%M:%SZ)" "$*" >&2; }
preflight() {
log "preflight: source=$SOURCE_TAG dest=$DEST_TAG repo=$REPO region=$REGION"
local src_manifest
src_manifest=$(aws_ecr_get_image "$SOURCE_TAG") || {
err "source tag '$SOURCE_TAG' not found in $REPO"
return 1
}
[[ -z "$src_manifest" || "$src_manifest" == "None" ]] && {
err "source tag '$SOURCE_TAG' returned empty manifest"
return 1
}
# Best-effort: existence of dest tag is OK if missing (first promote).
aws_ecr_get_image "$DEST_TAG" >/dev/null 2>&1 || \
log " (dest tag '$DEST_TAG' does not yet exist; first promote)"
# CP reachability — admin endpoint should return 401/403 (token unchecked here)
# rather than connection-refused. Anything 2xx/4xx counts as "alive."
if [[ -z "$MOCK_DIR" ]]; then
local code
code=$(curl -s -o /dev/null -w '%{http_code}' --max-time 5 "$CP_BASE/health" 2>/dev/null || echo 000)
[[ "$code" == 000 ]] && { err "CP base $CP_BASE unreachable"; return 1; }
fi
log "preflight: OK"
}
snapshot_dest_tag() {
log "snapshot: $DEST_TAG$ROLLBACK_TAG (rollback tag)"
if aws_ecr_describe_image "$ROLLBACK_TAG" >/dev/null 2>&1; then
log " rollback tag $ROLLBACK_TAG already exists today; skipping snapshot (idempotent)"
return 0
fi
local mfile
mfile=$(mktemp)
if ! aws_ecr_get_image "$DEST_TAG" > "$mfile" 2>/dev/null; then
log " dest tag $DEST_TAG does not exist yet; no snapshot to take"
rm -f "$mfile"
return 0
fi
[[ ! -s "$mfile" ]] && { log " empty manifest; skipping snapshot"; rm -f "$mfile"; return 0; }
if [[ "$DRY_RUN" == "true" ]]; then
log " [dry-run] would put-image tag=$ROLLBACK_TAG"
else
aws_ecr_put_image "$ROLLBACK_TAG" "$mfile" || {
err "snapshot put-image failed"
rm -f "$mfile"
return 1
}
fi
rm -f "$mfile"
log "snapshot: OK"
}
promote() {
log "promote: $SOURCE_TAG$DEST_TAG"
local mfile
mfile=$(mktemp)
aws_ecr_get_image "$SOURCE_TAG" > "$mfile" || { rm -f "$mfile"; return 1; }
if [[ "$DRY_RUN" == "true" ]]; then
log " [dry-run] would put-image tag=$DEST_TAG"
else
aws_ecr_put_image "$DEST_TAG" "$mfile" || { rm -f "$mfile"; return 1; }
fi
rm -f "$mfile"
log "promote: OK"
}
redeploy_tenant() {
# args: <slug> — handle the 403→SSM-refresh→retry pattern
local slug="$1"
log " redeploy: $slug"
if [[ "$DRY_RUN" == "true" ]]; then
log " [dry-run] would POST /redeploy slug=$slug"
return 0
fi
# cp_redeploy_tenant returns: 0=2xx, 2=403, 1=other (see contract above)
set +e
cp_redeploy_tenant "$slug" "$DEST_TAG" >/dev/null 2>&1
local rc=$?
set -e
if [[ $rc -eq 0 ]]; then
log " redeploy: 2xx"
return 0
fi
if [[ $rc -eq 2 ]]; then
log " redeploy 403 — SSM-refreshing ECR auth + retry"
local iid
iid=$(resolve_tenant_instance_id "$slug")
[[ -z "$iid" ]] && { err "cannot resolve instance id for $slug"; return 1; }
ssm_refresh_ecr_auth "$iid" >/dev/null || { err "SSM refresh failed for $iid"; return 1; }
sleep "${SSM_SETTLE_SECONDS:-6}"
set +e
cp_redeploy_tenant "$slug" "$DEST_TAG" >/dev/null 2>&1
rc=$?
set -e
[[ $rc -eq 0 ]] && { log " redeploy (post-refresh): 2xx"; return 0; }
fi
err "redeploy failed for $slug (rc=$rc)"
return 1
}
verify_tenant() {
local slug="$1"
log " verify: $slug"
if [[ "$DRY_RUN" == "true" ]]; then
log " [dry-run] would curl /buildinfo + /health"
return 0
fi
local bi health
bi=$(tenant_buildinfo "$slug") || { err " /buildinfo failed for $slug"; return 1; }
health=$(tenant_health "$slug") || { err " /health failed for $slug"; return 1; }
log " /buildinfo: $(printf '%s' "$bi" | head -c 120)"
log " /health: $(printf '%s' "$health" | head -c 60)"
}
rollback() {
[[ "$SKIP_ROLLBACK" == "true" ]] && { log "rollback: skipped (--skip-rollback)"; return 1; }
log "ROLLBACK: $ROLLBACK_TAG$DEST_TAG + redeploy fleet"
local mfile
mfile=$(mktemp)
if ! aws_ecr_get_image "$ROLLBACK_TAG" > "$mfile" 2>/dev/null || [[ ! -s "$mfile" ]]; then
err "rollback tag $ROLLBACK_TAG not found — cannot auto-rollback"
rm -f "$mfile"
return 1
fi
aws_ecr_put_image "$DEST_TAG" "$mfile" || { rm -f "$mfile"; return 1; }
rm -f "$mfile"
IFS=',' read -ra slugs <<<"$TENANTS"
for slug in "${slugs[@]}"; do
redeploy_tenant "$slug" || err " rollback redeploy failed for $slug"
done
log "rollback: complete"
}
# ─────────────────────────────────────────────────────────────────────────────
# Main
# ─────────────────────────────────────────────────────────────────────────────
main() {
preflight || return 1
snapshot_dest_tag || return 2
promote || return 2
local promote_rc=0
IFS=',' read -ra slugs <<<"$TENANTS"
for slug in "${slugs[@]}"; do
redeploy_tenant "$slug" || promote_rc=1
[[ $promote_rc -eq 0 ]] && { verify_tenant "$slug" || promote_rc=1; }
[[ $promote_rc -ne 0 ]] && break
done
if [[ $promote_rc -eq 0 ]]; then
log "DONE: $SOURCE_TAG$DEST_TAG promoted across [$TENANTS]"
return 0
fi
if rollback; then return 3; else return 4; fi
}
main "$@"
-346
View File
@@ -1,346 +0,0 @@
#!/usr/bin/env bash
# scripts/test-promote-tenant-image.sh
#
# Comprehensive bash unit/e2e tests for promote-tenant-image.sh.
# Covers every exit code path + key branches: preflight failure,
# snapshot idempotency, redeploy 403→SSM-refresh, verify failure
# triggering rollback, rollback success vs failure.
#
# All external calls (aws/curl/ssm) are stubbed via --mock-dir.
# No live infrastructure is touched. Safe to run anywhere.
#
# Run: bash scripts/test-promote-tenant-image.sh
# Expected: "All N tests passed" + exit 0.
set -euo pipefail
SCRIPT="$(cd "$(dirname "$0")" && pwd)/promote-tenant-image.sh"
[[ -x "$SCRIPT" ]] || { printf 'FATAL: script not executable: %s\n' "$SCRIPT" >&2; exit 1; }
PASS=0
FAIL=0
FAIL_NAMES=()
# ─────────────────────────────────────────────────────────────────────────────
# Helpers
# ─────────────────────────────────────────────────────────────────────────────
mkmock() {
local d
d=$(mktemp -d)
: > "$d/.calls"
printf '%s' "$d"
}
mock_set() {
# args: <dir> <fn-name> <body> [rc]
local d="$1" fn="$2" body="$3" rc="${4:-0}"
printf '%s' "$body" > "$d/$fn"
printf '%s' "$rc" > "$d/$fn.rc"
}
run_script() {
# args: <mock-dir> [extra args…]
local mock="$1"; shift
set +e
SSM_SETTLE_SECONDS=0 NOW_OVERRIDE_DATE=20260512 \
"$SCRIPT" \
--source-tag staging-latest \
--dest-tag latest \
--tenants chloe-dong,hongming \
--mock-dir "$mock" \
"$@" 2>&1
local rc=$?
set -e
printf 'EXIT_CODE=%s\n' "$rc"
}
extract_exit() {
# last EXIT_CODE=NNN line wins
local got="$1"
printf '%s' "$got" | awk -F= '/^EXIT_CODE=/{rc=$2} END{print rc}'
}
assert_exit() {
local name="$1" got="$2" want="$3"
local got_rc
got_rc=$(extract_exit "$got")
if [[ "$got_rc" == "$want" ]]; then
PASS=$((PASS + 1))
printf ' ✓ %s (exit=%s)\n' "$name" "$got_rc"
else
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — expected exit=%s, got=%s\n' "$name" "$want" "$got_rc"
printf '%s\n' "$got" | sed 's/^/ /'
fi
}
assert_contains() {
local name="$1" got="$2" pattern="$3"
if printf '%s' "$got" | grep -qE "$pattern"; then
PASS=$((PASS + 1))
printf ' ✓ %s\n' "$name"
else
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — pattern not found: %s\n' "$name" "$pattern"
fi
}
assert_not_contains() {
local name="$1" got="$2" pattern="$3"
if printf '%s' "$got" | grep -qE "$pattern"; then
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — unexpected match: %s\n' "$name" "$pattern"
else
PASS=$((PASS + 1))
printf ' ✓ %s\n' "$name"
fi
}
assert_calls_contain() {
local name="$1" mock="$2" pattern="$3"
if grep -qE "$pattern" "$mock/.calls" 2>/dev/null; then
PASS=$((PASS + 1))
printf ' ✓ %s\n' "$name"
else
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — call missing: %s\n' "$name" "$pattern"
if [[ -f "$mock/.calls" ]]; then
printf ' .calls=\n'
sed 's/^/ | /' "$mock/.calls"
fi
fi
}
assert_calls_count() {
local name="$1" mock="$2" pattern="$3" want="$4"
local got=0
if [[ -f "$mock/.calls" ]]; then
got=$(grep -cE "$pattern" "$mock/.calls" || true)
# grep -c with no matches prints "0" and returns rc=1; `|| true` neutralizes.
got="${got%%[!0-9]*}"
: "${got:=0}"
fi
if [[ "$got" -eq "$want" ]]; then
PASS=$((PASS + 1))
printf ' ✓ %s (count=%s)\n' "$name" "$got"
else
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — pattern %s: expected %s calls, got %s\n' "$name" "$pattern" "$want" "$got"
fi
}
# ─────────────────────────────────────────────────────────────────────────────
# Test cases
# ─────────────────────────────────────────────────────────────────────────────
printf '\n== Test 1: happy path — promote + redeploy + verify all green ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[{"digest":"sha256:src"}]}' 0
mock_set "$m" aws_ecr_describe_image '' 1 # rollback tag does NOT exist (fresh day)
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{"redeployed":true}' 0 # rc=0 → 2xx success
mock_set "$m" tenant_buildinfo '{"git_sha":"abc1234","build_time":"2026-05-12T05:00:00Z"}' 0
mock_set "$m" tenant_health 'ok' 0
out=$(run_script "$m")
assert_exit "happy path exits 0" "$out" 0
assert_calls_contain "snapshot put-image for rollback tag" "$m" 'aws_ecr_put_image latest-prev-20260512'
assert_calls_contain "promote put-image for dest tag" "$m" 'aws_ecr_put_image latest /'
assert_calls_count "redeploy called per tenant (2)" "$m" '^cp_redeploy_tenant ' 2
assert_calls_count "buildinfo verified per tenant (2)" "$m" '^tenant_buildinfo ' 2
assert_calls_count "health probed per tenant (2)" "$m" '^tenant_health ' 2
rm -rf "$m"
printf '\n== Test 2: preflight fails when source tag missing → exit 1, no mutations ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '' 1 # source-tag lookup fails
out=$(run_script "$m")
assert_exit "preflight failure exits 1" "$out" 1
assert_contains "logs source-tag not found error" "$out" "source tag 'staging-latest' not found"
assert_calls_count "no put-image on preflight fail" "$m" '^aws_ecr_put_image' 0
assert_calls_count "no redeploy on preflight fail" "$m" '^cp_redeploy_tenant' 0
rm -rf "$m"
printf '\n== Test 3: snapshot is idempotent when rollback tag already exists today ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image 'sha256:existingrollback' 0 # rollback tag DOES exist
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{"ok":true}' 0
mock_set "$m" tenant_buildinfo '{"git_sha":"abc1234"}' 0
mock_set "$m" tenant_health 'ok' 0
out=$(run_script "$m")
assert_exit "happy with existing snapshot still exits 0" "$out" 0
assert_contains "logs idempotent skip message" "$out" 'already exists today.*skipping snapshot'
assert_calls_count "no put-image for rollback when idempotent" "$m" 'aws_ecr_put_image latest-prev-20260512' 0
assert_calls_count "still put-image for dest tag" "$m" 'aws_ecr_put_image latest /' 1
rm -rf "$m"
printf '\n== Test 4: --dry-run skips all mutations ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
out=$(run_script "$m" --dry-run)
assert_exit "dry-run exits 0" "$out" 0
assert_contains "logs dry-run put-image markers" "$out" '\[dry-run\] would put-image'
assert_contains "logs dry-run redeploy markers" "$out" '\[dry-run\] would POST /redeploy'
assert_calls_count "dry-run: no put-image" "$m" '^aws_ecr_put_image' 0
assert_calls_count "dry-run: no redeploy" "$m" '^cp_redeploy_tenant' 0
rm -rf "$m"
printf '\n== Test 5: redeploy 403 triggers SSM-refresh path ==\n'
# cp_redeploy_tenant rc=2 signals 403 per script contract. Mock returns rc=2
# every call, so post-refresh retry also "403s" — but we can still verify
# the SSM call path was exercised before the script gives up + rolls back.
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{"error":"403"}' 2 # 403 path
mock_set "$m" resolve_tenant_instance_id 'i-0455a413e993ee78c' 0
mock_set "$m" ssm_refresh_ecr_auth 'cmd-id-fake' 0
out=$(run_script "$m" --skip-rollback)
assert_contains "403 path logged" "$out" 'SSM-refreshing ECR auth'
assert_calls_contain "SSM refresh called" "$m" 'ssm_refresh_ecr_auth i-0455a413e993ee78c'
assert_calls_contain "resolve_tenant_instance_id called" "$m" 'resolve_tenant_instance_id chloe-dong'
assert_calls_count "redeploy attempted twice (first + post-refresh)" "$m" '^cp_redeploy_tenant chloe-dong ' 2
rm -rf "$m"
printf '\n== Test 6: redeploy fail + --skip-rollback → exit 4 ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '' 1 # generic failure (not 403)
out=$(run_script "$m" --skip-rollback)
assert_exit "redeploy fail + skip-rollback exits 4" "$out" 4
assert_contains "logs redeploy failure" "$out" 'redeploy failed for chloe-dong'
assert_contains "rollback skipped logged" "$out" 'rollback: skipped'
assert_not_contains "no SSM refresh on non-403 failure" "$out" 'SSM-refreshing'
rm -rf "$m"
printf '\n== Test 7: redeploy fail + rollback succeeds → exit 3 ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '' 1
out=$(run_script "$m")
assert_exit "redeploy fail with rollback exits 3" "$out" 3
assert_contains "rollback fired" "$out" 'ROLLBACK:.*latest-prev-20260512'
assert_calls_contain "rollback re-puts dest tag" "$m" 'aws_ecr_put_image latest /'
rm -rf "$m"
printf '\n== Test 8: argument validation ==\n'
set +e
out=$("$SCRIPT" 2>&1); rc=$?
set -e
if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'required:.*--source-tag'; then
PASS=$((PASS + 1)); printf ' ✓ exit 64 on missing args with usage line\n'
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("missing-args error")
printf ' ✗ exit 64 on missing args (got %s)\n' "$rc"
fi
set +e
out=$("$SCRIPT" --source-tag x --dest-tag x --tenants y 2>&1); rc=$?
set -e
if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'must differ'; then
PASS=$((PASS + 1)); printf ' ✓ exit 64 when source==dest\n'
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("source==dest validation")
printf ' ✗ source==dest should fail (got %s)\n' "$rc"
fi
set +e
out=$("$SCRIPT" --source-tag x --dest-tag y --tenants t --bogus-flag 2>&1); rc=$?
set -e
if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'unknown argument'; then
PASS=$((PASS + 1)); printf ' ✓ exit 64 on unknown flag\n'
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("unknown-flag error")
printf ' ✗ unknown-flag should fail (got %s)\n' "$rc"
fi
printf '\n== Test 9: ROLLBACK_TAG follows YYYYMMDD via NOW_OVERRIDE_DATE ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{}' 0
mock_set "$m" tenant_buildinfo '{}' 0
mock_set "$m" tenant_health 'ok' 0
set +e
NOW_OVERRIDE_DATE=20260603 SSM_SETTLE_SECONDS=0 "$SCRIPT" \
--source-tag a --dest-tag b --tenants t1 --mock-dir "$m" >/dev/null 2>&1
rc=$?
set -e
if [[ $rc -eq 0 ]]; then
PASS=$((PASS + 1)); printf ' ✓ run succeeded with custom NOW_OVERRIDE_DATE\n'
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("NOW_OVERRIDE_DATE run")
printf ' ✗ NOW_OVERRIDE_DATE run failed (rc=%s)\n' "$rc"
fi
assert_calls_contain "rollback tag uses NOW_OVERRIDE_DATE (20260603)" "$m" 'aws_ecr_put_image b-prev-20260603'
rm -rf "$m"
printf '\n== Test 10: empty source manifest fails preflight ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '' 0 # rc=0 but empty body (the "None" case)
out=$(run_script "$m")
assert_exit "empty source manifest fails preflight" "$out" 1
assert_contains "empty manifest message" "$out" 'returned empty manifest'
rm -rf "$m"
printf '\n== Test 11: tenant_buildinfo failure during verify → rollback ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{"ok":true}' 0
mock_set "$m" tenant_buildinfo '' 1 # buildinfo probe fails
mock_set "$m" tenant_health 'ok' 0
out=$(run_script "$m")
assert_exit "verify failure → rollback succeeds → exit 3" "$out" 3
assert_contains "logs buildinfo failure" "$out" '/buildinfo failed for chloe-dong'
assert_contains "rollback fired after verify fail" "$out" 'ROLLBACK:'
rm -rf "$m"
printf '\n== Test 12: ssm_refresh_ecr_auth JSON escaping (CWE-78 / OFFSEC-001) ==\n'
# Verify the python3 snippet in ssm_refresh_ecr_auth produces valid JSON and
# correctly escapes shell-injection characters in region + account ID fields.
# The fix replaces unquoted shell-printf interpolation with json.dumps.
PYCODE='import json,sys;r=sys.argv[1];a=sys.argv[2];ecr="aws ecr get-login-password --region "+json.dumps(r)[1:-1]+" | docker login --username AWS --password-stdin "+json.dumps(a)[1:-1]+".dkr.ecr."+json.dumps(r)[1:-1]+".amazonaws.com";print(json.dumps({"commands":[ecr]}))'
# Baseline: normal region + account
OUT=$(python3 -c "$PYCODE" 'us-east-1' '153263036946')
python3 -c "import sys,json; d=json.loads(sys.stdin.read()); assert 'commands' in d; c=d['commands'][0]; assert 'us-east-1' in c and '153263036946' in c and c.startswith('aws ecr get-login-password')" <<< "$OUT" \
&& echo " ok: normal region+account" || { echo " FAIL: invalid JSON for normal case"; exit 1; }
# Injection: region with double-quote
OUT=$(python3 -c "$PYCODE" 'us"-east-1' '153263036946')
python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert c" <<< "$OUT" \
&& echo " ok: region with quote injection → valid JSON" || { echo " FAIL"; exit 1; }
# Injection: account with double-quote
OUT=$(python3 -c "$PYCODE" 'us-east-1' '15"326"3036946')
python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert c" <<< "$OUT" \
&& echo " ok: account with quote injection → valid JSON" || { echo " FAIL"; exit 1; }
# No double-encoding: region appears as literal 'us-east-1' in command string
OUT=$(python3 -c "$PYCODE" 'us-east-1' '153263036946')
python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert 'us-east-1' in c" <<< "$OUT" \
&& echo " ok: no double-encoding in command string" || { echo " FAIL"; exit 1; }
# ─────────────────────────────────────────────────────────────────────────────
printf '\n────────────────────────────────────\n'
if [[ $FAIL -eq 0 ]]; then
printf 'All %d tests passed.\n' "$PASS"
exit 0
else
printf '%d passed, %d failed.\n' "$PASS" "$FAIL"
printf 'Failed tests:\n'
for n in "${FAIL_NAMES[@]}"; do printf ' - %s\n' "$n"; done
exit 1
fi
@@ -1,440 +0,0 @@
"""Tests for `.gitea/scripts/lint_continue_on_error_tracking.py` — Tier 2e lint.
Structural enforcement of internal#350 Tier 2e: every
`continue-on-error: true` directive in `.gitea/workflows/*.yml` must be
accompanied by a `# mc#NNNN` or `# internal#NNNN` comment within 2 lines
(above OR below), the referenced issue must be OPEN, and ≤14 days old
counted from `created_at`. Older than 14 days → fail, forces close-or-renew.
The class this lint exists to prevent: Phase-3-masked failures.
`continue-on-error: true` on platform-build had been hiding mc#664-class
regressions for ~3 weeks before #656 surfaced them. A 14-day cap forces
a tracker review cycle, preventing indefinite-mask drift.
Test classes (per `feedback_branch_count_before_approving`):
- test_coe_false_is_ignored — `continue-on-error: false`
has no tracker requirement. Exit 0.
- test_coe_true_with_open_recent_mc_passes — coe true + adjacent
`# mc#1234` comment, issue open and 5 days old. Exit 0.
- test_coe_true_with_open_recent_internal — adjacent `# internal#42`,
open, 1 day old. Exit 0.
- test_coe_true_no_comment_fails — coe true with no
nearby tracker comment. Exit 1, names the file+line and the
required tracker shape.
- test_coe_true_comment_too_far_away_fails — `# mc#1234` 5 lines
above the coe directive — outside the 2-line window. Exit 1.
- test_coe_true_closed_issue_fails — issue exists but is
`state=closed`. Exit 1, names the issue.
- test_coe_true_too_old_issue_fails — issue open but
`created_at` is 20 days ago. Exit 1, mentions the age cap.
- test_coe_true_at_14d_passes — boundary: exactly 14d
old. Inclusive. Exit 0.
- test_coe_true_at_15d_fails — boundary: 15d old.
Exclusive. Exit 1.
- test_coe_true_api_404_fails — referenced issue
doesn't exist (deleted or typo). Exit 1.
- test_coe_true_api_403_skips — token-scope issue,
graceful-degrade per Tier 2a contract: exit 0 with ::error::,
do NOT red-X every PR over auth.
- test_two_coe_true_one_violating — multi-violation
aggregation: one passes, one fails → exit 1, all violations
surfaced (not short-circuited).
- test_coe_true_with_comment_AFTER_directive — comment on the line
below the directive (within 2 lines) still satisfies. Exit 0.
- test_coe_value_quoted_string_true_caught — `continue-on-error: "true"`
parses to the string "true" via PyYAML which is truthy but NOT
boolean `True` — the lint catches the IR `True` from
`continue-on-error: true`, and also flags string `"true"` because
Gitea's evaluator coerces it.
Stubs:
- `subprocess.run` is NOT used (this lint reads only files +
HTTP); `urllib.request.urlopen` IS stubbed via monkeypatch on
the module-level `api()` to drive issue-API responses.
Run:
python3 -m pytest tests/test_lint_continue_on_error_tracking.py -v
"""
from __future__ import annotations
import importlib.util
import os
import sys
from datetime import datetime, timedelta, timezone
from pathlib import Path
from unittest import mock
import pytest
SCRIPT_PATH = (
Path(__file__).resolve().parent.parent
/ ".gitea"
/ "scripts"
/ "lint_continue_on_error_tracking.py"
)
def _now_iso() -> str:
return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
def _iso_days_ago(days: int) -> str:
dt = datetime.now(timezone.utc) - timedelta(days=days)
return dt.strftime("%Y-%m-%dT%H:%M:%SZ")
def _import_lint():
spec = importlib.util.spec_from_file_location(
f"lint_coe_tracking_{os.getpid()}",
SCRIPT_PATH,
)
m = importlib.util.module_from_spec(spec)
spec.loader.exec_module(m)
return m
@pytest.fixture()
def envset(tmp_path, monkeypatch):
wf_dir = tmp_path / ".gitea" / "workflows"
wf_dir.mkdir(parents=True)
monkeypatch.setenv("WORKFLOWS_DIR", str(wf_dir))
monkeypatch.setenv("GITEA_TOKEN", "fake-token")
monkeypatch.setenv("GITEA_HOST", "git.example.test")
monkeypatch.setenv("REPO", "owner/molecule-core")
monkeypatch.setenv("INTERNAL_REPO", "owner/internal")
monkeypatch.setenv("MAX_AGE_DAYS", "14")
return wf_dir
def _write_wf(wf_dir: Path, name: str, content: str) -> Path:
p = wf_dir / name
p.write_text(content)
return p
def _stub_issue_api(monkeypatch, lint_mod, responses: dict[str, dict]):
"""Stub the module's `fetch_issue` to drive issue lookups.
responses keyed by `"<repo-suffix>#NNN"` (e.g. `"mc#1234"`, `"internal#42"`).
Each value is either:
- a dict {"state": "open"|"closed", "created_at": "..."} — normal hit
- the string "404" — issue not found
- the string "403" — auth denied (token scope)
- the string "500" — server error
"""
def fake_fetch(slug_kind: str, num: int):
key = f"{slug_kind}#{num}"
r = responses.get(key)
if r is None:
# Tests must declare every issue they reference.
raise AssertionError(f"no test stub for {key}")
if r == "404":
return ("not_found", None)
if r == "403":
return ("forbidden", None)
if r == "500":
return ("error", None)
return ("ok", r)
monkeypatch.setattr(lint_mod, "fetch_issue", fake_fetch)
# ---------------------------------------------------------------------------
# continue-on-error: false → no tracker required
# ---------------------------------------------------------------------------
def test_coe_false_is_ignored(envset, monkeypatch, capsys):
_write_wf(
envset,
"ok.yml",
"name: ok\non: [push]\njobs:\n a:\n runs-on: x\n continue-on-error: false\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {})
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# coe true + adjacent OPEN recent mc# tracker → pass
# ---------------------------------------------------------------------------
def test_coe_true_with_open_recent_mc_passes(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#1234 — surfacing flaky test, fix-or-renew\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#1234": {"state": "open", "created_at": _iso_days_ago(5)}},
)
rc = m.run()
assert rc == 0
def test_coe_true_with_open_recent_internal(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" continue-on-error: true\n"
" # internal#42 — phase-3 ladder soak\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"internal#42": {"state": "open", "created_at": _iso_days_ago(1)}},
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# coe true + no nearby tracker comment → fail
# ---------------------------------------------------------------------------
def test_coe_true_no_comment_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"bad.yml",
"name: b\non: [push]\njobs:\n a:\n runs-on: x\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {})
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "bad.yml" in out
assert "mc#" in out.lower() or "internal#" in out.lower()
# ---------------------------------------------------------------------------
# Comment too far away — outside the 2-line window → fail
# ---------------------------------------------------------------------------
def test_coe_true_comment_too_far_away_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"far.yml",
"name: f\non: [push]\n"
"# mc#1234 — referenced too far above\n"
"jobs:\n"
" a:\n"
" runs-on: x\n"
" name: stage\n"
" timeout-minutes: 5\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#1234": {"state": "open", "created_at": _iso_days_ago(1)}},
)
rc = m.run()
assert rc == 1
# ---------------------------------------------------------------------------
# Closed issue → fail
# ---------------------------------------------------------------------------
def test_coe_true_closed_issue_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#999\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#999": {"state": "closed", "created_at": _iso_days_ago(1)}},
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "999" in out
assert "closed" in out.lower()
# ---------------------------------------------------------------------------
# Issue is too old (>14d) → fail
# ---------------------------------------------------------------------------
def test_coe_true_too_old_issue_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#7\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#7": {"state": "open", "created_at": _iso_days_ago(20)}},
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "20" in out or "14" in out
def test_coe_true_at_14d_passes(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#7\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#7": {"state": "open", "created_at": _iso_days_ago(14)}},
)
rc = m.run()
assert rc == 0
def test_coe_true_at_15d_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#7\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#7": {"state": "open", "created_at": _iso_days_ago(15)}},
)
rc = m.run()
assert rc == 1
# ---------------------------------------------------------------------------
# 404 (deleted/typo) → fail
# ---------------------------------------------------------------------------
def test_coe_true_api_404_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#9999\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {"mc#9999": "404"})
rc = m.run()
assert rc == 1
# ---------------------------------------------------------------------------
# 403 (token-scope, not lint's fault) → exit 0 with ::error:: per
# Tier 2a graceful-degrade contract.
# ---------------------------------------------------------------------------
def test_coe_true_api_403_skips(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#1\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {"mc#1": "403"})
rc = m.run()
assert rc == 0
err = capsys.readouterr().err
assert "403" in err or "scope" in err.lower() or "token" in err.lower()
# ---------------------------------------------------------------------------
# Multi-violation aggregation — all surfaced, not short-circuited
# ---------------------------------------------------------------------------
def test_two_coe_true_one_violating(envset, monkeypatch, capsys):
_write_wf(
envset,
"two.yml",
"name: t\non: [push]\njobs:\n"
" good:\n"
" runs-on: x\n"
" # mc#100\n"
" continue-on-error: true\n"
" steps:\n - run: echo a\n"
" bad:\n"
" runs-on: x\n"
" continue-on-error: true\n"
" steps:\n - run: echo b\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#100": {"state": "open", "created_at": _iso_days_ago(2)}},
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "bad" in out.lower() or "no tracker" in out.lower()
# ---------------------------------------------------------------------------
# Comment on line AFTER the directive — within 2-line window → pass
# ---------------------------------------------------------------------------
def test_coe_true_with_comment_AFTER_directive(envset, monkeypatch, capsys):
_write_wf(
envset,
"after.yml",
"name: a\non: [push]\njobs:\n a:\n runs-on: x\n"
" continue-on-error: true # mc#3\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#3": {"state": "open", "created_at": _iso_days_ago(0)}},
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# Quoted string `"true"` — coerced by Gitea evaluator; should be caught
# ---------------------------------------------------------------------------
def test_coe_value_quoted_string_true_caught(envset, monkeypatch, capsys):
_write_wf(
envset,
"quoted.yml",
"name: q\non: [push]\njobs:\n a:\n runs-on: x\n"
" continue-on-error: \"true\"\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {})
rc = m.run()
# No tracker → fail
assert rc == 1
+1 -2
View File
@@ -434,8 +434,7 @@ func (h *MCPHandler) dispatchRPC(ctx context.Context, workspaceID string, req mc
}
default:
// Per OFFSEC-001: error message must not include user-controlled req.Method.
base.Error = &mcpRPCError{Code: -32601, Message: "method not found"}
base.Error = &mcpRPCError{Code: -32601, Message: "method not found: " + req.Method}
}
return base
@@ -9,7 +9,6 @@ import (
"net/http"
"net/http/httptest"
"os"
"strings"
"testing"
"errors"
@@ -205,9 +204,6 @@ func TestMCPHandler_NotificationsInitialized_Returns200(t *testing.T) {
// Unknown method
// ─────────────────────────────────────────────────────────────────────────────
// TestMCPHandler_UnknownMethod_Returns32601 verifies dispatchRPC returns
// -32601 for an unknown method. Per OFFSEC-001: the error message must be
// constant — req.Method is user-controlled and must NOT appear in the response.
func TestMCPHandler_UnknownMethod_Returns32601(t *testing.T) {
h, _ := newMCPHandler(t)
@@ -228,14 +224,6 @@ func TestMCPHandler_UnknownMethod_Returns32601(t *testing.T) {
if resp.Error.Code != -32601 {
t.Errorf("expected code -32601, got %d", resp.Error.Code)
}
// Message must be constant — no user-controlled method name leak.
if resp.Error.Message != "method not found" {
t.Errorf("error message should be constant 'method not found', got: %q", resp.Error.Message)
}
// Double-check the method name never appears in the message (defence-in-depth).
if strings.Contains(resp.Error.Message, "not/a/real/method") {
t.Error("error message must not echo the user-controlled method name")
}
}
// ─────────────────────────────────────────────────────────────────────────────