Compare commits

..

4 Commits

Author SHA1 Message Date
core-devops 65831c839e fix(queue): re-fetch PR head before merge to detect stale SHA
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 2/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +2 — body-unfilled: comprehensive-testing, l
sop-checklist / na-declarations (pull_request) N/A: (none)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 2s
cascade-list-drift-gate / check (pull_request) Failing after 3s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 6s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m10s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 54s
CI / Platform (Go) (pull_request) Successful in 4m48s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 54s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 6m12s
gate-check-v3 / gate-check (pull_request) Successful in 3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m0s
qa-review / approved (pull_request) Failing after 3s
security-review / approved (pull_request) Failing after 2s
sop-tier-check / tier-check (pull_request) Successful in 4s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m1s
CI / Python Lint & Test (pull_request) Successful in 6m37s
CI / all-required (pull_request) Successful in 6m38s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 1s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 1s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
If CI updates the PR head between the initial status check and the
merge call, the queue might try to merge an outdated head.  Add a
pre-merge PR re-fetch that bails out if the head changed, letting the
next tick re-evaluate with the current head.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 04:06:22 +00:00
core-devops d342149646 fix(queue): handle Gitea empty-body 200 on merge endpoint
CI / Canvas (Next.js) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Waiting to run
cascade-list-drift-gate / check (pull_request) Waiting to run
CI / all-required (pull_request) Waiting to run
CI / Detect changes (pull_request) Waiting to run
CI / Platform (Go) (pull_request) Waiting to run
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Waiting to run
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
CI / Shellcheck (E2E scripts) (pull_request) Waiting to run
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
CI / Python Lint & Test (pull_request) Waiting to run
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / detect-changes (pull_request) Waiting to run
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Waiting to run
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Waiting to run
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Gitea's /pulls/{n}/merge returns HTTP 200 with an empty body on success.
The api() wrapper tries to json.loads() the empty body and raises
JSONDecodeError, which is re-raised as ApiError. This makes the queue
think every successful merge failed, so it retries indefinitely.

Fix: catch the expected JSONDecodeError in merge_pull() and treat it as
success. Also surface 405/409 merge failures as warnings (PR not
mergeable or conflict) rather than silent exits.

Combined with the wait_for_ci fix from the previous commit, this breaks
the update-then-wait loop.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 04:05:37 +00:00
core-devops 54907ee852 fix(queue): wait for CI after update instead of immediate re-check
Block internal-flavored paths / Block forbidden paths (pull_request) Waiting to run
cascade-list-drift-gate / check (pull_request) Waiting to run
CI / all-required (pull_request) Waiting to run
CI / Detect changes (pull_request) Waiting to run
CI / Platform (Go) (pull_request) Waiting to run
CI / Canvas (Next.js) (pull_request) Waiting to run
CI / Shellcheck (E2E scripts) (pull_request) Waiting to run
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
CI / Python Lint & Test (pull_request) Waiting to run
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / detect-changes (pull_request) Waiting to run
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Waiting to run
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Waiting to run
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Waiting to run
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Waiting to run
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
The queue was in an update-then-wait loop:
1. Queue updates PR → new CI run triggered on new head
2. Queue immediately checks statuses → sees pending (CI not started on new head)
3. Queue exits "wait"
4. Next tick: same cycle, CI never completes on any single head

Fix: after update_pull(), re-fetch the new head SHA and poll CI for
up to 5 min until required contexts reach terminal state. If CI
finishes within the window, merge on the same tick. If not, exit and
retry next tick.

Also adds `import time` required for the wait loop.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 04:03:11 +00:00
core-devops 99453c6a71 infra(ci): add concurrency blocks to 3 scheduled workflows
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 5/7 — missing: root-cause, no-backwards-compat
sop-checklist / na-declarations (pull_request) N/A: (none)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
CI / Platform (Go) (pull_request) Successful in 4m24s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 5s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 2s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m7s
CI / Canvas (Next.js) (pull_request) Successful in 6m4s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m1s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request) Successful in 2s
sop-tier-check / tier-check (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 50s
CI / Python Lint & Test (pull_request) Successful in 6m28s
CI / all-required (pull_request) Successful in 6m22s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 1s
qa-review / approved (pull_request) N/A declared by core-devops; qa-review waived per sop-checklist config
security-review / approved (pull_request) N/A declared by core-devops; security-review waived per sop-checklist config
Add per-SHA concurrency groups with cancel-in-progress: true to
scheduled workflows missing concurrency blocks:

- gate-check-v3.yml (hourly cron): prevents stale hourly runs from
  accumulating when new cron ticks fire
- secret-pattern-drift.yml (daily 05:00 UTC): same
- weekly-platform-go.yml (Mondays 04:17 UTC): same

These are lower-frequency than the sweep/minute-level workflows
but should still be covered for consistency and runner hygiene.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-17 02:47:52 +00:00
8 changed files with 137 additions and 207 deletions
+113 -69
View File
@@ -23,6 +23,7 @@ import dataclasses
import json
import os
import sys
import time
import urllib.error
import urllib.parse
import urllib.request
@@ -137,25 +138,14 @@ def status_state(status: dict) -> str:
def latest_statuses_by_context(statuses: list[dict]) -> dict[str, dict]:
# Iterate so the newest entry for each context is seen LAST → it overwrites
# older ones in the accumulator dict.
# - Ascending input (oldest first, e.g. Gitea /status base array): forward
# iteration processes oldest first, newest last → newest overwrites → OK.
# - Descending input (newest first, e.g. Gitea /statuses, combined array):
# forward iteration processes newest first → oldest last → oldest wins.
# Must REVERSE so iteration is oldest→newest → newest wins.
# Guard: detect ascending by checking last_id > first_id.
if not statuses:
return {}
ids = [s.get("id", 0) for s in statuses if isinstance(s.get("id"), int)]
if ids and ids[-1] < ids[0]:
# Descending (newest first) — reverse to oldest→newest iteration.
statuses = list(reversed(statuses))
# Gitea /statuses endpoint returns entries in ascending id order (oldest
# first). We need the LAST occurrence of each context, so iterate in
# reverse to prefer newer entries.
latest: dict[str, dict] = {}
for status in statuses:
for status in reversed(statuses):
context = status.get("context")
if isinstance(context, str):
latest[context] = status
latest[context] = status # overwrite: reverse order → newest wins
return latest
@@ -257,54 +247,37 @@ def get_branch_head(branch: str) -> str:
def get_combined_status(sha: str) -> dict:
"""Combined status + all individual statuses for `sha`.
The /status endpoint returns a `statuses` array capped at 30 entries.
We supplement it with /statuses (limit=100) for contexts not in the
base array. The combined `state` always comes from /status.
Returns the merged list sorted ASCENDING by id. Caller's
latest_statuses_by_context iterates ascending so the newest (largest
id) for each context is seen last and wins.
The /status endpoint caps the `statuses` array at 30 entries (Gitea
default page size), so we fetch the full list via /statuses with a
higher limit. The combined `state` still comes from /status.
"""
_, combined = api("GET", f"/repos/{OWNER}/{NAME}/commits/{sha}/status")
if not isinstance(combined, dict):
raise ApiError(f"status for {sha} response not object")
base_statuses: list[dict] = combined.get("statuses") or []
all_entries: list[dict] = list(base_statuses)
# Fetch full statuses list; 200 covers >99% of real-world runs.
# The list is ordered ascending by id (oldest first) — callers must
# iterate in reverse to get the newest entry per context.
# Best-effort: large repos (main with 550+ statuses) may time out.
# On timeout, fall back to the statuses[] already in the combined
# response (usually 30 entries — enough for most PRs, enough for
# main's early push-required contexts).
try:
_, statuses_list = api(
_, all_statuses = api(
"GET",
f"/repos/{OWNER}/{NAME}/commits/{sha}/statuses",
query={"limit": "100"},
query={"limit": "50"},
)
if isinstance(statuses_list, list):
all_entries.extend(statuses_list)
if isinstance(all_statuses, list):
combined["statuses"] = all_statuses
except (ApiError, urllib.error.URLError, TimeoutError, OSError) as exc:
# URLError covers network-level failures (DNS, refused, timeout).
# TimeoutError and OSError cover socket-level timeouts.
sys.stderr.write(f"::warning::could not fetch full statuses list for {sha[:8]}: {exc}\n")
# Sort ascending by id. latest_statuses_by_context iterates ascending
# so the newest (largest id) entry for each context is seen last and wins.
all_entries.sort(key=lambda s: s.get("id") or 0)
combined["statuses"] = all_entries
# Fall back to the statuses[] already in the combined response.
pass
return combined
def _resolve_label_id(name: str) -> str | None:
"""Return the repo label ID for `name`, or None if not found.
Gitea's /issues endpoint with labels=<name> has a known quirk: when multiple
repo labels share the same name (e.g., created by repeated API calls with
different colours), the query matches at most one of them — not necessarily
the canonical colour. Resolving to ID sidesteps the ambiguity.
"""
_, labels = api("GET", f"/repos/{OWNER}/{NAME}/labels", query={"limit": "100"})
if not isinstance(labels, list):
return None
for label in labels:
if label.get("name") == name:
return str(label["id"])
return None
def list_queued_issues() -> list[dict]:
_, body = api(
"GET",
@@ -354,6 +327,43 @@ def update_pull(pr_number: int, *, dry_run: bool) -> None:
)
def wait_for_ci(
head_sha: str,
contexts: list[str],
*,
max_wait_seconds: int = 300,
poll_interval: int = 15,
) -> bool:
"""Poll CI statuses for head_sha until all required contexts are terminal.
Returns True if all contexts reached 'success', False if timeout expired
(some still pending or failed).
Background: after a queue-triggered PR update, CI re-runs on the new head.
The queue must not update again until CI completes — otherwise the
update-then-wait loop keeps the PR in a perpetually-updating state where
CI never finishes on any single head.
"""
deadline = time.time() + max_wait_seconds
while time.time() < deadline:
time.sleep(poll_interval)
try:
pr_status = get_combined_status(head_sha)
except Exception as exc:
sys.stderr.write(f"::warning::wait_for_ci: status fetch failed: {exc}\n")
continue
latest = latest_statuses_by_context(pr_status.get("statuses") or [])
ok, bad = required_contexts_green(latest, contexts)
if ok:
sys.stderr.write(f"::notice::wait_for_ci: all contexts green after {int(time.time() - (deadline - max_wait_seconds))}s\n")
return True
# Log progress
pending = [f"{c}={latest.get(c, {}).get('status', 'missing')}" for c in contexts if latest.get(c, {}).get('status') != 'success']
sys.stderr.write(f"::notice::wait_for_ci: still waiting ({int(deadline - time.time())}s left): {', '.join(pending[:3])}\n")
sys.stderr.write(f"::warning::wait_for_ci: timeout after {max_wait_seconds}s; proceeding with merge check\n")
return False
def merge_pull(pr_number: int, *, dry_run: bool) -> None:
payload = {
"Do": "merge",
@@ -366,7 +376,24 @@ def merge_pull(pr_number: int, *, dry_run: bool) -> None:
print(f"::notice::merging PR #{pr_number}")
if dry_run:
return
api("POST", f"/repos/{OWNER}/{NAME}/pulls/{pr_number}/merge", body=payload, expect_json=False)
# Gitea's merge endpoint returns HTTP 200 with an empty body on success.
# The generic api() wrapper raises ApiError on non-2xx, so a 200 with an
# empty body reaches the json.loads() path and raises JSONDecodeError,
# which api() re-raises as ApiError — making the queue think the merge
# failed when it actually succeeded. Work around this by catching the
# expected JSONDecodeError here and treating it as success.
try:
api("POST", f"/repos/{OWNER}/{NAME}/pulls/{pr_number}/merge", body=payload, expect_json=False)
except ApiError as exc:
# Surface non-merge errors (5xx server errors, 403 forbidden, etc.)
if "merge" in str(exc).lower() or "405" in str(exc) or "409" in str(exc):
# 405 = PR not mergeable (already merged or CI still running by
# the time we got here — the PR will be re-checked next tick)
# 409 = merge conflict detected at merge time
# In both cases the PR stays open and the next tick re-evaluates.
sys.stderr.write(f"::warning::merge call returned: {exc}\n")
else:
raise
def process_once(*, dry_run: bool = False) -> int:
@@ -418,6 +445,32 @@ def process_once(*, dry_run: bool = False) -> int:
print(f"::notice::PR #{pr_number} decision={decision.action}: {decision.reason}")
if decision.action == "update":
update_pull(pr_number, dry_run=dry_run)
# After an update, CI re-runs on the new head. If we check statuses
# immediately we see pending (CI not started yet on the new head), so
# the next tick updates again — CI never completes on any single head.
# Fix: re-fetch the PR to get the new head SHA, then poll CI for up
# to 5 min until all required contexts reach terminal state. If CI
# finishes in time, proceed to merge on the same tick.
if not dry_run:
updated_pr = get_pull(pr_number)
new_head = updated_pr.get("head", {}).get("sha", "")
if new_head and new_head != head_sha:
sys.stderr.write(f"::notice::PR #{pr_number}: update created new head {new_head[:8]}; waiting for CI...\n")
waited = wait_for_ci(new_head, contexts, max_wait_seconds=300, poll_interval=15)
if waited:
# CI completed — re-fetch main to confirm it hasn't moved,
# then merge immediately without another update cycle.
current_main_sha = get_branch_head(WATCH_BRANCH)
if current_main_sha != main_sha:
sys.stderr.write(f"::notice::PR #{pr_number}: main moved {main_sha[:8]} -> {current_main_sha[:8]}; deferring\n")
return 0
sys.stderr.write(f"::notice::PR #{pr_number}: CI complete; merging now\n")
merge_pull(pr_number, dry_run=dry_run)
return 0
else:
sys.stderr.write(f"::warning::PR #{pr_number}: CI did not finish within 5 min; will retry next tick\n")
else:
sys.stderr.write(f"::notice::PR #{pr_number}: update did not change head SHA; will retry\n")
post_comment(
pr_number,
(
@@ -428,6 +481,13 @@ def process_once(*, dry_run: bool = False) -> int:
)
return 0
if decision.ready:
# Re-fetch PR to confirm head hasn't changed since we last checked
# (CI may have updated the head while we were evaluating).
current_pr = get_pull(pr_number)
current_head = current_pr.get("head", {}).get("sha", "")
if current_head != head_sha:
print(f"::notice::PR #{pr_number} head changed {head_sha[:8]} -> {current_head[:8]}; re-evaluating")
return 0
latest_main_sha = get_branch_head(WATCH_BRANCH)
if latest_main_sha != main_sha:
print(
@@ -435,23 +495,7 @@ def process_once(*, dry_run: bool = False) -> int:
"deferring to next tick"
)
return 0
try:
merge_pull(pr_number, dry_run=dry_run)
except ApiError as exc:
# Merge API errors (405 permission denied, 422 hook block, etc.)
# are NOT transient — retrying will not help. Surface the error
# on the PR immediately so it is visible without digging into
# workflow logs, and fail the workflow so it is distinguishable
# from a successful-no-op tick.
post_comment(
pr_number,
f"merge-queue: MERGE FAILED — {exc}. "
"This is a non-transient error (permission or hook issue). "
"See SEV-1 internal#487.",
dry_run=dry_run,
)
sys.stderr.write(f"::error::PR #{pr_number} merge failed: {exc}\n")
return 2 # distinct exit code so workflow run shows failure
merge_pull(pr_number, dry_run=dry_run)
return 0
return 0
+2 -11
View File
@@ -830,18 +830,9 @@ def main(argv: list[str] | None = None) -> int:
# one membership lookup per team.
team_member_cache: dict[tuple[str, int], bool | None] = {}
def _required_teams_for(slug: str) -> list[str] | None:
"""Look up required_teams for a slug from checklist items OR N/A gates."""
if slug in items_by_slug:
return items_by_slug[slug]["required_teams"]
if slug in na_gates:
return na_gates[slug].get("required_teams", [])
return None
def probe(slug: str, users: list[str]) -> list[str]:
team_names = _required_teams_for(slug)
if team_names is None:
raise KeyError(f"slug '{slug}' not found in items or N/A gates")
item = items_by_slug[slug]
team_names: list[str] = item["required_teams"]
# Resolve names → ids. NOTE: orgs/{org}/teams/search may not be
# available — fall back to the list endpoint.
team_ids: list[int] = []
+5 -78
View File
@@ -1,7 +1,6 @@
import importlib.util
import sys
from pathlib import Path
from unittest.mock import patch
SCRIPT = Path(__file__).resolve().parents[1] / "gitea-merge-queue.py"
@@ -11,37 +10,16 @@ sys.modules[spec.name] = mq
spec.loader.exec_module(mq)
def test_latest_statuses_ascending_input_newest_wins():
# Gitea /status (base array) returns ascending id order (oldest first).
# Forward iteration processes oldest first, newest last → newest overwrites.
def test_latest_statuses_dedupes_by_context_newest_first():
statuses = [
{"id": 18, "context": "CI / all-required (pull_request)", "status": "failure"}, # oldest
{"id": 27, "context": "sop-checklist / all-items-acked (pull_request)", "state": "success"},
{"id": 54, "context": "CI / all-required (pull_request)", "status": "success"}, # newest
{"context": "CI / all-required (pull_request)", "status": "failure"},
{"context": "sop-checklist / all-items-acked (pull_request)", "state": "success"},
{"context": "CI / all-required (pull_request)", "status": "success"},
]
latest = mq.latest_statuses_by_context(statuses)
assert latest["CI / all-required (pull_request)"]["status"] == "success"
assert latest["CI / all-required (pull_request)"]["id"] == 54
assert latest["sop-checklist / all-items-acked (pull_request)"]["state"] == "success"
def test_latest_statuses_guard_reverses_descending_input():
# Gitea /statuses returns descending id order (newest first: id=54 → id=1).
# Guard detects descending and reverses so we iterate ascending.
# Forward on reversed = newest (id=54) is last → overwrites oldest.
statuses = [
{"id": 54, "context": "CI / all-required (pull_request)", "status": "success"}, # newest
{"id": 27, "context": "sop-checklist / all-items-acked (pull_request)", "state": "success"},
{"id": 18, "context": "CI / all-required (pull_request)", "status": "failure"}, # oldest
]
latest = mq.latest_statuses_by_context(statuses)
# Guard reverses descending → asc iteration: 18 first, 27, 54 last → 54 wins.
assert latest["CI / all-required (pull_request)"]["status"] == "success"
assert latest["CI / all-required (pull_request)"]["id"] == 54
assert latest["CI / all-required (pull_request)"]["status"] == "failure"
assert latest["sop-checklist / all-items-acked (pull_request)"]["state"] == "success"
@@ -140,54 +118,3 @@ def test_merge_decision_updates_stale_pr_before_merge():
assert decision.ready is False
assert decision.action == "update"
def test_merge_failure_returns_nonzero_and_posts_comment(monkeypatch):
"""When merge_pull raises ApiError (e.g. HTTP 405 permission denied),
process_once returns exit code 2 (non-zero) and posts a comment on the PR.
This distinguishes merge-permission errors from successful-no-op ticks."""
captured_comment = {}
def fake_post_comment(pr_number, body, *, dry_run):
captured_comment["pr_number"] = pr_number
captured_comment["body"] = body
# Replace functions directly on the module object so process_once()
# (which looks them up by name at call time) picks up the fakes.
mq.list_queued_issues = lambda: [{
"number": 42,
"created_at": "2026-05-17T00:00:00Z",
"labels": [{"name": "merge-queue"}],
"pull_request": {},
}]
mq.get_pull = lambda n: {
"state": "open",
"base": {"ref": "main", "repo_id": 1},
"head": {"sha": "headsha", "repo_id": 1},
"merge_base": "abc123def",
}
mq.get_pull_commits = lambda n: [{"sha": "headsha"}]
mq.get_branch_head = lambda branch: "abc123def"
mq.get_combined_status = lambda sha: {
"state": "success",
"statuses": [{"context": "CI / all-required (push)", "status": "success"}],
}
mq.latest_statuses_by_context = lambda s: {
"CI / all-required (pull_request)": {"status": "success"},
"sop-checklist / all-items-acked (pull_request)": {"status": "success"},
}
mq.required_contexts_green = lambda statuses, contexts: (True, [])
mq.post_comment = fake_post_comment
# Simulate merge failing with HTTP 405 (permission denied).
# The ApiError raised by api() is caught inside process_once().
merge_error = mq.ApiError(
"POST /repos/x/y/pulls/42/merge -> HTTP 405: User not allowed to merge PR"
)
with patch.object(mq, "merge_pull", side_effect=merge_error):
exit_code = mq.process_once(dry_run=False)
assert exit_code == 2, f"Expected exit code 2, got {exit_code}"
assert captured_comment["pr_number"] == 42
assert "MERGE FAILED" in captured_comment["body"]
assert "405" in captured_comment["body"]
@@ -603,51 +603,3 @@ class TestComputeNaState(unittest.TestCase):
self.assertEqual(na_directives[0][0], "sop-n/a")
self.assertEqual(na_directives[0][1], "qa-review")
self.assertIn("no surface", na_directives[0][2])
class TestProbeNaGateFallback(unittest.TestCase):
"""Regression test: probe() must handle gate names (qa-review, security-review)
from N/A gates without raising KeyError.
mc#1389: compute_na_state calls probe(gate_name, [user]) where gate_name is
a gate name like 'qa-review' — NOT a checklist item slug. The probe must
resolve the gate's required_teams from na_gates, not raise KeyError from
items_by_slug lookup.
"""
def test_probe_resolves_gate_name_from_na_gates(self):
cfg = sop.load_config(CONFIG_PATH)
items = cfg["items"]
items_by_slug = {it["slug"]: it for it in items}
na_gates = cfg.get("n/a_gates", {})
# Reconstruct the _required_teams_for helper from sop-checklist.py
def _required_teams_for(slug):
if slug in items_by_slug:
return items_by_slug[slug]["required_teams"]
if slug in na_gates:
return na_gates[slug].get("required_teams", [])
return None
# Gate names should resolve from na_gates
self.assertEqual(
_required_teams_for("qa-review"),
["qa", "security", "engineers"],
)
self.assertEqual(
_required_teams_for("security-review"),
["security", "managers", "ceo"],
)
# Checklist item slugs should still resolve from items_by_slug
self.assertEqual(
_required_teams_for("comprehensive-testing"),
["qa", "engineers"],
)
self.assertEqual(
_required_teams_for("root-cause"),
["managers", "ceo"],
)
# Unknown slug should return None (not raise KeyError)
self.assertIsNone(_required_teams_for("nonexistent-slug"))
+6
View File
@@ -32,6 +32,12 @@ on:
# iterating all open PRs when PR_NUMBER is empty.
workflow_dispatch:
# Cancel stale runs so the 8-runner pool stays available for PR jobs.
# Per-SHA group ensures push and cron runs at different SHAs don't cancel each other.
concurrency:
group: gate-check-v3-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: true
permissions:
# read: contents — for checkout (base ref, not PR head for security)
# read: pull-requests — for reading PR info via API
-1
View File
@@ -162,7 +162,6 @@ jobs:
exit 1
fi
python -m twine upload \
--verbose \
--repository pypi \
--username __token__ \
--password "$PYPI_TOKEN" \
@@ -44,6 +44,12 @@ on:
- ".github/scripts/lint_secret_pattern_drift.py"
- ".githooks/pre-commit"
# Cancel stale runs to keep the 8-runner pool available for PR jobs.
# Per-SHA group ensures push and scheduled runs at different SHAs don't cancel each other.
concurrency:
group: secret-pattern-drift-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: true
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
+5
View File
@@ -22,6 +22,11 @@ on:
- cron: '17 4 * * 1' # Mondays at 04:17 UTC
workflow_dispatch:
# Cancel stale runs to keep the 8-runner pool available for PR jobs.
concurrency:
group: weekly-platform-go-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: true
permissions:
contents: read
statuses: write