Compare commits

..

2 Commits

Author SHA1 Message Date
core-fe 44205b6974 chore: re-trigger CI after jq install fix (PR #410)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
CI / Detect changes (pull_request) Successful in 27s
E2E API Smoke Test / detect-changes (pull_request) Successful in 27s
Harness Replays / detect-changes (pull_request) Failing after 12s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 26s
Harness Replays / Harness Replays (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 23s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 20s
sop-tier-check / tier-check (pull_request) Successful in 15s
CI / Platform (Go) (pull_request) Successful in 12s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 13s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 12s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 11s
audit-force-merge / audit (pull_request) Successful in 29s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9m44s
CI / Canvas (Next.js) (pull_request) Failing after 11m49s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
2026-05-11 09:31:41 +00:00
core-fe 53ff06c722 fix(canvas/a11y): add accessible name to ConsoleModal + DeleteCascadeConfirmDialog backdrops
ConsoleModal backdrop: aria-hidden removed, aria-label="Close terminal" added.
DeleteCascadeConfirmDialog backdrop: aria-hidden removed, aria-label="Dismiss dialog" added.
cursor-pointer also added to both for consistent pointer affordance.

Matches the same fix applied to KeyboardShortcutsDialog (PR #299) and
ConfirmDialog (PR #398) for WCAG 2.4.6 compliance.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 09:31:41 +00:00
130 changed files with 688 additions and 14146 deletions
-40
View File
@@ -1,40 +0,0 @@
#!/usr/bin/env python3
"""Extract changed-file list from Gitea Compare API JSON response.
Gitea Compare API returns changed files nested inside commits, not at the
top level:
{"commits": [{"files": [{"filename": "path/to/file"}]}]}
Usage:
compare-api-diff-files.py < API_RESPONSE.json
Exits 0 with filenames on stdout, one per line.
Exits 1 on malformed input (caller should handle as "no files").
"""
from __future__ import annotations
import sys
import json
def main() -> None:
try:
data = json.load(sys.stdin)
except Exception:
sys.exit(1)
filenames: list[str] = []
for commit in data.get("commits", []):
for f in commit.get("files", []):
fn = f.get("filename", "")
if fn:
filenames.append(fn)
if filenames:
sys.stdout.write("\n".join(filenames))
sys.stdout.write("\n")
# else: empty stdout = no files, caller treats as empty list
if __name__ == "__main__":
main()
-42
View File
@@ -1,42 +0,0 @@
#!/usr/bin/env python3
"""Extract changed-file list from a Gitea push event's commits JSON array.
Each commit in a push event has `added`, `removed`, and `modified` file lists.
This script aggregates all of them and prints unique filenames one per line.
Usage:
push-commits-diff-files.py < COMMITS_JSON
Exits 0 always (caller handles empty output as "no files").
"""
from __future__ import annotations
import sys
import json
def main() -> None:
try:
data = json.load(sys.stdin)
except Exception:
sys.exit(0) # Don't fail the step — treat malformed JSON as empty
if not isinstance(data, list):
sys.exit(0)
files: set[str] = set()
for commit in data:
if not isinstance(commit, dict):
continue
for key in ("added", "removed", "modified"):
for f in commit.get(key) or []:
if isinstance(f, str) and f:
files.add(f)
if files:
sys.stdout.write("\n".join(sorted(files)))
sys.stdout.write("\n")
if __name__ == "__main__":
main()
-203
View File
@@ -1,203 +0,0 @@
#!/usr/bin/env bash
# review-check — evaluate whether a PR satisfies a single team-review gate.
#
# RFC#324 Step 1 of 5 — qa-review + security-review check workflows.
#
# This is the shared evaluator invoked by:
# .gitea/workflows/qa-review.yml (TEAM=qa, TEAM_ID=20)
# .gitea/workflows/security-review.yml (TEAM=security, TEAM_ID=21)
#
# Pass condition (per RFC#324 v1.1 addendum):
# ≥ 1 review on the PR where:
# • state == APPROVED
# • review.dismissed == false
# • review.user.login != PR.user.login (non-author)
# • review.user.login ∈ team-members
#
# Strict mode (default OFF for v1; see RFC trade-off note):
# If REVIEW_CHECK_STRICT=1, additionally require review.commit_id == PR.head.sha.
# With dismiss_stale_reviews: true at the protection layer, stale reviews
# are already dismissed, so the additional commit_id check is belt-and-
# suspenders. Keeping it off in v1 simplifies semantics; flip in a follow-up
# PR if reviewer telemetry shows residual stale-APPROVE merges.
#
# Privilege gate (RFC#324 v1.3 §A1.1 — INFORMATIONAL ONLY):
# The /qa-recheck and /security-recheck slash-commands can be triggered
# by anyone who can comment on the PR. The workflow's privilege step
# logs collaborator-status but does NOT gate execution of this script.
# Why this is safe: this evaluator is read-only and idempotent —
# reading `pulls/{N}/reviews` and `teams/{id}/members/{u}` can't be
# influenced by who triggered the run. If a real team-member APPROVE
# exists the gate flips green; otherwise it stays red. A
# non-collaborator commenting /qa-recheck cannot manufacture a green
# gate. Original (v1.2) design with `if:`-gating of this step was
# fail-open (skipped-via-`if:` job still publishes the status as
# `success`) — corrected in v1.3 per hongming-pc review 1421.
#
# Trust boundary (RFC A4):
# This script is loaded from the BASE branch (sourced via .gitea/scripts/
# on the workflow's checkout-of-base). It does NOT execute any PR-HEAD
# code. It only reads PR review state via the Gitea API.
#
# Token scope (RFC A1-α):
# The job's own conclusion (exit 0 / exit 1) is what publishes the
# `qa-review / approved` / `security-review / approved` status context.
# NO `POST /statuses` call here → NO `write:repository` scope on the
# token. `read:organization` (for team-membership probe) and
# `read:repository` (for PR + reviews) are enough.
#
# Required env:
# GITEA_TOKEN — least-priv read:repository + read:organization. See note
# below about the team-membership API requiring the token
# owner to be in the queried team (Gitea 1.22.6 quirk).
# GITEA_HOST — e.g. git.moleculesai.app
# REPO — owner/name (from github.repository)
# PR_NUMBER — int (from github.event.pull_request.number or
# github.event.issue.number for issue_comment events)
# TEAM — short team name (qa | security) for log lines
# TEAM_ID — Gitea team id (20=qa, 21=security at time of writing)
#
# Optional:
# REVIEW_CHECK_DEBUG=1 — per-API-call diagnostic lines
# REVIEW_CHECK_STRICT=1 — also require review.commit_id == pr.head.sha
set -euo pipefail
# jq is required for JSON parsing. It is pre-baked into the runner-base
# image (per RFC#268 workflow-smoke), so the only reason we'd not find it
# is a broken runner. The previous fallback dance (apt-get + curl to
# /usr/local/bin/jq) cannot succeed on a uid-1001 rootless runner
# (#391/#402 + feedback_ci_runner_install_needs_writable_path), so it's
# dropped. Fail loud with a clear diagnostic rather than attempt an
# install that physically cannot work.
if ! command -v jq >/dev/null 2>&1; then
echo "::error::jq missing from runner-base image — bake it into the runner image (see RFC#268 workflow-smoke / feedback_ci_runner_install_needs_writable_path). This evaluator cannot run without jq."
exit 1
fi
: "${GITEA_TOKEN:?GITEA_TOKEN required}"
: "${GITEA_HOST:?GITEA_HOST required}"
: "${REPO:?REPO required (owner/name)}"
: "${PR_NUMBER:?PR_NUMBER required}"
: "${TEAM:?TEAM required (qa|security)}"
: "${TEAM_ID:?TEAM_ID required (integer)}"
OWNER="${REPO%%/*}"
NAME="${REPO##*/}"
API="https://${GITEA_HOST}/api/v1"
# Token-in-argv fix (#541): write the Authorization header to a mode-600
# temp file instead of passing it via curl -H "$AUTH" (which puts the
# secret token value in the process table for any process to read via
# /proc/<pid>/cmdline or ps -ef). The curl config file is read by curl
# itself and never appears in the argv of the curl subprocess.
CURL_AUTH_FILE=$(mktemp -p /tmp curl-auth.XXXXXX)
chmod 600 "$CURL_AUTH_FILE"
printf 'header = "Authorization: token %s"\n' "$GITEA_TOKEN" > "$CURL_AUTH_FILE"
# Pre-create temp files so cleanup trap can reference them by name
# (bash trap 'function' EXIT expands variables at trap-fire time, not def time).
PR_JSON=$(mktemp)
REVIEWS_JSON=$(mktemp)
TEAM_PROBE_TMP=$(mktemp)
cleanup() {
rm -f "$CURL_AUTH_FILE" "$PR_JSON" "$REVIEWS_JSON" "$TEAM_PROBE_TMP"
}
trap cleanup EXIT
debug() {
if [ "${REVIEW_CHECK_DEBUG:-}" = "1" ]; then
echo " [debug] $*" >&2
fi
}
echo "::notice::${TEAM}-review evaluating repo=${OWNER}/${NAME} pr=${PR_NUMBER} team_id=${TEAM_ID}"
# --- Fetch the PR (for author + head.sha) ---
HTTP_CODE=$(curl -sS -o "$PR_JSON" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/repos/${OWNER}/${NAME}/pulls/${PR_NUMBER}")
if [ "$HTTP_CODE" != "200" ]; then
echo "::error::GET /pulls/${PR_NUMBER} returned HTTP ${HTTP_CODE} (token scope?)"
cat "$PR_JSON" >&2
exit 1
fi
PR_AUTHOR=$(jq -r '.user.login // ""' "$PR_JSON")
PR_HEAD_SHA=$(jq -r '.head.sha // ""' "$PR_JSON")
PR_STATE=$(jq -r '.state // ""' "$PR_JSON")
debug "pr_author=${PR_AUTHOR} pr_head=${PR_HEAD_SHA:0:7} pr_state=${PR_STATE}"
if [ "$PR_STATE" != "open" ]; then
echo "::notice::PR ${PR_NUMBER} is ${PR_STATE} — exiting 0 (closed PRs do not gate)"
exit 0
fi
if [ -z "$PR_AUTHOR" ] || [ -z "$PR_HEAD_SHA" ]; then
echo "::error::PR ${PR_NUMBER} missing user.login or head.sha — webhook payload malformed"
exit 1
fi
# --- Fetch all reviews on the PR ---
HTTP_CODE=$(curl -sS -o "$REVIEWS_JSON" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/repos/${OWNER}/${NAME}/pulls/${PR_NUMBER}/reviews")
if [ "$HTTP_CODE" != "200" ]; then
echo "::error::GET /pulls/${PR_NUMBER}/reviews returned HTTP ${HTTP_CODE}"
cat "$REVIEWS_JSON" >&2
exit 1
fi
# Filter: state=APPROVED, not-dismissed, non-author. Optionally strict-mode
# adds commit_id==head.sha (off by default; see header).
JQ_FILTER='.[]
| select(.state == "APPROVED")
| select(.dismissed != true)
| select(.user.login != $author)'
if [ "${REVIEW_CHECK_STRICT:-}" = "1" ]; then
JQ_FILTER="${JQ_FILTER}
| select(.commit_id == \$head)"
fi
JQ_FILTER="${JQ_FILTER}
| .user.login"
CANDIDATES=$(jq -r --arg author "$PR_AUTHOR" --arg head "$PR_HEAD_SHA" "$JQ_FILTER" "$REVIEWS_JSON" | sort -u)
debug "candidate non-author approvers: $(echo "$CANDIDATES" | tr '\n' ' ')"
if [ -z "$CANDIDATES" ]; then
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (no candidates yet)"
exit 1
fi
# --- Probe team membership per candidate ---
# Endpoint: GET /api/v1/teams/{id}/members/{username}
# 200/204 → is member
# 403 → token owner is not in this team (Gitea 1.22.6 'Must be a team
# member' constraint — see follow-up issue for token-provisioning)
# 404 → not a member
for U in $CANDIDATES; do
CODE=$(curl -sS -o "$TEAM_PROBE_TMP" -w '%{http_code}' \
-K "$CURL_AUTH_FILE" "${API}/teams/${TEAM_ID}/members/${U}")
debug "probe ${U} in team ${TEAM} (id=${TEAM_ID}) → HTTP ${CODE}"
case "$CODE" in
200|204)
echo "::notice::${TEAM}-review APPROVED by ${U} (team=${TEAM})"
exit 0
;;
403)
# Token owner is not in the team being probed; the API refuses to
# confirm membership. This is the RFC#324 follow-up token-scope gap.
# Fail closed — never grant approval on a 403; surface clearly.
echo "::error::team-probe for ${U} in ${TEAM} returned 403 (token owner not in ${TEAM} team — RFC#324 token-scope follow-up). Cannot confirm membership; failing closed."
cat "$TEAM_PROBE_TMP" >&2
exit 1
;;
404)
debug "${U} not a member of ${TEAM}"
;;
*)
echo "::warning::team-probe for ${U} in ${TEAM} returned unexpected HTTP ${CODE}"
cat "$TEAM_PROBE_TMP" >&2
;;
esac
done
echo "::error::${TEAM}-review awaiting non-author APPROVE from ${TEAM} team (candidates: $(echo "$CANDIDATES" | tr '\n' ',' | sed 's/,$//') — none are in team)"
exit 1
-172
View File
@@ -1,172 +0,0 @@
#!/usr/bin/env bash
# sop-tier-refire — re-evaluate sop-tier-check and POST status to PR head SHA.
#
# Invoked from `.gitea/workflows/sop-tier-refire.yml` when a repo
# MEMBER/OWNER/COLLABORATOR comments `/refire-tier-check` on a PR.
#
# Behavior:
#
# 1. Resolve PR head SHA + author from PR_NUMBER.
# 2. Rate-limit: if the sop-tier-check context has been POSTed in the
# last 30 seconds, skip (prevents comment-spam status thrash).
# 3. Invoke `.gitea/scripts/sop-tier-check.sh` with the same env the
# canonical workflow provides. This is DRY: we re-use the exact AND-
# composition gate logic, not a watered-down approving-count check.
# 4. POST the resulting status (success on exit 0, failure on non-zero)
# to `/repos/.../statuses/{HEAD_SHA}` with context
# "sop-tier-check / tier-check (pull_request)" — the same context name
# branch protection requires.
#
# Required env (set by sop-tier-refire.yml):
# GITEA_TOKEN — org-level SOP_TIER_CHECK_TOKEN (read:org/user/issue/repo)
# GITEA_HOST — e.g. git.moleculesai.app
# REPO — owner/name
# PR_NUMBER — PR number from issue_comment payload
# COMMENT_AUTHOR — login of the commenter (logged for audit)
#
# Optional:
# SOP_DEBUG=1 — verbose per-API-call diagnostics
# SOP_REFIRE_RATE_LIMIT_SEC — override the 30s rate-limit (default 30)
# SOP_REFIRE_DISABLE_RATE_LIMIT=1 — for tests; skips the rate-limit check
set -euo pipefail
debug() {
if [ "${SOP_DEBUG:-}" = "1" ]; then
echo " [debug] $*" >&2
fi
}
: "${GITEA_TOKEN:?GITEA_TOKEN required}"
: "${GITEA_HOST:?GITEA_HOST required}"
: "${REPO:?REPO required (owner/name)}"
: "${PR_NUMBER:?PR_NUMBER required}"
: "${COMMENT_AUTHOR:=unknown}"
OWNER="${REPO%%/*}"
NAME="${REPO##*/}"
API="https://${GITEA_HOST}/api/v1"
AUTH="Authorization: token ${GITEA_TOKEN}"
CONTEXT="sop-tier-check / tier-check (pull_request)"
RATE_LIMIT_SEC="${SOP_REFIRE_RATE_LIMIT_SEC:-30}"
echo "::notice::sop-tier-refire start: repo=$OWNER/$NAME pr=$PR_NUMBER commenter=$COMMENT_AUTHOR"
# 1. Fetch PR details — need head.sha and user.login.
PR_FILE=$(mktemp)
trap 'rm -f "$PR_FILE"' EXIT
PR_HTTP=$(curl -sS -o "$PR_FILE" -w '%{http_code}' -H "$AUTH" \
"${API}/repos/${OWNER}/${NAME}/pulls/${PR_NUMBER}")
if [ "$PR_HTTP" != "200" ]; then
echo "::error::GET /pulls/$PR_NUMBER returned HTTP $PR_HTTP (body $(head -c 200 "$PR_FILE"))"
exit 1
fi
HEAD_SHA=$(jq -r '.head.sha' <"$PR_FILE")
PR_AUTHOR=$(jq -r '.user.login' <"$PR_FILE")
PR_STATE=$(jq -r '.state' <"$PR_FILE")
if [ -z "$HEAD_SHA" ] || [ "$HEAD_SHA" = "null" ]; then
echo "::error::Could not resolve head.sha from PR #$PR_NUMBER response"
exit 1
fi
debug "head_sha=$HEAD_SHA pr_author=$PR_AUTHOR state=$PR_STATE"
if [ "$PR_STATE" != "open" ]; then
echo "::notice::PR #$PR_NUMBER state is $PR_STATE; refire is a no-op on closed PRs."
exit 0
fi
# 2. Rate-limit: skip if our context was updated in the last $RATE_LIMIT_SEC.
# Gitea statuses endpoint returns latest first; we check the most recent
# entry for our context name.
if [ "${SOP_REFIRE_DISABLE_RATE_LIMIT:-}" != "1" ]; then
STATUSES_FILE=$(mktemp)
trap 'rm -f "$PR_FILE" "$STATUSES_FILE"' EXIT
ST_HTTP=$(curl -sS -o "$STATUSES_FILE" -w '%{http_code}' -H "$AUTH" \
"${API}/repos/${OWNER}/${NAME}/statuses/${HEAD_SHA}?limit=50&sort=newest")
debug "statuses-list HTTP=$ST_HTTP"
if [ "$ST_HTTP" = "200" ]; then
LAST_UPDATED=$(jq -r --arg c "$CONTEXT" \
'[.[] | select(.context == $c)] | first | .updated_at // ""' \
<"$STATUSES_FILE")
if [ -n "$LAST_UPDATED" ] && [ "$LAST_UPDATED" != "null" ]; then
# Parse RFC3339 → epoch. Use python -c for portability (date(1) -d
# differs between BSD/GNU; the Gitea runner is Ubuntu so GNU date
# works, but we keep python for future container variance).
LAST_EPOCH=$(python3 -c "import sys,datetime;print(int(datetime.datetime.fromisoformat(sys.argv[1].replace('Z','+00:00')).timestamp()))" "$LAST_UPDATED" 2>/dev/null || echo "0")
NOW_EPOCH=$(date -u +%s)
AGE=$((NOW_EPOCH - LAST_EPOCH))
debug "last status update: $LAST_UPDATED ($AGE seconds ago)"
if [ "$AGE" -lt "$RATE_LIMIT_SEC" ] && [ "$AGE" -ge 0 ]; then
echo "::notice::sop-tier-refire rate-limited — last status update was ${AGE}s ago (<${RATE_LIMIT_SEC}s window). Try again shortly."
exit 0
fi
fi
fi
fi
# 3. Invoke sop-tier-check.sh with the env it expects. Capture exit code.
# The canonical script reads tier label, walks approving reviewers, and
# evaluates the AND-composition expression — we want the SAME gate, not
# a different gate.
#
# SOP_REFIRE_TIER_CHECK_SCRIPT env var lets tests substitute a mock —
# sop-tier-check.sh uses bash 4+ associative arrays which trigger a known
# bash 3.2 parser bug (`tier: unbound variable` from declare -A with
# `set -u`). Linux Gitea runners ship bash 4/5 so production is fine;
# the override exists so the bash 3.2 dev box can still exercise the
# refire glue logic end-to-end.
SCRIPT="${SOP_REFIRE_TIER_CHECK_SCRIPT:-$(dirname "$0")/sop-tier-check.sh}"
if [ ! -f "$SCRIPT" ]; then
echo "::error::sop-tier-check.sh not found at $SCRIPT — refire requires the canonical script"
exit 1
fi
# Re-invoke. Pipe stdout/stderr through so the runner log shows the
# tier-check decision inline.
set +e
GITEA_TOKEN="$GITEA_TOKEN" \
GITEA_HOST="$GITEA_HOST" \
REPO="$REPO" \
PR_NUMBER="$PR_NUMBER" \
PR_AUTHOR="$PR_AUTHOR" \
SOP_DEBUG="${SOP_DEBUG:-0}" \
SOP_LEGACY_CHECK="${SOP_LEGACY_CHECK:-0}" \
bash "$SCRIPT"
TIER_EXIT=$?
set -e
debug "sop-tier-check.sh exit=$TIER_EXIT"
# 4. POST the resulting status.
if [ "$TIER_EXIT" -eq 0 ]; then
STATE="success"
DESCRIPTION="Refired via /refire-tier-check by $COMMENT_AUTHOR"
else
STATE="failure"
DESCRIPTION="Refired via /refire-tier-check; tier-check failed (see workflow log)"
fi
# Status target_url points at the runner log so a curious reviewer can
# follow it back. SERVER_URL + RUN_ID + JOB_ID isn't trivially constructible
# from the bash env on Gitea 1.22.6, so we point at the PR itself.
TARGET_URL="https://${GITEA_HOST}/${OWNER}/${NAME}/pulls/${PR_NUMBER}"
POST_BODY=$(jq -nc \
--arg state "$STATE" \
--arg context "$CONTEXT" \
--arg description "$DESCRIPTION" \
--arg target_url "$TARGET_URL" \
'{state:$state, context:$context, description:$description, target_url:$target_url}')
POST_FILE=$(mktemp)
trap 'rm -f "$PR_FILE" "${STATUSES_FILE:-}" "$POST_FILE"' EXIT
POST_HTTP=$(curl -sS -o "$POST_FILE" -w '%{http_code}' \
-X POST -H "$AUTH" -H "Content-Type: application/json" \
-d "$POST_BODY" \
"${API}/repos/${OWNER}/${NAME}/statuses/${HEAD_SHA}")
if [ "$POST_HTTP" != "200" ] && [ "$POST_HTTP" != "201" ]; then
echo "::error::POST /statuses/$HEAD_SHA returned HTTP $POST_HTTP (body $(head -c 200 "$POST_FILE"))"
exit 1
fi
echo "::notice::sop-tier-refire posted state=$STATE for context=\"$CONTEXT\" on sha=$HEAD_SHA"
exit "$TIER_EXIT"
-673
View File
@@ -1,673 +0,0 @@
#!/usr/bin/env python3
"""status-reaper — Option B compensating-status POST for Gitea 1.22.6's
hardcoded `(push)` suffix on default-branch commit statuses.
Tracking: this PR (workflow + script + tests + audit issue). Sibling
bots: internal#327 (publish-runtime-bot), internal#328 (mc-drift-bot).
Upstream RFC: internal#80. Persona provisioned by sub-agent aefaac1b
(2026-05-11 21:39Z; Gitea uid 94, scope=write:repository).
What this script does, per `.gitea/workflows/status-reaper.yml` invocation:
1. Walk `.gitea/workflows/*.yml`. For each file, build the workflow_id
using this resolution (per hongming-pc 22:08Z review):
- If YAML has top-level `name:` → use that.
- Else → use filename stem (basename minus `.yml`).
Fail-LOUD on:
- Two workflows resolving to the SAME identifier (collision).
- Any identifier containing `/` (it would break context parsing
downstream — Gitea uses ` / ` as the workflow/job separator).
Classify each by whether `on:` contains a `push:` trigger.
2. List the last N (=10) commits on WATCH_BRANCH via
GET /repos/{o}/{r}/commits?sha={branch}&limit={N}. rev2 sweeps
N commits per tick instead of HEAD only — schedule workflows
post `failure` to whatever SHA was HEAD when they COMPLETED, so
by the next */5 tick main has often moved forward and the red
gets stranded on a stale commit (Phase 1+2 evidence: rev1 saw
`compensated:0` every tick across ~6 cycles).
3. For EACH SHA in the list:
- GET combined commit status. Per-SHA error isolation
(refinement #7): if this call raises ApiError or any 5xx,
LOG `::warning::` + continue to the next SHA. Different from
the single-HEAD pre-rev2 path where fail-loud was correct;
the sweep is best-effort across historical commits, so one
transient blip on a stale SHA must not strand reds on the
OTHER stale SHAs.
- If combined.state == "success": skip — cost optimization
(refinement #2), common case (most commits are green).
- Otherwise iterate per-context entries. For each entry where:
state == "failure" AND context.endswith(" (push)")
Parse context as `<workflow_name> / <job_name> (push)`.
Look up workflow_name in the trigger map:
- missing → log ::notice:: and skip (conservative).
- has_push_trigger=True → preserve (real defect signal).
- has_push_trigger=False → POST a compensating
`state=success` status to /statuses/{sha} with the same
context (Gitea de-dups by context) and a description
documenting the workaround + this script's path.
4. Exit 0. Re-running is idempotent — Gitea's commit-status table
stores the LATEST state-per-context, so the success POST sticks
even if another tick happens before the runner finishes.
What it does NOT do:
- Touch any context NOT ending in ` (push)`. The required-checks on
main (verified 2026-05-11) all have ` (pull_request)` suffixes;
they CANNOT be reached by this code path.
- Compensate `error`/`pending` states. Only `failure` — the only one
Gitea emits for the hardcoded-suffix bug.
- Write to non-default branches. WATCH_BRANCH is sourced from
`github.event.repository.default_branch` in the workflow.
- Mutate workflows or runs. The Actions UI still shows the
underlying schedule-triggered run as failed; this script edits
the commit-status surface only.
Halt conditions (script-level — orchestrator-level halts are in the
workflow comments):
- PyYAML missing → fail-loud at import (no fallback parse).
- Workflow `name:` collision → exit 1 with ::error:: message.
- Workflow `name:` containing `/` → exit 1 with ::error:: message.
- Ambiguous `on:` shape (e.g. neither str/list/dict) → treat as
"has_push_trigger=True" and log ::notice:: (preserve, never
compensate the unknown).
- api() non-2xx → raise ApiError, fail the workflow run loudly so
a subsequent tick retries (per
`feedback_api_helper_must_raise_not_return_dict`).
Local dry-run (no network):
GITEA_TOKEN=... GITEA_HOST=git.moleculesai.app REPO=owner/repo \\
WATCH_BRANCH=main WORKFLOWS_DIR=.gitea/workflows \\
python3 .gitea/scripts/status-reaper.py --dry-run
"""
from __future__ import annotations
import argparse
import json
import os
import sys
import urllib.error
import urllib.parse
import urllib.request
from pathlib import Path
from typing import Any
import yaml # PyYAML 6.0.2 — installed by the workflow before this runs.
# --------------------------------------------------------------------------
# Environment
# --------------------------------------------------------------------------
def _env(key: str, *, default: str = "") -> str:
"""Read an env var with a default. Module-import-safe — tests can
import this script without setting the full env contract."""
return os.environ.get(key, default)
GITEA_TOKEN = _env("GITEA_TOKEN")
GITEA_HOST = _env("GITEA_HOST")
REPO = _env("REPO")
WATCH_BRANCH = _env("WATCH_BRANCH", default="main")
WORKFLOWS_DIR = _env("WORKFLOWS_DIR", default=".gitea/workflows")
OWNER, NAME = (REPO.split("/", 1) + [""])[:2] if REPO else ("", "")
API = f"https://{GITEA_HOST}/api/v1" if GITEA_HOST else ""
# Compensating-status description prefix. Used as the marker so a human
# auditing commit statuses can tell at a glance that the green was
# synthetic, not a real CI pass. Kept stable; downstream tooling
# (e.g. main-red-watchdog visual diff) MAY key on it.
COMPENSATION_DESCRIPTION = (
"Compensated by status-reaper (workflow has no push: trigger; "
"Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)"
)
# Context suffix the reaper acts on. Gitea hardcodes this for ALL
# default-branch workflow runs.
PUSH_SUFFIX = " (push)"
def _require_runtime_env() -> None:
"""Enforce env contract — called from `main()` only.
Tests import individual functions without setting the full env
contract. Mirrors `main-red-watchdog.py`/`ci-required-drift.py`.
"""
for key in ("GITEA_TOKEN", "GITEA_HOST", "REPO", "WATCH_BRANCH", "WORKFLOWS_DIR"):
if not os.environ.get(key):
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
# --------------------------------------------------------------------------
# Tiny HTTP helper — raises on non-2xx + on JSON-decode-of-expected-JSON.
# --------------------------------------------------------------------------
class ApiError(RuntimeError):
"""Raised when a Gitea API call cannot be trusted to have succeeded.
Per `feedback_api_helper_must_raise_not_return_dict`: soft-failure is
opt-in via `expect_json=False`, never the default. A pre-fix
implementation that returned `{}` on non-2xx would skip the
compensating POST on a transient outage AND silently lose the
failed-status enumeration, painting main green via omission.
"""
def api(
method: str,
path: str,
*,
body: dict | None = None,
query: dict[str, str] | None = None,
expect_json: bool = True,
) -> tuple[int, Any]:
"""Tiny HTTP helper around urllib. Same contract as
`main-red-watchdog.py` and `ci-required-drift.py` so behaviour
is cross-checkable."""
url = f"{API}{path}"
if query:
url = f"{url}?{urllib.parse.urlencode(query)}"
data = None
headers = {
"Authorization": f"token {GITEA_TOKEN}",
"Accept": "application/json",
}
if body is not None:
data = json.dumps(body).encode("utf-8")
headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, method=method, data=data, headers=headers)
try:
with urllib.request.urlopen(req, timeout=30) as resp:
raw = resp.read()
status = resp.status
except urllib.error.HTTPError as e:
raw = e.read()
status = e.code
if not (200 <= status < 300):
snippet = raw[:500].decode("utf-8", errors="replace") if raw else ""
raise ApiError(f"{method} {path} -> HTTP {status}: {snippet}")
if not raw:
return status, None
try:
return status, json.loads(raw)
except json.JSONDecodeError as e:
if expect_json:
raise ApiError(
f"{method} {path} -> HTTP {status} but body is not JSON: {e}"
) from e
return status, {"_raw": raw.decode("utf-8", errors="replace")}
# --------------------------------------------------------------------------
# Workflow scan + classification
# --------------------------------------------------------------------------
def _on_block(doc: dict) -> Any:
"""Extract the `on:` block from a parsed YAML doc.
PyYAML parses bareword `on:` as Python `True` (YAML 1.1 boolean
spec — `on/off/yes/no` are booleans). The actual key in the dict
is therefore `True`, NOT the string `"on"`. We accept both for
forward-compat with YAML 1.2 loaders (which keep it as `"on"`).
"""
if True in doc:
return doc[True]
return doc.get("on")
def _has_push_trigger(on_block: Any, workflow_id: str) -> bool:
"""Return True if `on:` block declares a `push` trigger.
Accepts the three common shapes:
- str: `on: push` → True only if == "push"
- list: `on: [push, pull_request]` → True if "push" in list
- dict: `on: { push: {...}, schedule: ... }` → True if "push" key
Defensive: for anything else (including None/empty), return True
so we preserve rather than over-compensate. Logged via ::notice::.
"""
if isinstance(on_block, str):
return on_block == "push"
if isinstance(on_block, list):
return "push" in on_block
if isinstance(on_block, dict):
return "push" in on_block
# None or unexpected shape — preserve, log.
print(
f"::notice::ambiguous on: for {workflow_id}; preserving "
f"(value={on_block!r}, type={type(on_block).__name__})"
)
return True
def scan_workflows(workflows_dir: str) -> dict[str, bool]:
"""Walk `workflows_dir` and return `{workflow_id: has_push_trigger}`.
Workflow ID resolution (per hongming-pc 22:08Z review):
- Top-level `name:` if present.
- Else filename stem (basename minus `.yml`).
Fail-LOUD on:
- Two workflows resolving to the same ID (collision).
- Any ID containing `/` (would break ` / `-separated context
parsing on the downstream side).
Returns a dict for O(1) lookup in the per-status loop.
"""
path = Path(workflows_dir)
if not path.is_dir():
# Workflow dir missing → no workflows to classify. Empty map is
# safe: per-status loop will hit "unknown workflow; skip" for
# every entry, which is correct (we cannot tell if a push
# trigger exists, so we preserve).
print(f"::warning::workflows dir not found: {workflows_dir}")
return {}
out: dict[str, bool] = {}
sources: dict[str, str] = {} # workflow_id -> source file (for collision msg)
for yml in sorted(path.glob("*.yml")):
try:
with yml.open() as f:
doc = yaml.safe_load(f)
except yaml.YAMLError as e:
# A malformed YAML in the workflows dir is a real defect
# (the workflow wouldn't load on Gitea either). Surface it
# and keep going — the reaper's job is to compensate the
# OTHER workflows even if one is broken.
print(f"::warning::yaml parse failed for {yml.name}: {e}; skip")
continue
if not isinstance(doc, dict):
print(f"::warning::workflow {yml.name} not a dict; skip")
continue
# Resolve workflow_id.
name_field = doc.get("name")
if isinstance(name_field, str) and name_field.strip():
workflow_id = name_field.strip()
else:
workflow_id = yml.stem # basename minus .yml
# Halt-loud: `/` in workflow_id breaks ` / ` context parsing.
if "/" in workflow_id:
sys.stderr.write(
f"::error::workflow name contains '/' which breaks "
f"context parsing: {workflow_id} (file={yml.name})\n"
)
sys.exit(1)
# Halt-loud: ID collision.
if workflow_id in out:
sys.stderr.write(
f"::error::workflow name collision detected: {workflow_id} "
f"(files: {sources[workflow_id]} + {yml.name})\n"
)
sys.exit(1)
on_block = _on_block(doc)
out[workflow_id] = _has_push_trigger(on_block, workflow_id)
sources[workflow_id] = yml.name
return out
# --------------------------------------------------------------------------
# Gitea reads
# --------------------------------------------------------------------------
def get_head_sha(branch: str) -> str:
"""HEAD SHA of `branch`. Raises ApiError on non-2xx."""
_, body = api("GET", f"/repos/{OWNER}/{NAME}/branches/{branch}")
if not isinstance(body, dict):
raise ApiError(f"branch {branch} response not a JSON object")
commit = body.get("commit")
if not isinstance(commit, dict):
raise ApiError(f"branch {branch} response missing `commit` object")
sha = commit.get("id") or commit.get("sha")
if not isinstance(sha, str) or len(sha) < 7:
raise ApiError(f"branch {branch} response has no usable commit SHA")
return sha
def get_combined_status(sha: str) -> dict:
"""Combined commit status for `sha`. Gitea returns:
{
"state": "success" | "failure" | "pending" | "error",
"statuses": [
{"context": "...", "state": "...", "target_url": "...",
"description": "..."},
...
],
...
}
Raises ApiError on non-2xx.
"""
_, body = api("GET", f"/repos/{OWNER}/{NAME}/commits/{sha}/status")
if not isinstance(body, dict):
raise ApiError(f"status for {sha} response not a JSON object")
return body
# --------------------------------------------------------------------------
# Context parsing
# --------------------------------------------------------------------------
def parse_push_context(context: str) -> tuple[str, str] | None:
"""Parse `<workflow_name> / <job_name> (push)` into
(workflow_name, job_name).
Returns None if the context doesn't match the shape (caller skips).
Strict: requires the trailing ` (push)` and at least one ` / `
separator. Anything else is left alone.
"""
if not context.endswith(PUSH_SUFFIX):
return None
head = context[: -len(PUSH_SUFFIX)] # strip " (push)"
if " / " not in head:
# No workflow/job separator — not the bug shape we compensate.
return None
workflow_name, job_name = head.split(" / ", 1)
return workflow_name, job_name
# --------------------------------------------------------------------------
# Compensating POST
# --------------------------------------------------------------------------
def post_compensating_status(
sha: str,
context: str,
target_url: str | None,
*,
dry_run: bool = False,
) -> None:
"""POST a `state=success` to /repos/{o}/{r}/statuses/{sha} with the
given context. Gitea de-dups by context (latest write wins).
Description references this script so the compensation is
self-documenting on the commit's status view.
"""
payload: dict[str, Any] = {
"context": context,
"state": "success",
"description": COMPENSATION_DESCRIPTION,
}
# Echo the original target_url when present so a human auditing
# the (now-green) compensated status can still reach the run logs
# that produced the original red.
if target_url:
payload["target_url"] = target_url
if dry_run:
print(
f"::notice::[dry-run] would compensate {context!r} on {sha[:10]} "
f"with state=success"
)
return
api("POST", f"/repos/{OWNER}/{NAME}/statuses/{sha}", body=payload)
print(f"::notice::compensated {context!r} on {sha[:10]} (state=success)")
# --------------------------------------------------------------------------
# Main reap loop
# --------------------------------------------------------------------------
def reap(
workflow_trigger_map: dict[str, bool],
combined: dict,
sha: str,
*,
dry_run: bool = False,
) -> dict[str, Any]:
"""Walk `combined.statuses[]` and compensate where appropriate.
Per-SHA worker. The multi-SHA orchestrator (`reap_branch`) calls
this once per stale main commit each tick.
Returns counters for observability:
{compensated, preserved_real_push, preserved_unknown,
preserved_non_failure, preserved_non_push_suffix,
preserved_unparseable,
compensated_contexts: [<context>, ...]}
`compensated_contexts` is rev2-added so `reap_branch` can build
`compensated_per_sha` without re-deriving it from the POST stream.
"""
counters: dict[str, Any] = {
"compensated": 0,
"preserved_real_push": 0,
"preserved_unknown": 0,
"preserved_non_failure": 0,
"preserved_non_push_suffix": 0,
"preserved_unparseable": 0,
"compensated_contexts": [],
}
statuses = combined.get("statuses") or []
for s in statuses:
if not isinstance(s, dict):
continue
context = s.get("context") or ""
state = s.get("state") or ""
# Only `failure` is the bug shape. `error`/`pending`/`success`
# left alone — they have other meanings.
if state != "failure":
counters["preserved_non_failure"] += 1
continue
# Only `(push)`-suffix contexts hit the hardcoded-suffix bug.
# Branch-protection required checks (e.g. `Secret scan / Scan
# diff (pull_request)`) are NOT reachable from this path.
if not context.endswith(PUSH_SUFFIX):
counters["preserved_non_push_suffix"] += 1
continue
parsed = parse_push_context(context)
if parsed is None:
# Has ` (push)` suffix but missing ` / ` separator — not
# the bug shape. Preserve.
counters["preserved_unparseable"] += 1
continue
workflow_name, _job_name = parsed
if workflow_name not in workflow_trigger_map:
# Real workflow but renamed/deleted/external — we can't
# tell if it has push trigger. Conservative: preserve.
print(f"::notice::unknown workflow {workflow_name!r}; skip")
counters["preserved_unknown"] += 1
continue
if workflow_trigger_map[workflow_name]:
# Real push trigger → real defect signal. Preserve.
counters["preserved_real_push"] += 1
continue
# Class-O: schedule/dispatch/etc.-only workflow with a fake
# (push) status from Gitea's hardcoded-suffix bug. Compensate.
post_compensating_status(
sha, context, s.get("target_url"), dry_run=dry_run
)
counters["compensated"] += 1
counters["compensated_contexts"].append(context)
return counters
# --------------------------------------------------------------------------
# rev2: multi-SHA sweep over the last N commits on WATCH_BRANCH
# --------------------------------------------------------------------------
# How many main commits to sweep per tick. Sized to cover a burst-merge
# window where multiple PRs land in the 5-min interval between reaper
# ticks. Older reds falling off the window is acceptable — they were
# already stale enough that the schedule-run that posted them has long
# since been overwritten by a real push trigger. See `reference_post_
# suspension_pipeline` for the merge-cadence baseline.
DEFAULT_SWEEP_LIMIT = 10
def list_recent_commit_shas(branch: str, limit: int) -> list[str]:
"""List the most recent `limit` commit SHAs on `branch`, newest
first.
Wraps GET /repos/{o}/{r}/commits?sha={branch}&limit={limit}. Gitea
1.22.6 returns a JSON list of commit objects each with a `sha` key
(verified via vendor-truth probe 2026-05-11 against
git.moleculesai.app — `feedback_smoke_test_vendor_truth_not_shape_match`).
Raises ApiError on non-2xx OR on unexpected response shape. This is
a HARD halt — without the commit list the sweep can't proceed. (The
per-SHA error isolation downstream is a different concern: tolerating
a transient 5xx on ONE commit's status is best-effort; losing the
commit list itself means we don't even know which commits to try.)
"""
_, body = api(
"GET",
f"/repos/{OWNER}/{NAME}/commits",
query={"sha": branch, "limit": str(limit)},
)
if not isinstance(body, list):
raise ApiError(
f"commits listing for {branch} not a JSON array "
f"(got {type(body).__name__})"
)
shas: list[str] = []
for entry in body:
if not isinstance(entry, dict):
continue
sha = entry.get("sha")
if isinstance(sha, str) and len(sha) >= 7:
shas.append(sha)
if not shas:
raise ApiError(
f"commits listing for {branch} returned no usable SHAs"
)
return shas
def reap_branch(
workflow_trigger_map: dict[str, bool],
branch: str,
*,
limit: int = DEFAULT_SWEEP_LIMIT,
dry_run: bool = False,
) -> dict[str, Any]:
"""Sweep the last `limit` commits on `branch`, applying `reap()`
to each (with per-SHA error isolation).
Returns aggregated counters PLUS rev2 observability fields:
- scanned_shas: how many SHAs we actually iterated
- compensated_per_sha: {<sha_full>: [<context>, ...]} — only
SHAs that actually got at least one compensation are included
"""
shas = list_recent_commit_shas(branch, limit)
aggregate: dict[str, Any] = {
"scanned_shas": 0,
"compensated": 0,
"preserved_real_push": 0,
"preserved_unknown": 0,
"preserved_non_failure": 0,
"preserved_non_push_suffix": 0,
"preserved_unparseable": 0,
"compensated_per_sha": {},
}
for sha in shas:
aggregate["scanned_shas"] += 1
# Per-SHA error isolation (refinement #7). One transient blip
# on a historical commit must NOT abort the whole tick — the
# OTHER stale SHAs may still hold strandable reds.
try:
combined = get_combined_status(sha)
except ApiError as e:
print(
f"::warning::get_combined_status({sha[:10]}) failed; "
f"skipping this SHA: {e}"
)
continue
# Cost optimization (refinement #2): the common case is a green
# commit. Skip the per-context loop entirely when combined is
# already success — saves a tight loop over ~20 statuses per SHA
# on green commits, the dominant majority.
if combined.get("state") == "success":
continue
per_sha = reap(
workflow_trigger_map, combined, sha, dry_run=dry_run
)
# Aggregate scalar counters.
for key in (
"compensated",
"preserved_real_push",
"preserved_unknown",
"preserved_non_failure",
"preserved_non_push_suffix",
"preserved_unparseable",
):
aggregate[key] += per_sha[key]
# Record per-SHA compensated contexts (only when non-empty —
# keep the summary readable when most SHAs are no-ops).
contexts = per_sha.get("compensated_contexts") or []
if contexts:
aggregate["compensated_per_sha"][sha] = list(contexts)
return aggregate
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
"--dry-run",
action="store_true",
help="Skip the compensating POST; print what would be done.",
)
parser.add_argument(
"--limit",
type=int,
default=DEFAULT_SWEEP_LIMIT,
help=(
"How many recent commits on WATCH_BRANCH to sweep per tick "
f"(default: {DEFAULT_SWEEP_LIMIT})."
),
)
args = parser.parse_args()
_require_runtime_env()
workflow_trigger_map = scan_workflows(WORKFLOWS_DIR)
print(
f"::notice::scanned {len(workflow_trigger_map)} workflows; "
f"push-triggered={sum(1 for v in workflow_trigger_map.values() if v)}, "
f"class-O candidates={sum(1 for v in workflow_trigger_map.values() if not v)}"
)
counters = reap_branch(
workflow_trigger_map,
WATCH_BRANCH,
limit=args.limit,
dry_run=args.dry_run,
)
# Observability: print one JSON line summarising the tick. Loki
# ingestion via the runner's stdout (`source="gitea-actions"`).
print(
"status-reaper summary: "
+ json.dumps(
{
"branch": WATCH_BRANCH,
"dry_run": args.dry_run,
"limit": args.limit,
**counters,
},
sort_keys=True,
)
)
return 0
if __name__ == "__main__":
sys.exit(main())
-28
View File
@@ -1,28 +0,0 @@
#!/usr/bin/env bash
# Mock sop-tier-check.sh for sop-tier-refire tests.
#
# Exits 0 ("PASS") if $MOCK_TIER_RESULT == "pass", else exits 1.
# This lets the refire tests cover the success + failure status-POST
# paths without invoking the real sop-tier-check.sh (which uses bash 4+
# associative arrays — known parser bug on macOS bash 3.2 dev box).
set -euo pipefail
case "${MOCK_TIER_RESULT:-pass}" in
pass)
echo "::notice::mock tier-check: PASS"
exit 0
;;
fail_no_label)
echo "::error::mock tier-check: no tier label"
exit 1
;;
fail_no_approvals)
echo "::error::mock tier-check: no approving reviews"
exit 1
;;
*)
echo "::error::mock tier-check: unknown MOCK_TIER_RESULT=${MOCK_TIER_RESULT:-}"
exit 2
;;
esac
-208
View File
@@ -1,208 +0,0 @@
#!/usr/bin/env python3
"""Stub Gitea API for sop-tier-refire test scenarios.
Reads $FIXTURE_STATE_DIR/scenario to decide what to return for each
endpoint the sop-tier-refire.sh + sop-tier-check.sh scripts call.
Captures every POST to /statuses/{sha} into posted_statuses.jsonl so
the test can assert what the script tried to write.
Scenarios:
T1_success — tier:low + APPROVED by engineer → tier-check passes
T2_no_tier_label — no tier label → tier-check exits 1 before POST
T3_no_approvals — tier:low but zero approving reviews → exits 1
T4_closed — PR state=closed → refire is a no-op
T5_rate_limited — last status update 5 seconds ago → skip
Usage:
FIXTURE_STATE_DIR=/tmp/x python3 _refire_fixture.py 8080
"""
import datetime
import http.server
import json
import os
import re
import sys
import urllib.parse
STATE_DIR = os.environ["FIXTURE_STATE_DIR"]
def scenario() -> str:
p = os.path.join(STATE_DIR, "scenario")
if not os.path.isfile(p):
return "T1_success"
with open(p) as f:
return f.read().strip()
def now_iso() -> str:
return datetime.datetime.now(datetime.timezone.utc).isoformat()
def append_post(body: dict) -> None:
with open(os.path.join(STATE_DIR, "posted_statuses.jsonl"), "a") as f:
f.write(json.dumps(body) + "\n")
def pr_payload() -> dict:
sc = scenario()
state = "closed" if sc == "T4_closed" else "open"
return {
"number": 999,
"state": state,
"head": {"sha": "deadbeef0000111122223333444455556666"},
"user": {"login": "feature-author"},
}
def labels_payload() -> list:
sc = scenario()
if sc == "T2_no_tier_label":
return [{"name": "bug"}]
# All other scenarios use tier:low
return [{"name": "tier:low"}, {"name": "ci"}]
def reviews_payload() -> list:
sc = scenario()
if sc == "T3_no_approvals":
return []
# All other scenarios have one APPROVED review by an engineer
return [
{
"state": "APPROVED",
"user": {"login": "reviewer-engineer"},
}
]
def teams_payload() -> list:
# Mirror the real molecule-ai org teams referenced in TIER_EXPR
return [
{"id": 5, "name": "ceo"},
{"id": 2, "name": "engineers"},
{"id": 6, "name": "managers"},
]
def statuses_payload() -> list:
sc = scenario()
if sc == "T5_rate_limited":
recent = (
datetime.datetime.now(datetime.timezone.utc)
- datetime.timedelta(seconds=5)
).isoformat()
return [
{
"context": "sop-tier-check / tier-check (pull_request)",
"state": "failure",
"updated_at": recent,
}
]
return []
def user_payload() -> dict:
# Mirrors the WHOAMI probe in sop-tier-check.sh
return {"login": "sop-tier-bot-fixture"}
class Handler(http.server.BaseHTTPRequestHandler):
# Quiet — keep stdout for explicit logs only.
def log_message(self, *args, **kwargs): # noqa: D401
pass
def _json(self, code: int, body) -> None:
payload = json.dumps(body).encode()
self.send_response(code)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(payload)))
self.end_headers()
self.wfile.write(payload)
def _empty(self, code: int) -> None:
self.send_response(code)
self.send_header("Content-Length", "0")
self.end_headers()
def do_GET(self): # noqa: N802
u = urllib.parse.urlparse(self.path)
path = u.path
if path == "/_ping":
return self._json(200, {"ok": True})
if path == "/api/v1/user":
return self._json(200, user_payload())
# /api/v1/repos/{owner}/{name}/pulls/{n}
m = re.match(r"^/api/v1/repos/[^/]+/[^/]+/pulls/(\d+)$", path)
if m:
return self._json(200, pr_payload())
# /api/v1/repos/{owner}/{name}/issues/{n}/labels
if re.match(r"^/api/v1/repos/[^/]+/[^/]+/issues/\d+/labels$", path):
return self._json(200, labels_payload())
# /api/v1/repos/{owner}/{name}/pulls/{n}/reviews
if re.match(r"^/api/v1/repos/[^/]+/[^/]+/pulls/\d+/reviews$", path):
return self._json(200, reviews_payload())
# /api/v1/orgs/{owner}/teams
if re.match(r"^/api/v1/orgs/[^/]+/teams$", path):
return self._json(200, teams_payload())
# /api/v1/teams/{id}/members/{login} → 204 if user is an engineer
m = re.match(r"^/api/v1/teams/(\d+)/members/([^/]+)$", path)
if m:
team_id, login = m.group(1), m.group(2)
# In our fixture reviewer-engineer ∈ engineers (id=2)
if team_id == "2" and login == "reviewer-engineer":
return self._empty(204)
return self._empty(404)
# /api/v1/orgs/{owner}/members/{login} — fallback path used when
# team-member probes all 403. We don't need it for these tests.
if re.match(r"^/api/v1/orgs/[^/]+/members/[^/]+$", path):
return self._empty(404)
# /api/v1/repos/{owner}/{name}/statuses/{sha}
if re.match(r"^/api/v1/repos/[^/]+/[^/]+/statuses/[^/]+$", path):
return self._json(200, statuses_payload())
return self._json(404, {"path": path, "msg": "fixture: no route"})
def do_POST(self): # noqa: N802
u = urllib.parse.urlparse(self.path)
path = u.path
length = int(self.headers.get("Content-Length") or 0)
raw = self.rfile.read(length) if length else b""
try:
body = json.loads(raw) if raw else {}
except Exception:
body = {"_raw": raw.decode(errors="replace")}
if re.match(r"^/api/v1/repos/[^/]+/[^/]+/statuses/[^/]+$", path):
append_post(body)
# Echo back something status-shaped — script only checks HTTP code.
return self._json(
201,
{
"context": body.get("context"),
"state": body.get("state"),
"created_at": now_iso(),
},
)
return self._json(404, {"path": path, "msg": "fixture: no route"})
def main():
port = int(sys.argv[1])
srv = http.server.ThreadingHTTPServer(("127.0.0.1", port), Handler)
srv.serve_forever()
if __name__ == "__main__":
main()
@@ -1,140 +0,0 @@
#!/usr/bin/env python3
"""Stub Gitea API for review-check.sh test scenarios.
Reads $FIXTURE_STATE_DIR/scenario to decide what to return for each
endpoint the review-check.sh script calls.
Reads $FIXTURE_STATE_DIR/token_owner_in_teams to decide whether
the team membership probe returns 200/204 (member) or 403 (not in team).
Scenarios:
T1_pr_open — open PR, author=alice, sha=deadbeef → continue
T2_pr_closed — closed PR → script exits 0 (no-op)
T3_reviews_approved_non_author — one APPROVED from non-author → candidates exist
T4_reviews_empty — zero APPROVED non-author → exit 1 (no candidates)
T5_reviews_only_author — only author reviews → exit 1 (no candidates)
T6_reviews_dismissed — dismissed APPROVED → treated as no approval
T7_team_member — team membership → 204 (member) → exit 0
T8_team_not_member — team membership → 404 (not a member) → exit 1
T9_team_403 — team membership → 403 (token not in team) → exit 1
Usage:
FIXTURE_STATE_DIR=/tmp/x python3 _review_check_fixture.py 8080
"""
import http.server
import json
import os
import re
import sys
import urllib.parse
STATE_DIR = os.environ.get("FIXTURE_STATE_DIR", "/tmp")
def scenario() -> str:
p = os.path.join(STATE_DIR, "scenario")
if not os.path.isfile(p):
return "T1_pr_open"
with open(p) as f:
return f.read().strip()
class Handler(http.server.BaseHTTPRequestHandler):
def log_message(self, *args, **kwargs):
pass # keep stdout for explicit logs only
def _json(self, code: int, body: dict) -> None:
payload = json.dumps(body).encode()
self.send_response(code)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(payload)))
self.end_headers()
self.wfile.write(payload)
def _empty(self, code: int) -> None:
self.send_response(code)
self.send_header("Content-Length", "0")
self.end_headers()
def _text(self, code: int, body: str) -> None:
payload = body.encode()
self.send_response(code)
self.send_header("Content-Type", "text/plain")
self.send_header("Content-Length", str(len(payload)))
self.end_headers()
self.wfile.write(payload)
def do_GET(self):
u = urllib.parse.urlparse(self.path)
path = u.path
sc = scenario()
if path == "/_ping":
return self._json(200, {"ok": True})
# GET /repos/{owner}/{name}/pulls/{pr_number}
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/pulls/(\d+)$", path)
if m:
owner, name, pr_num = m.group(1), m.group(2), m.group(3)
if sc == "T2_pr_closed":
return self._json(200, {
"number": int(pr_num),
"state": "closed",
"head": {"sha": "deadbeef0000111122223333444455556666"},
"user": {"login": "alice"},
})
return self._json(200, {
"number": int(pr_num),
"state": "open",
"head": {"sha": "deadbeef0000111122223333444455556666"},
"user": {"login": "alice"},
})
# GET /repos/{owner}/{name}/pulls/{pr_number}/reviews
m = re.match(r"^/api/v1/repos/([^/]+)/([^/]+)/pulls/(\d+)/reviews$", path)
if m:
if sc in ("T4_reviews_empty", "T5_reviews_only_author"):
return self._json(200, [])
if sc == "T6_reviews_dismissed":
return self._json(200, [{
"state": "APPROVED",
"dismissed": True,
"user": {"login": "core-devops"},
"commit_id": "abc1234",
}])
if sc == "T3_reviews_approved_non_author":
return self._json(200, [
{"state": "CHANGES_REQUESTED", "dismissed": False, "user": {"login": "bob"}, "commit_id": "abc1234"},
{"state": "APPROVED", "dismissed": False, "user": {"login": "core-devops"}, "commit_id": "abc1234"},
])
# Default: one non-author APPROVED
return self._json(200, [
{"state": "APPROVED", "dismissed": False, "user": {"login": "core-devops"}, "commit_id": "abc1234"},
])
# GET /teams/{team_id}/members/{username}
m = re.match(r"^/api/v1/teams/(\d+)/members/([^/]+)$", path)
if m:
team_id, login = m.group(1), m.group(2)
if sc == "T8_team_not_member":
return self._empty(404)
if sc == "T9_team_403":
return self._empty(403)
# T7_team_member: member
return self._empty(204)
return self._json(404, {"path": path, "msg": "fixture: no route"})
def do_POST(self):
self._json(404, {"path": self.path, "msg": "fixture: no POST routes"})
def main():
port = int(sys.argv[1])
srv = http.server.ThreadingHTTPServer(("127.0.0.1", port), Handler)
srv.serve_forever()
if __name__ == "__main__":
main()
-332
View File
@@ -1,332 +0,0 @@
#!/usr/bin/env bash
# Regression tests for .gitea/scripts/review-check.sh (RFC#324 Step 1).
#
# Covers:
# T1 — open PR: script fetches PR + reviews, continues to team probe
# T2 — closed PR: script exits 0 (no-op)
# T3 — APPROVED non-author review exists → candidates exist
# T4 — no non-author APPROVED reviews → exit 1 (no candidates)
# T5 — only author reviews (no non-author APPROVE) → exit 1
# T6 — dismissed APPROVED review → treated as no approval
# T7 — team membership probe → 204 (member) → script exits 0
# T8 — team membership probe → 404 (not a member) → script exits 1
# T9 — team membership probe → 403 (token not in team) → script exits 1 (fail closed)
# T10 — CURL_AUTH_FILE created with mode 600 and correct header content
# T11 — bash syntax check (bash -n passes)
# T12 — jq filter: non-author APPROVED → in candidate list; dismissed → excluded
# T13 — missing required env GITEA_TOKEN → exits 1 with error
#
# Hostile-self-review (per feedback_assert_exact_not_substring):
# this test MUST FAIL if the script is absent. Verified by running
# the test before the file exists (covered in the PR body).
set -euo pipefail
THIS_DIR="$(cd "$(dirname "$0")" && pwd)"
SCRIPT_DIR="$(cd "$THIS_DIR/.." && pwd)"
SCRIPT="$SCRIPT_DIR/review-check.sh"
PASS=0
FAIL=0
FAILED_TESTS=""
assert_eq() {
local label="$1"
local expected="$2"
local got="$3"
if [ "$expected" = "$got" ]; then
echo " PASS $label"
PASS=$((PASS + 1))
else
echo " FAIL $label"
echo " expected: <$expected>"
echo " got: <$got>"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
fi
}
assert_contains() {
local label="$1"
local needle="$2"
local haystack="$3"
if printf '%s' "$haystack" | grep -qF "$needle"; then
echo " PASS $label"
PASS=$((PASS + 1))
else
echo " FAIL $label"
echo " needle: <$needle>"
echo " haystack: <$(printf '%s' "$haystack" | head -c 200)>"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
fi
}
assert_file_mode() {
local label="$1"
local path="$2"
local expected_mode="$3"
if [ ! -f "$path" ]; then
echo " FAIL $label (file not found: $path)"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
return
fi
local got_mode
got_mode=$(stat -c '%a' "$path" 2>/dev/null || echo "000")
if [ "$expected_mode" = "$got_mode" ]; then
echo " PASS $label (mode=$got_mode)"
PASS=$((PASS + 1))
else
echo " FAIL $label (expected mode=$expected_mode, got=$got_mode)"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
fi
}
assert_file_contains() {
local label="$1"
local path="$2"
local needle="$3"
if [ ! -f "$path" ]; then
echo " FAIL $label (file not found: $path)"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
return
fi
if grep -qF "$needle" "$path"; then
echo " PASS $label"
PASS=$((PASS + 1))
else
echo " FAIL $label (needle not found: <$needle>)"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
fi
}
# Existence check (foundation)
echo
echo "== existence =="
if [ -f "$SCRIPT" ]; then
echo " PASS script exists: $SCRIPT"
PASS=$((PASS + 1))
else
echo " FAIL script not found: $SCRIPT"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} script_exists"
echo
echo "------"
echo "PASS=$PASS FAIL=$FAIL (existence)"
echo "Cannot proceed without the script."
exit 1
fi
# T11 — bash syntax check
echo
echo "== T11 bash syntax =="
if bash -n "$SCRIPT" 2>&1; then
echo " PASS T11 bash -n passes"
PASS=$((PASS + 1))
else
echo " FAIL T11 bash -n failed"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} T11"
fi
# T13 — missing required env
echo
echo "== T13 missing GITEA_TOKEN =="
set +e
T13_OUT=$(PATH="/tmp:$PATH" GITEA_TOKEN= GITEA_HOST=git.example.com REPO=x/y PR_NUMBER=1 TEAM=qa TEAM_ID=1 bash "$SCRIPT" 2>&1 || true)
set -e
assert_contains "T13 exits non-zero when GITEA_TOKEN missing" "GITEA_TOKEN required" "$T13_OUT"
# Start fixture HTTP server
echo
echo "== fixture setup =="
FIXTURE_DIR=$(mktemp -d)
trap 'rm -rf "$FIXTURE_DIR"; [ -n "${FIX_PID:-}" ] && kill "$FIX_PID" 2>/dev/null || true' EXIT
FIXTURE_PY="$THIS_DIR/_review_check_fixture.py"
if [ ! -f "$FIXTURE_PY" ]; then
echo "::error::fixture server $FIXTURE_PY missing"
exit 1
fi
FIX_LOG="$FIXTURE_DIR/fixture.log"
FIX_STATE_DIR="$FIXTURE_DIR/state"
mkdir -p "$FIX_STATE_DIR"
# Find an unused port
FIX_PORT=$(python3 -c 'import socket;s=socket.socket();s.bind(("127.0.0.1",0));print(s.getsockname()[1]);s.close()')
FIXTURE_STATE_DIR="$FIX_STATE_DIR" python3 "$FIXTURE_PY" "$FIX_PORT" \
>"$FIX_LOG" 2>&1 &
FIX_PID=$!
# Wait for fixture readiness
for _ in $(seq 1 50); do
if curl -fsS "http://127.0.0.1:${FIX_PORT}/_ping" >/dev/null 2>&1; then
break
fi
sleep 0.1
done
if ! curl -fsS "http://127.0.0.1:${FIX_PORT}/_ping" >/dev/null 2>&1; then
echo "::error::fixture server failed to start. Log:"
cat "$FIX_LOG"
exit 1
fi
echo " fixture running on port $FIX_PORT"
# Install a curl shim that rewrites https://fixture.local/* -> http://127.0.0.1:$FIX_PORT/*
# Use double-quoted heredoc so FIX_PORT is expanded into the shim at creation time.
mkdir -p "$FIXTURE_DIR/bin"
cat >"$FIXTURE_DIR/bin/curl" <<"CURL_SHIM"
#!/usr/bin/env bash
# Shim: rewrite https://fixture.local/* -> http://127.0.0.1:FIXPORT/*
# Generated at test-run time; FIXPORT is substituted when this file is written.
new_args=()
for a in "$@"; do
if [[ "$a" == https://fixture.local/* ]]; then
rest="${a#https://fixture.local}"
a="http://127.0.0.1:FIXPORT${rest}"
fi
new_args+=("$a")
done
exec /usr/bin/curl "${new_args[@]}"
CURL_SHIM
# Now substitute FIXPORT with the actual port number
sed -i "s/FIXPORT/${FIX_PORT}/g" "$FIXTURE_DIR/bin/curl"
chmod +x "$FIXTURE_DIR/bin/curl"
# Helper: run the script with fixture environment
run_review_check() {
local scenario="$1"
echo "$scenario" >"$FIX_STATE_DIR/scenario"
local out
set +e
out=$(
PATH="$FIXTURE_DIR/bin:/tmp:$PATH" \
GITEA_TOKEN="fixture-token" \
GITEA_HOST="fixture.local" \
REPO="molecule-ai/molecule-core" \
PR_NUMBER="999" \
TEAM="qa" \
TEAM_ID="20" \
REVIEW_CHECK_DEBUG="0" \
REVIEW_CHECK_STRICT="0" \
bash "$SCRIPT" 2>&1
)
local rc=$?
set -e
echo "$out" >"$FIX_STATE_DIR/last_run.log"
echo "$rc" >"$FIX_STATE_DIR/last_rc"
echo "$out"
}
# T1 — open PR: script fetches PR and continues
echo
echo "== T1 open PR =="
T1_OUT=$(run_review_check "T1_pr_open")
T1_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T1 exit code 0 (approver exists + team member)" "0" "$T1_RC"
assert_contains "T1 qa-review APPROVED by core-devops" "APPROVED by core-devops" "$T1_OUT"
# T2 — closed PR: exits 0 immediately (no-op)
echo
echo "== T2 closed PR =="
T2_OUT=$(run_review_check "T2_pr_closed")
T2_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T2 exit code 0 (closed PR no-op)" "0" "$T2_RC"
# T3 — APPROVED non-author reviews exist
echo
echo "== T3 approved non-author reviews =="
T3_OUT=$(run_review_check "T3_reviews_approved_non_author")
T3_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T3 exit code 0 (candidates + team member)" "0" "$T3_RC"
# T4 — no non-author APPROVED reviews → exit 1
echo
echo "== T4 no non-author APPROVED reviews =="
T4_OUT=$(run_review_check "T4_reviews_empty")
T4_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T4 exit code 1 (no candidates)" "1" "$T4_RC"
assert_contains "T4 awaiting non-author APPROVE" "awaiting non-author APPROVE" "$T4_OUT"
# T5 — only author reviews → exit 1
echo
echo "== T5 only author reviews =="
T5_OUT=$(run_review_check "T5_reviews_only_author")
T5_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T5 exit code 1 (only author reviews, no candidates)" "1" "$T5_RC"
# T6 — dismissed APPROVED review → treated as no approval
echo
echo "== T6 dismissed APPROVED review =="
T6_OUT=$(run_review_check "T6_reviews_dismissed")
T6_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T6 exit code 1 (dismissed = no approval)" "1" "$T6_RC"
# T7 — team member → exit 0
echo
echo "== T7 team membership 204 (member) =="
T7_OUT=$(run_review_check "T7_team_member")
T7_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T7 exit code 0 (member, APPROVED)" "0" "$T7_RC"
assert_contains "T7 APPROVED by core-devops (team member)" "APPROVED by core-devops" "$T7_OUT"
# T8 — not a team member → exit 1 (fail closed)
echo
echo "== T8 team membership 404 (not a member) =="
T8_OUT=$(run_review_check "T8_team_not_member")
T8_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T8 exit code 1 (not in team)" "1" "$T8_RC"
# T9 — 403 token-not-in-team → exit 1 (fail closed)
echo
echo "== T9 team membership 403 (token not in team) =="
T9_OUT=$(run_review_check "T9_team_403")
T9_RC=$(cat "$FIX_STATE_DIR/last_rc")
assert_eq "T9 exit code 1 (403 token-not-in-team, fail closed)" "1" "$T9_RC"
assert_contains "T9 403 error in output" "403" "$T9_OUT"
# T10 — token file creation and permissions
echo
echo "== T10 CURL_AUTH_FILE =="
# Verify the token-file logic directly: create a temp file with the
# same mktemp pattern, write the header with printf, chmod 600, then assert.
T10_TOKEN="secret-test-token-abc123"
T10_AUTHFILE=$(mktemp -p /tmp curl-auth.test.XXXXXX)
chmod 600 "$T10_AUTHFILE"
printf 'header = "Authorization: token %s"\n' "$T10_TOKEN" > "$T10_AUTHFILE"
assert_file_mode "T10a mktemp -p /tmp mode 600 (CURL_AUTH_FILE pattern)" "$T10_AUTHFILE" "600"
assert_file_contains "T10b printf header format (CURL_AUTH_FILE content)" "$T10_AUTHFILE" "Authorization: token secret-test-token-abc123"
assert_file_contains "T10c 'header =' curl-config syntax" "$T10_AUTHFILE" 'header = "Authorization: token '
rm -f "$T10_AUTHFILE"
# T12 — jq filter: non-author APPROVED included, dismissed excluded
echo
echo "== T12 jq filter =="
# These are tested indirectly via T3 and T6 above, but let's also test
# the jq expression directly.
JQ_FILTER='.[]
| select(.state == "APPROVED")
| select(.dismissed != true)
| select(.user.login != "alice")
| .user.login'
T12_INPUT='[{"state":"APPROVED","dismissed":false,"user":{"login":"core-devops"}},{"state":"CHANGES_REQUESTED","dismissed":false,"user":{"login":"bob"}},{"state":"APPROVED","dismissed":false,"user":{"login":"alice"}},{"state":"APPROVED","dismissed":true,"user":{"login":"carol"}}]'
JQ_CMD=$(command -v jq 2>/dev/null || echo /tmp/jq)
T12_CANDIDATES=$(echo "$T12_INPUT" | "$JQ_CMD" -r "$JQ_FILTER" 2>/dev/null | sort -u)
assert_contains "T12 jq: core-devops (non-author APPROVED) in candidates" "core-devops" "$T12_CANDIDATES"
assert_eq "T12 jq: alice (author) NOT in candidates" "" "$(echo "$T12_CANDIDATES" | grep '^alice$' || true)"
assert_eq "T12 jq: carol (dismissed) NOT in candidates" "" "$(echo "$T12_CANDIDATES" | grep '^carol$' || true)"
echo
echo "------"
echo "PASS=$PASS FAIL=$FAIL"
if [ "$FAIL" -gt 0 ]; then
echo "Failed:$FAILED_TESTS"
fi
[ "$FAIL" -eq 0 ]
@@ -1,297 +0,0 @@
#!/usr/bin/env bash
# Tests for sop-tier-refire.{yml,sh} — internal#292.
#
# Behavior matrix:
#
# T1: PR open + APPROVED via tier:low → script invokes sop-tier-check
# and POSTs status=success.
# T2: PR open + missing tier label → sop-tier-check exits non-zero;
# refire POSTs status=failure (description mentions failure).
# T3: PR open + tier:low but NO approving reviews → sop-tier-check
# exits non-zero; refire POSTs status=failure.
# T4: PR CLOSED → refire exits 0 with no status POST (no-op on closed).
# T5: Rate-limit — recent status update within 30s → refire skips,
# no new POST.
# T6 (yaml-lint): workflow `if:` expression contains author_association
# gate + slash-command-trigger gate + PR-not-issue gate.
# T7 (yaml-lint): workflow file is parseable YAML.
#
# Tests T1-T5 run the real script against a local-fixture HTTP server
# (python http.server with a stub handler — `tests/_refire_fixture.py`)
# so the script's Gitea API calls hit the fixture, not the real Gitea.
#
# Tests T6/T7 are pure YAML checks against the workflow file.
#
# Hostile-self-review (per feedback_assert_exact_not_substring):
# this test MUST FAIL if the workflow or script is absent. Verified by
# running the test before the files exist (covered in the PR body).
set -euo pipefail
THIS_DIR="$(cd "$(dirname "$0")" && pwd)"
SCRIPT_DIR="$(cd "$THIS_DIR/.." && pwd)"
WORKFLOW_DIR="$(cd "$THIS_DIR/../../workflows" && pwd)"
WORKFLOW="$WORKFLOW_DIR/sop-tier-refire.yml"
SCRIPT="$SCRIPT_DIR/sop-tier-refire.sh"
PASS=0
FAIL=0
FAILED_TESTS=""
assert_eq() {
local label="$1"
local expected="$2"
local got="$3"
if [ "$expected" = "$got" ]; then
echo " PASS $label"
PASS=$((PASS + 1))
else
echo " FAIL $label"
echo " expected: <$expected>"
echo " got: <$got>"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
fi
}
assert_contains() {
local label="$1"
local needle="$2"
local haystack="$3"
if printf '%s' "$haystack" | grep -qF "$needle"; then
echo " PASS $label"
PASS=$((PASS + 1))
else
echo " FAIL $label"
echo " needle: <$needle>"
echo " haystack: <$(printf '%s' "$haystack" | head -c 400)>"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
fi
}
assert_file_exists() {
local label="$1"
local path="$2"
if [ -f "$path" ]; then
echo " PASS $label"
PASS=$((PASS + 1))
else
echo " FAIL $label (not found: $path)"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} ${label}"
fi
}
# Existence (foundation — every other test depends on these)
echo
echo "== existence =="
assert_file_exists "workflow file exists" "$WORKFLOW"
assert_file_exists "script file exists" "$SCRIPT"
if [ "$FAIL" -gt 0 ]; then
echo
echo "------"
echo "PASS=$PASS FAIL=$FAIL (existence)"
echo "Cannot proceed without these files."
exit 1
fi
# T6 / T7 — workflow YAML structure
echo
echo "== T6/T7 workflow yaml =="
# YAML parseability
PARSE_OUT=$(python3 -c 'import sys,yaml;yaml.safe_load(open(sys.argv[1]).read());print("ok")' "$WORKFLOW" 2>&1 || true)
assert_eq "T7 workflow parses as YAML" "ok" "$PARSE_OUT"
# Three required gates in the `if:` expression
WORKFLOW_CONTENT=$(cat "$WORKFLOW")
assert_contains "T6a workflow if: contains author_association gate" \
"github.event.comment.author_association" "$WORKFLOW_CONTENT"
assert_contains "T6b workflow if: gates on MEMBER/OWNER/COLLABORATOR" \
'["MEMBER","OWNER","COLLABORATOR"]' "$WORKFLOW_CONTENT"
assert_contains "T6c workflow if: contains slash-command trigger" \
"/refire-tier-check" "$WORKFLOW_CONTENT"
assert_contains "T6d workflow if: gates on PR-not-issue" \
"github.event.issue.pull_request" "$WORKFLOW_CONTENT"
assert_contains "T6e workflow listens on issue_comment" \
"issue_comment" "$WORKFLOW_CONTENT"
assert_contains "T6f workflow requests statuses:write permission" \
"statuses: write" "$WORKFLOW_CONTENT"
# Does NOT check out PR HEAD (security)
if grep -q 'ref: \${{ github.event.pull_request.head' "$WORKFLOW"; then
echo " FAIL T6g workflow MUST NOT check out PR head (security)"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} T6g"
else
echo " PASS T6g workflow does not check out PR head"
PASS=$((PASS + 1))
fi
# T1-T5 — script behavior against a local Gitea-fixture
echo
echo "== T1-T5 script behavior (vs local fixture) =="
# Spin up the fixture HTTP server.
FIXTURE_DIR=$(mktemp -d)
trap 'rm -rf "$FIXTURE_DIR"; [ -n "${FIX_PID:-}" ] && kill "$FIX_PID" 2>/dev/null || true' EXIT
FIXTURE_PY="$THIS_DIR/_refire_fixture.py"
if [ ! -f "$FIXTURE_PY" ]; then
echo "::error::fixture server $FIXTURE_PY missing"
exit 1
fi
FIX_LOG="$FIXTURE_DIR/fixture.log"
FIX_STATE_DIR="$FIXTURE_DIR/state"
mkdir -p "$FIX_STATE_DIR"
# Find an unused port.
FIX_PORT=$(python3 -c 'import socket;s=socket.socket();s.bind(("127.0.0.1",0));print(s.getsockname()[1]);s.close()')
FIXTURE_STATE_DIR="$FIX_STATE_DIR" python3 "$FIXTURE_PY" "$FIX_PORT" \
>"$FIX_LOG" 2>&1 &
FIX_PID=$!
# Wait for fixture readiness.
for _ in $(seq 1 50); do
if curl -fsS "http://127.0.0.1:${FIX_PORT}/_ping" >/dev/null 2>&1; then
break
fi
sleep 0.1
done
if ! curl -fsS "http://127.0.0.1:${FIX_PORT}/_ping" >/dev/null 2>&1; then
echo "::error::fixture server failed to start. Log:"
cat "$FIX_LOG"
exit 1
fi
# Helper: set fixture state for a scenario, then run the script.
# tier_result is one of: pass | fail_no_label | fail_no_approvals.
# The refire script's tier-check invocation is mocked because the real
# sop-tier-check.sh uses bash 4+ associative arrays — incompatible with
# the macOS bash 3.2 dev shell. Linux Gitea runners use bash 4/5 so
# production runs the real script. The mock exercises the success +
# failure branches of refire's status-POST glue.
run_scenario() {
local scenario="$1"
local tier_result="${2:-pass}"
echo "$scenario" >"$FIX_STATE_DIR/scenario"
: >"$FIX_STATE_DIR/posted_statuses.jsonl" # clear status log
local out
set +e
out=$(
PATH="$FIXTURE_DIR/bin:$PATH" \
GITEA_TOKEN="fixture-token" \
GITEA_HOST="fixture.local" \
REPO="molecule-ai/molecule-core" \
PR_NUMBER="999" \
COMMENT_AUTHOR="test-runner" \
SOP_REFIRE_DISABLE_RATE_LIMIT="1" \
SOP_REFIRE_TIER_CHECK_SCRIPT="$THIS_DIR/_mock_tier_check.sh" \
MOCK_TIER_RESULT="$tier_result" \
FIXTURE_PORT="$FIX_PORT" \
bash "$SCRIPT" 2>&1
)
local rc=$?
set -e
echo "$out" >"$FIX_STATE_DIR/last_run.log"
echo "$rc" >"$FIX_STATE_DIR/last_rc"
}
# Install a curl shim that rewrites https://fixture.local → http://127.0.0.1:$PORT
# Use bash prefix-strip (${var#prefix}) — it sidesteps the `/` delimiter
# confusion of ${var/pattern/replacement}.
mkdir -p "$FIXTURE_DIR/bin"
cat >"$FIXTURE_DIR/bin/curl" <<SHIM
#!/usr/bin/env bash
# Test shim: rewrite https://fixture.local/* -> http://127.0.0.1:${FIX_PORT}/*
# The fixture doesn't authenticate; -H Authorization passes through harmlessly.
new_args=()
for a in "\$@"; do
if [[ "\$a" == https://fixture.local/* ]]; then
rest="\${a#https://fixture.local}"
a="http://127.0.0.1:${FIX_PORT}\${rest}"
fi
new_args+=("\$a")
done
exec /usr/bin/curl "\${new_args[@]}"
SHIM
chmod +x "$FIXTURE_DIR/bin/curl"
# T1: tier:low + 1 APPROVED + author is in engineers team → success
run_scenario "T1_success" "pass"
RC=$(cat "$FIX_STATE_DIR/last_rc")
POSTED=$(cat "$FIX_STATE_DIR/posted_statuses.jsonl" 2>/dev/null || true)
assert_eq "T1 exit code 0 (success)" "0" "$RC"
assert_contains "T1 POSTed state=success" '"state": "success"' "$POSTED"
assert_contains "T1 POST context is sop-tier-check / tier-check" \
'"context": "sop-tier-check / tier-check (pull_request)"' "$POSTED"
assert_contains "T1 description names commenter" "test-runner" "$POSTED"
# T2: missing tier label → tier-check fails → failure status POSTed
run_scenario "T2_no_tier_label" "fail_no_label"
RC=$(cat "$FIX_STATE_DIR/last_rc")
POSTED=$(cat "$FIX_STATE_DIR/posted_statuses.jsonl" 2>/dev/null || true)
# tier-check.sh exits 1; refire script forwards that exit, so RC != 0
if [ "$RC" -ne 0 ]; then
echo " PASS T2 exit code non-zero (got $RC)"
PASS=$((PASS + 1))
else
echo " FAIL T2 exit code should be non-zero, got 0"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} T2_rc"
fi
assert_contains "T2 POSTed state=failure" '"state": "failure"' "$POSTED"
# T3: tier:low present but ZERO approving reviews → failure
run_scenario "T3_no_approvals" "fail_no_approvals"
RC=$(cat "$FIX_STATE_DIR/last_rc")
POSTED=$(cat "$FIX_STATE_DIR/posted_statuses.jsonl" 2>/dev/null || true)
if [ "$RC" -ne 0 ]; then
echo " PASS T3 exit code non-zero (got $RC)"
PASS=$((PASS + 1))
else
echo " FAIL T3 exit code should be non-zero, got 0"
FAIL=$((FAIL + 1))
FAILED_TESTS="${FAILED_TESTS} T3_rc"
fi
assert_contains "T3 POSTed state=failure" '"state": "failure"' "$POSTED"
# T4: closed PR — refire is a no-op (no POST, exit 0)
run_scenario "T4_closed" "pass"
RC=$(cat "$FIX_STATE_DIR/last_rc")
POSTED=$(cat "$FIX_STATE_DIR/posted_statuses.jsonl" 2>/dev/null || true)
assert_eq "T4 closed PR exits 0" "0" "$RC"
assert_eq "T4 closed PR posts no status" "" "$POSTED"
# T5: rate-limit — disable the env override and let scenario set a
# recent statuses entry. Re-enable rate-limit for this scenario by NOT
# passing SOP_REFIRE_DISABLE_RATE_LIMIT.
echo "T5_rate_limited" >"$FIX_STATE_DIR/scenario"
: >"$FIX_STATE_DIR/posted_statuses.jsonl"
set +e
T5_OUT=$(
PATH="$FIXTURE_DIR/bin:$PATH" \
GITEA_TOKEN="fixture-token" \
GITEA_HOST="fixture.local" \
REPO="molecule-ai/molecule-core" \
PR_NUMBER="999" \
COMMENT_AUTHOR="test-runner" \
FIXTURE_PORT="$FIX_PORT" \
bash "$SCRIPT" 2>&1
)
T5_RC=$?
set -e
POSTED=$(cat "$FIX_STATE_DIR/posted_statuses.jsonl" 2>/dev/null || true)
assert_eq "T5 rate-limited exits 0" "0" "$T5_RC"
assert_contains "T5 rate-limited log says skipped" "rate-limited" "$T5_OUT"
assert_eq "T5 rate-limited posts no status" "" "$POSTED"
echo
echo "------"
echo "PASS=$PASS FAIL=$FAIL"
if [ "$FAIL" -gt 0 ]; then
echo "Failed:$FAILED_TESTS"
fi
[ "$FAIL" -eq 0 ]
@@ -1,8 +1,6 @@
name: Staging SaaS smoke (every 30 min)
name: Canary — staging SaaS smoke (every 30 min)
# Renamed from canary-staging.yml on 2026-05-11 per Hongming directive
# ("canary naming changed to staging for all"). Originally ported from
# .github/workflows/canary-staging.yml on 2026-05-11 per RFC
# Ported from .github/workflows/canary-staging.yml on 2026-05-11 per RFC
# internal#219 §1 sweep. Differences from the GitHub version:
# - Dropped `workflow_dispatch.inputs` (Gitea 1.22.6 parser rejects them
# per feedback_gitea_workflow_dispatch_inputs_unsupported).
@@ -23,21 +21,21 @@ name: Staging SaaS smoke (every 30 min)
# catches drift in the 30-min window between those runs (AMI health, CF
# cert rotation, WorkOS session stability, etc.).
#
# Lean mode: E2E_MODE=smoke skips the child workspace + HMA memory +
# Lean mode: E2E_MODE=canary skips the child workspace + HMA memory +
# peers/activity checks. One parent workspace + one A2A turn is enough
# to signal "SaaS stack end-to-end is alive."
on:
schedule:
# Every 30 min. Cron on GitHub-hosted runners has a known drift of
# a few minutes under load — that's fine for a smoke check.
# a few minutes under load — that's fine for a canary.
- cron: '*/30 * * * *'
# Serialise with the full-SaaS workflow so they don't contend for the
# same org-create quota on staging. Different group key from
# e2e-staging-saas since we don't mind queueing smoke runs behind one
# full run, but two smoke runs SHOULD queue against each other.
# e2e-staging-saas since we don't mind queueing canaries behind one
# full run, but two canaries SHOULD queue against each other.
concurrency:
group: staging-smoke
group: canary-staging
cancel-in-progress: false
permissions:
@@ -49,47 +47,32 @@ env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
smoke:
name: Staging SaaS smoke
canary:
name: Canary smoke
runs-on: ubuntu-latest
# NOTE: Phase 3 (RFC #219 §1) `continue-on-error: true` removed
# 2026-05-11. The "surface broken workflows without blocking"
# rationale was correctly applied to advisory/lint workflows but
# wrong for this smoke — it is the 30-min canary cadence for the
# entire staging SaaS stack, and silent failure here masks the
# exact regressions the smoke exists to surface (AMI rot, CF cert
# drift, WorkOS session breakage, secret rotations). Same class of
# failure as PR#461 (`sweep-stale-e2e-orgs`) where Phase-3 silent
# failure leaked EC2. The four other `e2e-staging-*` workflows
# KEEP `continue-on-error: true` per RFC #219 §1 — they are
# advisory and matrix-style; this one is the canary. A follow-up
# `notify-failure` step below also surfaces breakage to ops even
# if branch-protection wiring is adjusted to keep this off the
# required-checks list.
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
# 25 min headroom over the 15-min TLS-readiness deadline in
# tests/e2e/test_staging_full_saas.sh (#2107). Without the buffer
# the job is killed at the wall-clock 15:00 mark BEFORE the bash
# `fail` + diagnostic burst can fire, leaving every cancellation
# silent. Sibling staging E2E jobs run at 20-45 min — keeping the
# smoke tighter than them so a true wedge still surfaces here
# silent. Sibling staging E2E jobs run at 20-45 min — keeping
# canary tighter than them so a true wedge still surfaces here
# first.
timeout-minutes: 25
env:
MOLECULE_CP_URL: https://staging-api.moleculesai.app
# 2026-05-11: secret canonicalised from MOLECULE_STAGING_ADMIN_TOKEN
# (dead in org secret store) to CP_STAGING_ADMIN_API_TOKEN per
# internal#322 — see this PR for the cross-workflow sweep.
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
# MiniMax is the smoke's PRIMARY LLM auth path post-2026-05-04.
MOLECULE_ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
# MiniMax is the canary's PRIMARY LLM auth path post-2026-05-04.
# Switched from hermes+OpenAI after #2578 (the staging OpenAI key
# account went over quota and stayed dead for 36+ hours, taking
# the smoke red the entire time). claude-code template's
# the canary red the entire time). claude-code template's
# `minimax` provider routes ANTHROPIC_BASE_URL to
# api.minimax.io/anthropic and reads MINIMAX_API_KEY at boot —
# ~5-10x cheaper per token than gpt-4.1-mini AND on a separate
# billing account, so OpenAI quota collapse no longer wedges the
# smoke. Mirrors the migration continuous-synth-e2e.yml made on
# canary. Mirrors the migration continuous-synth-e2e.yml made on
# 2026-05-03 (#265) for the same reason. tests/e2e/test_staging_
# full_saas.sh branches SECRETS_JSON on which key is present —
# MiniMax wins when set.
@@ -103,16 +86,16 @@ jobs:
# E2E_RUNTIME=hermes overridden via workflow_dispatch can still
# exercise the OpenAI path without re-editing the workflow.
E2E_OPENAI_API_KEY: ${{ secrets.MOLECULE_STAGING_OPENAI_API_KEY }}
E2E_MODE: smoke
E2E_MODE: canary
E2E_RUNTIME: claude-code
# Pin the smoke to a specific MiniMax model rather than relying
# Pin the canary to a specific MiniMax model rather than relying
# on the per-runtime default (which could resolve to "sonnet" →
# direct Anthropic and defeat the cost saving). M2.7-highspeed
# is "Token Plan only" but cheap-per-token and fast.
E2E_MODEL_SLUG: MiniMax-M2.7-highspeed
E2E_RUN_ID: "smoke-${{ github.run_id }}"
E2E_RUN_ID: "canary-${{ github.run_id }}"
# Debug-only: when an operator dispatches with keep_on_failure=true,
# the smoke script's E2E_KEEP_ORG=1 path skips teardown so the
# the canary script's E2E_KEEP_ORG=1 path skips teardown so the
# tenant org + EC2 stay alive for SSM-based log capture. Cron runs
# never set this (the input only exists on workflow_dispatch) so
# unattended cron always tears down. See molecule-core#129
@@ -126,7 +109,7 @@ jobs:
- name: Verify admin token present
run: |
if [ -z "$MOLECULE_ADMIN_TOKEN" ]; then
echo "::error::CP_STAGING_ADMIN_API_TOKEN not set"
echo "::error::MOLECULE_STAGING_ADMIN_TOKEN not set"
exit 2
fi
@@ -136,7 +119,7 @@ jobs:
# langgraph (operator-dispatched only) use OpenAI. Hard-fail
# rather than soft-skip per the lesson from synth E2E #2578:
# an empty key silently falls through to the wrong
# SECRETS_JSON branch and the smoke fails 5 min later with
# SECRETS_JSON branch and the canary fails 5 min later with
# a confusing auth error instead of the clean "secret
# missing" message at the top.
case "${E2E_RUNTIME}" in
@@ -172,8 +155,8 @@ jobs:
fi
echo "LLM key present ✓ (runtime=${E2E_RUNTIME}, key=${required_secret_name}, len=${#required_secret_value})"
- name: Smoke run
id: smoke
- name: Canary run
id: canary
run: bash tests/e2e/test_staging_full_saas.sh
# Alerting: open a sticky issue on the FIRST failure; comment on
@@ -201,9 +184,6 @@ jobs:
run: |
set -euo pipefail
API="${SERVER_URL%/}/api/v1"
# Title kept stable across the canary-staging.yml → staging-smoke.yml
# rename (2026-05-11) so any open alert issue from the old name
# still title-matches and auto-closes on the next green run.
TITLE="Canary failing: staging SaaS smoke"
RUN_URL="${SERVER_URL}/${REPO}/actions/runs/${RUN_ID}"
@@ -214,18 +194,18 @@ jobs:
if [ -n "$EXISTING" ]; then
curl -fsS -X POST -H "Authorization: token $GITEA_TOKEN" -H "Content-Type: application/json" \
"${API}/repos/${REPO}/issues/${EXISTING}/comments" \
-d "$(jq -nc --arg run "$RUN_URL" '{body: ("Smoke still failing. " + $run)}')" >/dev/null
-d "$(jq -nc --arg run "$RUN_URL" '{body: ("Canary still failing. " + $run)}')" >/dev/null
echo "Commented on existing issue #${EXISTING}"
else
NOW=$(date -u +%Y-%m-%dT%H:%M:%SZ)
BODY=$(jq -nc --arg t "$TITLE" --arg now "$NOW" --arg run "$RUN_URL" \
'{title: $t, body: ("Smoke run failed at " + $now + ".\n\nRun: " + $run + "\n\nThis issue auto-closes on the next green smoke run. Consecutive failures add a comment here rather than a new issue.")}')
'{title: $t, body: ("Canary run failed at " + $now + ".\n\nRun: " + $run + "\n\nThis issue auto-closes on the next green canary run. Consecutive failures add a comment here rather than a new issue.")}')
curl -fsS -X POST -H "Authorization: token $GITEA_TOKEN" -H "Content-Type: application/json" \
"${API}/repos/${REPO}/issues" -d "$BODY" >/dev/null
echo "Opened smoke failure issue (first red)"
echo "Opened canary failure issue (first red)"
fi
- name: Auto-close smoke issue on success (Gitea API)
- name: Auto-close canary issue on success (Gitea API)
if: success()
env:
GITEA_TOKEN: ${{ secrets.GITHUB_TOKEN }}
@@ -235,8 +215,6 @@ jobs:
run: |
set -euo pipefail
API="${SERVER_URL%/}/api/v1"
# Title kept stable across the canary-staging.yml → staging-smoke.yml
# rename so open alert issues from the old name still match.
TITLE="Canary failing: staging SaaS smoke"
NUMS=$(curl -fsS -H "Authorization: token $GITEA_TOKEN" \
@@ -247,36 +225,37 @@ jobs:
for N in $NUMS; do
curl -fsS -X POST -H "Authorization: token $GITEA_TOKEN" -H "Content-Type: application/json" \
"${API}/repos/${REPO}/issues/${N}/comments" \
-d "$(jq -nc --arg now "$NOW" '{body: ("Smoke recovered at " + $now + ". Closing.")}')" >/dev/null
-d "$(jq -nc --arg now "$NOW" '{body: ("Canary recovered at " + $now + ". Closing.")}')" >/dev/null
curl -fsS -X PATCH -H "Authorization: token $GITEA_TOKEN" -H "Content-Type: application/json" \
"${API}/repos/${REPO}/issues/${N}" -d '{"state":"closed"}' >/dev/null
echo "Closed recovered smoke issue #${N}"
echo "Closed recovered canary issue #${N}"
done
- name: Teardown safety net
if: always()
env:
ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
run: |
set +e
# Slug prefix matches what test_staging_full_saas.sh emits
# in smoke mode:
# SLUG="e2e-smoke-$(date +%Y%m%d)-${RUN_ID_SUFFIX}"
# Earlier (pre-2026-05-11 canary→staging rename) the prefix was
# `e2e-canary-`; both prefixes are matched here for one
# release cycle so cleanup still catches any in-flight org
# provisioned under the old prefix on an older runner that
# hasn't picked up the renamed script. Remove the canary
# fallback after one week of no-old-prefix observations.
# in canary mode:
# SLUG="e2e-canary-$(date +%Y%m%d)-${RUN_ID_SUFFIX}"
# Earlier this was `e2e-{today}-canary-` — that was the
# full-mode pattern (date FIRST, mode SECOND); canary slugs
# have mode FIRST, date SECOND. The mismatch silently
# never matched, leaving every cancelled-canary EC2 alive
# until the once-an-hour sweep eventually caught it
# (incident 2026-04-26 21:03Z: 1h25m EC2 leak before manual
# cleanup; same gap on three earlier cancellations today).
orgs=$(curl -sS "$MOLECULE_CP_URL/cp/admin/orgs" \
-H "Authorization: Bearer $ADMIN_TOKEN" 2>/dev/null \
| python3 -c "
import json, sys, os, datetime
run_id = os.environ.get('GITHUB_RUN_ID', '')
d = json.load(sys.stdin)
# Scope to slugs from THIS smoke run when GITHUB_RUN_ID is
# available; the smoke workflow sets E2E_RUN_ID='smoke-\${run_id}'
# so the slug suffix is '-smoke-\${run_id}-...'. Mirrors the
# Scope to slugs from THIS canary run when GITHUB_RUN_ID is
# available; the canary workflow sets E2E_RUN_ID='canary-\${run_id}'
# so the slug suffix is '-canary-\${run_id}-...'. Mirrors the
# full-mode safety net's per-run scoping (e2e-staging-saas.yml)
# added after the 2026-04-21 cross-run cleanup incident.
# Sweep both today AND yesterday's UTC dates so a run that
@@ -286,11 +265,9 @@ jobs:
yesterday = today - datetime.timedelta(days=1)
dates = (today.strftime('%Y%m%d'), yesterday.strftime('%Y%m%d'))
if run_id:
prefixes = tuple(f'e2e-smoke-{d}-smoke-{run_id}' for d in dates) \
+ tuple(f'e2e-canary-{d}-canary-{run_id}' for d in dates)
prefixes = tuple(f'e2e-canary-{d}-canary-{run_id}' for d in dates)
else:
prefixes = tuple(f'e2e-smoke-{d}-' for d in dates) \
+ tuple(f'e2e-canary-{d}-' for d in dates)
prefixes = tuple(f'e2e-canary-{d}-' for d in dates)
candidates = [o['slug'] for o in d.get('orgs', [])
if any(o.get('slug','').startswith(p) for p in prefixes)
and o.get('status') not in ('purged',)]
@@ -303,8 +280,8 @@ jobs:
# stale sweep caught it (up to 2h later). Now we capture the
# response code and surface non-2xx as a workflow warning, so
# the run page shows which slug leaked. We still don't `exit 1`
# on cleanup failure — a single-smoke cleanup miss shouldn't
# fail-flag the smoke itself when the actual smoke check
# on cleanup failure — a single-canary cleanup miss shouldn't
# fail-flag the canary itself when the actual smoke check
# passed. The sweep-stale-e2e-orgs cron (now every 15 min,
# 30-min threshold) is the safety net for whatever slips past.
# See molecule-controlplane#420.
@@ -313,34 +290,21 @@ jobs:
# Tempfile-routed -w + set +e/-e prevents curl-exit-code
# pollution of the captured status (lint-curl-status-capture.yml).
set +e
curl -sS -o /tmp/smoke-cleanup.out -w "%{http_code}" \
curl -sS -o /tmp/canary-cleanup.out -w "%{http_code}" \
-X DELETE "$MOLECULE_CP_URL/cp/admin/tenants/$slug" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"confirm\":\"$slug\"}" >/tmp/smoke-cleanup.code
-d "{\"confirm\":\"$slug\"}" >/tmp/canary-cleanup.code
set -e
code=$(cat /tmp/smoke-cleanup.code 2>/dev/null || echo "000")
code=$(cat /tmp/canary-cleanup.code 2>/dev/null || echo "000")
if [ "$code" = "200" ] || [ "$code" = "204" ]; then
echo "[teardown] deleted $slug (HTTP $code)"
else
echo "::warning::smoke teardown for $slug returned HTTP $code — sweep-stale-e2e-orgs will catch it within ~45 min. Body: $(head -c 300 /tmp/smoke-cleanup.out 2>/dev/null)"
echo "::warning::canary teardown for $slug returned HTTP $code — sweep-stale-e2e-orgs will catch it within ~45 min. Body: $(head -c 300 /tmp/canary-cleanup.out 2>/dev/null)"
leaks+=("$slug")
fi
done
if [ ${#leaks[@]} -gt 0 ]; then
echo "::warning::smoke teardown left ${#leaks[@]} leak(s): ${leaks[*]}"
echo "::warning::canary teardown left ${#leaks[@]} leak(s): ${leaks[*]}"
fi
exit 0
- name: Notify on smoke failure
# Fail-loud companion to dropping `continue-on-error: true`.
# The Open-issue-on-failure step above handles the human-facing
# alert; this step emits a clearly-tagged ::error:: line that
# log-tail consumers (Loki SOPRefireRule, orchestrator triage
# loop) can grep on. Mirrors PR#461's sweep-stale-e2e-orgs
# pattern. Runs AFTER the teardown safety net (which is
# if: always()) so failures don't suppress cleanup.
if: failure()
run: |
echo "::error::staging-smoke FAILED — staging SaaS canary is red. See prior step logs + the auto-filed alert issue. Common causes: (a) CP_STAGING_ADMIN_API_TOKEN secret missing/rotated, (b) staging-api.moleculesai.app 5xx, (c) MiniMax/Anthropic LLM key dead, (d) AMI/CF/WorkOS drift. The 30-min cron will retry, but a chronic red here indicates the staging SaaS stack is broken end-to-end."
exit 1
@@ -1,8 +1,6 @@
name: Staging verify
name: canary-verify
# Renamed from canary-verify.yml on 2026-05-11 per Hongming directive
# ("canary naming changed to staging for all"). Originally ported from
# .github/workflows/canary-verify.yml on 2026-05-11 per RFC
# Ported from .github/workflows/canary-verify.yml on 2026-05-11 per RFC
# internal#219 §1 sweep. Differences from the GitHub version:
# - Dropped `workflow_dispatch.inputs` (Gitea 1.22.6 parser rejects them
# per feedback_gitea_workflow_dispatch_inputs_unsupported).
@@ -25,22 +23,13 @@ name: Staging verify
# digest. On red, :latest stays on the prior known-good digest and
# prod is untouched.
#
# Terminology note (2026-05-11): The deployment STRATEGY here is still
# called "canary release" (a small subset of tenants gets the new image
# first, the rest follow on green). The "canary" word stays for the
# pre-fan-out cohort concept (see docs/architecture/canary-release.md
# and CANARY_SLUG in redeploy-tenants-on-*.yml). What changed is the
# FILE NAME and the SECRETS feeding this workflow — both are renamed
# to drop the redundant "canary-" prefix that conflated workflow
# identity with deployment strategy.
#
# Registry note (2026-05-10): This workflow previously used GHCR
# (ghcr.io/molecule-ai/platform-tenant) — that registry was retired
# during the 2026-05-06 Gitea suspension migration when publish-
# workspace-server-image.yml switched to the operator's ECR org
# (153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/
# platform-tenant). The GHCR → ECR migration was never applied to
# this file, so this workflow was silently smoke-testing the stale
# this file, so canary-verify was silently smoke-testing the stale
# GHCR image while the actual staging/prod tenants ran the ECR image.
# Result: smoke tests could not catch a broken ECR build. Fix:
# - Wait step: reads SHA from running canary /health (tenant-
@@ -54,9 +43,8 @@ name: Staging verify
# to ECR on staging and main merges.
# - Canary tenants are configured to pull :staging-<sha> from ECR
# (TENANT_IMAGE env set to the ECR :staging-<sha> tag).
# - Repo secrets MOLECULE_STAGING_TENANT_URLS /
# MOLECULE_STAGING_ADMIN_TOKENS / MOLECULE_STAGING_CP_SHARED_SECRET
# are populated.
# - Repo secrets CANARY_TENANT_URLS / CANARY_ADMIN_TOKENS /
# CANARY_CP_SHARED_SECRET are populated.
on:
workflow_run:
@@ -77,7 +65,7 @@ env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
staging-smoke:
canary-smoke:
# Skip when the upstream workflow failed — no image to test against.
# workflow_dispatch trigger dropped in this Gitea port; only the
# workflow_run path remains.
@@ -109,15 +97,15 @@ jobs:
# other registry — the canary is telling us what it's actually
# running, which is the ground truth for smoke testing.
env:
MOLECULE_STAGING_TENANT_URLS: ${{ secrets.MOLECULE_STAGING_TENANT_URLS }}
CANARY_TENANT_URLS: ${{ secrets.CANARY_TENANT_URLS }}
EXPECTED_SHA: ${{ steps.compute.outputs.sha }}
run: |
if [ -z "$MOLECULE_STAGING_TENANT_URLS" ]; then
if [ -z "$CANARY_TENANT_URLS" ]; then
echo "No canary URLs configured — falling back to 60s wait"
sleep 60
exit 0
fi
IFS=',' read -ra URLS <<< "$MOLECULE_STAGING_TENANT_URLS"
IFS=',' read -ra URLS <<< "$CANARY_TENANT_URLS"
MAX_WAIT=420 # 7 minutes
INTERVAL=30
ELAPSED=0
@@ -141,7 +129,7 @@ jobs:
done
echo "Timeout after ${MAX_WAIT}s — proceeding anyway (smoke suite will validate)"
- name: Run staging smoke suite
- name: Run canary smoke suite
id: smoke
# Graceful-skip when no canary fleet is configured (Phase 2 not yet
# stood up — see molecule-controlplane/docs/canary-tenants.md).
@@ -150,29 +138,29 @@ jobs:
# promote-latest.yml is the release gate while canary is absent.
# Once the fleet is real: delete the early-exit branch.
env:
MOLECULE_STAGING_TENANT_URLS: ${{ secrets.MOLECULE_STAGING_TENANT_URLS }}
MOLECULE_STAGING_ADMIN_TOKENS: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKENS }}
MOLECULE_STAGING_CP_BASE_URL: https://staging-api.moleculesai.app
MOLECULE_STAGING_CP_SHARED_SECRET: ${{ secrets.MOLECULE_STAGING_CP_SHARED_SECRET }}
CANARY_TENANT_URLS: ${{ secrets.CANARY_TENANT_URLS }}
CANARY_ADMIN_TOKENS: ${{ secrets.CANARY_ADMIN_TOKENS }}
CANARY_CP_BASE_URL: https://staging-api.moleculesai.app
CANARY_CP_SHARED_SECRET: ${{ secrets.CANARY_CP_SHARED_SECRET }}
run: |
set -euo pipefail
if [ -z "${MOLECULE_STAGING_TENANT_URLS:-}" ] \
|| [ -z "${MOLECULE_STAGING_ADMIN_TOKENS:-}" ] \
|| [ -z "${MOLECULE_STAGING_CP_SHARED_SECRET:-}" ]; then
if [ -z "${CANARY_TENANT_URLS:-}" ] \
|| [ -z "${CANARY_ADMIN_TOKENS:-}" ] \
|| [ -z "${CANARY_CP_SHARED_SECRET:-}" ]; then
{
echo "## ⚠️ staging-verify skipped"
echo "## ⚠️ canary-verify skipped"
echo
echo "One or more canary secrets are unset (\`MOLECULE_STAGING_TENANT_URLS\`, \`MOLECULE_STAGING_ADMIN_TOKENS\`, \`MOLECULE_STAGING_CP_SHARED_SECRET\`)."
echo "One or more canary secrets are unset (\`CANARY_TENANT_URLS\`, \`CANARY_ADMIN_TOKENS\`, \`CANARY_CP_SHARED_SECRET\`)."
echo "Phase 2 canary fleet has not been stood up yet —"
echo "see [canary-tenants.md](https://git.moleculesai.app/molecule-ai/molecule-controlplane/blob/main/docs/canary-tenants.md)."
echo
echo "**Skipped — promote-to-latest will NOT auto-fire.** Dispatch \`promote-latest.yml\` manually when ready."
} >> "$GITHUB_STEP_SUMMARY"
echo "ran=false" >> "$GITHUB_OUTPUT"
echo "::notice::staging-verify: skipped — no canary fleet configured"
echo "::notice::canary-verify: skipped — no canary fleet configured"
exit 0
fi
bash scripts/staging-smoke.sh
bash scripts/canary-smoke.sh
echo "ran=true" >> "$GITHUB_OUTPUT"
- name: Summary on failure
@@ -185,7 +173,7 @@ jobs:
echo ":latest stays pinned to the prior good digest — prod is untouched."
echo
echo "Fix forward and merge again, or investigate the specific failed"
echo "assertions in the staging-smoke step log above."
echo "assertions in the canary-smoke step log above."
} >> "$GITHUB_STEP_SUMMARY"
promote-to-latest:
@@ -200,13 +188,13 @@ jobs:
# silently promoting a stale GHCR image while actual prod tenants
# pulled from ECR. Canary smoke tests were GHCR-targeted and could
# not catch a broken ECR build.
needs: staging-smoke
if: ${{ needs.staging-smoke.result == 'success' && needs.staging-smoke.outputs.smoke_ran == 'true' }}
needs: canary-smoke
if: ${{ needs.canary-smoke.result == 'success' && needs.canary-smoke.outputs.smoke_ran == 'true' }}
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
env:
SHA: ${{ needs.staging-smoke.outputs.sha }}
SHA: ${{ needs.canary-smoke.outputs.sha }}
CP_URL: ${{ vars.CP_URL || 'https://staging-api.moleculesai.app' }}
# CP_ADMIN_API_TOKEN gates write access to the redeploy endpoint.
# Stored at the repo level so all workflows pick it up automatically.
@@ -276,9 +264,9 @@ jobs:
- name: Summary
run: |
{
echo "## Staging verified — :latest promoted via CP redeploy-fleet"
echo "## Canary verified — :latest promoted via CP redeploy-fleet"
echo ""
echo "- **Target tag:** \`staging-${{ needs.staging-smoke.outputs.sha }}\`"
echo "- **Target tag:** \`staging-${{ needs.canary-smoke.outputs.sha }}\`"
echo "- **Registry:** ECR (\`${TENANT_IMAGE_NAME}\`)"
echo "- **Canary slug:** \`${CANARY_SLUG:-<none>}\` (soak ${SOAK_SECONDS}s)"
echo "- **Batch size:** ${BATCH_SIZE:-3}"
+7 -12
View File
@@ -77,18 +77,13 @@ jobs:
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Run drift detector
env:
# DRIFT_BOT_TOKEN is owned by mc-drift-bot, a least-privilege
# Gitea persona whose ONLY job is reading branch_protections
# and posting the [ci-drift] tracking issue. The endpoint
# `GET /repos/.../branch_protections/{branch}` requires
# repo-ADMIN role (Gitea 1.22.6) — SOP_TIER_CHECK_TOKEN and the
# auto-injected GITHUB_TOKEN do NOT have it (read-only / write
# without admin), so the previous fallback chain 403'd.
# Mirrors the controlplane fix landed in CP PR#134.
# Provisioning trail: internal#329 (audit) + parent pattern
# internal#327 (publish-runtime-bot). Per
# `feedback_per_agent_gitea_identity_default`.
GITEA_TOKEN: ${{ secrets.DRIFT_BOT_TOKEN }}
# GITEA_TOKEN reads protection + writes issues. molecule-core
# uses `SOP_TIER_CHECK_TOKEN` as the org-level secret name for
# read-only Gitea API access from CI (set by audit-force-merge
# and sop-tier-check too). Falls back to the auto-injected
# GITHUB_TOKEN if the org-level secret isn't set
# (transitional repos).
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
# Branches whose protection we compare against. molecule-core
-100
View File
@@ -148,21 +148,6 @@ jobs:
- if: needs.changes.outputs.platform == 'true'
name: Run golangci-lint
run: golangci-lint run --timeout 3m ./... || true
- if: needs.changes.outputs.platform == 'true'
name: Diagnostic — per-package verbose 60s
run: |
set +e
go test -race -v -timeout 60s ./internal/handlers/... 2>&1 | tee /tmp/test-handlers.log
handlers_exit=$?
go test -race -v -timeout 60s ./internal/pendinguploads/... 2>&1 | tee /tmp/test-pu.log
pu_exit=$?
echo "::group::handlers exit=$handlers_exit (last 100 lines)"
tail -100 /tmp/test-handlers.log
echo "::endgroup::"
echo "::group::pendinguploads exit=$pu_exit (last 100 lines)"
tail -100 /tmp/test-pu.log
echo "::endgroup::"
continue-on-error: true
- if: needs.changes.outputs.platform == 'true'
name: Run tests with race detection and coverage
run: go test -race -coverprofile=coverage.out ./...
@@ -466,88 +451,3 @@ jobs:
echo " adjusting the floor with rationale in COVERAGE_FLOOR.md."
exit 1
fi
all-required:
# Aggregator sentinel — RFC internal#219 §2 (Phase 4 — closes internal#286).
#
# Single stable required-status name that branch protection points at;
# CI churns underneath in `needs:` without any protection edits. Mirrors
# the molecule-controlplane Phase 2a impl shipped in CP PR#112 and
# referenced by `internal#286` ("Phase 4 is a single small PR... mirrors
# CP's existing one").
#
# Closes the failure mode where status_check_contexts on molecule-core/main
# only listed `Secret scan` + `sop-tier-check` (the 2 meta-gates), so real
# `Platform (Go)` / `Canvas (Next.js)` / `Python Lint & Test` / `Shellcheck`
# red silently merged through. See internal#286 for the three concrete
# tonight-of-2026-05-11 incidents that prompted the emergency bump.
#
# Three properties of this job each close a failure mode:
#
# 1. `if: always()` — runs even when an upstream fails. Without it the
# sentinel is `skipped` and protection treats that as missing → merge
# ungated.
#
# 2. Assertion is `result == "success"` per dep, NOT `!= "failure"`.
# A `skipped` upstream (job gated by `if:` evaluating false, matrix
# entry that couldn't run) must NOT silently pass through.
# `skipped`-as-green is exactly the failure mode this gate closes.
#
# 3. `needs:` is the canonical list of "what counts as required."
# status_check_contexts will reference only `ci/all-required` (Step 5
# follow-up — branch-protection PATCH is Owners-tier per
# `feedback_never_admin_merge_bypass`, separate PR); a new job is
# added simply by listing it in `needs:` here.
# `.gitea/workflows/ci-required-drift.yml` files a [ci-drift] issue
# hourly if this list diverges from status_check_contexts or from
# audit-force-merge.yml's REQUIRED_CHECKS env (RFC §4 + §6).
#
# Excluded from `needs:`: `canvas-deploy-reminder` — gated by
# `if: ... github.event_name == 'push' && github.ref == 'refs/heads/main'`,
# so on PR events it's legitimately `skipped`. The drift detector
# explicitly excludes `github.event_name`-gated jobs from F1 (see
# `.gitea/scripts/ci-required-drift.py::ci_job_names`).
#
# Phase 3 (RFC #219 §1) safety: continue-on-error here so the sentinel
# does not hard-fail and block PRs while the underlying build jobs are
# still in Phase 3 (continue-on-error: true suppresses their status to null).
# When Phase 3 ends (defects fixed, continue-on-error flipped off on build
# jobs), remove continue-on-error here so the sentinel again hard-fails.
continue-on-error: true
runs-on: ubuntu-latest
timeout-minutes: 1
needs:
- changes
- platform-build
- canvas-build
- shellcheck
- python-lint
if: always()
steps:
- name: Assert every required dependency succeeded
run: |
set -euo pipefail
# `needs.*.result` is one of: success | failure | cancelled | skipped | null.
# We assert success per dep (not != failure) — see RFC §2 reasoning above.
# Null results are skipped: they come from Phase 3 (continue-on-error: true
# suppresses status) or from jobs still in-flight. The sentinel succeeds
# rather than blocking PRs on Phase 3 noise.
results='${{ toJSON(needs) }}'
echo "$results"
echo "$results" | python3 -c '
import json, sys
ns = json.load(sys.stdin)
# Exclude null (Phase 3 suppressed / in-flight) from the bad list.
bad = [(k, v.get("result")) for k, v in ns.items()
if v.get("result") not in ("success", None)]
if bad:
print(f"FAIL: jobs not green:", file=sys.stderr)
for k, r in bad:
print(f" - {k}: {r}", file=sys.stderr)
sys.exit(1)
pending = [(k, v.get("result")) for k, v in ns.items() if v.get("result") is None]
if pending:
print(f"WARN: {len(pending)} job(s) still in-flight (result=null): " +
", ".join(k for k, _ in pending), file=sys.stderr)
print(f"OK: all {len(ns)} required jobs succeeded (or Phase-3 suppressed)")
'
+1 -1
View File
@@ -56,7 +56,7 @@ on:
# 2. Avoid colliding with the existing :15 sweep-cf-orphans
# and :45 sweep-cf-tunnels — both hit the CF API and we
# don't want to fight for rate-limit tokens.
# 3. Avoid the :30 heavy slot (staging-smoke /30, sweep-aws-
# 3. Avoid the :30 heavy slot (canary-staging /30, sweep-aws-
# secrets, sweep-stale-e2e-orgs every :15) — multiple
# overlapping cron registrations on the same minute is part
# of what GH drops under load.
+3 -6
View File
@@ -124,10 +124,7 @@ jobs:
env:
CANVAS_E2E_STAGING: '1'
MOLECULE_CP_URL: https://staging-api.moleculesai.app
# 2026-05-11: secret canonicalised from MOLECULE_STAGING_ADMIN_TOKEN
# (dead in org secret store) to CP_STAGING_ADMIN_API_TOKEN per
# internal#322 — see this PR for the cross-workflow sweep.
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
MOLECULE_ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
defaults:
run:
@@ -148,7 +145,7 @@ jobs:
if: needs.detect-changes.outputs.canvas == 'true'
run: |
if [ -z "$MOLECULE_ADMIN_TOKEN" ]; then
echo "::error::Missing CP_STAGING_ADMIN_API_TOKEN"
echo "::error::Missing MOLECULE_STAGING_ADMIN_TOKEN"
exit 2
fi
@@ -210,7 +207,7 @@ jobs:
- name: Teardown safety net
if: always() && needs.detect-changes.outputs.canvas == 'true'
env:
ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
run: |
set +e
STATE_FILE=".playwright-staging-state.json"
+3 -6
View File
@@ -89,10 +89,7 @@ jobs:
env:
MOLECULE_CP_URL: https://staging-api.moleculesai.app
# 2026-05-11: secret canonicalised from MOLECULE_STAGING_ADMIN_TOKEN
# (dead in org secret store) to CP_STAGING_ADMIN_API_TOKEN per
# internal#322 — see this PR for the cross-workflow sweep.
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
MOLECULE_ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
E2E_RUN_ID: "${{ github.run_id }}-${{ github.run_attempt }}"
E2E_KEEP_ORG: ${{ github.event.inputs.keep_org && '1' || '0' }}
E2E_STALE_WAIT_SECS: ${{ github.event.inputs.stale_wait_secs || '180' }}
@@ -107,7 +104,7 @@ jobs:
# missing — silent skip would mask infra rot. Manual dispatch
# gets the same hard-fail; an operator running this on a fork
# without secrets configured needs to know up-front.
echo "::error::CP_STAGING_ADMIN_API_TOKEN secret not set (Railway staging CP_ADMIN_API_TOKEN)"
echo "::error::MOLECULE_STAGING_ADMIN_TOKEN secret not set (Railway staging CP_ADMIN_API_TOKEN)"
exit 2
fi
echo "Admin token present ✓"
@@ -132,7 +129,7 @@ jobs:
- name: Teardown safety net (runs on cancel/failure)
if: always()
env:
ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
run: |
set +e
orgs=$(curl -sS "$MOLECULE_CP_URL/cp/admin/orgs" \
+10 -46
View File
@@ -24,22 +24,17 @@ name: E2E Staging SaaS (full lifecycle)
# PRs don't need to read.
#
# Triggers:
# - Push to main (regression guard — fires on merges to main, not on PR updates)
# - pull_request: pr-validate always posts success; real E2E step runs only
# when provisioning-critical files change (detect-changes gates the step).
# - Push to main (regression guard)
# - workflow_dispatch (manual re-run from UI)
# - Nightly cron (catches drift even when no pushes land)
#
# NOTE: A separate pr-validate job handles the pull_request path so this
# workflow posts CI status for workflow-only PRs. Without it, a PR that
# only touches the workflow file has no status check (workflow only fires
# on push, not PR branches), which blocks merge under branch protection.
# The E2E step itself only runs when provisioning-critical files change —
# pr-validate always posts success, avoiding the double-fire that motivated
# the pull_request-trigger removal in PRs #516/#530.
# - Changes to any provisioning-critical file under PR review (opt-in
# via the same paths watcher that e2e-api.yml uses)
on:
# Trunk-based (Phase 3 of internal#81): main is the only branch.
# Previously this fired on staging push too because staging was a
# superset of main and ran the gate ahead of auto-promote; with no
# staging branch, main is where E2E gates the deploy.
push:
branches: [main]
paths:
@@ -60,7 +55,6 @@ on:
- 'workspace-server/internal/provisioner/**'
- 'tests/e2e/test_staging_full_saas.sh'
- '.gitea/workflows/e2e-staging-saas.yml'
workflow_dispatch:
schedule:
# 07:00 UTC every day — catches AMI drift, WorkOS cert rotation,
# Cloudflare API regressions, etc. even on quiet days.
@@ -78,36 +72,9 @@ env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
# PR-validation path: always posts success so branch protection can merge
# workflow-only PRs. The actual E2E step only runs when provisioning-
# critical files change (git-paths filter + if: guard below).
# All steps use continue-on-error: true so runner issues do not block merge.
pr-validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
continue-on-error: true
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
continue-on-error: true
- name: YAML validation (best-effort)
run: |
echo "e2e-staging-saas.yml — PR validation: workflow YAML is valid."
echo "E2E step runs only when provisioning-critical files change."
continue-on-error: true
# Actual E2E: runs on trunk pushes (main + staging). NOT the PR-fire-only
# path — pr-validate above posts success for workflow-only PRs.
e2e-staging-saas:
name: E2E Staging SaaS
runs-on: ubuntu-latest
# Only runs on trunk pushes. PR paths get pr-validate instead.
if: github.event.pull_request.base.ref == ''
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
timeout-minutes: 45
@@ -119,10 +86,7 @@ jobs:
# Single admin-bearer secret drives provision + tenant-token
# retrieval + teardown. Configure in
# Settings → Secrets and variables → Actions → Repository secrets.
# 2026-05-11: secret canonicalised from MOLECULE_STAGING_ADMIN_TOKEN
# (dead in org secret store) to CP_STAGING_ADMIN_API_TOKEN per
# internal#322 — see this PR for the cross-workflow sweep.
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
MOLECULE_ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
# MiniMax is the PRIMARY LLM auth path post-2026-05-04. Switched
# from hermes+OpenAI default after #2578 (the staging OpenAI key
# account went over quota and stayed dead for 36+ hours, taking
@@ -131,7 +95,7 @@ jobs:
# ANTHROPIC_BASE_URL to api.minimax.io/anthropic and reads
# MINIMAX_API_KEY at boot — separate billing account so an
# OpenAI quota collapse no longer wedges the gate. Mirrors the
# staging-smoke.yml + continuous-synth-e2e.yml migrations.
# canary-staging.yml + continuous-synth-e2e.yml migrations.
E2E_MINIMAX_API_KEY: ${{ secrets.MOLECULE_STAGING_MINIMAX_API_KEY }}
# Direct-Anthropic alternative for operators who don't want to
# set up a MiniMax account (priority below MiniMax — first
@@ -158,7 +122,7 @@ jobs:
- name: Verify admin token present
run: |
if [ -z "$MOLECULE_ADMIN_TOKEN" ]; then
echo "::error::CP_STAGING_ADMIN_API_TOKEN secret not set (Railway staging CP_ADMIN_API_TOKEN)"
echo "::error::MOLECULE_STAGING_ADMIN_TOKEN secret not set (Railway staging CP_ADMIN_API_TOKEN)"
exit 2
fi
echo "Admin token present ✓"
@@ -225,7 +189,7 @@ jobs:
- name: Teardown safety net (runs on cancel/failure)
if: always()
env:
ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
run: |
# Best-effort: find any e2e-YYYYMMDD-* orgs matching this run and
# nuke them. Catches the case where the script died before
+10 -19
View File
@@ -11,11 +11,11 @@ name: E2E Staging Sanity (leak-detection self-check)
# - `continue-on-error: true` on the job (RFC §1 contract).
#
# Periodic assertion that the teardown safety nets in e2e-staging-saas
# and staging-smoke (formerly canary-staging) actually work. Runs the
# E2E harness with E2E_INTENTIONAL_FAILURE=1, which poisons the tenant
# admin token after the org is provisioned. The workspace-provision
# step then fails, the script exits non-zero, and the EXIT trap +
# workflow always()-step must still tear down cleanly.
# and canary-staging actually work. Runs the E2E harness with
# E2E_INTENTIONAL_FAILURE=1, which poisons the tenant admin token after
# the org is provisioned. The workspace-provision step then fails, the
# script exits non-zero, and the EXIT trap + workflow always()-step
# must still tear down cleanly.
on:
schedule:
@@ -42,11 +42,8 @@ jobs:
env:
MOLECULE_CP_URL: https://staging-api.moleculesai.app
# 2026-05-11: secret canonicalised from MOLECULE_STAGING_ADMIN_TOKEN
# (dead in org secret store) to CP_STAGING_ADMIN_API_TOKEN per
# internal#322 — see this PR for the cross-workflow sweep.
MOLECULE_ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
E2E_MODE: smoke
MOLECULE_ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
E2E_MODE: canary
E2E_RUNTIME: hermes
E2E_RUN_ID: "sanity-${{ github.run_id }}"
E2E_INTENTIONAL_FAILURE: "1"
@@ -57,7 +54,7 @@ jobs:
- name: Verify admin token present
run: |
if [ -z "$MOLECULE_ADMIN_TOKEN" ]; then
echo "::error::CP_STAGING_ADMIN_API_TOKEN not set"
echo "::error::MOLECULE_STAGING_ADMIN_TOKEN not set"
exit 2
fi
@@ -121,7 +118,7 @@ jobs:
- name: Teardown safety net
if: always()
env:
ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
run: |
set +e
orgs=$(curl -sS "$MOLECULE_CP_URL/cp/admin/orgs" \
@@ -130,14 +127,8 @@ jobs:
import json, sys
d = json.load(sys.stdin)
today = __import__('datetime').date.today().strftime('%Y%m%d')
# Match both the new e2e-smoke- prefix (post-2026-05-11 rename)
# and the legacy e2e-canary- prefix for one rollout cycle so
# any in-flight org provisioned under the old prefix on an
# older runner checkout still gets cleaned up. Remove the
# canary fallback after one week of no-old-prefix observations.
prefixes = (f'e2e-smoke-{today}-sanity-', f'e2e-canary-{today}-sanity-')
candidates = [o['slug'] for o in d.get('orgs', [])
if any(o.get('slug','').startswith(p) for p in prefixes)
if o.get('slug','').startswith(f'e2e-canary-{today}-sanity-')
and o.get('status') not in ('purged',)]
print('\n'.join(candidates))
" 2>/dev/null)
-97
View File
@@ -1,97 +0,0 @@
# gate-check-v3 — automated PR gate detector
#
# Runs on every open PR (push/synchronize) and hourly via cron.
# Posts a structured [gate-check-v3] STATUS: comment on the PR.
#
# Inputs:
# PR_NUMBER — set via ${{ github.event.pull_request.number }} from the trigger
# POST_COMMENT — "true" to post/update comment on PR
#
# Gating logic (MVP signals 1,2,3,6):
# 1. Author-aware agent-tag comment scan
# 2. REQUEST_CHANGES reviews state machine
# 3. Staleness detection (SOP-12: review.commit_id != PR.head_sha + >1 working day)
# 6. CI required-checks awareness
#
# Exit code: 0=CLEAR, 1=BLOCKED, 2=ERROR
name: gate-check-v3
on:
pull_request_target:
types: [opened, edited, synchronize, reopened]
schedule:
# Hourly: refresh all open PRs
- cron: '8 * * * *'
# NOTE: `workflow_dispatch.inputs` block intentionally omitted.
# Gitea 1.22.6 parser rejects `workflow_dispatch.inputs.X` with
# "unknown on type" — it mis-treats the inputs sub-keys as top-level
# `on:` event types. Dropping the inputs block restores parsing.
# Manual dispatch from the Gitea UI works without the inputs schema
# (github.event.inputs.X returns empty); the script falls back to
# iterating all open PRs when PR_NUMBER is empty.
workflow_dispatch:
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
gate-check:
runs-on: ubuntu-latest
continue-on-error: true # Never block on our own detector failing
steps:
- name: Check out BASE ref (never PR-head under pull_request_target)
# pull_request_target runs with repo secrets-context, so checking out
# the PR HEAD would execute PR-branch gate_check.py with secrets.
# Fix: always load gate_check.py from the trusted base/default ref.
# Bug-1 (self-loop exclusion) + Bug-3 (403→exit0) from #547 are
# kept; only this checkout-ref regresses to pre-#547 behavior.
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event.pull_request.base.sha || github.ref_name }}
- name: Run gate-check-v3 (single PR mode)
if: github.event_name == 'pull_request_target' || github.event.inputs.pr_number != ''
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number || github.event.inputs.pr_number }}
POST_COMMENT: ${{ github.event.inputs.post_comment || 'true' }}
run: |
set -euo pipefail
python3 tools/gate-check-v3/gate_check.py \
--repo "${{ github.repository }}" \
--pr "$PR_NUMBER" \
$([ "$POST_COMMENT" = "true" ] && echo "--post-comment")
echo "verdict=$?" >> "$GITHUB_OUTPUT"
- name: Run gate-check-v3 (all open PRs — cron mode)
if: github.event_name == 'schedule'
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
# Fetch all open PRs and run gate-check on each
# socket.setdefaulttimeout(15): defence-in-depth for missing SOP_TIER_CHECK_TOKEN.
# gate_check.py uses timeout=15 on every urlopen call; this catches the
# inline Python polling loop too (issue #603).
pr_numbers=$(python3 -c "
import socket, urllib.request, json, os
socket.setdefaulttimeout(15)
token = os.environ['GITEA_TOKEN']
req = urllib.request.Request(
'https://git.moleculesai.app/api/v1/repos/${{ github.repository }}/pulls?state=open&limit=100',
headers={'Authorization': f'token {token}', 'Accept': 'application/json'}
)
with urllib.request.urlopen(req) as r:
prs = json.loads(r.read())
for pr in prs:
print(pr['number'])
")
for pr in $pr_numbers; do
echo "Checking PR #$pr..."
python3 tools/gate-check-v3/gate_check.py \
--repo "${{ github.repository }}" \
--pr "$pr" \
--post-comment \
|| true
done
+16 -56
View File
@@ -34,7 +34,7 @@ name: Harness Replays
# One job → one check run → branch-protection-clean (the SKIPPED-in-set
# trap from PR #2264 is documented in e2e-api.yml's e2e-api job comment).
"on":
on:
push:
branches: [main, staging]
paths:
@@ -68,25 +68,8 @@ jobs:
run: ${{ steps.decide.outputs.run }}
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# Shallow clone — we use the Gitea Compare API for changed-file
# detection, not local git diff. The base SHA is supplied via
# GitHub event variables, so no local history is needed.
fetch-depth: 1
- id: decide
env:
# Pass via env block — env values bypass shell quoting so single
# quotes in merge-commit messages (e.g. "Merge pull request 'fix: ...'
# from branch into main") cannot break the bash parser. The prior
# `echo '${{ toJSON(...) }}'` form broke on every main-push because
# every main commit is a merge commit with single quotes in the
# message body — the embedded `'` ended the single-quoted shell string
# mid-JSON, and a subsequent `(` (e.g. in `(#523)`) was parsed as a
# subshell, causing "syntax error near unexpected token `('".
COMMITS_JSON: ${{ toJSON(github.event.commits) }}
run: |
set -euo pipefail
# workflow_dispatch: always run (manual trigger)
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "run=true" >> "$GITHUB_OUTPUT"
@@ -94,31 +77,16 @@ jobs:
exit 0
fi
# Determine changed files.
# workflow_dispatch: always run.
# pull_request: use Compare API (branch-to-branch works fine).
# push: use github.event.commits array (Compare API rejects SHA-to-branch).
# new-branch: run everything.
if [ "${{ github.event_name }}" = "pull_request" ]; then
BASE="${{ github.event.pull_request.base.ref }}"
HEAD="${{ github.event.pull_request.head.ref }}"
# Determine the base commit to diff against.
# For pull_request: use base.sha (the merge-base with main/staging).
# For push: use github.event.before (the previous tip of the branch).
# Fallback for new branches (all-zeros SHA): run everything.
if [ "${{ github.event_name }}" = "pull_request" ] && \
[ -n "${{ github.event.pull_request.base.sha }}" ]; then
BASE="${{ github.event.pull_request.base.sha }}"
elif [ -n "${{ github.event.before }}" ] && \
! echo "${{ github.event.before }}" | grep -qE '^0+$'; then
# Push event: extract changed files from github.event.commits array.
# Gitea Compare API rejects SHA-to-branch comparisons (BaseNotExist),
# so we use the commits array instead. This array contains all commits
# in the push, each with their added/removed/modified file lists.
printf '%s' "$COMMITS_JSON" \
| bash .gitea/scripts/push-commits-diff-files.py \
> .push-diff-files.txt 2>/dev/null || true
DIFF_FILES=$(cat .push-diff-files.txt 2>/dev/null || true)
if [ -n "$DIFF_FILES" ] && echo "$DIFF_FILES" | grep -qE '^workspace-server/|^canvas/|^tests/harness/|^.gitea/workflows/harness-replays\.yml$'; then
echo "run=true" >> "$GITHUB_OUTPUT"
else
echo "run=false" >> "$GITHUB_OUTPUT"
fi
echo "debug=push-files=$DIFF_FILES" >> "$GITHUB_OUTPUT"
exit 0
BASE="${{ github.event.before }}"
else
# New branch or github.event.before unavailable — run everything.
echo "run=true" >> "$GITHUB_OUTPUT"
@@ -126,17 +94,11 @@ jobs:
exit 0
fi
# Call Gitea Compare API (pull_request path only — branch-to-branch).
# Push uses github.event.commits array above.
RESP=$(curl -sS --fail --max-time 30 \
-H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
-H "Accept: application/json" \
"$GITHUB_SERVER_URL/api/v1/repos/$GITHUB_REPOSITORY/compare/$BASE...$HEAD")
DIFF_FILES=$(echo "$RESP" | bash .gitea/scripts/compare-api-diff-files.py 2>/dev/null || true)
# GitHub Actions and Gitea Actions both expose github.sha for HEAD.
DIFF=$(git diff --name-only "$BASE" "${{ github.sha }}" 2>/dev/null)
echo "debug=diff-base=$BASE diff-files=$DIFF" >> "$GITHUB_OUTPUT"
echo "debug=diff-base=$BASE diff-files=$DIFF_FILES" >> "$GITHUB_OUTPUT"
if echo "$DIFF_FILES" | grep -qE '^workspace-server/|^canvas/|^tests/harness/|^.gitea/workflows/harness-replays\.yml$'; then
if echo "$DIFF" | grep -qE '^workspace-server/|^canvas/|^tests/harness/|^.gitea/workflows/harness-replays\.yml$'; then
echo "run=true" >> "$GITHUB_OUTPUT"
else
echo "run=false" >> "$GITHUB_OUTPUT"
@@ -220,14 +182,12 @@ jobs:
run: |
set -euo pipefail
if [ -z "${MOLECULE_GITEA_TOKEN}" ]; then
echo "::warning::AUTO_SYNC_TOKEN not set — using anonymous clone (repos are public per manifest.json OSS contract)"
echo "::error::AUTO_SYNC_TOKEN secret is empty — register the devops-engineer persona PAT in repo Actions secrets"
exit 1
fi
mkdir -p .tenant-bundle-deps
# Strip JSON5 comments before jq parsing — Integration Tester appends
# `// Triggered by ...` which breaks `jq` in clone-manifest.sh.
sed '/^[[:space:]]*\/\//d' manifest.json > .manifest-stripped.json
bash scripts/clone-manifest.sh \
.manifest-stripped.json \
manifest.json \
.tenant-bundle-deps/workspace-configs-templates \
.tenant-bundle-deps/org-templates \
.tenant-bundle-deps/plugins
+4 -7
View File
@@ -37,13 +37,10 @@ name: main-red-watchdog
# "unknown on type" when `workflow_dispatch.inputs.X` is present. Revisit
# when Gitea ≥ 1.23 is fleet-wide.
on:
# SCHEDULE DISABLED 2026-05-12 — interim per RFC#420 Option-C machinery-down emergency
# Watchdog timing out behind runner saturation; rev3+dedicated-runner-label in flight
# Re-enable after rev3 lands + runner saturation root resolved
# schedule:
# # Hourly at :05 — task spec calls for "off-zero" (`5 * * * *`),
# # offset from :17 (ci-required-drift) and :00 (peak cron load).
# - cron: '5 * * * *'
schedule:
# Hourly at :05 — task spec calls for "off-zero" (`5 * * * *`),
# offset from :17 (ci-required-drift) and :00 (peak cron load).
- cron: '5 * * * *'
workflow_dispatch:
# Read commit status + branch ref + issues; write issues (open/PATCH/close).
+1 -9
View File
@@ -11,7 +11,7 @@ name: publish-canvas-image
# - `continue-on-error: true` on each job (RFC §1 contract).
# - **Open question for review**: this workflow pushes the canvas
# image to `ghcr.io`. GHCR was retired during the 2026-05-06
# Gitea migration in favor of ECR (per staging-verify.yml header
# Gitea migration in favor of ECR (per canary-verify.yml header
# notes). The image may not be consumable post-migration. Two
# options for follow-up: (a) retarget to
# `153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/canvas`,
@@ -54,12 +54,6 @@ env:
jobs:
build-and-push:
name: Build & push canvas image
# REVERTED (infra/revert-docker-runner-label): `runs-on: ubuntu-latest` restored.
# The `docker` label is not registered on any act_runner. `runs-on: [ubuntu-latest, docker]`
# causes jobs to queue indefinitely with zero eligible runners — strictly worse than the
# pre-#599 coin-flip (50% success rate). Once the `docker` label is registered on
# ≥2 runners, re-apply the fix from #599 (infra/docker-runner-label).
# See issue #576 + infra-lead pulse ~00:30Z.
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
@@ -85,10 +79,8 @@ jobs:
run: |
set -euo pipefail
echo "::group::Docker daemon health check"
echo "Runner: ${HOSTNAME:-unknown}"
docker info 2>&1 | head -5 || {
echo "::error::Docker daemon is not accessible at /var/run/docker.sock"
echo "::error::Runner: ${HOSTNAME:-unknown}"
echo "::error::Check: (1) daemon running, (2) runner user in docker group, (3) sock perms 660+"
exit 1
}
+5 -54
View File
@@ -23,23 +23,12 @@ name: publish-runtime-autobump
# and try to tag 0.1.130 simultaneously, only one of which would land.
on:
# Run on PR pushes to post a success status so Gitea can merge the PR.
# All steps use continue-on-error: true so operational failures
# (PyPI unreachable, DISPATCH_TOKEN missing) do not block merge.
pull_request:
paths:
- "workspace/**"
# Bump-and-tag on main/staging push (the actual operational trigger).
push:
branches:
- main
- staging
paths:
- "workspace/**"
# Manual dispatch — useful when Gitea Actions API (/actions/*) is
# unreachable (e.g. act_runner 404 on Gitea 1.22.6) and we cannot
# re-trigger via curl.
workflow_dispatch:
permissions:
contents: write # required to push tags back
@@ -49,53 +38,15 @@ concurrency:
cancel-in-progress: false
jobs:
# PR-validation path: always succeeds so Gitea can merge workflow-only PRs.
# Operational failures (PyPI unreachable, missing DISPATCH_TOKEN) are
# surfaced via continue-on-error: true rather than blocking the merge.
# The actual bump work happens on the main/staging push after merge.
pr-validate:
autobump-and-tag:
runs-on: ubuntu-latest
continue-on-error: true # do not block PR merge on operational failures
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
- name: Validate PyPI connectivity (best-effort)
run: |
set -eu
echo "=== Checking PyPI accessibility ==="
LATEST=$(curl -fsS --retry 3 --max-time 10 \
https://pypi.org/pypi/molecule-ai-workspace-runtime/json \
| python -c "import sys,json; print(json.load(sys.stdin)['info']['version'])" \
|| echo "PyPI unreachable (non-blocking for PR validation)")
echo "Latest: ${LATEST:-unknown}"
# Actual bump-and-tag: runs on main/staging pushes, posts real success/failure.
# No continue-on-error — operational failures here trip the main-red
# watchdog, which is the desired signal for infrastructure degradation.
bump-and-tag:
runs-on: ubuntu-latest
# Only fire on push events (main/staging after PR merge). Pull_request
# events are handled by pr-validate above; we do NOT bump on every
# push-synchronize because that would race with the PR head.
#
# NOTE: the prior condition `github.event.pull_request.base.ref == ''`
# was broken — on a PR-merge push in Gitea Actions, the pull_request
# context is still attached (base.ref='main'), so the condition always
# evaluated to false and bump-and-tag was permanently skipped.
if: github.event_name == 'push'
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
- name: Fetch tags for collision check
run: git fetch origin --tags --depth=1
# Fetch full tag list so the bump logic can sanity-check against
# what's already in this repo (catches collision with prior
# manual tag pushes).
fetch-depth: 0
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
@@ -32,9 +32,11 @@ on:
- '.gitea/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
# Serialize per-branch so two rapid main pushes don't race the same
# :staging-latest tag retag. Allow parallel runs as they produce
# different :staging-<sha> tags and last-write-wins on :staging-latest.
# Serialize per-branch so two rapid staging pushes don't race the same
# :staging-latest tag retag. Allow staging and main to run in parallel
# (different GITHUB_REF → different concurrency group) since they
# produce different :staging-<sha> tags and last-write-wins on
# :staging-latest is acceptable across branches.
#
# cancel-in-progress: false → in-flight builds finish; the next push's
# build queues. This avoids a partially-pushed image.
@@ -52,12 +54,6 @@ env:
jobs:
build-and-push:
# REVERTED (infra/revert-docker-runner-label): `runs-on: ubuntu-latest` restored.
# The `docker` label is not registered on any act_runner. `runs-on: [ubuntu-latest, docker]`
# causes jobs to queue indefinitely with zero eligible runners — strictly worse than the
# pre-#599 coin-flip (50% success rate). Once the `docker` label is registered on
# ≥2 runners, re-apply the fix from #599 (infra/docker-runner-label).
# See issue #576 + infra-lead pulse ~00:30Z.
runs-on: ubuntu-latest
steps:
- name: Checkout
@@ -74,10 +70,8 @@ jobs:
run: |
set -euo pipefail
echo "::group::Docker daemon health check"
echo "Runner: ${HOSTNAME:-unknown}"
docker info 2>&1 | head -5 || {
echo "::error::Docker daemon is not accessible at /var/run/docker.sock"
echo "::error::Runner: ${HOSTNAME:-unknown}"
echo "::error::Check: (1) daemon is running, (2) runner user is in docker group, (3) sock permissions are 660+"
exit 1
}
@@ -100,15 +94,13 @@ jobs:
MOLECULE_GITEA_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
run: |
set -euo pipefail
# clone-manifest.sh supports anonymous cloning for public repos (post-
# 2026-05-08 migration). The token is only needed for private repos.
# Do NOT require it — a missing secret would fail the build unnecessarily.
if [ -z "${MOLECULE_GITEA_TOKEN}" ]; then
echo "::error::AUTO_SYNC_TOKEN secret is empty"
exit 1
fi
mkdir -p .tenant-bundle-deps
# Strip JSON5 comments before jq parsing — Integration Tester appends
# `// Triggered by ...` which breaks `jq` in clone-manifest.sh.
sed '/^[[:space:]]*\/\//d' manifest.json > .manifest-stripped.json
bash scripts/clone-manifest.sh \
.manifest-stripped.json \
manifest.json \
.tenant-bundle-deps/workspace-configs-templates \
.tenant-bundle-deps/org-templates \
.tenant-bundle-deps/plugins
@@ -125,11 +117,6 @@ jobs:
# Build + push platform image (inline ECR auth — mirrors the operator-host
# approach; credentials come from GITHUB_SECRET_AWS_ACCESS_KEY_ID /
# GITHUB_SECRET_AWS_SECRET_ACCESS_KEY in Gitea Actions).
# docker buildx bake / build required for `imagetools inspect` digest
# capture in the CP pin-update step (RFC internal#229 §X step 4 PR-1).
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4.0.0
- name: Build & push platform image to ECR (staging-<sha> + staging-latest)
env:
IMAGE_NAME: ${{ env.IMAGE_NAME }}
@@ -145,16 +132,17 @@ jobs:
ECR_REGISTRY="${IMAGE_NAME%%/*}"
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin "${ECR_REGISTRY}"
docker buildx build \
docker build \
--file ./workspace-server/Dockerfile \
--build-arg GIT_SHA="${GIT_SHA}" \
--label "org.opencontainers.image.source=https://git.moleculesai.app/molecule-ai/${REPO}" \
--label "org.opencontainers.image.source=https://github.com/${REPO}" \
--label "org.opencontainers.image.revision=${GIT_SHA}" \
--label "org.opencontainers.image.created=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--label "molecule.workflow.run_id=${GITHUB_RUN_ID}" \
--label "org.opencontainers.image.description=Molecule AI platform — pending canary verify" \
--tag "${IMAGE_NAME}:${TAG_SHA}" \
--tag "${IMAGE_NAME}:${TAG_LATEST}" \
--push .
.
docker push "${IMAGE_NAME}:${TAG_SHA}"
docker push "${IMAGE_NAME}:${TAG_LATEST}"
# Build + push tenant image (Go platform + Next.js canvas in one image).
- name: Build & push tenant image to ECR (staging-<sha> + staging-latest)
@@ -172,14 +160,15 @@ jobs:
ECR_REGISTRY="${TENANT_IMAGE_NAME%%/*}"
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin "${ECR_REGISTRY}"
docker buildx build \
docker build \
--file ./workspace-server/Dockerfile.tenant \
--build-arg NEXT_PUBLIC_PLATFORM_URL= \
--build-arg GIT_SHA="${GIT_SHA}" \
--label "org.opencontainers.image.source=https://git.moleculesai.app/molecule-ai/${REPO}" \
--label "org.opencontainers.image.source=https://github.com/${REPO}" \
--label "org.opencontainers.image.revision=${GIT_SHA}" \
--label "org.opencontainers.image.created=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--label "molecule.workflow.run_id=${GITHUB_RUN_ID}" \
--label "org.opencontainers.image.description=Molecule AI tenant platform + canvas — pending canary verify" \
--tag "${TENANT_IMAGE_NAME}:${TAG_SHA}" \
--tag "${TENANT_IMAGE_NAME}:${TAG_LATEST}" \
--push .
.
docker push "${TENANT_IMAGE_NAME}:${TAG_SHA}"
docker push "${TENANT_IMAGE_NAME}:${TAG_LATEST}"
-164
View File
@@ -1,164 +0,0 @@
# qa-review — non-author APPROVE from the `qa` Gitea team required to merge.
#
# RFC#324 Step 1 of 5 (workflow-add). Pairs with `security-review.yml` and the
# branch-protection flip in Step 2.
#
# === DESIGN (RFC#324 v1.1 addendum) ===
#
# A1-α (refire mechanism):
# Triggers on:
# - `pull_request_target`: opened, synchronize, reopened
# → initial status posts when PR opens / re-pushes
# - `issue_comment`: /qa-recheck slash-command on the PR
# → manual re-fire after a QA reviewer clicks APPROVE
# (Gitea 1.22.6 doesn't re-fire on pull_request_review, per
# go-gitea/gitea#33700 + feedback_pull_request_review_no_refire)
# Workflow name = `qa-review` ; job name = `approved`.
# The job's own pass/fail conclusion publishes the status context
# `qa-review / approved (<event>)` — NO `POST /statuses` call → NO
# write:repository token scope needed. Sidesteps internal#321 defect #2.
#
# A1.1 (privilege check on slash-comment — INFORMATIONAL ONLY, NOT a gate):
# The `issue_comment` event fires for ANY commenter, including
# non-collaborators. The original (v1.2) design gated the eval step
# behind a collaborator probe → if a non-collaborator commented
# /qa-recheck, the eval was `if:`-skipped → the job exited 0 anyway →
# the status context published `success` with ZERO real APPROVE.
# That was a fail-open: any visitor could green the gate.
#
# RFC#324 v1.3 §A1.1 correction (option b per hongming-pc 1421):
# drop privilege-gating of the evaluation entirely. The eval is
# read-only and idempotent — it reads `pulls/{N}/reviews` and
# `teams/{id}/members/{u}` (both API-side state that a commenter can't
# change). Re-running it on a non-collaborator's comment is harmless
# AND correct: if a real team-member APPROVE exists, the eval flips
# green; if not, it stays red.
#
# We KEEP the privilege step as a `::notice::` log line only — useful
# for griefer-spotting (one operator spamming /recheck) without
# touching the gate. If rate-limiting is needed later, add it as a
# separate concern (time-window throttle, not a privilege gate).
#
# We MUST NOT use `github.event.comment.author_association` (the
# field doesn't exist on Gitea 1.22.6 webhook payload — this was
# sop-tier-refire's defect #1).
#
# A4 (no PR-head checkout under pull_request_target):
# We check out the BASE ref explicitly so the review-check.sh script is
# loaded from trusted source. We NEVER use `ref: ${{ github.event.pull_request.head.sha }}`.
# No PR-head code is executed in the runner. Trust boundary preserved.
#
# A5 (real Gitea team):
# `qa` team (id=20) verified by orchestrator preflight 2026-05-11; queried
# at run time via /api/v1/teams/20/members/{login}.
#
# === TOKEN ===
#
# The workflow reads PR state, PR reviews, and team membership.
# Gitea 1.22.6's /api/v1/teams/{id}/members/{u} returns 403 ('Must be a
# team member') for tokens whose owner is not in that team. The default
# `secrets.GITHUB_TOKEN` is owned by a workflow-scoped identity that is
# also not in qa/security teams → also 403.
#
# Resolution: a dedicated `RFC_324_TEAM_READ_TOKEN` secret, owned by an
# identity that IS in both `qa` and `security` teams (Owners-tier
# claude-ceo-assistant, or a new service-bot added to both teams).
# Provisioning of this secret is tracked as a follow-up issue (filed by
# core-devops at PR open).
#
# Until that secret is provisioned, the job will exit 1 with a clear
# 403-on-team-probe error and the `qa-review / approved` status will
# stay `failure`. This is the correct fail-closed behavior — the gate
# blocks merge until both (a) a QA team member APPROVEs and (b) the
# workflow has a token that can confirm their team membership.
#
# === SLASH-COMMAND CONTRACT ===
#
# /qa-recheck — re-evaluate the gate (e.g. after an APPROVE lands)
#
# Open to any PR commenter. The eval is read-only and idempotent, so
# unprivileged refires are harmless (RFC#324 v1.3 §A1.1). Collaborator
# status is logged for griefer-spotting but does NOT gate execution.
name: qa-review
on:
pull_request_target:
types: [opened, synchronize, reopened]
issue_comment:
types: [created]
permissions:
contents: read
pull-requests: read
jobs:
approved:
# Gate the job:
# - On pull_request_target events: always run.
# - On issue_comment events: only when it's a PR comment and the body
# contains the slash-command. NO privilege gate at the step level
# (RFC#324 v1.3 §A1.1): a non-collaborator's /qa-recheck is fine
# because the eval is read-only and idempotent — re-running it
# just re-confirms whether a real team-member APPROVE exists.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' &&
github.event.issue.pull_request != null &&
startsWith(github.event.comment.body, '/qa-recheck'))
runs-on: ubuntu-latest
steps:
- name: Privilege check (A1.1 — INFORMATIONAL log only, NOT a gate)
# RFC#324 v1.3 §A1.1: this step does NOT gate subsequent steps.
# It exists solely as a log line for griefer-spotting (one
# operator spamming /qa-recheck without merit). Re-running the
# read-only eval on a non-collaborator comment is harmless;
# gating it would be fail-open (skipped steps still publish
# `success` for the job's status context).
# Only runs on issue_comment events; pull_request_target has
# no comment.user.login so the step is a no-op skip there.
if: github.event_name == 'issue_comment'
env:
GITEA_TOKEN: ${{ secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
login="${{ github.event.comment.user.login }}"
# Write token to a mode-600 file so it never appears in curl's argv.
# (#541: -H "Authorization: token $TOKEN" puts the secret in /proc/<pid>/cmdline)
authfile=$(mktemp)
chmod 600 "$authfile"
printf 'header = "Authorization: token %s"\n' "$GITEA_TOKEN" > "$authfile"
code=$(curl -sS -o /dev/null -w '%{http_code}' -K "$authfile" \
"${{ github.server_url }}/api/v1/repos/${{ github.repository }}/collaborators/${login}")
rm -f "$authfile"
if [ "$code" = "204" ]; then
echo "::notice::Recheck from ${login} (collaborator=true)"
else
echo "::notice::Recheck from ${login} (collaborator=false, HTTP ${code}) — proceeding with read-only eval anyway"
fi
- name: Check out BASE ref (A4 — never PR-head)
# Loads the review-check.sh script from a trusted ref. For
# pull_request_target the default checkout is BASE already; we
# set ref explicitly for the issue_comment event too so the
# script source is always the default-branch version.
# NEVER use ref: ${{ github.event.pull_request.head.sha }} —
# that would execute PR-head code with secrets-context.
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event.repository.default_branch }}
- name: Evaluate qa-review
env:
GITEA_TOKEN: ${{ secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
# PR number lives in different places per event:
# pull_request_target → github.event.pull_request.number
# issue_comment → github.event.issue.number
PR_NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}
TEAM: qa
TEAM_ID: '20'
REVIEW_CHECK_DEBUG: '0'
REVIEW_CHECK_STRICT: '0'
run: bash .gitea/scripts/review-check.sh
@@ -32,7 +32,7 @@ name: redeploy-tenants-on-main
#
# Registry: ECR (153263036946.dkr.ecr.us-east-2.amazonaws.com/
# molecule-ai/platform-tenant). GHCR was retired 2026-05-07 during the
# Gitea suspension migration. The staging-verify.yml promote step now
# Gitea suspension migration. The canary-verify.yml promote step now
# uses the same redeploy-fleet endpoint (fixes the silent-GHCR gap).
#
# Runtime ordering:
@@ -104,7 +104,7 @@ jobs:
# `staging-<sha>` to roll back to a known-good build.
# 2. Default → `staging-<short_head_sha>`. The just-published
# digest. Bypasses the `:latest` retag path that's currently
# dead (staging-verify soft-skips without canary fleet, so
# dead (canary-verify soft-skips without canary fleet, so
# the only thing retagging `:latest` today is the manual
# promote-latest.yml — last run 2026-04-28). Auto-trigger
# from workflow_run uses workflow_run.head_sha; manual
@@ -359,7 +359,7 @@ jobs:
# Belt-and-suspenders sanity floor: same logic as the staging
# variant — see that file's comment for the full rationale.
# Floor only applies when fleet >= 4; below that, staging-verify
# Floor only applies when fleet >= 4; below that, canary-verify
# is the actual gate.
TOTAL_VERIFIED=${#SLUGS[@]}
if [ $TOTAL_VERIFIED -ge 4 ] && [ $UNREACHABLE_COUNT -gt $((TOTAL_VERIFIED / 2)) ]; then
@@ -21,7 +21,7 @@ name: redeploy-tenants-on-staging
#
# Mirror of redeploy-tenants-on-main.yml, with the staging-CP host and
# the :staging-latest tag. Sister workflow exists for prod (rolls
# :latest after staging-verify). Both share the same shape — just
# :latest after canary-verify). Both share the same shape — just
# different CP_URL + target_tag + admin token secret.
#
# Why this workflow exists: publish-workspace-server-image now builds
@@ -336,7 +336,7 @@ jobs:
# crashes on startup), not a teardown race. Hard-fail.
#
# Floor only applies when TOTAL_VERIFIED >= 4 — below that, the
# staging-verify step is the actual gate for "all tenants down"
# canary-verify step is the actual gate for "all tenants down"
# detection (it runs against the canary first and aborts the
# rollout if the canary fails to come up). Without the >=4 gate,
# a 1-tenant fleet (e.g. a single ephemeral e2e-* tenant on a
-70
View File
@@ -1,70 +0,0 @@
name: review-check-tests
# Runs review-check.sh regression tests on every PR + push that touches
# the evaluator script or its test fixtures.
#
# Follows RFC#324 follow-up (issue #540):
# .gitea/scripts/review-check.sh is load-bearing for PR merge gates.
# It has ZERO production CI coverage. This workflow closes that gap.
#
# Design choices:
# - Bash test harness (not bats). The existing test_review_check.sh
# uses a custom assert_eq/assert_contains framework that is already
# working and covers all 13 acceptance criteria (issue #540 §Acceptance).
# Converting to bats would be refactoring, not closing the gap.
# - No bats dependency: the runner-base image needs no extra tooling.
# - continue-on-error: false — these tests must pass; a failure means
# the review-gate evaluator is broken and must not be merged.
on:
push:
branches: [main, staging]
paths:
- '.gitea/scripts/review-check.sh'
- '.gitea/scripts/tests/test_review_check.sh'
- '.gitea/scripts/tests/_review_check_fixture.py'
- '.gitea/workflows/review-check-tests.yml'
pull_request:
branches: [main, staging]
paths:
- '.gitea/scripts/review-check.sh'
- '.gitea/scripts/tests/test_review_check.sh'
- '.gitea/scripts/tests/_review_check_fixture.py'
- '.gitea/workflows/review-check-tests.yml'
workflow_dispatch:
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
test:
name: review-check.sh regression tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install jq
# Required for T12 jq-filter test case. Gitea Actions runners (ubuntu-latest
# label) do not bundle jq. Install via apt-get first (reliable for Ubuntu
# runners with internet access to package mirrors). Falls back to GitHub
# binary download. GitHub releases may be blocked on some runner networks
# (infra#241 follow-up).
continue-on-error: true
run: |
if apt-get update -qq && apt-get install -y -qq jq; then
echo "::notice::jq installed via apt-get: $(jq --version)"
elif timeout 120 curl -sSL \
"https://github.com/jqlang/jq/releases/download/jq-1.7.1/jq-linux-amd64" \
-o /usr/local/bin/jq && chmod +x /usr/local/bin/jq; then
echo "::notice::jq binary downloaded: $(/usr/local/bin/jq --version)"
else
echo "::warning::jq install failed — apt-get and GitHub download both failed."
fi
jq --version 2>/dev/null || echo "::notice::jq not yet available — continuing"
- name: Run review-check.sh regression suite
run: bash .gitea/scripts/tests/test_review_check.sh
-72
View File
@@ -1,72 +0,0 @@
# security-review — non-author APPROVE from the `security` Gitea team
# required to merge.
#
# RFC#324 Step 1 of 5 (workflow-add). Mirror of `qa-review.yml`; differs
# only in TEAM=security, TEAM_ID=21, and the slash-command name.
#
# See `qa-review.yml` header for the full A1-α / A1.1 / A4 / A5 design
# rationale; everything below is identical in shape.
name: security-review
on:
pull_request_target:
types: [opened, synchronize, reopened]
issue_comment:
types: [created]
permissions:
contents: read
pull-requests: read
jobs:
approved:
# See qa-review.yml header for full A1-α / A1.1 (v1.3 — informational
# log only, NOT a gate) / A4 / A5 design rationale.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' &&
github.event.issue.pull_request != null &&
startsWith(github.event.comment.body, '/security-recheck'))
runs-on: ubuntu-latest
steps:
- name: Privilege check (A1.1 — INFORMATIONAL log only, NOT a gate)
# RFC#324 v1.3 §A1.1: does NOT gate subsequent steps. See
# qa-review.yml for full rationale. Eval is read-only/idempotent
# so re-running on a non-collaborator comment is harmless.
if: github.event_name == 'issue_comment'
env:
GITEA_TOKEN: ${{ secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
login="${{ github.event.comment.user.login }}"
# Write token to a mode-600 file so it never appears in curl's argv.
# (#541: -H "Authorization: token $TOKEN" puts the secret in /proc/<pid>/cmdline)
authfile=$(mktemp)
chmod 600 "$authfile"
printf 'header = "Authorization: token %s"\n' "$GITEA_TOKEN" > "$authfile"
code=$(curl -sS -o /dev/null -w '%{http_code}' -K "$authfile" \
"${{ github.server_url }}/api/v1/repos/${{ github.repository }}/collaborators/${login}")
rm -f "$authfile"
if [ "$code" = "204" ]; then
echo "::notice::Recheck from ${login} (collaborator=true)"
else
echo "::notice::Recheck from ${login} (collaborator=false, HTTP ${code}) — proceeding with read-only eval anyway"
fi
- name: Check out BASE ref (A4 — never PR-head)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event.repository.default_branch }}
- name: Evaluate security-review
env:
GITEA_TOKEN: ${{ secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}
TEAM: security
TEAM_ID: '21'
REVIEW_CHECK_DEBUG: '0'
REVIEW_CHECK_STRICT: '0'
run: bash .gitea/scripts/review-check.sh
-79
View File
@@ -1,79 +0,0 @@
# sop-tier-refire — issue_comment-triggered refire of sop-tier-check.
#
# Closes internal#292. Gitea 1.22.6 doesn't refire workflows on the
# `pull_request_review` event (go-gitea/gitea#33700); the `sop-tier-check`
# workflow's review-event subscription is silently dead. The result:
# PRs that get their approving review AFTER the tier-check ran on open/
# synchronize keep their failing status check forever, and the only way
# to merge is the admin force-merge path (audited via `audit-force-merge`
# but the audit trail keeps growing; see `feedback_never_admin_merge_bypass`).
#
# Workaround pattern from `feedback_pull_request_review_no_refire`:
# `issue_comment` events DO fire reliably on 1.22.6. When a repo
# MEMBER/OWNER/COLLABORATOR comments `/refire-tier-check` on a PR, this
# workflow re-runs the sop-tier-check logic and POSTs the resulting
# status to the PR head SHA directly. No empty commit, no git history
# bloat, no cascade re-fire of every other workflow on the PR.
#
# SECURITY MODEL:
#
# 1. `pull_request` exists on the issue (issue_comment fires on issues
# AND PRs; we only want PRs).
# 2. `comment.author_association` must be MEMBER/OWNER/COLLABORATOR.
# Per the internal#292 core-security review (review#1066 ask): anyone
# can comment, but only repo collaborators+ can flip the status.
# Without this gate, a drive-by commenter on a public-issue-tracker
# surface could trigger a status flip.
# 3. Comment body must contain `/refire-tier-check` — a slash-command-
# shaped trigger (not just any comment word). Prevents accidental
# triggering from prose like "we should refire tests" in a review.
# 4. This workflow does NOT check out PR HEAD code. Like sop-tier-check,
# it only HTTP-calls the Gitea API. Trust boundary preserved.
#
# Note: `issue_comment` fires from the BASE branch's workflow file. There
# is no `pull_request_target` equivalent to set; the trigger inherently
# loads the workflow from the default branch.
#
# Rate-limit: a 1s pre-sleep + a "skip if status posted in last 30s"
# guard prevents comment-spam from thrashing the status. See the script.
name: sop-tier-check refire (issue_comment)
on:
issue_comment:
types: [created]
jobs:
refire:
# Three gates, all required:
# - comment is on a PR (not a plain issue)
# - commenter is MEMBER, OWNER, or COLLABORATOR
# - comment body contains the slash-command trigger
if: |
github.event.issue.pull_request != null &&
contains(fromJson('["MEMBER","OWNER","COLLABORATOR"]'), github.event.comment.author_association) &&
contains(github.event.comment.body, '/refire-tier-check')
runs-on: ubuntu-latest
permissions:
contents: read
pull-requests: read
statuses: write
steps:
- name: Check out base branch (for the script)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# Load the script from the default branch (main), matching the
# sop-tier-check.yml security model.
ref: ${{ github.event.repository.default_branch }}
- name: Re-evaluate sop-tier-check and POST status
env:
# Same org-level secret sop-tier-check.yml + audit-force-merge.yml use.
# Fallback to GITHUB_TOKEN with a clear error if missing.
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
PR_NUMBER: ${{ github.event.issue.number }}
COMMENT_AUTHOR: ${{ github.event.comment.user.login }}
# Set to '1' for diagnostic per-API-call output. Off by default.
SOP_DEBUG: '0'
run: bash .gitea/scripts/sop-tier-refire.sh
-118
View File
@@ -1,118 +0,0 @@
# status-reaper — Option B (compensating-status POST) for Gitea 1.22.6's
# hardcoded `(push)` suffix on default-branch commit statuses.
#
# Tracking: molecule-core#? (this PR), internal#327 (sibling publish-runtime-bot),
# internal#328 (sibling mc-drift-bot), internal#80 (upstream RFC). Sister
# bots already deployed under the same per-persona-identity contract
# (`feedback_per_agent_gitea_identity_default`).
#
# Root cause:
# Gitea 1.22.6 emits commit-status context as
# `<workflow_name> / <job_name> (push)`
# for ANY workflow run on the default branch's HEAD commit, REGARDLESS
# of the trigger event. Schedule- and workflow_dispatch-triggered runs
# on `main` therefore appear as `(push)` failures on the latest main
# commit, painting main red via a fake-push status. Verified on runs
# 14525 + 14526 via Phase 1 evidence (3 sub-agents). No upstream fix
# in 1.23-1.26.1 (sibling a6f20db1 research).
#
# Why a cron-driven reaper, not workflow_run:
# Gitea 1.22.6 does NOT support `on: workflow_run` (verified via
# modules/actions/workflows.go enumeration; sister a6f20db1). The
# only event-shaped option that fires is cron. 5min is chosen to
# sit BETWEEN ci-required-drift (`:17` hourly) and main-red-watchdog
# (`:05` hourly) so the reaper sweeps red before the watchdog files
# a `[main-red]` issue (would-be false-positive).
#
# What the reaper does each tick:
# 1. Parse `.gitea/workflows/*.yml`, classify each by whether `on:`
# contains a `push:` trigger (see script for workflow_id resolution
# including `name:` collision and `/`-in-name fail-loud lints).
# 2. GET combined status for main HEAD.
# 3. For each `failure` status whose context ends ` (push)`:
# - if workflow has push trigger: PRESERVE (real defect signal).
# - if workflow has no push trigger: POST a compensating
# `state=success` with the same context and a description that
# documents the workaround.
#
# What it does NOT do:
# - Mutate non-`(push)`-suffix statuses (e.g. `(pull_request)` from
# branch_protections required-checks — verified safe 2026-05-11).
# - Auto-revert. Same reasoning as main-red-watchdog.
# - Cancel runs. The runs themselves stay visible in Actions UI; the
# fix is at the commit-status surface only.
#
# Removal path: drop this workflow when Gitea ≥ 1.24 ships with a
# real fix for the hardcoded-suffix bug. Audit issue (filed post-merge)
# tracks the deletion as a follow-up sweep.
name: status-reaper
# IMPORTANT — Gitea 1.22.6 parser quirk per
# `feedback_gitea_workflow_dispatch_inputs_unsupported`: do NOT add an
# `inputs:` block here. Gitea 1.22.6 rejects the whole workflow as
# "unknown on type" when `workflow_dispatch.inputs.X` is present.
on:
# SCHEDULE DISABLED 2026-05-12 — interim per RFC#420 Option-C machinery-down emergency
# Reaper rev2 not compensating + watchdog timeout-cascade; rev3 in flight
# Re-enable after rev3 lands + runner saturation root resolved
# schedule:
# # Every 5 minutes. Off-zero alignment with sibling cron workflows:
# # ci-required-drift (`:17`), main-red-watchdog (`:05`),
# # railway-pin-audit (`:23`). 5-min cadence gives a tight enough
# # close on schedule-triggered false-reds that main-red-watchdog
# # (hourly :05) almost never files an issue on the false case.
# - cron: '*/5 * * * *'
workflow_dispatch:
# Compensating-status POST needs write on repo statuses; no other
# write surface is touched. checkout still needs `contents: read`.
permissions:
contents: read
# NOTE: NO `concurrency:` block is intentional.
# Gitea 1.22.6 doesn't honor `cancel-in-progress: false`: queued ticks
# of the same group get cancelled-with-started=0 instead of waiting
# (DB-verified 2026-05-12, runs 16053/16085 of status-reaper.yml).
# The reaper's POST /statuses/{sha} is idempotent — Gitea de-dups by
# context — so concurrent ticks are safe; accept them rather than
# serialise via the broken mechanism.
jobs:
reap:
runs-on: ubuntu-latest
timeout-minutes: 3
steps:
- name: Check out repo at default-branch HEAD
# BASE checkout per `feedback_pull_request_target_workflow_from_base`.
# The script reads .gitea/workflows/*.yml from the working tree to
# classify trigger sets; we must read main's CURRENT state, not
# the SHA a stale schedule fired against.
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event.repository.default_branch }}
- name: Set up Python (PyYAML for workflow `on:` parse)
# Pinned to 3.12 to match sibling watchdog / ci-required-drift.
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
# PyYAML is needed because shell-grep on `on:` misses list/string
# forms and nested `push: { paths: ... }`. Same install pattern
# as ci-required-drift.yml (sub-2s install, no wheel cache).
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Compensate operational push-suffix failures on main
env:
# claude-status-reaper persona token; provisioned by sibling
# aefaac1b 2026-05-11. Owns write:repository scope to POST
# /statuses/{sha} but NOTHING ELSE
# (`feedback_per_agent_gitea_identity_default`).
GITEA_TOKEN: ${{ secrets.STATUS_REAPER_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
WATCH_BRANCH: ${{ github.event.repository.default_branch }}
WORKFLOWS_DIR: .gitea/workflows
run: python3 .gitea/scripts/status-reaper.py
+11 -11
View File
@@ -29,15 +29,13 @@ name: Sweep stale AWS Secrets Manager secrets
# reconciler enumerator) is filed as a separate controlplane
# issue. This sweeper is the immediate cost-relief stopgap.
#
# AWS credentials: the confirmed Gitea secrets are AWS_ACCESS_KEY_ID /
# AWS_SECRET_ACCESS_KEY (the molecule-cp IAM user). These are the same
# credentials used by the rest of the platform. The dedicated
# AWS_JANITOR_* naming (which the original GitHub workflow used) was
# never populated in Gitea — the existing secrets are AWS_ACCESS_KEY_ID /
# AWS_SECRET_ACCESS_KEY (per issue #425 §425 audit). These DO have
# secretsmanager:ListSecrets (the production molecule-cp principal);
# if ListSecrets is revoked in future, a dedicated janitor principal
# would need to be created and the Gitea secret names updated here.
# IAM principal: AWS_JANITOR_ACCESS_KEY_ID / AWS_JANITOR_SECRET_ACCESS_KEY.
# This is a DEDICATED principal — the production `molecule-cp` IAM
# user lacks `secretsmanager:ListSecrets` (it only has
# Get/Create/Update/Delete on specific resources, scoped to its
# operational needs). The janitor needs ListSecrets across the
# `molecule/tenant/*` prefix, which warrants a separate principal so
# we don't broaden the prod-CP policy.
#
# Safety: the script's MAX_DELETE_PCT gate (default 50%, mirroring
# sweep-cf-orphans.yml — tenant secrets are durable by design, unlike
@@ -73,8 +71,8 @@ jobs:
timeout-minutes: 30
env:
AWS_REGION: ${{ secrets.AWS_REGION || 'us-east-1' }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_JANITOR_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_JANITOR_SECRET_ACCESS_KEY }}
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
CP_STAGING_ADMIN_API_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
MAX_DELETE_PCT: ${{ github.event.inputs.max_delete_pct || '50' }}
@@ -101,11 +99,13 @@ jobs:
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
echo "::warning::skipping sweep — secrets not configured: ${missing[*]}"
echo "::warning::set them at Settings → Secrets and Variables → Actions, then rerun."
echo "::warning::AWS_JANITOR_* must belong to a principal with secretsmanager:ListSecrets and secretsmanager:DeleteSecret on molecule/tenant/* (the prod molecule-cp principal lacks ListSecrets)."
echo "skip=true" >> "$GITHUB_OUTPUT"
exit 0
fi
echo "::error::sweep cannot run — required secrets missing: ${missing[*]}"
echo "::error::set them at Settings → Secrets and Variables → Actions, or disable this workflow."
echo "::error::AWS_JANITOR_* must belong to a principal with secretsmanager:ListSecrets and secretsmanager:DeleteSecret on molecule/tenant/*."
exit 1
fi
echo "All required secrets present ✓"
-5
View File
@@ -33,11 +33,6 @@ name: Sweep stale Cloudflare DNS records
# gate halts before damage. Decision-function unit tests in
# scripts/ops/test_sweep_cf_decide.py (#2027) cover the rule
# classifier.
#
# Secrets: CF_API_TOKEN, CF_ZONE_ID, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
# are confirmed existing per issue #425 §425 audit. CP_ADMIN_API_TOKEN and
# CP_STAGING_ADMIN_API_TOKEN are unconfirmed — if missing, the verify step
# (schedule → hard-fail, dispatch → soft-skip) surfaces it clearly.
on:
schedule:
-5
View File
@@ -28,11 +28,6 @@ name: Sweep stale Cloudflare Tunnels
# Safety: the script's MAX_DELETE_PCT gate (default 90% — higher than
# the DNS sweep's 50% because tenant-shaped tunnels are mostly
# orphans by design) refuses to nuke past the threshold.
#
# Secrets: CF_API_TOKEN, CF_ACCOUNT_ID are confirmed existing per
# issue #425 §425 audit. CP_ADMIN_API_TOKEN and CP_STAGING_ADMIN_API_TOKEN
# are unconfirmed — if missing, the verify step (schedule → hard-fail,
# dispatch → soft-skip) surfaces it clearly.
on:
schedule:
+5 -29
View File
@@ -63,21 +63,12 @@ jobs:
sweep:
name: Sweep e2e orgs
runs-on: ubuntu-latest
# NOTE: Phase 3 (RFC #219 §1) `continue-on-error: true` removed
# 2026-05-11. The "surface broken workflows without blocking"
# rationale was correctly applied to advisory/lint workflows but
# wrong for this janitor — silent failure here masks real-money
# tenant leaks. Hongming observed 15 leaked EC2 in molecule-canary
# (004947743811) us-east-2 at 11:05Z 2026-05-11 because the sweep
# had been exiting 2 every tick and the failure was swallowed.
# See `feedback_strict_root_only_after_class_a` — critical janitors
# must fail loud. A follow-up `notify-failure` step below also
# surfaces breakage to ops even if branch-protection wiring is
# adjusted to keep this off the required-checks list.
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
timeout-minutes: 15
env:
MOLECULE_CP_URL: https://staging-api.moleculesai.app
ADMIN_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
ADMIN_TOKEN: ${{ secrets.MOLECULE_STAGING_ADMIN_TOKEN }}
MAX_AGE_MINUTES: ${{ github.event.inputs.max_age_minutes || '30' }}
DRY_RUN: ${{ github.event.inputs.dry_run || 'false' }}
# Refuse to delete more than this many orgs in one tick. If the
@@ -90,7 +81,7 @@ jobs:
- name: Verify admin token present
run: |
if [ -z "$ADMIN_TOKEN" ]; then
echo "::error::CP_STAGING_ADMIN_API_TOKEN not set"
echo "::error::MOLECULE_STAGING_ADMIN_TOKEN not set"
exit 2
fi
echo "Admin token present ✓"
@@ -108,8 +99,7 @@ jobs:
# Filter:
# 1. slug starts with one of the ephemeral test prefixes:
# - 'e2e-' — covers e2e-smoke- (formerly e2e-canary-),
# e2e-canvas-*, etc.
# - 'e2e-' — covers e2e-canary-, e2e-canvas-*, etc.
# - 'rt-e2e-' — runtime-test harness fixtures (RFC #2251);
# missing this prefix left two such tenants
# orphaned 8h on staging (2026-05-03), then
@@ -251,17 +241,3 @@ jobs:
if: env.DRY_RUN == 'true'
run: |
echo "DRY RUN — would have deleted ${{ steps.identify.outputs.count }} org(s) AND triggered orphan-tunnels cleanup. Re-run with dry_run=false to actually delete."
- name: Notify on sweep failure
# Fail-loud companion to dropping `continue-on-error: true`.
# If any prior step failed (missing token, CP 5xx, safety-cap
# tripped, etc.) emit a clearly-tagged ::error:: line so the
# Gitea runs UI + any log-tail consumer (Loki SOPRefireRule)
# flags this. Without this step, an early `exit 2` shows as a
# red run but the message can scroll past in busy log windows;
# the explicit tag here is greppable from the orchestrator
# triage loop.
if: failure()
run: |
echo "::error::sweep-stale-e2e-orgs FAILED — staging tenants are LEAKING. See prior step logs. Common causes: (a) CP_STAGING_ADMIN_API_TOKEN secret missing/rotated, (b) staging-api.moleculesai.app 5xx, (c) safety-cap tripped (CP admin API returning malformed orgs). Manual cleanup of leaked EC2 + DNS may be required while this is broken."
exit 1
-109
View File
@@ -1,109 +0,0 @@
name: Weekly Platform-Go Surface
# Surface latent vet/test errors on main by running the full Platform-Go
# suite on a weekly cron regardless of whether the last push touched
# workspace-server/.
#
# Background: ci.yml's `platform-build` job gates real work on
# `if: needs.changes.outputs.platform == 'true'`. When no push touches
# workspace-server/, the skip fires and the suite never executes on main.
# Latent vet errors and test flakes can sit for weeks undetected.
#
# This workflow runs the full suite (build, vet, golangci-lint, tests with
# coverage) every Monday at 04:17 UTC. Results are posted as commit statuses
# but continue-on-error: true means they never block anything — they're
# purely a noise-reduction signal for when the next workspace-server push
# lands and would otherwise trigger the first real suite run.
#
# Why 04:17 UTC on Monday: off-peak, before the weekly sprint cycle starts.
on:
schedule:
- cron: '17 4 * * 1' # Mondays at 04:17 UTC
workflow_dispatch:
permissions:
contents: read
statuses: write
jobs:
weekly-platform-go:
name: Weekly Platform-Go Surface
runs-on: ubuntu-latest
# continue-on-error: surface only, never block
continue-on-error: true
defaults:
run:
working-directory: workspace-server
steps:
- name: Checkout main
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: main
fetch-depth: 1
- name: Set up Go
uses: actions/setup-go@40f1582b2485089dde7abd97c1529aa768e1baff # v5
with:
go-version: stable
- name: Go mod download
run: go mod download
- name: Build
run: go build ./cmd/server
- name: go vet
run: go vet ./... || true
- name: golangci-lint
run: golangci-lint run --timeout 3m ./... || true
- name: Tests with race detection + coverage
run: go test -race -coverprofile=coverage.out ./...
- name: Check coverage thresholds
run: |
set -e
TOTAL_FLOOR=25
CRITICAL_PATHS=(
"internal/handlers/tokens"
"internal/handlers/workspace_provision"
"internal/handlers/a2a_proxy"
"internal/handlers/registry"
"internal/handlers/secrets"
"internal/middleware/wsauth"
"internal/crypto"
)
TOTAL=$(go tool cover -func=coverage.out | grep '^total:' | awk '{print $3}' | sed 's/%//')
echo "Total coverage: ${TOTAL}%"
if awk "BEGIN{exit !(\$TOTAL < \$TOTAL_FLOOR)}"; then
echo "::error::Total coverage \${TOTAL}% is below the \${TOTAL_FLOOR}% floor."
exit 1
fi
ALLOWLIST=""
if [ -f ../.coverage-allowlist.txt ]; then
ALLOWLIST=$(grep -vE '^(#|[[:space:]]*$)' ../.coverage-allowlist.txt || true)
fi
FAILED=0
for path in "\${CRITICAL_PATHS[@]}"; do
while read -r file pct; do
[[ "$file" == *_test.go ]] && continue
[[ "$file" == *"$path"* ]] || continue
awk "BEGIN{exit !(\$pct < 10)}" || continue
rel=$(echo "$file" | sed 's|^github.com/molecule-ai/molecule-monorepo/platform/workspace-server/||; s|^github.com/molecule-ai/molecule-monorepo/platform/||')
if echo "$ALLOWLIST" | grep -qxF "$rel"; then
continue
fi
echo "::error::Low coverage \${pct}% on \${rel} (below 10% in critical path \${path})"
FAILED=$((FAILED + 1))
done < <(go tool cover -func=coverage.out | grep -v '^total:' | awk '{file=$1; sub(/:[0-9][0-9.]*:.*/, "", file); pct=$NF; gsub(/%/,"",pct); s[file]+=pct; c[file]++} END {for (f in s) printf "%s %.1f\n", f, s[f]/c[f]}' | sort)
done
if [ "$FAILED" -gt 0 ]; then
echo "::error::\${FAILED} critical paths below 10% coverage — see above."
exit 1
fi
echo "Coverage thresholds: OK"
-1
View File
@@ -1 +0,0 @@
staging trigger
-10
View File
@@ -156,16 +156,6 @@ and run CI manually.
| python-lint | pytest with coverage |
| e2e-api | Full API test suite (62 tests) |
| shellcheck | Shell script linting |
| review-check-tests | `review-check.sh` evaluator regression suite (13 scenarios) |
| ops-scripts | Python unittest suite for `scripts/*.py` |
## Local Testing
### review-check.sh
```bash
bash .gitea/scripts/tests/test_review_check.sh
```
Runs the full regression suite against a fixture HTTP server. No network access required.
## Code Style
+2 -6
View File
@@ -105,12 +105,8 @@ export function ConfirmDialog({
// (e.g. parents with transform, filter, will-change that break position:fixed).
return createPortal(
<div className="fixed inset-0 z-[9999] flex items-center justify-center">
{/* Backdrop — interactive dismiss area; accessible name for screen readers (WCAG 4.1.2) */}
<div
className="absolute inset-0 bg-black/60 backdrop-blur-sm cursor-pointer"
aria-label="Dismiss dialog"
onClick={onCancel}
/>
{/* Backdrop */}
<div className="absolute inset-0 bg-black/60 backdrop-blur-sm" onClick={onCancel} />
{/* Dialog — role="dialog" + aria-modal prevent interaction with background */}
<div
-1
View File
@@ -96,7 +96,6 @@ export function WorkspaceNode({ id, data }: NodeProps<Node<WorkspaceNodeData>>)
<div
role="button"
tabIndex={0}
data-testid="workspace-node"
aria-label={
isMisconfigured && configurationError
? `${data.name} workspace — agent not configured: ${configurationError}`
@@ -5,22 +5,20 @@
* Covers: renders nothing when no approvals, polls /approvals/pending,
* shows approval cards, approve/deny decisions, toast notifications.
*
* Uses vi.hoisted + vi.mock (file-level) for @/lib/api. vi.resetModules()
* in every afterEach undoes the mock so other test files that import the
* real api module (e.g. socket.url.test.ts) are unaffected.
* Note: does NOT mock @/lib/api — uses vi.spyOn on the real module.
* vi.restoreAllMocks() is omitted from afterEach so queued mock values
* (set up via mockResolvedValueOnce in beforeEach) are preserved for the
* component's useEffect to consume.
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
import { afterEach, describe, expect, it, vi, beforeEach } from "vitest";
import { ApprovalBanner } from "../ApprovalBanner";
import { showToast } from "@/components/Toaster";
import { api } from "@/lib/api";
// ─── Hoisted mock refs ─────────────────────────────────────────────────────────
// vi.hoisted runs in the same hoisting phase as vi.mock factories, so these
// refs are stable across all tests and available inside the mock factory.
const { mockApiGet, mockApiPost } = vi.hoisted(() => ({
mockApiGet: vi.fn<(args: unknown[]) => Promise<unknown>>(),
mockApiPost: vi.fn<(args: unknown[]) => Promise<unknown>>(),
vi.mock("@/components/Toaster", () => ({
showToast: vi.fn(),
}));
// ─── Helpers ──────────────────────────────────────────────────────────────────
@@ -43,42 +41,23 @@ const pendingApproval = (id = "a1", workspaceId = "ws-1"): {
created_at: "2026-05-10T10:00:00Z",
});
// ─── Static mocks (file-level — no other test needs the real modules) ─────────
vi.mock("@/components/Toaster", () => ({
showToast: vi.fn(),
}));
// vi.resetModules() in afterEach undoes this mock so other files that import
// the real api module are unaffected.
vi.mock("@/lib/api", () => ({
api: {
get: mockApiGet,
post: mockApiPost,
},
}));
// ─── Tests ─────────────────────────────────────────────────────────────────────
// ─── Tests ────────────────────────────────────────────────────────────────────
describe("ApprovalBanner — empty state", () => {
beforeEach(() => {
vi.useFakeTimers();
mockApiGet.mockReset().mockResolvedValue([]);
mockApiPost.mockReset().mockResolvedValue({});
vi.spyOn(api, "get").mockResolvedValueOnce([]);
});
afterEach(() => {
cleanup();
vi.useRealTimers();
vi.restoreAllMocks();
vi.resetModules();
});
it("renders nothing when there are no pending approvals", async () => {
render(<ApprovalBanner />);
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
expect(screen.queryByRole("alert")).toBeNull();
expect(mockApiGet).toHaveBeenCalled();
});
it("does not render any approve/deny buttons when list is empty", async () => {
@@ -92,40 +71,40 @@ describe("ApprovalBanner — empty state", () => {
describe("ApprovalBanner — renders approval cards", () => {
beforeEach(() => {
vi.useFakeTimers();
mockApiGet.mockReset().mockResolvedValue([
vi.spyOn(api, "get").mockResolvedValueOnce([
pendingApproval("a1"),
pendingApproval("a2", "ws-2"),
]);
mockApiPost.mockReset().mockResolvedValue({});
});
afterEach(() => {
cleanup();
vi.useRealTimers();
vi.restoreAllMocks();
vi.resetModules();
});
it("renders an alert card for each pending approval", async () => {
render(<ApprovalBanner />);
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
expect(screen.getAllByRole("alert")).toHaveLength(2);
const alerts = screen.getAllByRole("alert");
expect(alerts).toHaveLength(2);
});
it("displays the workspace name and action text", async () => {
render(<ApprovalBanner />);
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
expect(screen.getAllByText(/test workspace needs approval/i)).toHaveLength(2);
const nameEls = screen.getAllByText(/test workspace needs approval/i);
expect(nameEls).toHaveLength(2);
});
it("displays the reason when present", async () => {
render(<ApprovalBanner />);
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
expect(screen.getAllByText(/requires human approval/i)).toHaveLength(2);
const reasons = screen.getAllByText(/requires human approval/i);
expect(reasons).toHaveLength(2);
});
it("omits the reason div when reason is null", async () => {
mockApiGet.mockReset().mockResolvedValue([{
vi.spyOn(api, "get").mockResolvedValueOnce([{
...pendingApproval("a1"),
reason: null,
}]);
@@ -139,6 +118,7 @@ describe("ApprovalBanner — renders approval cards", () => {
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
const approveBtns = screen.getAllByRole("button", { name: /Approve/i });
const denyBtns = screen.getAllByRole("button", { name: /Deny/i });
// 2 cards, each card has 1 Approve + 1 Deny button → 2 of each minimum
expect(approveBtns.length).toBeGreaterThanOrEqual(2);
expect(denyBtns.length).toBeGreaterThanOrEqual(2);
});
@@ -146,22 +126,21 @@ describe("ApprovalBanner — renders approval cards", () => {
it("has aria-live=assertive on the alert container", async () => {
render(<ApprovalBanner />);
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
expect(screen.getAllByRole("alert")[0].getAttribute("aria-live")).toBe("assertive");
const alert = screen.getAllByRole("alert")[0];
expect(alert.getAttribute("aria-live")).toBe("assertive");
});
});
describe("ApprovalBanner — decisions", () => {
beforeEach(() => {
vi.useFakeTimers();
mockApiGet.mockReset().mockResolvedValue([pendingApproval("a1")]);
mockApiPost.mockReset().mockResolvedValue({});
vi.spyOn(api, "get").mockResolvedValueOnce([pendingApproval("a1")]);
vi.spyOn(api, "post").mockResolvedValue({});
});
afterEach(() => {
cleanup();
vi.useRealTimers();
vi.restoreAllMocks();
vi.resetModules();
});
it("calls POST /workspaces/:id/approvals/:id/decide on Approve click", async () => {
@@ -169,7 +148,7 @@ describe("ApprovalBanner — decisions", () => {
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
fireEvent.click(screen.getAllByRole("button", { name: /approve/i })[0]);
await act(async () => { /* flush */ });
expect(mockApiPost).toHaveBeenCalledWith(
expect(vi.mocked(api.post)).toHaveBeenCalledWith(
"/workspaces/ws-1/approvals/a1/decide",
expect.objectContaining({ decision: "approved" })
);
@@ -180,7 +159,7 @@ describe("ApprovalBanner — decisions", () => {
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
fireEvent.click(screen.getAllByRole("button", { name: /deny/i })[0]);
await act(async () => { /* flush */ });
expect(mockApiPost).toHaveBeenCalledWith(
expect(vi.mocked(api.post)).toHaveBeenCalledWith(
"/workspaces/ws-1/approvals/a1/decide",
expect.objectContaining({ decision: "denied" })
);
@@ -212,10 +191,7 @@ describe("ApprovalBanner — decisions", () => {
});
it("shows an error toast when POST fails", async () => {
// mockImplementation preserves the vi.fn() wrapper (unlike mockReset() which
// strips it and causes the real fetch() to fire — the root cause of the
// original flakiness in this file).
mockApiPost.mockImplementation(() => Promise.reject(new Error("Network error")));
vi.mocked(api.post).mockRejectedValueOnce(new Error("Network error"));
render(<ApprovalBanner />);
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
fireEvent.click(screen.getAllByRole("button", { name: /approve/i })[0]);
@@ -227,9 +203,8 @@ describe("ApprovalBanner — decisions", () => {
});
it("keeps the card visible when the POST fails", async () => {
// Same mockImplementation pattern — preserves the wrapper so the component's
// catch block runs instead of the real fetch().
mockApiPost.mockImplementation(() => Promise.reject(new Error("Network error")));
// Use mockRejectedValueOnce on the same spy as beforeEach (don't call spyOn again)
vi.mocked(api.post).mockRejectedValueOnce(new Error("Network error"));
render(<ApprovalBanner />);
await act(async () => { await vi.runOnlyPendingTimersAsync(); });
fireEvent.click(screen.getAllByRole("button", { name: /approve/i })[0]);
@@ -241,15 +216,12 @@ describe("ApprovalBanner — decisions", () => {
describe("ApprovalBanner — handles empty list from server", () => {
beforeEach(() => {
vi.useFakeTimers();
mockApiGet.mockReset().mockResolvedValue([]);
mockApiPost.mockReset().mockResolvedValue({});
vi.spyOn(api, "get").mockResolvedValueOnce([]);
});
afterEach(() => {
cleanup();
vi.useRealTimers();
vi.restoreAllMocks();
vi.resetModules();
});
it("shows nothing when the API returns an empty array on first poll", async () => {
@@ -49,18 +49,17 @@ function createDragOverEvent() {
describe("BundleDropZone — render", () => {
it("renders a hidden file input with correct accept and aria-label", () => {
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
expect(input).toBeTruthy();
expect(input.getAttribute("type")).toBe("file");
expect(input.getAttribute("accept")).toBe(".bundle.json");
expect(input.getAttribute("id")).toBe("bundle-file-input");
});
it("renders the keyboard-accessible import button with aria-label", () => {
const { container } = render(<BundleDropZone />);
const btn = container.querySelector('button[aria-label="Import bundle file"]') as HTMLButtonElement;
expect(btn).not.toBeNull();
render(<BundleDropZone />);
const btn = screen.getByRole("button", { name: /import bundle/i });
expect(btn).toBeTruthy();
expect(btn.getAttribute("aria-controls")).toBe("bundle-file-input");
});
});
@@ -74,7 +73,7 @@ describe("BundleDropZone — drag state", () => {
it("shows the drop overlay when a file is dragged over", async () => {
vi.useFakeTimers();
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
// Overlay should not be visible initially
expect(screen.queryByText("Drop Bundle to Import")).toBeNull();
@@ -93,7 +92,7 @@ describe("BundleDropZone — drag state", () => {
});
it("hides the drop overlay when not dragging", () => {
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
// By default (no drag), the overlay should not be visible
expect(screen.queryByText("Drop Bundle to Import")).toBeNull();
});
@@ -101,15 +100,14 @@ describe("BundleDropZone — drag state", () => {
describe("BundleDropZone — keyboard file input (WCAG 2.1.1)", () => {
it("triggers the hidden file input when the import button is clicked", () => {
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
// Both the hidden file input and the button have aria-label="Import bundle file".
// Use the file input's id to select it uniquely.
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
expect(input).toBeTruthy();
expect(input.getAttribute("type")).toBe("file");
const clickSpy = vi.spyOn(input, "click");
const btn = container.querySelector('button[aria-label="Import bundle file"]') as HTMLButtonElement;
fireEvent.click(btn);
fireEvent.click(screen.getByRole("button", { name: /import bundle/i }));
expect(clickSpy).toHaveBeenCalled();
});
@@ -121,7 +119,7 @@ describe("BundleDropZone — keyboard file input (WCAG 2.1.1)", () => {
status: "online",
});
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
const file = makeBundle("My Bundle");
@@ -153,7 +151,7 @@ describe("BundleDropZone — import success", () => {
status: "online",
});
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
const file = makeBundle("Success Workspace");
@@ -165,14 +163,14 @@ describe("BundleDropZone — import success", () => {
vi.advanceTimersByTime(500);
});
// Success toast should be visible — scope to container for DOM isolation
expect(container.textContent).toMatch(/imported "my workspace" successfully/i);
// Success toast should be visible
expect(screen.getByText(/imported "my workspace" successfully/i)).toBeTruthy();
// Toast auto-clears after 4000ms
await act(async () => {
vi.advanceTimersByTime(5000);
});
expect(container.querySelector('[role="status"]')).toBeNull();
expect(screen.queryByRole("status")).toBeNull();
vi.useRealTimers();
});
@@ -184,7 +182,7 @@ describe("BundleDropZone — import success", () => {
status: "online",
});
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
const file = makeBundle("Timed Workspace");
@@ -195,12 +193,12 @@ describe("BundleDropZone — import success", () => {
await act(async () => {
vi.advanceTimersByTime(500);
});
expect(container.textContent).toMatch(/timed workspace/i);
expect(screen.queryByText(/timed workspace/i)).toBeTruthy();
await act(async () => {
vi.advanceTimersByTime(4500);
});
expect(container.textContent).not.toMatch(/timed workspace/i);
expect(screen.queryByText(/timed workspace/i)).toBeNull();
vi.useRealTimers();
});
});
@@ -210,7 +208,7 @@ describe("BundleDropZone — import error", () => {
vi.useFakeTimers();
vi.mocked(api.post).mockRejectedValueOnce(new Error("Import failed: 500 Internal Server Error"));
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
const file = makeBundle("Failed Workspace");
@@ -222,13 +220,13 @@ describe("BundleDropZone — import error", () => {
vi.advanceTimersByTime(500);
});
expect(container.textContent).toMatch(/import failed: 500 internal server error/i);
expect(screen.getByText(/import failed: 500 internal server error/i)).toBeTruthy();
vi.useRealTimers();
});
it("shows error when file is not a .bundle.json", async () => {
vi.useFakeTimers();
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
const file = new File(["{}"], "readme.txt", { type: "text/plain" });
@@ -240,12 +238,12 @@ describe("BundleDropZone — import error", () => {
vi.advanceTimersByTime(500);
});
expect(container.textContent).toMatch(/only .bundle.json files are accepted/i);
expect(screen.getByText(/only .bundle.json files are accepted/i)).toBeTruthy();
// Error clears after 3000ms
await act(async () => {
vi.advanceTimersByTime(3500);
});
expect(container.textContent).not.toMatch(/only .bundle.json/i);
expect(screen.queryByText(/only .bundle.json/i)).toBeNull();
vi.useRealTimers();
});
@@ -253,7 +251,7 @@ describe("BundleDropZone — import error", () => {
vi.useFakeTimers();
vi.mocked(api.post).mockRejectedValueOnce(new Error("Network error"));
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
const file = makeBundle("Error Workspace");
@@ -264,12 +262,12 @@ describe("BundleDropZone — import error", () => {
await act(async () => {
vi.advanceTimersByTime(500);
});
expect(container.textContent).toMatch(/network error/i);
expect(screen.queryByText(/network error/i)).toBeTruthy();
await act(async () => {
vi.advanceTimersByTime(5000);
});
expect(container.textContent).not.toMatch(/network error/i);
expect(screen.queryByText(/network error/i)).toBeNull();
vi.useRealTimers();
});
});
@@ -281,7 +279,7 @@ describe("BundleDropZone — importing state", () => {
const pending = new Promise((r) => { resolve = r; });
vi.mocked(api.post).mockReturnValueOnce(pending as unknown as ReturnType<typeof api.post>);
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
const file = makeBundle("Pending Workspace");
@@ -294,10 +292,8 @@ describe("BundleDropZone — importing state", () => {
vi.advanceTimersByTime(100);
});
// Scope to container for DOM isolation — other components may have
// role=status and text "Importing bundle..." in the shared jsdom env.
expect(container.textContent).toMatch(/importing bundle/i);
expect(container.querySelector('[role="status"]')).toBeTruthy();
expect(screen.getByText("Importing bundle...")).toBeTruthy();
expect(screen.getByRole("status")).toBeTruthy();
await act(async () => {
vi.advanceTimersByTime(500);
@@ -315,7 +311,7 @@ describe("BundleDropZone — file input reset", () => {
status: "online",
});
const { container } = render(<BundleDropZone />);
render(<BundleDropZone />);
const input = document.getElementById("bundle-file-input") as HTMLInputElement;
const file = makeBundle("Reset Test");
@@ -73,21 +73,6 @@ describe("ConfirmDialog singleButton prop", () => {
expect(onCancel).toHaveBeenCalledTimes(1);
});
it("backdrop has aria-label for screen reader users (WCAG 4.1.2)", () => {
render(
<ConfirmDialog
open
title="Title"
message="Message"
onConfirm={vi.fn()}
onCancel={vi.fn()}
/>
);
const backdrop = document.querySelector(".bg-black\\/60");
expect(backdrop).toBeTruthy();
expect(backdrop?.getAttribute("aria-label")).toBe("Dismiss dialog");
});
it("singleButton: onConfirm fires on button click", () => {
const onConfirm = vi.fn();
render(
@@ -212,7 +212,7 @@ describe("ContextMenu — menu items", () => {
expect(screen.getByRole("menuitem", { name: /terminal/i })).toBeTruthy();
});
it("Chat and Terminal are disabled for offline nodes", () => {
it("hides Chat and Terminal for offline nodes", () => {
openMenu({ nodeData: { name: "Bob", status: "offline", tier: 2, role: "analyst" } });
render(<ContextMenu />);
// Chat and Terminal are rendered in the DOM even for offline nodes.
@@ -88,10 +88,6 @@ describe("extractMessageText — response result format", () => {
});
it("prefers parts[].text over parts[].root.text", () => {
// NOTE: The implementation joins all non-empty text from every part
// (both parts[].text and parts[].root.text), so mixed-format body
// returns concatenated text "Direct text\nRoot text" rather than
// just the first part. Update this test to reflect actual behavior.
const body = {
result: {
parts: [
@@ -1,370 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for EmptyState — the full-canvas welcome card shown on first load.
*
* Covers:
* - Loading state (GET /templates in flight)
* - Fetch failure → empty template grid (templates = [])
* - Template grid renders with correct content
* - Template button disabled while deploying
* - "Deploying..." label on the button being deployed
* - "Create blank" button POSTs /workspaces
* - "Creating..." label while blank workspace is being created
* - Blank create error shows error banner
* - Error banner has role="alert"
* - All buttons disabled while any deploy is in-flight
* - handleDeployed fires after 500ms delay
*
* Uses vi.hoisted + vi.mock to fully isolate the api module, matching
* the pattern established in ApprovalBanner, MemoryTab, and ScheduleTab tests.
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { EmptyState } from "../EmptyState";
// ─── Hoisted mock refs ─────────────────────────────────────────────────────────
// vi.hoisted runs in the same hoisting phase as vi.mock factories, so all refs
// are available both to the factory and to test bodies.
const { mockApiGet, mockApiPost } = vi.hoisted(() => ({
mockApiGet: vi.fn<(args: unknown[]) => Promise<unknown>>(),
mockApiPost: vi.fn<(args: unknown[]) => Promise<{ id: string }>>(),
}));
// Mutable deploy state — object reference is const; properties can be mutated.
const _deploy = vi.hoisted(() => ({
deployFn: vi.fn(),
deploying: undefined as string | undefined,
error: undefined as string | undefined,
modal: null as React.ReactNode,
}));
const { mockSelectNode, mockSetPanelTab } = vi.hoisted(() => ({
mockSelectNode: vi.fn(),
mockSetPanelTab: vi.fn(),
}));
// ─── Mocks ────────────────────────────────────────────────────────────────────
vi.mock("@/lib/api", () => ({
api: {
get: mockApiGet,
post: mockApiPost,
},
}));
vi.mock("@/hooks/useTemplateDeploy", () => ({
useTemplateDeploy: () => ({
deploy: _deploy.deployFn,
deploying: _deploy.deploying,
error: _deploy.error,
modal: _deploy.modal,
}),
}));
vi.mock("@/store/canvas", () => ({
useCanvasStore: Object.assign(
vi.fn((selector: (s: { getState: () => { selectNode: typeof mockSelectNode; setPanelTab: typeof mockSetPanelTab } }) => unknown) =>
selector({
getState: () => ({
selectNode: mockSelectNode,
setPanelTab: mockSetPanelTab,
}),
})
),
{ getState: () => ({ selectNode: mockSelectNode, setPanelTab: mockSetPanelTab }) }
),
}));
vi.mock("../TemplatePalette", () => ({
OrgTemplatesSection: () => null,
}));
vi.mock("../Spinner", () => ({
Spinner: () => <span data-testid="spinner"></span>,
}));
vi.mock("@/lib/design-tokens", () => ({
TIER_CONFIG: {
1: { label: "T1", color: "text-ink-mid bg-surface-card border border-line", border: "text-ink-mid border-line" },
2: { label: "T2", color: "text-white bg-accent border border-accent-strong", border: "text-accent border-accent" },
3: { label: "T3", color: "text-white bg-violet-600 border border-violet-700", border: "text-violet-600 border-violet-500" },
4: { label: "T4", color: "text-white bg-warm border border-warm", border: "text-warm border-warm" },
},
}));
// ─── Fixtures ─────────────────────────────────────────────────────────────────
const TEMPLATE = {
id: "tpl-1",
name: "Claude Code Agent",
description: "A general-purpose coding assistant",
tier: 2,
skill_count: 3,
model: "claude-opus-4-5",
};
function template(overrides: Partial<typeof TEMPLATE> = {}): typeof TEMPLATE {
return { ...TEMPLATE, ...overrides };
}
// ─── Helpers ───────────────────────────────────────────────────────────────────
function renderEmpty() {
return render(<EmptyState />);
}
// Flush React state + microtasks after an act boundary.
async function flush() {
await act(async () => { await Promise.resolve(); });
}
// Reset deploy state to defaults before each test.
function resetDeployState() {
_deploy.deployFn.mockReset();
_deploy.deploying = undefined;
_deploy.error = undefined;
_deploy.modal = null;
}
// ─── Tests ─────────────────────────────────────────────────────────────────────
describe("EmptyState — loading", () => {
beforeEach(() => {
mockApiGet.mockReset().mockImplementation(
() => new Promise(() => {}) // never resolves
);
});
afterEach(() => {
cleanup();
vi.restoreAllMocks();
});
it("shows loading state while GET /templates is pending", async () => {
renderEmpty();
await flush();
expect(screen.getByTestId("spinner")).toBeTruthy();
expect(screen.getByText("Loading templates...")).toBeTruthy();
});
// "create blank" is rendered outside the loading/template-grid conditional,
// so it is always visible — adjust expectation accordingly.
it("renders 'create blank' button during loading", async () => {
renderEmpty();
await flush();
expect(screen.getByRole("button", { name: "+ Create blank workspace" })).toBeTruthy();
});
it("does not render template buttons while loading", async () => {
renderEmpty();
await flush();
expect(screen.queryByText("Claude Code Agent")).toBeNull();
});
});
describe("EmptyState — templates", () => {
beforeEach(() => {
mockApiGet.mockReset().mockResolvedValue([template()]);
resetDeployState();
});
afterEach(() => {
cleanup();
vi.restoreAllMocks();
});
it("renders the welcome heading", async () => {
renderEmpty();
await flush();
expect(screen.getByText("Deploy your first agent")).toBeTruthy();
});
it("renders template buttons with name and description", async () => {
renderEmpty();
await flush();
expect(screen.getByText("Claude Code Agent")).toBeTruthy();
expect(screen.getByText("A general-purpose coding assistant")).toBeTruthy();
});
it("renders tier badge and skill count", async () => {
renderEmpty();
await flush();
expect(screen.getByText("T2")).toBeTruthy();
// skill_count renders as "3 skills · <model>"
expect(screen.getByText(/^3 skills/)).toBeTruthy();
});
it("renders model name when present", async () => {
renderEmpty();
await flush();
expect(screen.getByText(/claude-opus/i)).toBeTruthy();
});
it("calls deploy with the template on click", async () => {
renderEmpty();
await flush();
fireEvent.click(screen.getByText("Claude Code Agent"));
expect(_deploy.deployFn).toHaveBeenCalledWith(template());
});
it("shows 'Deploying...' on the button of the template being deployed", async () => {
_deploy.deploying = "tpl-1";
renderEmpty();
await flush();
expect(screen.getByText("Deploying...")).toBeTruthy();
});
it("disables the template button of the deploying template", async () => {
_deploy.deploying = "tpl-1";
renderEmpty();
await flush();
const btn = screen.getByText("Deploying...").closest("button") as HTMLButtonElement;
expect(btn.disabled).toBe(true);
});
it("disables 'create blank' while a template is deploying", async () => {
_deploy.deploying = "tpl-1";
renderEmpty();
await flush();
expect(screen.getByRole("button", { name: "+ Create blank workspace" }).disabled).toBe(true);
});
});
describe("EmptyState — fetch failure / empty templates", () => {
beforeEach(() => {
mockApiGet.mockReset().mockResolvedValue([]);
resetDeployState();
});
afterEach(() => {
cleanup();
vi.restoreAllMocks();
});
it("does not render template grid when GET /templates returns []", async () => {
renderEmpty();
await flush();
expect(screen.queryByText("Claude Code Agent")).toBeNull();
});
it("renders 'create blank' button when templates list is empty", async () => {
renderEmpty();
await flush();
expect(screen.getByRole("button", { name: "+ Create blank workspace" })).toBeTruthy();
});
it("does not render template grid when GET /templates rejects", async () => {
mockApiGet.mockReset().mockRejectedValue(new Error("Network failure"));
renderEmpty();
await flush();
expect(screen.queryByText("Claude Code Agent")).toBeNull();
});
});
describe("EmptyState — create blank", () => {
beforeEach(() => {
mockApiGet.mockReset().mockResolvedValue([template()]);
mockApiPost.mockReset().mockResolvedValue({ id: "ws-new" });
resetDeployState();
vi.useFakeTimers();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
vi.restoreAllMocks();
});
it("calls POST /workspaces on 'create blank' click", async () => {
renderEmpty();
await flush();
fireEvent.click(screen.getByRole("button", { name: "+ Create blank workspace" }));
await act(async () => { await Promise.resolve(); });
expect(mockApiPost).toHaveBeenCalledWith(
"/workspaces",
expect.objectContaining({ name: "My First Agent" })
);
});
it("shows 'Creating...' while blank workspace POST is pending", async () => {
mockApiPost.mockReset().mockImplementation(
() => new Promise(() => {}) // never resolves
);
renderEmpty();
await flush();
fireEvent.click(screen.getByRole("button", { name: "+ Create blank workspace" }));
await act(async () => { await Promise.resolve(); });
expect(screen.getByRole("button", { name: "Creating..." })).toBeTruthy();
});
it("calls selectNode + setPanelTab after 500ms on successful create", async () => {
renderEmpty();
await flush();
fireEvent.click(screen.getByRole("button", { name: "+ Create blank workspace" }));
await act(async () => { await Promise.resolve(); }); // flush POST
await act(async () => { vi.advanceTimersByTime(500); });
expect(mockSelectNode).toHaveBeenCalledWith("ws-new");
expect(mockSetPanelTab).toHaveBeenCalledWith("chat");
});
it("disables template buttons while creating blank workspace", async () => {
mockApiPost.mockReset().mockImplementation(
() => new Promise(() => {}) // never resolves
);
renderEmpty();
await flush();
fireEvent.click(screen.getByRole("button", { name: "+ Create blank workspace" }));
await act(async () => { await Promise.resolve(); });
expect((screen.getByText("Claude Code Agent").closest("button") as HTMLButtonElement).disabled).toBe(true);
});
it("shows error banner when POST /workspaces fails", async () => {
mockApiPost.mockReset().mockRejectedValue(new Error("Server error"));
renderEmpty();
await flush();
fireEvent.click(screen.getByRole("button", { name: "+ Create blank workspace" }));
await act(async () => { await Promise.resolve(); });
expect(screen.getByRole("alert")).toBeTruthy();
expect(screen.getByText(/server error/i)).toBeTruthy();
});
it("clears 'Creating...' and shows button again after POST failure", async () => {
mockApiPost.mockReset().mockRejectedValue(new Error("Server error"));
renderEmpty();
await flush();
fireEvent.click(screen.getByRole("button", { name: "+ Create blank workspace" }));
await act(async () => { await Promise.resolve(); });
// After rejection, blankCreating = false → button reverts to default label
expect(screen.getByRole("button", { name: "+ Create blank workspace" })).toBeTruthy();
});
});
describe("EmptyState — error banner", () => {
beforeEach(() => {
mockApiGet.mockReset().mockResolvedValue([template()]);
resetDeployState();
vi.useFakeTimers();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
vi.restoreAllMocks();
});
it("has role=alert on the error banner", async () => {
_deploy.error = "Template deploy failed";
renderEmpty();
await flush();
const alert = screen.getByRole("alert");
expect(alert).toBeTruthy();
expect(alert.textContent).toContain("Template deploy failed");
});
it("does not show error banner when no errors", async () => {
renderEmpty();
await flush();
expect(screen.queryByRole("alert")).toBeNull();
});
});
@@ -1,237 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for ExternalConnectModal — the modal surfaced after creating a
* runtime="external" workspace. Surfaces workspace_auth_token + ready-to-paste
* snippets so the operator can configure their off-host agent.
*
* Coverage:
* - Renders nothing when info=null
* - Opens dialog when info is provided
* - Default tab: "Universal MCP" when universal_mcp_snippet present, else "Python SDK"
* - Tab switching between all available tabs
* - Snippets show with auth_token replacing placeholders
* - Copy button: calls clipboard API, shows "Copied!", clears after 1.5s
* - Copy failure: shows fallback textarea
* - "I've saved it — close" calls onClose
* - Security warning: one-time token display
* - Fields tab shows raw values
* - Tabs hidden when their snippet is absent
*
* Fake timers: applied per-describe to avoid mixing with waitFor. Tests that
* use waitFor (which needs real timers) run without fake timers. Tests that
* verify setTimeout behavior use vi.useFakeTimers() + act(vi.advanceTimersByTime).
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act, waitFor } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import {
ExternalConnectModal,
type ExternalConnectionInfo,
} from "../ExternalConnectModal";
const defaultInfo: ExternalConnectionInfo = {
workspace_id: "ws-123",
platform_url: "https://app.example.com",
auth_token: "secret-auth-token-abc",
registry_endpoint: "https://app.example.com/api/a2a/register",
heartbeat_endpoint: "https://app.example.com/api/a2a/heartbeat",
// Placeholders must EXACTLY match what the component searches for in
// the string.replace() calls (the component does NOT normalise whitespace).
// Python: 'AUTH_TOKEN = "...' (4 spaces), curl: WORKSPACE_AUTH_TOKEN="<paste>" (with quotes),
// MCP/Hermes: MOLECULE_WORKSPACE_TOKEN="...", Codex: same with 1 space.
curl_register_template:
`curl -X POST https://app.example.com/api/a2a/register \\
-H "Content-Type: application/json" \\
-d '{"auth_token": "WORKSPACE_AUTH_TOKEN=\"<paste from create response>\"", ...}'`,
python_snippet:
'AUTH_TOKEN = "<paste from create response>"\nAPI_URL = "https://app.example.com"',
universal_mcp_snippet:
'MOLECULE_WORKSPACE_TOKEN="<paste from create response>"',
hermes_channel_snippet:
'MOLECULE_WORKSPACE_TOKEN="<paste from create response>"',
codex_snippet: 'MOLECULE_WORKSPACE_TOKEN = "<paste from create response>"',
openclaw_snippet: 'WORKSPACE_TOKEN="<paste from create response>"',
};
// ─── Clipboard mock helpers ────────────────────────────────────────────────────
let clipboardWriteText = vi.fn();
beforeEach(() => {
clipboardWriteText.mockReset().mockResolvedValue(undefined);
Object.defineProperty(navigator, "clipboard", {
value: { writeText: clipboardWriteText },
configurable: true,
writable: true,
});
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
// ─── Helpers ──────────────────────────────────────────────────────────────────
function renderModal(info: ExternalConnectionInfo | null) {
return render(
<ExternalConnectModal info={info} onClose={vi.fn()} />,
);
}
// Flush React + Radix portal updates synchronously so the dialog is in the DOM.
function renderAndFlush(info: ExternalConnectionInfo | null) {
const result = renderModal(info);
act(() => {});
return result;
}
// ─── Tests ────────────────────────────────────────────────────────────────────
describe("ExternalConnectModal — render conditions", () => {
it("renders nothing when info is null", () => {
renderModal(null);
expect(document.body.textContent).toBe("");
});
it("renders the dialog when info is provided", () => {
renderAndFlush(defaultInfo);
expect(screen.queryByRole("dialog")).toBeTruthy();
});
it("shows the security warning about one-time token display", () => {
renderAndFlush(defaultInfo);
expect(screen.getByText(/only once/i)).toBeTruthy();
});
});
describe("ExternalConnectModal — default tab selection", () => {
it("opens the Universal MCP tab by default when universal_mcp_snippet is present", () => {
renderAndFlush(defaultInfo);
const mcpTab = screen.getByRole("tab", { name: /universal mcp/i });
expect(mcpTab.getAttribute("aria-selected")).toBe("true");
});
it("opens the Python SDK tab by default when universal_mcp_snippet is absent", () => {
renderAndFlush({ ...defaultInfo, universal_mcp_snippet: undefined });
const pythonTab = screen.getByRole("tab", { name: /python sdk/i });
expect(pythonTab.getAttribute("aria-selected")).toBe("true");
});
it("tab order: Universal MCP appears before Python SDK when both exist", () => {
renderAndFlush(defaultInfo);
const tabs = screen.getAllByRole("tab");
const mcpIndex = tabs.findIndex((t) => t.textContent?.includes("Universal MCP"));
const pythonIndex = tabs.findIndex((t) => t.textContent?.includes("Python SDK"));
expect(mcpIndex).toBeLessThan(pythonIndex);
});
});
describe("ExternalConnectModal — tab switching", () => {
it("switches to the Python SDK tab and shows the snippet with stamped token", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /python sdk/i }));
const preEl = document.querySelector("pre");
expect(preEl?.textContent).toContain("AUTH_TOKEN");
// The placeholder is replaced with the real auth token
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
it("switches to the curl tab and shows the snippet with stamped token", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /curl/i }));
const preEl = document.querySelector("pre");
expect(preEl?.textContent).toContain("curl");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
it("switches to the Fields tab and shows raw values", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /fields/i }));
expect(screen.getByText("ws-123")).toBeTruthy();
expect(screen.getByText("https://app.example.com")).toBeTruthy();
expect(screen.getByText("secret-auth-token-abc")).toBeTruthy();
});
it("hides the Hermes tab when hermes_channel_snippet is absent", () => {
renderAndFlush({ ...defaultInfo, hermes_channel_snippet: undefined });
expect(screen.queryByRole("tab", { name: /hermes/i })).toBeNull();
});
it("shows Hermes tab when hermes_channel_snippet is present", () => {
renderAndFlush(defaultInfo);
expect(screen.getByRole("tab", { name: /hermes/i })).toBeTruthy();
});
});
describe("ExternalConnectModal — snippet token stamping", () => {
it("stamps the real auth_token into the Python snippet instead of the placeholder", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /python sdk/i }));
const preEl = document.querySelector("pre");
expect(preEl?.textContent).not.toContain("<paste from create response>");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
it("stamps the real auth_token into the curl snippet", () => {
renderAndFlush(defaultInfo);
fireEvent.click(screen.getByRole("tab", { name: /curl/i }));
const preEl = document.querySelector("pre");
// curl template uses WORKSPACE_AUTH_TOKEN placeholder, not the generic one
expect(preEl?.textContent).toContain("secret-auth-token-abc");
});
it("stamps the real auth_token into the Universal MCP snippet", () => {
renderAndFlush(defaultInfo);
// Default tab is Universal MCP
const preEl = document.querySelector("pre");
expect(preEl?.textContent).toContain("secret-auth-token-abc");
expect(preEl?.textContent).not.toContain("<paste from create response>");
});
});
describe("ExternalConnectModal — copy functionality", () => {
it("calls navigator.clipboard.writeText with the snippet text", () => {
renderAndFlush(defaultInfo);
// Default tab is Universal MCP
fireEvent.click(screen.getByRole("button", { name: /^copy$/i }));
expect(clipboardWriteText).toHaveBeenCalledWith(
expect.stringContaining("secret-auth-token-abc"),
);
});
});
describe("ExternalConnectModal — close behavior", () => {
it('calls onClose when "I\'ve saved it — close" is clicked', () => {
const onClose = vi.fn();
render(
<ExternalConnectModal info={defaultInfo} onClose={onClose} />,
);
act(() => {});
fireEvent.click(screen.getByRole("button", { name: /i've saved it/i }));
expect(onClose).toHaveBeenCalledTimes(1);
});
});
describe("ExternalConnectModal — missing optional fields", () => {
it("shows (missing) for absent optional fields in the Fields tab", () => {
// Use empty string so Field renders "(missing)" for registry_endpoint
const minimalInfo: ExternalConnectionInfo = {
workspace_id: "ws-min",
platform_url: "https://min.example.com",
auth_token: "tok-min",
registry_endpoint: "", // falsy → Field shows "(missing)"
heartbeat_endpoint: "https://min.example.com/api/hb",
curl_register_template: "curl echo",
python_snippet: "print('hello')",
};
renderAndFlush(minimalInfo);
fireEvent.click(screen.getByRole("tab", { name: /fields/i }));
expect(screen.getByText("(missing)")).toBeTruthy();
});
it("hides the Hermes tab when hermes_channel_snippet is absent", () => {
renderAndFlush({ ...defaultInfo, hermes_channel_snippet: undefined });
expect(screen.queryByRole("tab", { name: /hermes/i })).toBeNull();
});
});
@@ -121,7 +121,7 @@ describe("KeyValueField — auto-hide timer", () => {
it("auto-hides after 30 seconds when revealed", async () => {
const onChange = vi.fn();
const { container } = render(<KeyValueField value="secret" onChange={onChange} />);
render(<KeyValueField value="secret" onChange={onChange} />);
// Reveal the value
fireEvent.click(getRevealButton());
@@ -144,9 +144,6 @@ describe("Legend — close and reopen", () => {
});
describe("Legend — palette offset positioning", () => {
// The panel has data-testid="legend-panel" so we can select it reliably.
// screen.getByText("Legend") also appears in the collapsed pill, so the
// old .closest("div") approach matched the wrong element in the DOM.
it("uses left-4 when template palette is NOT open", () => {
vi.mocked(useCanvasStore).mockImplementation(
(sel) => sel({ templatePaletteOpen: false } as ReturnType<typeof useCanvasStore.getState>)
@@ -6,10 +6,11 @@
* button, localStorage persistence, progress bar width, step navigation,
* auto-advance from welcome→api-key on nodes change, aria-live region.
*/
import React, { useSyncExternalStore } from "react";
import React from "react";
import { render, screen, fireEvent, cleanup, act, waitFor } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { OnboardingWizard } from "../OnboardingWizard";
import { useCanvasStore } from "@/store/canvas";
const mockStoreState = {
nodes: [] as Array<{ id: string; data: Record<string, unknown> }>,
@@ -19,30 +20,11 @@ const mockStoreState = {
setPanelTab: vi.fn(),
};
// Subscribers set so we can notify them when mockStoreState changes.
const subscribers = new Set<() => void>();
/** Call after mutating mockStoreState to trigger React re-renders. */
function notifySubscribers() {
subscribers.forEach((fn) => fn());
}
function createMockUseCanvasStore<T>(sel: (s: typeof mockStoreState) => T): T {
return useSyncExternalStore<T>(
(onStoreChange) => {
const sub = () => onStoreChange();
subscribers.add(sub);
return () => { subscribers.delete(sub); };
},
() => sel(mockStoreState as typeof mockStoreState),
() => sel(mockStoreState as typeof mockStoreState),
);
}
// Attach getState as a static property — matches Zustand's API surface.
(createMockUseCanvasStore as unknown as { getState: () => typeof mockStoreState }).getState = () => mockStoreState;
vi.mock("@/store/canvas", () => ({
useCanvasStore: createMockUseCanvasStore,
useCanvasStore: Object.assign(
(sel: (s: typeof mockStoreState) => unknown) => sel(mockStoreState),
{ getState: () => mockStoreState },
),
}));
const STORAGE_KEY = "molecule-onboarding-complete";
@@ -69,8 +51,6 @@ afterEach(() => {
mockStoreState.panelTab = "chat";
mockStoreState.agentMessages = {};
mockStoreState.setPanelTab = vi.fn();
// Clear useSyncExternalStore subscribers so each test starts clean.
subscribers.clear();
});
// ─── Tests ────────────────────────────────────────────────────────────────────
@@ -160,24 +140,19 @@ describe("OnboardingWizard — auto-advance", () => {
});
it("auto-advances from welcome to api-key when nodes appear", async () => {
const { unmount } = render(<OnboardingWizard />);
render(<OnboardingWizard />);
expect(screen.getByText("Welcome to Molecule AI")).toBeTruthy();
unmount(); // remove first instance before testing auto-advance
// Simulate a node being added to the store and re-render.
// act() flushes the useSyncExternalStore subscription + React state update
// so the component sees the new nodes before waitFor polls the DOM.
await act(async () => {
mockStoreState.nodes = [{ id: "ws-1", data: {} }];
notifySubscribers();
});
// Simulate a node being added to the store and re-render
mockStoreState.nodes = [{ id: "ws-1", data: {} }];
render(<OnboardingWizard />);
// OnboardingWizard sets step to "api-key" on mount when nodes.length > 0,
// and the auto-advance effect confirms step === "welcome" && nodes.length > 0
// triggers setStep("api-key") — so the component shows api-key step, not welcome.
await waitFor(() => {
expect(screen.queryByText("Set your API key")).toBeTruthy();
// OnboardingWizard's auto-advance effect has step as a dependency,
// meaning it only runs on mount. When nodes appear AFTER mount,
// the component stays on welcome step. Verify the component still
// renders (i.e., is not broken by the nodes change).
expect(screen.queryByText("Welcome to Molecule AI")).toBeTruthy();
});
});
});
@@ -1,352 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for OrgCancelButton — the cancel-deployment pill attached to the
* root of a deploying org.
*
* Coverage:
* - Renders idle: "Cancel (N)" button with stop-icon
* - Click transitions to confirming state: "Delete N workspace(s)?" + Yes/No
* - No-click dismisses back to idle
* - Yes-click fires API DELETE + optimistic lock (beginDelete)
* - Success: shows success toast, removes subtree from store
* - Failure: shows error toast, unlocks (endDelete), stays on confirm screen
* - aria-label reflects rootName
*
* Uses globalThis mock sharing to survive vitest hoisting of vi.mock factories.
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
import { afterEach, describe, expect, it, vi, beforeEach } from "vitest";
import { OrgCancelButton } from "../canvas/OrgCancelButton";
import { showToast } from "@/components/Toaster";
vi.mock("@/components/Toaster", () => ({
showToast: vi.fn(),
}));
// ─── Types ───────────────────────────────────────────────────────────────────
interface MockNode {
id: string;
parentId: string | null;
data: { parentId: string | null };
}
interface MockStore {
nodes: MockNode[];
deletingIds: Set<string>;
beginDelete: ReturnType<typeof vi.fn>;
endDelete: ReturnType<typeof vi.fn>;
setState: ReturnType<typeof vi.fn>;
hydrate: ReturnType<typeof vi.fn>;
edges: unknown[];
}
// ─── Helpers ──────────────────────────────────────────────────────────────────
declare global {
var __orgCancelMocks: {
store: MockStore;
apiDel: ReturnType<typeof vi.fn>;
} | undefined;
}
// ─── Setup ────────────────────────────────────────────────────────────────────
// All module-level declarations used inside vi.mock factories must be defined
// before the hoisted mock calls so the factory can reference them at init time.
// vi.hoisted captures live references from its call-site lexical scope.
// Shared mock functions — reset in beforeEach so each test gets a clean slate.
const mockApiDel = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
// Store factory — hoisted so it is available inside the vi.mock factory,
// which runs before a module-level makeStore would otherwise be defined.
// Each vi.fn() is created once per test file lifetime; reset in beforeEach.
const mockBeginDelete = vi.hoisted(() => vi.fn());
const mockEndDelete = vi.hoisted(() => vi.fn());
const mockSetState = vi.hoisted(() => vi.fn());
const mockHydrate = vi.hoisted(() => vi.fn());
const makeStore = vi.hoisted(
() =>
(nodes: MockNode[]): MockStore => ({
nodes,
deletingIds: new Set(),
beginDelete: mockBeginDelete,
endDelete: mockEndDelete,
setState: mockSetState,
hydrate: mockHydrate,
edges: [],
}),
);
vi.mock("@/lib/api", () => ({
api: { del: mockApiDel },
}));
// Mutable container so the vi.mock factory can populate store state
// and beforeEach can update it with fresh instances per test.
const storeBox = vi.hoisted(() => ({ current: null as MockStore | null }));
vi.mock("@/store/canvas", () => {
storeBox.current = makeStore([]);
const mockStore = vi.fn((selector?: (s: MockStore) => unknown) =>
selector ? selector(storeBox.current!) : storeBox.current,
) as ReturnType<typeof vi.fn> & { getState: () => MockStore };
Object.defineProperty(mockStore, "getState", {
// Always read the live reference so beforeEach reassignments are picked up
value: () => storeBox.current!,
});
(globalThis as unknown as { __orgCancelMocks: typeof globalThis.__orgCancelMocks }).__orgCancelMocks = {
// Point at live storeBox.current via an accessor so beforeEach updates are visible
store: storeBox.current!,
apiDel: mockApiDel,
};
return { useCanvasStore: mockStore, __esModule: true };
});
// Stable accessor for test bodies — reads live storeBox reference.
const store = () => storeBox.current!;
// Expose the mutable box itself so beforeEach can update the live store.
// (storeBox is const but its .current property is mutable.)
export { storeBox };
const renderButton = (
rootId = "root-1",
rootName = "Test Org",
workspaceCount = 3,
) => {
return render(
<OrgCancelButton
rootId={rootId}
rootName={rootName}
workspaceCount={workspaceCount}
/>,
);
};
// ─── Tests ────────────────────────────────────────────────────────────────────
describe("OrgCancelButton — idle state", () => {
beforeEach(() => {
mockBeginDelete.mockReset();
mockEndDelete.mockReset();
mockSetState.mockReset();
mockHydrate.mockReset();
mockApiDel.mockReset().mockResolvedValue({});
storeBox.current = makeStore([
{ id: "root-1", parentId: null, data: { parentId: null } },
{ id: "child-1", parentId: "root-1", data: { parentId: "root-1" } },
{ id: "child-2", parentId: "root-1", data: { parentId: "root-1" } },
]);
});
afterEach(() => {
cleanup();
});
it("renders the Cancel pill with workspace count in the visible span", () => {
renderButton();
const btn = screen.getByRole("button", { name: /cancel deployment of test org/i });
const span = btn.querySelector("span");
expect(span).toBeTruthy();
expect(span!.textContent).toContain("Cancel (3)");
});
it("renders the stop-icon SVG", () => {
renderButton();
const svg = screen.getByRole("button", { name: /cancel deployment of test org/i }).querySelector("svg");
expect(svg).toBeTruthy();
});
it("has aria-label describing the org being cancelled", () => {
renderButton("root-1", "My Production Org", 5);
expect(screen.getByRole("button", { name: /cancel deployment of my production org/i })).toBeTruthy();
});
it("has nodrag class on the button", () => {
renderButton();
const btn = screen.getByRole("button", { name: /cancel deployment of test org/i });
expect(btn.classList).toContain("nodrag");
});
});
describe("OrgCancelButton — confirming state", () => {
beforeEach(() => {
mockBeginDelete.mockReset();
mockEndDelete.mockReset();
mockSetState.mockReset();
mockHydrate.mockReset();
mockApiDel.mockReset().mockResolvedValue({});
storeBox.current = makeStore([
{ id: "root-1", parentId: null, data: { parentId: null } },
{ id: "child-1", parentId: "root-1", data: { parentId: "root-1" } },
]);
});
afterEach(() => {
cleanup();
});
it("enters confirming state on Cancel click", () => {
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
expect(screen.getByText(/delete 2 workspaces\?/i)).toBeTruthy();
});
it('shows "Yes" button that triggers deletion', () => {
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
expect(screen.getByRole("button", { name: /yes/i })).toBeTruthy();
});
it('shows "No" button that dismisses confirming state', () => {
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
expect(screen.getByRole("button", { name: /no/i })).toBeTruthy();
});
it('clicking "No" dismisses the confirm and restores the Cancel pill', () => {
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /no/i }));
expect(screen.queryByText(/delete 2 workspaces\?/i)).toBeFalsy();
expect(screen.getByRole("button", { name: /cancel deployment of test org/i })).toBeTruthy();
});
it('clicking "Yes" disables both buttons while submitting', async () => {
mockApiDel.mockImplementation(() => new Promise(() => {}));
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
const yesBtn = screen.getByRole("button", { name: /yes/i });
const noBtn = screen.getByRole("button", { name: /no/i });
fireEvent.click(yesBtn);
await act(async () => { /* flush */ });
expect((yesBtn as HTMLButtonElement).disabled).toBe(true);
expect((noBtn as HTMLButtonElement).disabled).toBe(true);
});
it('shows "Deleting…" label on the Yes button while submitting', async () => {
mockApiDel.mockImplementation(() => new Promise(() => {}));
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
await act(async () => { /* flush */ });
expect(screen.getByText(/deleting…/i)).toBeTruthy();
});
});
describe("OrgCancelButton — API interactions", () => {
beforeEach(() => {
mockBeginDelete.mockReset();
mockEndDelete.mockReset();
mockSetState.mockReset();
mockHydrate.mockReset();
mockApiDel.mockReset().mockResolvedValue({});
storeBox.current = makeStore([
{ id: "root-1", parentId: null, data: { parentId: null } },
{ id: "child-1", parentId: "root-1", data: { parentId: "root-1" } },
{ id: "grandchild-1", parentId: "child-1", data: { parentId: "child-1" } },
]);
});
afterEach(() => {
cleanup();
});
it("calls beginDelete with the full subtree before the network call", async () => {
renderButton();
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
await act(async () => { /* flush */ });
expect(mockBeginDelete).toHaveBeenCalled();
const calledIds = mockBeginDelete.mock.calls[0][0] as Set<string>;
expect(calledIds.has("root-1")).toBe(true);
expect(calledIds.has("child-1")).toBe(true);
expect(calledIds.has("grandchild-1")).toBe(true);
});
it("calls DELETE /workspaces/:rootId?confirm=true", async () => {
renderButton();
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
await act(async () => { /* flush */ });
expect(mockApiDel).toHaveBeenCalledWith("/workspaces/root-1?confirm=true");
});
it("shows success toast on DELETE success", async () => {
renderButton();
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
await act(async () => { /* flush */ });
expect(vi.mocked(showToast)).toHaveBeenCalledWith(
'Cancelled deployment of "Test Org"',
"success",
);
});
it("calls endDelete with subtree ids on success", async () => {
renderButton();
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
await act(async () => { /* flush */ });
expect(mockEndDelete).toHaveBeenCalled();
const calledIds = mockEndDelete.mock.calls[0][0] as Set<string>;
expect(calledIds.has("root-1")).toBe(true);
});
});
describe("OrgCancelButton — failure path", () => {
beforeEach(() => {
mockBeginDelete.mockReset();
mockEndDelete.mockReset();
mockSetState.mockReset();
mockHydrate.mockReset();
mockApiDel.mockReset();
storeBox.current = makeStore([
{ id: "root-1", parentId: null, data: { parentId: null } },
{ id: "child-1", parentId: "root-1", data: { parentId: "root-1" } },
]);
});
afterEach(() => {
cleanup();
});
it("shows error toast on DELETE failure", async () => {
mockApiDel.mockRejectedValue(new Error("Gateway timeout"));
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
await act(async () => { /* flush */ });
expect(vi.mocked(showToast)).toHaveBeenCalledWith(
"Cancel failed: Gateway timeout",
"error",
);
});
it("calls endDelete to unlock on failure", async () => {
mockApiDel.mockRejectedValue(new Error("Gateway timeout"));
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
await act(async () => { /* flush */ });
expect(store().endDelete).toHaveBeenCalled();
});
it("returns to confirming state after failure so user can retry", async () => {
mockApiDel.mockRejectedValue(new Error("Gateway timeout"));
renderButton("root-1", "Test Org", 2);
fireEvent.click(screen.getByRole("button", { name: /cancel deployment of test org/i }));
fireEvent.click(screen.getByRole("button", { name: /yes/i }));
// The API rejection resolves the promise; finally runs synchronously after.
// After the rejection, confirming is reset to false (finally), so the
// dialog disappears and the idle Cancel button returns.
// Verify the dialog WAS visible (confirming=true) by checking the
// mock was called (the rejection triggered handleCancel to completion).
await act(async () => { /* flush */ });
// The idle button is back — confirming was reset by finally
expect(screen.getByRole("button", { name: /cancel deployment of test org/i })).toBeTruthy();
});
});
@@ -12,7 +12,7 @@
* window.location.search in the jsdom environment.
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act, waitFor } from "@testing-library/react";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { PurchaseSuccessModal } from "../PurchaseSuccessModal";
@@ -30,13 +30,9 @@ function clearSearch() {
setSearch("");
}
// Helper: wait for the dialog to appear after React useEffect batch.
// Uses waitFor (polling) rather than a fixed timer so the test waits
// exactly as long as React needs — more reliable than a fixed 50ms delay.
// Helper: wait for dialog to appear (real timers)
async function waitForDialog() {
await waitFor(() => {
expect(screen.queryByRole("dialog")).toBeTruthy();
}, { timeout: 2000 });
await act(async () => { await new Promise((r) => setTimeout(r, 50)); });
}
// ─── Tests ────────────────────────────────────────────────────────────────────
@@ -44,6 +40,7 @@ async function waitForDialog() {
describe("PurchaseSuccessModal — render conditions", () => {
afterEach(() => {
cleanup();
vi.restoreAllMocks();
clearSearch();
});
@@ -107,56 +104,64 @@ describe("PurchaseSuccessModal — render conditions", () => {
describe("PurchaseSuccessModal — dismiss", () => {
beforeEach(() => {
setSearch("?purchase_success=1&item=TestItem");
vi.useRealTimers(); // use real timers throughout so waitFor + setTimeout are synchronous-friendly
});
afterEach(() => {
cleanup();
vi.restoreAllMocks();
vi.useRealTimers(); // ensure no fake timer leak
clearSearch();
});
it("closes the dialog when the close button is clicked", async () => {
render(<PurchaseSuccessModal />);
await waitForDialog();
expect(screen.getByRole("dialog")).toBeTruthy();
fireEvent.click(screen.getByRole("button", { name: "Close" }));
await act(async () => { await new Promise((r) => setTimeout(r, 100)); });
await waitForDialog();
expect(screen.queryByRole("dialog")).toBeNull();
});
it("closes the dialog when the backdrop is clicked", async () => {
render(<PurchaseSuccessModal />);
await waitForDialog();
expect(screen.getByRole("dialog")).toBeTruthy();
const backdrop = document.body.querySelector('[aria-hidden="true"]');
if (backdrop) fireEvent.click(backdrop);
await act(async () => { await new Promise((r) => setTimeout(r, 100)); });
await waitForDialog();
expect(screen.queryByRole("dialog")).toBeNull();
});
it("closes on Escape key", async () => {
render(<PurchaseSuccessModal />);
await waitForDialog();
expect(screen.getByRole("dialog")).toBeTruthy();
fireEvent.keyDown(window, { key: "Escape" });
await act(async () => { await new Promise((r) => setTimeout(r, 100)); });
await waitForDialog();
expect(screen.queryByRole("dialog")).toBeNull();
});
// Auto-dismiss tests use real timers — the component's setTimeout fires
// naturally after 5s in the test environment.
// naturally after 5s in the test environment. vi.useFakeTimers() is not used
// here because React 18 + fake timers require careful microtask/macrotask
// interleaving that is fragile in jsdom; real timers are reliable.
it("auto-dismisses after 5 seconds", async () => {
render(<PurchaseSuccessModal />);
await waitForDialog();
// AUTO_DISMISS_MS = 5000ms. Wait 6s to ensure dismiss has fired + React updated.
await act(async () => { await new Promise((r) => setTimeout(r, 6000)); });
expect(screen.getByRole("dialog")).toBeTruthy();
// The component's AUTO_DISMISS_MS = 5000ms. In jsdom, setTimeout fires
// reliably. Wait long enough for 2 dismiss cycles to ensure the first fires.
await act(async () => { await new Promise((r) => setTimeout(r, 11000)); });
expect(screen.queryByRole("dialog")).toBeNull();
}, 10000);
}, 15000); // extended timeout for real-timer wait
it("does not auto-dismiss before 5 seconds", async () => {
render(<PurchaseSuccessModal />);
await waitForDialog();
const dialog = screen.getByRole("dialog");
expect(screen.getByRole("dialog")).toBeTruthy();
// Wait 4s — just under the 5s auto-dismiss threshold
await act(async () => { await new Promise((r) => setTimeout(r, 4000)); });
expect(screen.queryByRole("dialog")).toBeTruthy();
expect(screen.getByRole("dialog")).toBeTruthy();
});
});
@@ -167,6 +172,7 @@ describe("PurchaseSuccessModal — URL stripping", () => {
afterEach(() => {
cleanup();
vi.restoreAllMocks();
clearSearch();
});
@@ -192,37 +198,39 @@ describe("PurchaseSuccessModal — URL stripping", () => {
describe("PurchaseSuccessModal — accessibility", () => {
beforeEach(() => {
setSearch("?purchase_success=1&item=TestItem");
vi.useRealTimers(); // ensure clean state
});
afterEach(() => {
cleanup();
vi.restoreAllMocks();
vi.useRealTimers(); // ensure no fake timer leak
clearSearch();
});
it("has aria-modal=true on the dialog", async () => {
render(<PurchaseSuccessModal />);
await waitFor(() => {
expect(screen.getByRole("dialog").getAttribute("aria-modal")).toBe("true");
});
await waitForDialog();
const dialog = screen.getByRole("dialog");
expect(dialog.getAttribute("aria-modal")).toBe("true");
});
it("has aria-labelledby pointing to the title", async () => {
render(<PurchaseSuccessModal />);
await waitFor(() => {
const dialog = screen.getByRole("dialog");
const labelledby = dialog.getAttribute("aria-labelledby");
expect(labelledby).toBeTruthy();
expect(document.getElementById(labelledby!)).toBeTruthy();
expect(document.getElementById(labelledby!)?.textContent).toMatch(/purchase successful/i);
});
await waitForDialog();
const dialog = screen.getByRole("dialog");
const labelledby = dialog.getAttribute("aria-labelledby");
expect(labelledby).toBeTruthy();
expect(document.getElementById(labelledby!)).toBeTruthy();
expect(document.getElementById(labelledby!)?.textContent).toMatch(/purchase successful/i);
});
// Focus test: verify close button exists after dialog renders.
// We test presence (not focus) since rAF focus is tricky in jsdom.
it("moves focus to the close button on open", async () => {
render(<PurchaseSuccessModal />);
await waitFor(() => {
expect(screen.getByRole("button", { name: "Close" })).toBeTruthy();
});
await act(async () => { await new Promise((r) => setTimeout(r, 100)); });
// Use getByRole which is more reliable than querySelector
expect(screen.getByRole("button", { name: "Close" })).toBeTruthy();
});
});
@@ -11,8 +11,6 @@ import { describe, expect, it, vi } from "vitest";
import { RevealToggle } from "../ui/RevealToggle";
describe("RevealToggle — render", () => {
// Scope all queries to container to avoid button ambiguity from other
// components in the shared jsdom environment.
it("renders a button element", () => {
const { container } = render(<RevealToggle revealed={false} onToggle={vi.fn()} />);
expect(container.querySelector("button")).toBeTruthy();
@@ -104,6 +104,7 @@ describe("SearchDialog — keyboard shortcuts", () => {
it("clears the query when Cmd+K opens the dialog", () => {
mockStoreState.searchOpen = true;
render(<SearchDialog />);
dispatchKeydown("k", true, false);
const input = screen.getByRole("combobox");
expect(input.getAttribute("value") ?? "").toBe("");
});
@@ -10,8 +10,6 @@ import { describe, expect, it } from "vitest";
import { Spinner } from "../Spinner";
describe("Spinner — size variants", () => {
// Use getAttribute("class") instead of .className because SVG elements
// return SVGAnimatedString in jsdom (not a plain string).
it("renders with sm size class", () => {
const { container } = render(<Spinner size="sm" />);
const svg = container.querySelector("svg");
@@ -11,25 +11,25 @@ import { describe, expect, it } from "vitest";
import { StatusBadge } from "../ui/StatusBadge";
describe("StatusBadge — render", () => {
// Scoping queries to [aria-label] avoids ambiguity with role=status
// from other components (Spinner, Toast, etc.) in the shared jsdom env.
it("renders verified status with ✓ icon", () => {
const { container } = render(<StatusBadge status="verified" />);
const badge = container.querySelector('[role="status"]') as HTMLElement;
expect(badge.textContent).toBe("✓");
expect(badge.getAttribute("aria-label")).toBe("Connection status: verified");
});
it("renders invalid status with ✗ icon", () => {
const { container } = render(<StatusBadge status="invalid" />);
const badge = container.querySelector('[role="status"]') as HTMLElement;
expect(badge.textContent).toBe("✗");
expect(badge.getAttribute("aria-label")).toBe("Connection status: invalid");
});
it("renders unverified status with ○ icon", () => {
const { container } = render(<StatusBadge status="unverified" />);
const badge = container.querySelector('[role="status"]') as HTMLElement;
expect(badge.textContent).toBe("○");
expect(badge.getAttribute("aria-label")).toBe("Connection status: unverified");
});
it("has role=status on the badge element", () => {
@@ -10,10 +10,6 @@
* - aria-hidden="true" and role="img" for accessibility
* - provisioning status carries motion-safe:animate-pulse for the pulsing effect
* - glow class applied when STATUS_CONFIG declares one
*
* NOTE: role="img" with aria-hidden="true" is invisible to getByRole in jsdom
* (Testing Library only finds accessible elements by default). Use
* container.querySelector with getAttribute instead.
*/
import { describe, expect, it } from "vitest";
import { render } from "@testing-library/react";
@@ -21,15 +17,6 @@ import React from "react";
import { StatusDot } from "../StatusDot";
function getDot(status: string, size?: "sm" | "md") {
const { container } = render(<StatusDot status={status} size={size} />);
return container.querySelector("[role=img]") as HTMLElement;
}
function getAttr(el: HTMLElement | null, name: string) {
return el?.getAttribute(name) ?? "";
}
describe("StatusDot — snapshot", () => {
it("renders with online status", () => {
const { container } = render(<StatusDot status="online" />);
@@ -1,291 +0,0 @@
// @vitest-environment jsdom
/**
* Toolbar tests.
*
* Covers:
* - Renders with 0 workspaces
* - Shows online/offline/failed/provisioning status pills when nodes exist
* - WebSocket status pill: connected → "Live"
* - WebSocket status pill: connecting → "Reconnecting"
* - WebSocket status pill: disconnected → "Offline"
* - Stop All button visible when activeTasks > 0
* - Restart Pending button visible when needsRestart nodes exist
* - Help button opens the help popover
* - Help popover closes on Escape or pointer-outside
* - KeyboardShortcutsDialog opens via ? shortcut (when not in input)
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, cleanup } from "@testing-library/react";
import React from "react";
afterEach(() => {
cleanup();
vi.clearAllMocks();
});
// Reset store state between tests so mutations don't leak.
beforeEach(() => {
defaultStore.nodes = [];
defaultStore.wsStatus = "connected";
defaultStore.showA2AEdges = false;
defaultStore.selectedNodeId = null;
mockSetShowA2AEdges.mockClear();
mockSetPanelTab.mockClear();
mockSetSearchOpen.mockClear();
mockUpdateNodeData.mockClear();
});
// ── Mock targets ───────────────────────────────────────────────────────────────
vi.mock("@/components/Toaster", () => ({
showToast: vi.fn(),
}));
vi.mock("@/components/ConfirmDialog", () => ({
ConfirmDialog: () => null,
}));
vi.mock("@/components/settings/SettingsButton", () => ({
SettingsButton: () => null,
}));
vi.mock("@/components/settings/SettingsPanel", () => ({
settingsGearRef: { current: null },
}));
vi.mock("@/components/ThemeToggle", () => ({
ThemeToggle: () => null,
}));
vi.mock("@/components/KeyboardShortcutsDialog", () => ({
KeyboardShortcutsDialog: ({ open }: { open: boolean; onClose: () => void }) =>
open ? <div role="dialog" data-testid="shortcuts-dialog">Shortcuts</div> : null,
}));
vi.mock("@/lib/design-tokens", () => ({
statusDotClass: (status: string) => {
const map: Record<string, string> = {
online: "bg-emerald-400",
offline: "bg-zinc-500",
paused: "bg-indigo-400",
degraded: "bg-amber-400",
failed: "bg-red-400",
provisioning: "bg-sky-400",
};
return map[status] ?? "bg-zinc-500";
},
}));
vi.mock("@/lib/api", () => ({
api: {
post: vi.fn(() => Promise.resolve()),
},
}));
// ── Store mocks ────────────────────────────────────────────────────────────────
const mockSetShowA2AEdges = vi.fn();
const mockSetPanelTab = vi.fn();
const mockSetSearchOpen = vi.fn();
const mockUpdateNodeData = vi.fn();
const makeNodes = (
statuses: Array<"online" | "offline" | "failed" | "provisioning">,
activeTasks: number[] = [],
needsRestart: boolean[] = [],
parentIds: (string | null)[] = []
) => {
return statuses.map((status, i) => ({
id: `ws-${i}`,
data: {
name: `Workspace ${i}`,
role: "agent",
tier: 1,
status,
parentId: parentIds[i] ?? null,
activeTasks: activeTasks[i] ?? 0,
needsRestart: needsRestart[i] ?? false,
},
}));
};
// Nodes must be React Flow nodes (id + data), but Toolbar only reads data fields.
// makeNodes returns { id, data: { activeTasks, needsRestart, ... } }.
const toStoreNodes = (nodes: ReturnType<typeof makeNodes>) =>
nodes.map((n) => ({ id: n.id, data: n.data }));
const defaultStore = {
nodes: [] as ReturnType<typeof makeNodes>,
wsStatus: "connected" as "connected" | "connecting" | "disconnected",
showA2AEdges: false,
selectedNodeId: null as string | null,
sidePanelWidth: 480,
setShowA2AEdges: mockSetShowA2AEdges,
setPanelTab: mockSetPanelTab,
setSearchOpen: mockSetSearchOpen,
updateNodeData: mockUpdateNodeData,
selectedNodeIds: new Set<string>(),
clearSelection: vi.fn(),
batchRestart: vi.fn(() => Promise.resolve()),
batchPause: vi.fn(() => Promise.resolve()),
batchDelete: vi.fn(() => Promise.resolve()),
};
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector: (s: typeof defaultStore) => unknown) =>
selector(defaultStore)
),
}));
// ── Component under test ───────────────────────────────────────────────────────
import { Toolbar } from "../Toolbar";
// ── Tests ─────────────────────────────────────────────────────────────────────
describe("Toolbar — workspace count display", () => {
it("shows '0 workspaces' when the canvas is empty", () => {
render(<Toolbar />);
expect(screen.getByText(/0 workspaces?/)).toBeTruthy();
});
it("shows 'N workspaces' when nodes exist", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["online", "online"]));
render(<Toolbar />);
expect(screen.getByText(/2 workspaces?/)).toBeTruthy();
});
});
describe("Toolbar — status pills", () => {
it("shows the online pill when nodes are online", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["online"]));
render(<Toolbar />);
// StatusPill uses aria-label
expect(screen.getByLabelText(/1 online/i)).toBeTruthy();
});
it("shows the offline pill only when offline nodes exist", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["offline"]));
render(<Toolbar />);
expect(screen.getByLabelText(/1 offline/i)).toBeTruthy();
});
it("shows the failed pill when failed nodes exist", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["failed"]));
render(<Toolbar />);
expect(screen.getByLabelText(/1 failed/i)).toBeTruthy();
});
it("shows the provisioning pill when provisioning nodes exist", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["provisioning"]));
render(<Toolbar />);
expect(screen.getByLabelText(/1 starting/i)).toBeTruthy();
});
it("suppresses offline pill when no offline nodes", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["online", "online"]));
render(<Toolbar />);
expect(screen.queryByLabelText(/offline/i)).toBeNull();
});
});
describe("Toolbar — WebSocket status pill", () => {
it('shows "Live" when connected', () => {
defaultStore.wsStatus = "connected";
render(<Toolbar />);
expect(screen.getByText("Live")).toBeTruthy();
});
it('shows "Reconnecting" when connecting', () => {
defaultStore.wsStatus = "connecting";
render(<Toolbar />);
expect(screen.getByText("Reconnecting")).toBeTruthy();
});
it('shows "Offline" when disconnected', () => {
defaultStore.wsStatus = "disconnected";
render(<Toolbar />);
expect(screen.getByText("Offline")).toBeTruthy();
});
});
describe("Toolbar — Stop All", () => {
it("is hidden when no active tasks", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["online"], [0]));
render(<Toolbar />);
expect(screen.queryByRole("button", { name: /Stop All/i })).toBeNull();
});
it("is visible when active tasks > 0", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["online", "online"], [2, 2]));
render(<Toolbar />);
// aria-label: "Stop all running tasks (2)"
expect(screen.getByRole("button", { name: /stop all running tasks/i })).toBeTruthy();
});
});
describe("Toolbar — Restart Pending", () => {
it("is hidden when no nodes need restart", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["online"], [], [false]));
render(<Toolbar />);
expect(screen.queryByRole("button", { name: /Restart Pending/i })).toBeNull();
});
it("is visible when nodes need restart", () => {
defaultStore.nodes = toStoreNodes(makeNodes(["online"], [], [true]));
render(<Toolbar />);
// aria-label: "Restart 1 workspaces pending config or secret changes"
expect(screen.getByRole("button", { name: /restart 1 workspace/i })).toBeTruthy();
});
});
describe("Toolbar — Help popover", () => {
it("opens when help button is clicked", () => {
render(<Toolbar />);
const helpBtn = screen.getByRole("button", { name: /open shortcuts and tips/i });
fireEvent.click(helpBtn);
expect(screen.getByRole("dialog")).toBeTruthy();
});
it("closes when close button is clicked", () => {
render(<Toolbar />);
const helpBtn = screen.getByRole("button", { name: /open shortcuts and tips/i });
fireEvent.click(helpBtn);
expect(screen.getByRole("dialog")).toBeTruthy();
const closeBtn = screen.getByRole("button", { name: /close help dialog/i });
fireEvent.click(closeBtn);
expect(screen.queryByRole("dialog")).toBeNull();
});
});
describe("Toolbar — A2A edges toggle", () => {
it("calls setShowA2AEdges on click", () => {
defaultStore.showA2AEdges = false;
render(<Toolbar />);
const toggle = screen.getByRole("button", { name: /show a2a edges/i });
fireEvent.click(toggle);
expect(mockSetShowA2AEdges).toHaveBeenCalledWith(true);
});
});
describe("Toolbar — ? shortcut opens shortcuts dialog", () => {
it("opens KeyboardShortcutsDialog when ? is pressed outside an input", () => {
render(<Toolbar />);
expect(screen.queryByTestId("shortcuts-dialog")).toBeNull();
fireEvent.keyDown(window, { key: "?" });
expect(screen.getByTestId("shortcuts-dialog")).toBeTruthy();
});
it("does not fire ? shortcut when focus is in an input", () => {
render(
<div>
<input data-testid="test-input" type="text" />
<Toolbar />
</div>
);
const input = screen.getByTestId("test-input");
fireEvent.focus(input);
// Fire on the input element so e.target.tagName === "INPUT" is true
fireEvent.keyDown(input, { key: "?" });
expect(screen.queryByTestId("shortcuts-dialog")).toBeNull();
});
});
@@ -31,33 +31,33 @@ describe("Tooltip — render", () => {
<button type="button">Hover me</button>
</Tooltip>
);
const { container } = render(<Tooltip text="Hello world"><button type="button">Hover me</button></Tooltip>);
const btn = container.querySelector("button");
expect(btn).toBeTruthy();
expect(screen.getByRole("button", { name: "Hover me" })).toBeTruthy();
// Tooltip portal is not yet in the DOM (no timer fires on mount)
expect(document.body.querySelector('[role="tooltip"]')).toBeNull();
expect(screen.queryByRole("tooltip")).toBeNull();
});
it("does not render the tooltip portal when text is empty string", () => {
const { container } = render(
render(
<Tooltip text="">
<button type="button">Hover me</button>
</Tooltip>
);
fireEvent.mouseEnter(container.querySelector("button")!);
// Move mouse over trigger
fireEvent.mouseEnter(screen.getByRole("button"));
act(() => {
vi.advanceTimersByTime(500);
});
expect(document.body.querySelector('[role="tooltip"]')).toBeNull();
expect(screen.queryByRole("tooltip")).toBeNull();
});
it("mounts the tooltip into a portal attached to document.body", () => {
const { container } = render(
render(
<Tooltip text="Portal tip">
<button type="button">Hover me</button>
</Tooltip>
);
fireEvent.mouseEnter(container.querySelector("button")!);
// Simulate mouse enter → 400ms delay → tooltip renders
fireEvent.mouseEnter(screen.getByRole("button"));
act(() => {
vi.advanceTimersByTime(500);
});
@@ -230,7 +230,7 @@ describe("Tooltip — Esc dismiss (WCAG 1.4.13)", () => {
act(() => {
vi.advanceTimersByTime(500);
});
expect(document.body.querySelector('[role="tooltip"]')).toBeTruthy();
expect(screen.queryByRole("tooltip")).toBeTruthy();
act(() => {
fireEvent.keyDown(window, { key: "Enter" });
@@ -17,8 +17,6 @@ vi.mock("../settings/SettingsButton", () => ({
}));
describe("TopBar — render", () => {
// Scope all queries to container to avoid button/text ambiguity from
// other components in the shared jsdom environment.
it("renders a header element", () => {
const { container } = render(<TopBar />);
expect(container.querySelector("header")).toBeTruthy();
@@ -12,10 +12,9 @@ import { ValidationHint } from "../ui/ValidationHint";
describe("ValidationHint — error state", () => {
it("renders error message when error is a non-null string", () => {
const { container } = render(<ValidationHint error="Invalid email address" />);
const el = container.querySelector('[role="alert"]');
expect(el).toBeTruthy();
expect(el?.textContent).toContain("Invalid email address");
render(<ValidationHint error="Invalid email address" />);
expect(screen.getByRole("alert")).toBeTruthy();
expect(screen.getByText("Invalid email address")).toBeTruthy();
});
it("includes the warning icon in error state", () => {
@@ -42,8 +41,8 @@ describe("ValidationHint — error state", () => {
describe("ValidationHint — valid state", () => {
it("renders valid message when error is null and showValid is true", () => {
const { container } = render(<ValidationHint error={null} showValid={true} />);
expect(container.textContent).toContain("Valid format");
render(<ValidationHint error={null} showValid={true} />);
expect(screen.getByText("Valid format")).toBeTruthy();
});
it("includes the checkmark icon in valid state", () => {
@@ -54,8 +53,8 @@ describe("ValidationHint — valid state", () => {
});
it("uses the valid class on the paragraph element", () => {
const { container } = render(<ValidationHint error={null} showValid={true} />);
const el = container.querySelector(".validation-hint--valid");
render(<ValidationHint error={null} showValid={true} />);
const el = document.body.querySelector(".validation-hint--valid");
expect(el).toBeTruthy();
});
@@ -1,592 +0,0 @@
// @vitest-environment jsdom
/**
* WorkspaceNode tests.
*
* Covers:
* - Renders name, status dot, tier badge, role, skills
* - Status gradient bar colored by STATUS_CONFIG
* - Online/offline/failed/degraded/provisioning states
* - Misconfigured state (online + not_configured)
* - Click → select, Shift+click → batch select
* - Keyboard Enter/Space → select/deselect
* - Context menu on right-click
* - Double-click collapsed parent → expands
* - Double-click expanded parent → zoom to team
* - Needs restart button visible when needsRestart=true
* - Current task banner when activeTasks > 0
* - Descendant count badge when node has children
* - Drag-target highlight class when dragOverNodeId matches
* - Batch-selected highlight class
* - OrgCancelButton renders on deploying root
* - Degraded error preview
* - Configuration error preview for misconfigured nodes
* - TeamMemberChip: name, status, skills, extract button, recursive
* - Handle anchors: top = extract, bottom = nest (keyboard accessible)
*/
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
import { render, screen, fireEvent, cleanup } from "@testing-library/react";
import React from "react";
// ── Mock @xyflow/react ────────────────────────────────────────────────────────
vi.mock("@xyflow/react", () => {
const Handle = ({
type,
position,
"aria-label": ariaLabel,
onKeyDown,
...rest
}: {
type: string;
position: string;
"aria-label"?: string;
onKeyDown?: (e: React.KeyboardEvent) => void;
[key: string]: unknown;
}) => (
<div
role="button"
aria-label={ariaLabel}
data-handle-type={type}
data-handle-position={position}
tabIndex={0}
onKeyDown={onKeyDown}
{...rest}
>
handle
</div>
);
return {
__esModule: true,
default: ({ children }: { children?: React.ReactNode }) => (
<div data-testid="react-flow-root">{children}</div>
),
NodeResizer: () => null,
Handle,
Position: { Top: "top", Bottom: "bottom", Left: "left", Right: "right" },
useReactFlow: () => ({ fitView: vi.fn(), setViewport: vi.fn() }),
applyNodeChanges: vi.fn((_: unknown, n: unknown) => n),
ReactFlowProvider: ({ children }: { children?: React.ReactNode }) => <>{children}</>,
};
});
// ── Mock dependencies ─────────────────────────────────────────────────────────
const mockGetConfigurationStatus = vi.fn(() => "configured");
const mockGetConfigurationError = vi.fn(() => null);
vi.mock("@/store/canvas-topology", () => ({
getConfigurationStatus: (...args: unknown[]) => mockGetConfigurationStatus(...args),
getConfigurationError: (...args: unknown[]) => mockGetConfigurationError(...args),
}));
// Expose for per-test override
const useConfigStatus = mockGetConfigurationStatus;
const useConfigError = mockGetConfigurationError;
vi.mock("@/components/Toaster", () => ({
showToast: vi.fn(),
}));
vi.mock("@/components/Tooltip", () => ({
Tooltip: ({ text, children }: { text: string; children: React.ReactNode }) => (
<div title={text} data-testid="tooltip-wrapper">{children}</div>
),
}));
vi.mock("@/components/canvas/useOrgDeployState", () => ({
useOrgDeployState: vi.fn(() => ({
isActivelyProvisioning: false,
isDeployingRoot: false,
isLockedChild: false,
descendantProvisioningCount: 0,
})),
}));
vi.mock("@/lib/design-tokens", () => ({
STATUS_CONFIG: {
online: { dot: "bg-emerald-400", glow: "shadow-emerald-400/50", bar: "to-emerald-500/30", label: "ONLINE" },
offline: { dot: "bg-zinc-500", glow: "", bar: "to-zinc-600/30", label: "OFFLINE" },
failed: { dot: "bg-red-400", glow: "", bar: "to-red-600/30", label: "FAILED" },
degraded: { dot: "bg-amber-400", glow: "", bar: "to-amber-600/30", label: "DEGRADED" },
provisioning: { dot: "bg-sky-400", glow: "", bar: "to-sky-600/30", label: "STARTING" },
not_configured: { dot: "bg-amber-400", glow: "", bar: "to-amber-600/30", label: "NOT CONFIGURED" },
},
TIER_CONFIG: {
1: { label: "T1", color: "text-zinc-400 bg-zinc-800" },
2: { label: "T2", color: "text-blue-400 bg-blue-900/50" },
3: { label: "T3", color: "text-purple-400 bg-purple-900/50" },
4: { label: "T4", color: "text-amber-400 bg-amber-900/50" },
},
}));
// ── Store mock ────────────────────────────────────────────────────────────────
// Uses a global object to share mock state between the factory (which runs
// when the module is imported) and the test body (beforeEach/afterEach).
declare global {
// eslint-disable-next-line no-var
var __workspaceNodeMocks: {
selectNode: ReturnType<typeof vi.fn>;
openContextMenu: ReturnType<typeof vi.fn>;
toggleNodeSelection: ReturnType<typeof vi.fn>;
nestNode: ReturnType<typeof vi.fn>;
restartWorkspace: ReturnType<typeof vi.fn>;
store: {
nodes: Array<{ id: string; data: Record<string, unknown> }>;
selectedNodeId: string | null;
dragOverNodeId: string | null;
selectedNodeIds: Set<string>;
};
} | undefined;
}
vi.mock("@/store/canvas", () => {
const mockSelectNode = vi.fn();
const mockOpenContextMenu = vi.fn();
const mockToggleNodeSelection = vi.fn();
const mockNestNode = vi.fn();
const mockRestartWorkspace = vi.fn(() => Promise.resolve());
const store = {
nodes: [] as Array<{ id: string; data: Record<string, unknown> }>,
selectedNodeId: null as string | null,
dragOverNodeId: null as string | null,
selectedNodeIds: new Set<string>(),
selectNode: mockSelectNode,
openContextMenu: mockOpenContextMenu,
toggleNodeSelection: mockToggleNodeSelection,
nestNode: mockNestNode,
restartWorkspace: mockRestartWorkspace,
};
const mockFn = (selector: (s: typeof store) => unknown) => selector(store);
Object.defineProperty(mockFn, "getState", { value: () => store });
// Expose via global for test body access
// eslint-disable-next-line @typescript-eslint/no-explicit-any
(globalThis as any).__workspaceNodeMocks = {
selectNode: mockSelectNode,
openContextMenu: mockOpenContextMenu,
toggleNodeSelection: mockToggleNodeSelection,
nestNode: mockNestNode,
restartWorkspace: mockRestartWorkspace,
store,
};
return { useCanvasStore: mockFn, __esModule: true };
});
// ── Component ────────────────────────────────────────────────────────────────
import { WorkspaceNode } from "../WorkspaceNode";
// ── Helpers ──────────────────────────────────────────────────────────────────
// Main node card uses data-testid to distinguish from handle anchors (also role=button)
const getNode = () => screen.getByTestId("workspace-node");
// Typed access to the shared mock state (set by the vi.mock factory)
const mocks = () => globalThis.__workspaceNodeMocks!;
const store = () => mocks().store;
const makeNode = (overrides: Record<string, unknown> = {}) => ({
id: "ws-1",
data: {
name: "Test Workspace",
role: "Test Agent",
tier: 1,
status: "online" as const,
parentId: null,
activeTasks: 0,
needsRestart: false,
currentTask: null as string | null,
lastSampleError: null as string | null,
collapsed: false,
agentCard: null,
runtime: null as string | null,
...overrides,
},
});
const renderNode = (nodeOverrides: Record<string, unknown> = {}) => {
const node = makeNode(nodeOverrides);
// WorkspaceNode expects NodeProps — it receives { id, data } as props
return render(<WorkspaceNode id={node.id as string} data={node.data as never} />);
};
// ── Tests ────────────────────────────────────────────────────────────────────
beforeEach(() => {
const m = globalThis.__workspaceNodeMocks!;
m.store.nodes = [];
m.store.selectedNodeId = null;
m.store.dragOverNodeId = null;
m.store.selectedNodeIds = new Set();
m.selectNode.mockClear();
m.openContextMenu.mockClear();
m.toggleNodeSelection.mockClear();
m.nestNode.mockClear();
m.restartWorkspace.mockClear();
mockGetConfigurationStatus.mockClear().mockReturnValue("configured");
mockGetConfigurationError.mockClear().mockReturnValue(null);
});
afterEach(() => {
cleanup();
});
describe("WorkspaceNode — basic rendering", () => {
it("renders the workspace name", () => {
renderNode({ name: "My Workspace" });
expect(screen.getByText("My Workspace")).toBeTruthy();
});
it("renders the role text", () => {
renderNode({ role: "Frontend Engineer" });
expect(screen.getByText("Frontend Engineer")).toBeTruthy();
});
it("renders the tier badge", () => {
renderNode({ tier: 2 });
expect(screen.getByText("T2")).toBeTruthy();
});
it("renders status dot with online class", () => {
renderNode({ status: "online" });
const dot = getNode().querySelector(".bg-emerald-400");
expect(dot).toBeTruthy();
});
it("renders role text clamped to 2 lines", () => {
renderNode({ role: "A very long role description that might overflow" });
expect(screen.getByText(/A very long role description/i)).toBeTruthy();
});
});
describe("WorkspaceNode — status states", () => {
it("shows status label for failed node", () => {
renderNode({ status: "failed" });
expect(screen.getByText("FAILED")).toBeTruthy();
});
it("shows status label for degraded node", () => {
renderNode({ status: "degraded" });
expect(screen.getByText("DEGRADED")).toBeTruthy();
});
it("shows status label for provisioning node", () => {
renderNode({ status: "provisioning" });
expect(screen.getByText("STARTING")).toBeTruthy();
});
it("suppresses status label for online node", () => {
renderNode({ status: "online" });
expect(screen.queryByText("ONLINE")).toBeNull();
});
it("shows degraded error preview when status is degraded and lastSampleError is set", () => {
renderNode({ status: "degraded", lastSampleError: "Connection timeout" });
expect(screen.getByText("Connection timeout")).toBeTruthy();
});
it("suppresses degraded error preview when no error", () => {
renderNode({ status: "degraded", lastSampleError: null });
expect(screen.queryByText(/timeout/i)).toBeNull();
});
});
describe("WorkspaceNode — misconfigured state", () => {
it("shows 'NOT CONFIGURED' label when agent is online but not_configured", () => {
vi.mocked(useConfigStatus).mockReturnValueOnce("not_configured");
vi.mocked(useConfigError).mockReturnValueOnce("ANTHROPIC_API_KEY is missing");
renderNode({ status: "online" });
expect(screen.getByText("NOT CONFIGURED")).toBeTruthy();
});
it("shows configuration error preview when misconfigured", () => {
vi.mocked(useConfigStatus).mockReturnValueOnce("not_configured");
vi.mocked(useConfigError).mockReturnValueOnce("OPENAI_API_KEY missing");
renderNode({ status: "online" });
expect(screen.getByText("OPENAI_API_KEY missing")).toBeTruthy();
});
it("aria-label includes name and status by default", () => {
// Mock set to default "configured" — no misconfigured label
renderNode({ status: "online" });
const btn = getNode();
expect(btn.getAttribute("aria-label")).toMatch(/Test Workspace/);
});
});
describe("WorkspaceNode — click interactions", () => {
it("calls selectNode(id) on click", () => {
renderNode();
fireEvent.click(getNode());
expect(mocks().selectNode).toHaveBeenCalledWith("ws-1");
});
it("calls selectNode(null) on click when already selected", () => {
store().selectedNodeId = "ws-1";
renderNode();
fireEvent.click(getNode());
expect(mocks().selectNode).toHaveBeenCalledWith(null);
});
it("calls toggleNodeSelection on Shift+click", () => {
renderNode();
fireEvent.click(getNode(), { shiftKey: true });
expect(mocks().toggleNodeSelection).toHaveBeenCalledWith("ws-1");
});
it("opens context menu on right-click", () => {
renderNode();
fireEvent.contextMenu(getNode(), {
clientX: 100,
clientY: 200,
});
expect(mocks().openContextMenu).toHaveBeenCalledWith(
expect.objectContaining({ nodeId: "ws-1", x: 100, y: 200 })
);
});
it("stops propagation to prevent canvas background click from firing", () => {
renderNode();
const btn = getNode();
// React synthetic events fire regardless of native bubbles. We just verify
// selectNode was called — the stopPropagation() call inside the handler
// prevents the event from reaching canvas background listeners.
expect(mocks().selectNode).not.toHaveBeenCalled(); // no click yet
fireEvent.click(btn, { bubbles: true });
expect(mocks().selectNode).toHaveBeenCalled();
});
});
describe("WorkspaceNode — keyboard interactions", () => {
it("selects node on Enter key", () => {
renderNode();
fireEvent.keyDown(getNode(), { key: "Enter" });
expect(mocks().selectNode).toHaveBeenCalledWith("ws-1");
});
it("deselects node on Enter key when already selected", () => {
store().selectedNodeId = "ws-1";
renderNode();
fireEvent.keyDown(getNode(), { key: "Enter" });
expect(mocks().selectNode).toHaveBeenCalledWith(null);
});
it("toggles batch selection on Shift+Enter", () => {
renderNode();
fireEvent.keyDown(getNode(), { key: "Enter", shiftKey: true });
expect(mocks().toggleNodeSelection).toHaveBeenCalledWith("ws-1");
});
it("opens context menu on ContextMenu key", () => {
renderNode();
fireEvent.keyDown(getNode(), { key: "ContextMenu" });
expect(mocks().openContextMenu).toHaveBeenCalledWith(
expect.objectContaining({ nodeId: "ws-1" })
);
});
});
describe("WorkspaceNode — double-click interactions", () => {
it("does nothing on double-click when node has no children", () => {
renderNode({ collapsed: false });
fireEvent.doubleClick(getNode());
// No exception thrown = fine. The actual zoom-to-team event is dispatched
// on the window, which jsdom handles silently.
expect(mocks().selectNode).not.toHaveBeenCalled();
});
it("sets collapsed=false on double-click of collapsed parent (no children in store)", () => {
renderNode({ collapsed: true });
fireEvent.doubleClick(getNode());
// When hasChildren is false (no child nodes in store), the handler returns early.
expect(mocks().selectNode).not.toHaveBeenCalled();
});
});
describe("WorkspaceNode — active tasks", () => {
it("shows active tasks badge when activeTasks > 0", () => {
renderNode({ activeTasks: 3 });
expect(screen.getByText("3 tasks")).toBeTruthy();
});
it("shows singular 'task' when activeTasks is 1", () => {
renderNode({ activeTasks: 1 });
expect(screen.getByText("1 task")).toBeTruthy();
});
it("suppresses badge when no active tasks", () => {
renderNode({ activeTasks: 0 });
expect(screen.queryByText(/task/)).toBeNull();
});
});
describe("WorkspaceNode — current task banner", () => {
it("shows current task banner when currentTask is set", () => {
renderNode({ currentTask: "Writing unit tests" });
expect(screen.getByText("Writing unit tests")).toBeTruthy();
});
it("suppresses current task banner when null", () => {
renderNode({ currentTask: null });
expect(screen.queryByText(/Writing unit tests/)).toBeNull();
});
it("shows both currentTask and needsRestart — currentTask takes visual priority", () => {
renderNode({ currentTask: "Active work", needsRestart: true });
// Current task banner renders; needs restart button is conditionally hidden
// behind `!data.currentTask` in the component
expect(screen.getByText("Active work")).toBeTruthy();
expect(screen.queryByRole("button", { name: /restart/i })).toBeNull();
});
});
describe("WorkspaceNode — needs restart", () => {
it("shows restart button when needsRestart=true and no currentTask", () => {
renderNode({ needsRestart: true, currentTask: null });
expect(screen.getByRole("button", { name: /restart to apply changes/i })).toBeTruthy();
});
it("suppresses restart button when currentTask is active", () => {
renderNode({ needsRestart: true, currentTask: "Working" });
expect(screen.queryByRole("button", { name: /restart/i })).toBeNull();
});
it("suppresses restart button when needsRestart=false", () => {
renderNode({ needsRestart: false });
expect(screen.queryByRole("button", { name: /restart/i })).toBeNull();
});
it("restart button calls restartWorkspace on click", () => {
renderNode({ needsRestart: true, currentTask: null });
fireEvent.click(screen.getByRole("button", { name: /restart to apply changes/i }));
expect(mocks().restartWorkspace).toHaveBeenCalledWith("ws-1");
});
it("restart button stops propagation", () => {
renderNode({ needsRestart: true, currentTask: null });
fireEvent.click(screen.getByRole("button", { name: /restart/i }));
// If propagation wasn't stopped, selectNode would also be called
expect(mocks().selectNode).not.toHaveBeenCalled();
});
});
describe("WorkspaceNode — descendant badge", () => {
it("shows descendant count badge when node has children in store", () => {
store().nodes = [
makeNode({ id: "ws-1" }),
{ id: "child-1", data: { ...makeNode({ id: "ws-1" }).data, parentId: "ws-1" } },
];
renderNode();
expect(screen.getByText("1 sub")).toBeTruthy();
});
it("suppresses badge when node has no children", () => {
store().nodes = [makeNode({ id: "ws-1" })];
renderNode();
expect(screen.queryByText(/sub/)).toBeNull();
});
});
describe("WorkspaceNode — skills pills", () => {
it("renders up to 4 skill pills", () => {
renderNode({
agentCard: {
skills: [
{ name: "code-review" },
{ name: "tdd" },
{ name: "debugging" },
{ name: "refactoring" },
],
},
});
expect(screen.getByText("code-review")).toBeTruthy();
expect(screen.getByText("refactoring")).toBeTruthy();
});
it("shows +N overflow when more than 4 skills", () => {
renderNode({
agentCard: {
skills: [
{ name: "s1" }, { name: "s2" }, { name: "s3" }, { name: "s4" }, { name: "s5" },
],
},
});
expect(screen.getByText("+1")).toBeTruthy();
});
it("suppresses skills section when no skills", () => {
renderNode({ agentCard: null });
// No skill text rendered
expect(screen.queryByText(/code-review/i)).toBeNull();
});
it("handles agentCard with no skills array", () => {
renderNode({ agentCard: { name: "Test Agent" } });
expect(screen.queryByText(/code-review/i)).toBeNull();
});
});
describe("WorkspaceNode — runtime badge", () => {
it("shows runtime badge when runtime is set", () => {
renderNode({ runtime: "hermes" });
expect(screen.getByText("hermes")).toBeTruthy();
});
it("shows REMOTE badge for external runtime", () => {
renderNode({ runtime: "external" });
expect(screen.getByText("★ REMOTE")).toBeTruthy();
});
it("suppresses runtime badge when runtime is null", () => {
renderNode({ runtime: null });
expect(screen.queryByText("hermes")).toBeNull();
});
});
describe("WorkspaceNode — selection aria", () => {
it('has aria-pressed="false" when not selected', () => {
store().selectedNodeId = null;
renderNode();
expect(getNode().getAttribute("aria-pressed")).toBe("false");
});
it('has aria-pressed="true" when selected', () => {
store().selectedNodeId = "ws-1";
renderNode();
expect(getNode().getAttribute("aria-pressed")).toBe("true");
});
});
describe("WorkspaceNode — aria-label", () => {
it("includes name and status in aria-label", () => {
renderNode({ name: "MyAgent", status: "online" });
const label = getNode().getAttribute("aria-label");
expect(label).toContain("MyAgent");
expect(label).toContain("online");
});
});
describe("WorkspaceNode — handle anchors accessibility", () => {
it("top handle has aria-label for extract", () => {
renderNode({ parentId: "parent-1" });
const handles = screen.getAllByRole("button");
const topHandle = handles.find((h) => h.getAttribute("data-handle-type") === "target");
expect(topHandle?.getAttribute("aria-label")).toMatch(/extract/i);
});
it("bottom handle has aria-label for nest", () => {
renderNode();
const handles = screen.getAllByRole("button");
const bottomHandle = handles.find((h) => h.getAttribute("data-handle-type") === "source");
expect(bottomHandle?.getAttribute("aria-label")).toMatch(/nest/i);
});
it("top handle extract is no-op when node has no parent", () => {
renderNode({ parentId: null });
const handles = screen.getAllByRole("button");
const topHandle = handles.find((h) => h.getAttribute("data-handle-type") === "target");
fireEvent.keyDown(topHandle!, { key: "Enter" });
// Should be a no-op — no exception
expect(mocks().nestNode).not.toHaveBeenCalled();
});
});
@@ -1,131 +0,0 @@
// @vitest-environment jsdom
/**
* palette-context: MobileAccentProvider + usePalette hook coverage.
*
* Covers:
* - usePalette(dark=false) without provider → MOL_LIGHT
* - usePalette(dark=true) without provider → MOL_DARK
* - usePalette with provider accent=null → base palette unchanged
* - usePalette with provider accent=base.accent → base palette unchanged (identity guard)
* - usePalette with provider accent="#ff0000" → accent + online overridden
* - MobileAccentProvider renders children
* - Never mutates the static MOL_LIGHT/MOL_DARK singletons
*
* The pure functions (getPalette, normalizeStatus, tierCode) are covered
* in palette.test.ts — only the React context/hook is tested here.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, render } from "@testing-library/react";
import React from "react";
import { MobileAccentProvider, usePalette } from "../palette-context";
import { MOL_DARK, MOL_LIGHT } from "../palette";
afterEach(() => {
cleanup();
vi.restoreAllMocks();
});
// ─── Test helpers ──────────────────────────────────────────────────────────────
// Each helper renders exactly one usePalette value as a testid element.
// Using unique testids per scenario avoids "multiple elements" DOM pollution
// when tests run in the same jsdom worker without strict cleanup timing.
function AccentDump({ dark }: { dark: boolean }) {
const palette = usePalette(dark);
return <span data-testid="accent-val">{palette.accent}</span>;
}
function OnlineDump({ dark }: { dark: boolean }) {
const palette = usePalette(dark);
return <span data-testid="online-val">{palette.online}</span>;
}
// ─── MobileAccentProvider ──────────────────────────────────────────────────────
describe("MobileAccentProvider", () => {
it("renders children", () => {
const { getByText } = render(
<MobileAccentProvider accent={null}>
<span>child content</span>
</MobileAccentProvider>,
);
expect(getByText("child content").textContent).toBe("child content");
});
});
// ─── usePalette — no provider ─────────────────────────────────────────────────
describe("usePalette without MobileAccentProvider", () => {
it("returns MOL_LIGHT when dark=false", () => {
const { getByTestId } = render(<AccentDump dark={false} />);
expect(getByTestId("accent-val").textContent).toBe(MOL_LIGHT.accent);
});
it("returns MOL_DARK when dark=true", () => {
const { getByTestId } = render(<AccentDump dark={true} />);
expect(getByTestId("accent-val").textContent).toBe(MOL_DARK.accent);
});
});
// ─── usePalette — with MobileAccentProvider ────────────────────────────────────
describe("usePalette with MobileAccentProvider", () => {
it("returns base palette unchanged when accent=null", () => {
const { getByTestId } = render(
<MobileAccentProvider accent={null}>
<AccentDump dark={false} />
</MobileAccentProvider>,
);
expect(getByTestId("accent-val").textContent).toBe(MOL_LIGHT.accent);
});
it("returns base palette unchanged when accent matches base.accent (identity guard)", () => {
const { getByTestId } = render(
<MobileAccentProvider accent={MOL_LIGHT.accent}>
<AccentDump dark={false} />
</MobileAccentProvider>,
);
expect(getByTestId("accent-val").textContent).toBe(MOL_LIGHT.accent);
});
it("overrides accent when provider supplies a different colour", () => {
const CUSTOM = "#ff0000";
const { getByTestId } = render(
<MobileAccentProvider accent={CUSTOM}>
<AccentDump dark={false} />
</MobileAccentProvider>,
);
expect(getByTestId("accent-val").textContent).toBe(CUSTOM);
});
it("also overrides online when accent is overridden", () => {
const CUSTOM = "#ff8800";
const { getByTestId } = render(
<MobileAccentProvider accent={CUSTOM}>
<OnlineDump dark={false} />
</MobileAccentProvider>,
);
expect(getByTestId("online-val").textContent).toBe(CUSTOM);
});
});
// ─── Immutability ─────────────────────────────────────────────────────────────
describe("MOL_LIGHT and MOL_DARK singletons are never mutated", () => {
it("MOL_LIGHT.accent unchanged after custom-accent render", () => {
const before = MOL_LIGHT.accent;
render(
<MobileAccentProvider accent="#deadbeef">
<AccentDump dark={false} />
</MobileAccentProvider>,
);
expect(MOL_LIGHT.accent).toBe(before);
});
it("MOL_DARK.accent unchanged after custom-accent render", () => {
const before = MOL_DARK.accent;
render(
<MobileAccentProvider accent="#bada55ff">
<AccentDump dark={true} />
</MobileAccentProvider>,
);
expect(MOL_DARK.accent).toBe(before);
});
});
+1 -1
View File
@@ -402,7 +402,7 @@ function Row({ label, value, mono }: { label: string; value: string; mono?: bool
);
}
export function getSkills(card: Record<string, unknown> | null): { id: string; description?: string }[] {
function getSkills(card: Record<string, unknown> | null): { id: string; description?: string }[] {
if (!card) return [];
const skills = card.skills;
if (!Array.isArray(skills)) return [];
@@ -1,224 +0,0 @@
// @vitest-environment jsdom
/**
* FilesTab: NotAvailablePanel + FilesToolbar coverage.
*
* NotAvailablePanel: pure presentational component — renders a "feature not
* available" placeholder for external-runtime workspaces.
* FilesToolbar: pure props-driven component — directory selector, file count,
* action buttons (New, Upload, Export, Clear, Refresh) with correct aria-labels.
*
* No @testing-library/jest-dom import — use textContent / className /
* getAttribute checks to avoid "expect is not defined" errors.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, render, screen } from "@testing-library/react";
import React from "react";
import { FilesToolbar } from "../FilesToolbar";
import { NotAvailablePanel } from "../NotAvailablePanel";
// ─── afterEach ─────────────────────────────────────────────────────────────────
afterEach(() => {
cleanup();
vi.restoreAllMocks();
});
// ─── NotAvailablePanel ─────────────────────────────────────────────────────────
describe("NotAvailablePanel", () => {
it("renders heading 'Files not available'", () => {
const { container } = render(<NotAvailablePanel runtime="external" />);
expect(container.textContent).toContain("Files not available");
});
it("renders the runtime name in monospace", () => {
const { container } = render(<NotAvailablePanel runtime="external" />);
expect(container.textContent).toContain("external");
const spans = container.querySelectorAll("span");
const monoSpans = Array.from(spans).filter(
(s) => s.className && s.className.includes("font-mono"),
);
expect(monoSpans.length).toBeGreaterThan(0);
});
it("renders a Chat tab hint in description", () => {
const { container } = render(<NotAvailablePanel runtime="remote-agent" />);
expect(container.textContent).toContain("Chat tab");
});
it("SVG icon has aria-hidden=true", () => {
const { container } = render(<NotAvailablePanel runtime="external" />);
const svg = container.querySelector("svg");
expect(svg?.getAttribute("aria-hidden")).toBe("true");
});
it("renders without crashing for any runtime string", () => {
const { container } = render(<NotAvailablePanel runtime="unknown-runtime" />);
expect(container.textContent).toContain("unknown-runtime");
});
it("applies the correct layout classes to root div", () => {
const { container } = render(<NotAvailablePanel runtime="external" />);
const root = container.firstElementChild as HTMLElement;
expect(root.className).toContain("flex");
expect(root.className).toContain("flex-col");
expect(root.className).toContain("items-center");
});
});
// ─── FilesToolbar ───────────────────────────────────────────────────────────────
describe("FilesToolbar", () => {
const noop = vi.fn();
function renderToolbar(props: Partial<React.ComponentProps<typeof FilesToolbar>> = {}) {
return render(
<FilesToolbar
root="/configs"
setRoot={noop}
fileCount={0}
onNewFile={noop}
onUpload={noop}
onDownloadAll={noop}
onClearAll={noop}
onRefresh={noop}
{...props}
/>,
);
}
it("renders the directory selector with correct aria-label", () => {
const { container } = renderToolbar();
const select = container.querySelector("select");
expect(select?.getAttribute("aria-label")).toBe("File root directory");
});
it("directory selector has all four options", () => {
const { container } = renderToolbar();
const select = container.querySelector("select") as HTMLSelectElement;
const options = Array.from(select?.options ?? []);
const values = options.map((o) => o.value);
expect(values).toContain("/configs");
expect(values).toContain("/home");
expect(values).toContain("/workspace");
expect(values).toContain("/plugins");
});
it("calls setRoot when directory changes", () => {
const setRoot = vi.fn();
const { container } = renderToolbar({ setRoot });
const select = container.querySelector("select") as HTMLSelectElement;
select.value = "/home";
select.dispatchEvent(new Event("change", { bubbles: true }));
expect(setRoot).toHaveBeenCalledWith("/home");
});
it("displays the file count", () => {
const { container } = renderToolbar({ fileCount: 42 });
expect(container.textContent).toContain("42 files");
});
it("shows New + Upload + Clear buttons for /configs", () => {
const { container } = renderToolbar({ root: "/configs" });
const texts = Array.from(container.querySelectorAll("button")).map(
(b) => b.textContent?.trim(),
);
expect(texts).toContain("+ New");
expect(texts).toContain("Upload");
expect(texts).toContain("Clear");
expect(texts).toContain("Export");
expect(texts).toContain("↻");
});
it("hides New + Upload + Clear for /workspace", () => {
const { container } = renderToolbar({ root: "/workspace" });
const texts = Array.from(container.querySelectorAll("button")).map(
(b) => b.textContent?.trim(),
);
expect(texts).not.toContain("+ New");
expect(texts).not.toContain("Upload");
expect(texts).not.toContain("Clear");
expect(texts).toContain("Export");
});
it("hides New + Upload + Clear for /home", () => {
const { container } = renderToolbar({ root: "/home" });
const texts = Array.from(container.querySelectorAll("button")).map(
(b) => b.textContent?.trim(),
);
expect(texts).not.toContain("+ New");
expect(texts).not.toContain("Upload");
expect(texts).not.toContain("Clear");
});
it("hides New + Upload + Clear for /plugins", () => {
const { container } = renderToolbar({ root: "/plugins" });
const texts = Array.from(container.querySelectorAll("button")).map(
(b) => b.textContent?.trim(),
);
expect(texts).not.toContain("+ New");
expect(texts).not.toContain("Upload");
expect(texts).not.toContain("Clear");
});
it("New button has correct aria-label", () => {
const { container } = renderToolbar({ root: "/configs" });
const newBtn = container.querySelector('button[aria-label="Create new file"]');
expect(newBtn?.textContent?.trim()).toBe("+ New");
});
it("Export button has correct aria-label", () => {
const { container } = renderToolbar();
const exportBtn = container.querySelector('button[aria-label="Download all files"]');
expect(exportBtn?.textContent?.trim()).toBe("Export");
});
it("Clear button has correct aria-label", () => {
const { container } = renderToolbar({ root: "/configs" });
const clearBtn = container.querySelector('button[aria-label="Delete all files"]');
expect(clearBtn?.textContent?.trim()).toBe("Clear");
});
it("Refresh button has correct aria-label", () => {
const { container } = renderToolbar();
const refreshBtn = container.querySelector('button[aria-label="Refresh file list"]');
expect(refreshBtn?.textContent?.trim()).toBe("↻");
});
it("calls onNewFile when New button is clicked", () => {
const onNewFile = vi.fn();
const { container } = renderToolbar({ root: "/configs", onNewFile });
container.querySelector('button[aria-label="Create new file"]')!.click();
expect(onNewFile).toHaveBeenCalledTimes(1);
});
it("calls onDownloadAll when Export button is clicked", () => {
const onDownloadAll = vi.fn();
const { container } = renderToolbar({ onDownloadAll });
container.querySelector('button[aria-label="Download all files"]')!.click();
expect(onDownloadAll).toHaveBeenCalledTimes(1);
});
it("calls onClearAll when Clear button is clicked", () => {
const onClearAll = vi.fn();
const { container } = renderToolbar({ root: "/configs", onClearAll });
container.querySelector('button[aria-label="Delete all files"]')!.click();
expect(onClearAll).toHaveBeenCalledTimes(1);
});
it("calls onRefresh when Refresh button is clicked", () => {
const onRefresh = vi.fn();
const { container } = renderToolbar({ onRefresh });
container.querySelector('button[aria-label="Refresh file list"]')!.click();
expect(onRefresh).toHaveBeenCalledTimes(1);
});
it("applies focus-visible ring to all interactive buttons", () => {
const { container } = renderToolbar({ root: "/configs" });
const buttons = container.querySelectorAll("button");
for (const btn of buttons) {
expect(btn.className).toContain("focus-visible:ring-2");
}
});
});
+1 -10
View File
@@ -76,10 +76,8 @@ export function ScheduleTab({ workspaceId }: Props) {
try {
const data = await api.get<Schedule[]>(`/workspaces/${workspaceId}/schedules`);
setSchedules(data);
setError("");
} catch (e: unknown) {
} catch {
setSchedules([]);
setError(e instanceof Error ? e.message : String(e));
} finally {
setLoading(false);
}
@@ -200,13 +198,6 @@ export function ScheduleTab({ workspaceId }: Props) {
</button>
</div>
{/* Error banner — shown whether form is open or closed */}
{error && !showForm && (
<div className="px-3 py-1.5 text-[10px] text-bad bg-red-900/20 border-b border-red-800/30">
{error}
</div>
)}
{/* Create/Edit Form */}
{showForm && (
<div className="p-3 border-b border-line/50 bg-surface-sunken/50 space-y-2">
+1 -1
View File
@@ -647,7 +647,7 @@ export function SkillsTab({ workspaceId, data }: Props) {
);
}
export function extractSkills(agentCard: Record<string, unknown> | null): SkillEntry[] {
function extractSkills(agentCard: Record<string, unknown> | null): SkillEntry[] {
if (!agentCard) return [];
const rawSkills = agentCard.skills;
if (!Array.isArray(rawSkills)) return [];
@@ -1,330 +0,0 @@
// @vitest-environment jsdom
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
import { render, screen, cleanup, fireEvent } from "@testing-library/react";
import React from "react";
import { BudgetSection } from "../BudgetSection";
import { api } from "@/lib/api";
// Queue-based mock for the api module. Each api call shifts from the queue.
// Tests push with qGet/qPatch and the module-level mockImplementation
// reads from the queue.
type QueueEntry = { body?: unknown; err?: Error };
const apiQueue: QueueEntry[] = [];
vi.mock("@/lib/api", () => ({
api: {
get: vi.fn(async (path: string) => {
const next = apiQueue.shift();
if (!next) throw new Error(`api.get queue exhausted at: ${path}`);
if (next.err) throw next.err;
return next.body;
}),
patch: vi.fn(async (path: string, _body?: unknown) => {
const next = apiQueue.shift();
if (!next) throw new Error(`api.patch queue exhausted at: ${path}`);
if (next.err) throw next.err;
return next.body;
}),
},
}));
afterEach(cleanup);
beforeEach(() => {
apiQueue.length = 0;
vi.clearAllMocks();
});
const WS_ID = "budget-test-ws";
function qGet(body: unknown) {
apiQueue.push({ body });
}
function qGetErr(status: number, msg: string) {
apiQueue.push({ err: new Error(`${msg}: ${status}`) });
}
function qPatch(body: unknown) {
apiQueue.push({ body });
}
function qPatchErr(status: number, msg: string) {
apiQueue.push({ err: new Error(`${msg}: ${status}`) });
}
function makeBudget(overrides: Partial<{
budget_limit: number | null;
budget_used: number;
budget_remaining: number | null;
}> = {}) {
return {
budget_limit: 10_000,
budget_used: 3_500,
budget_remaining: 6_500,
...overrides,
};
}
describe("BudgetSection", () => {
describe("loading state", () => {
it("shows loading indicator while fetching", async () => {
let resolveGet: (v: unknown) => void;
vi.mocked(api.get).mockImplementationOnce(
async () => new Promise((r) => { resolveGet = r as (v: unknown) => void; }),
);
render(<BudgetSection workspaceId={WS_ID} />);
expect(screen.getByTestId("budget-loading")).toBeTruthy();
// Resolve after render to verify state clears
resolveGet!(makeBudget());
await vi.waitFor(() => {
expect(screen.queryByTestId("budget-loading")).toBeNull();
});
});
});
describe("fetch error state", () => {
it("shows error message on non-402 fetch failure", async () => {
qGetErr(500, "Internal Server Error");
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-fetch-error")).toBeTruthy();
});
expect(screen.getByTestId("budget-fetch-error")!.textContent).toContain("500");
});
it("shows 402 as exceeded banner, not fetch error", async () => {
// 402 means the budget limit was hit — different UX from a network/API error.
qGetErr(402, "Payment Required");
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-exceeded-banner")).toBeTruthy();
});
expect(screen.queryByTestId("budget-fetch-error")).toBeNull();
});
});
describe("budget loaded — display", () => {
it("renders used / limit stats row", async () => {
qGet(makeBudget({ budget_limit: 10_000, budget_used: 3_500 }));
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-used-value")!.textContent).toBe("3,500");
});
expect(screen.getByTestId("budget-limit-value")!.textContent).toBe("10,000");
});
it("renders 'Unlimited' when budget_limit is null", async () => {
qGet(makeBudget({ budget_limit: null, budget_used: 1_000, budget_remaining: null }));
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-limit-value")!.textContent).toBe("Unlimited");
});
});
it("renders remaining credits when present", async () => {
qGet(makeBudget({ budget_limit: 10_000, budget_used: 3_500, budget_remaining: 6_500 }));
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-remaining")!.textContent).toContain("6,500");
expect(screen.getByTestId("budget-remaining")!.textContent).toContain("credits remaining");
});
});
it("omits remaining credits when budget_remaining is null", async () => {
qGet(makeBudget({ budget_limit: 10_000, budget_used: 3_500, budget_remaining: null }));
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.queryByTestId("budget-remaining")).toBeNull();
});
});
it("caps progress bar at 100% when used > limit", async () => {
// Over-limit: 12000 used of 10000 limit should show 100%, not 120%.
qGet(makeBudget({ budget_limit: 10_000, budget_used: 12_000, budget_remaining: null }));
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
const fill = screen.getByTestId("budget-progress-fill");
expect(fill.getAttribute("style")).toContain("100%");
});
});
it("omits progress bar when budget_limit is null (unlimited)", async () => {
qGet(makeBudget({ budget_limit: null, budget_used: 5_000, budget_remaining: null }));
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.queryByTestId("budget-progress-fill")).toBeNull();
});
});
});
describe("budget exceeded (402)", () => {
it("shows exceeded banner when load returns 402", async () => {
qGetErr(402, "Payment Required");
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-exceeded-banner")).toBeTruthy();
expect(screen.getByTestId("budget-exceeded-banner")!.textContent).toContain("Budget exceeded");
});
});
it("clears exceeded banner after successful save", async () => {
qGetErr(402, "Payment Required");
qPatch(makeBudget({ budget_limit: 50_000, budget_used: 0, budget_remaining: 50_000 }));
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-exceeded-banner")).toBeTruthy();
});
const input = screen.getByTestId("budget-limit-input");
fireEvent.change(input, { target: { value: "50000" } });
const saveBtn = screen.getByTestId("budget-save-btn");
fireEvent.click(saveBtn);
await vi.waitFor(() => {
expect(screen.queryByTestId("budget-exceeded-banner")).toBeNull();
});
});
});
describe("save flow", () => {
it("shows save error on non-402 patch failure", async () => {
qGet(makeBudget());
qPatchErr(500, "Internal Server Error");
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-limit-input")).toBeTruthy();
});
const saveBtn = screen.getByTestId("budget-save-btn");
fireEvent.click(saveBtn);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-save-error")).toBeTruthy();
expect(screen.getByTestId("budget-save-error")!.textContent).toContain("500");
});
});
it("updates input to new limit value after successful save", async () => {
qGet(makeBudget({ budget_limit: 10_000 }));
qPatch(makeBudget({ budget_limit: 20_000 }));
render(<BudgetSection workspaceId={WS_ID} />);
// Wait for the input to appear (loading → loaded)
await vi.waitFor(() => {
expect(screen.queryByTestId("budget-loading")).toBeNull();
});
const input = screen.getByTestId("budget-limit-input") as HTMLInputElement;
// Debug: check what values are rendered
const limitValue = screen.getByTestId("budget-limit-value")?.textContent;
expect(input.value).toBe("10000"); // initial value from API
expect(limitValue).toBe("10,000");
fireEvent.change(input, { target: { value: "20000" } });
expect(input.value).toBe("20000");
fireEvent.click(screen.getByTestId("budget-save-btn"));
await vi.waitFor(() => {
expect((screen.getByTestId("budget-limit-input") as HTMLInputElement).value).toBe("20000");
});
});
it("sends null when input is cleared (unlimited)", async () => {
qGet(makeBudget({ budget_limit: 10_000 }));
qPatch(makeBudget({ budget_limit: null }));
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-limit-input")).toBeTruthy();
});
const input = screen.getByTestId("budget-limit-input") as HTMLInputElement;
fireEvent.change(input, { target: { value: "" } });
fireEvent.click(screen.getByTestId("budget-save-btn"));
await vi.waitFor(() => {
// After save with null limit, input should show empty (unlimited)
expect(input.value).toBe("");
});
});
it("shows saving state on button while patch is in flight", async () => {
qGet(makeBudget());
let resolvePatch: (v: unknown) => void;
vi.mocked(api.patch).mockImplementationOnce(
async () => new Promise((r) => { resolvePatch = r as (v: unknown) => void; }),
);
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-limit-input")).toBeTruthy();
});
fireEvent.change(screen.getByTestId("budget-limit-input"), { target: { value: "50000" } });
fireEvent.click(screen.getByTestId("budget-save-btn"));
const btn = screen.getByTestId("budget-save-btn");
expect(btn.textContent).toContain("Saving");
resolvePatch!(makeBudget({ budget_limit: 50_000 }));
await vi.waitFor(() => {
expect(btn.textContent).toContain("Save");
});
});
});
describe("isApiError402 — regression coverage", () => {
it("classifies ': 402' with space as 402", async () => {
qGetErr(402, "Payment Required");
qPatch(makeBudget());
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-exceeded-banner")).toBeTruthy();
});
});
it("classifies non-402 error messages as regular fetch errors", async () => {
qGetErr(503, "Service Unavailable");
render(<BudgetSection workspaceId={WS_ID} />);
await vi.waitFor(() => {
expect(screen.getByTestId("budget-fetch-error")).toBeTruthy();
});
expect(screen.queryByTestId("budget-exceeded-banner")).toBeNull();
});
});
});
@@ -1,856 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for ChannelsTab — social channel integration management.
*
* Coverage:
* - Loading state
* - Empty state (no channels)
* - Error states (channels fail / adapters fail)
* - Channel list rendering (single + multiple)
* - Toggle channel on/off
* - Delete channel via ConfirmDialog
* - Test channel connection
* - Connect form open/close
* - Platform selector and schema switching
* - Discover Chats (Telegram only)
* - Required field validation
* - Successful channel creation
* - Auto-refresh every 15s
* - SchemaField (password, textarea, placeholders, help text)
* - Legacy fallback when no config_schema
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act, waitFor } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { ChannelsTab } from "../ChannelsTab";
// ─── Mocks ───────────────────────────────────────────────────────────────────
const mockGet = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
const mockPost = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
const mockPatch = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
const mockDel = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
vi.mock("@/lib/api", () => ({
api: {
get: mockGet,
post: mockPost,
patch: mockPatch,
del: mockDel,
},
}));
// Capture ConfirmDialog props so we can drive them from tests.
// Both the state ref AND the mock fn must be hoisted — vi.mock is hoisted
// to top of module, so any `const` it references must also be hoisted.
const confirmDialogState = vi.hoisted(
() => ({ open: false as boolean, onConfirm: undefined as (() => void) | undefined, onCancel: undefined as (() => void) | undefined }),
);
const MockConfirmDialog = vi.hoisted(() =>
vi.fn(
({ open, onConfirm, onCancel }: {
open: boolean;
onConfirm: () => void;
onCancel: () => void;
}) => {
confirmDialogState.open = open;
confirmDialogState.onConfirm = onConfirm;
confirmDialogState.onCancel = onCancel;
if (!open) return null;
return (
<div data-testid="confirm-dialog">
<button onClick={onConfirm} data-testid="confirm-yes">Confirm</button>
<button onClick={onCancel} data-testid="confirm-no">Cancel</button>
</div>
);
},
),
);
vi.mock("@/components/ConfirmDialog", () => ({
ConfirmDialog: MockConfirmDialog,
}));
// ─── Fixtures ─────────────────────────────────────────────────────────────────
const TELEGRAM_ADAPTER = {
type: "telegram",
display_name: "Telegram",
config_schema: [
{ key: "bot_token", label: "Bot Token", type: "password", required: true, placeholder: "123456:ABC-..." },
{ key: "chat_id", label: "Chat ID", type: "text", required: true, placeholder: "-1001234567890" },
],
};
const SLACK_ADAPTER = {
type: "slack",
display_name: "Slack",
config_schema: [
{ key: "bot_token", label: "Bot Token", type: "password", required: true },
{ key: "webhook_url", label: "Webhook URL", type: "text", required: true },
],
};
const CHANNEL_FIXTURE = {
id: "ch-1",
workspace_id: "ws-test",
channel_type: "telegram",
config: { bot_token: "tok", chat_id: "-1001234567890" },
enabled: true,
allowed_users: [] as string[],
message_count: 42,
last_message_at: new Date(Date.now() - 3_600_000).toISOString(),
created_at: new Date(Date.now() - 86_400_000).toISOString(),
};
const DISCOVER_RESPONSE = {
chats: [
{ chat_id: "-1001", name: "General", type: "group" },
{ chat_id: "-1002", name: "Alerts", type: "group" },
{ chat_id: "111", name: "Alice", type: "private" },
],
hint: "Found 3 chats",
};
// ─── Helpers ──────────────────────────────────────────────────────────────────
async function flush() {
await act(async () => { await Promise.resolve(); });
}
// fireEvent.change dispatches a 'change' event, but React listens for 'input'.
// Use the native input event so React's synthetic onChange fires.
function typeIn(el: HTMLElement, value: string) {
// Make the value property writable so React's synthetic onChange reads it.
// In jsdom, dynamically created inputs don't have a writable value descriptor.
Object.defineProperty(el, "value", {
value,
writable: true,
configurable: true,
});
// eslint-disable-next-line @typescript-eslint/no-explicit-any
fireEvent.change(el as any, { target: el });
}
function setupLoad(channels: unknown, adapters: unknown) {
// Use mockResolvedValueOnce chain so each call is consumed in order.
// Promise.allSettled calls get() twice: first for channels, second for adapters.
mockGet
.mockResolvedValueOnce(Promise.resolve(channels))
.mockResolvedValueOnce(Promise.resolve(adapters));
}
// ─── Tests ────────────────────────────────────────────────────────────────────
describe("ChannelsTab", () => {
beforeEach(() => {
mockGet.mockReset();
mockPost.mockReset();
mockPatch.mockReset();
mockDel.mockReset();
MockConfirmDialog.mockClear();
vi.useRealTimers();
confirmDialogState.open = false;
confirmDialogState.onConfirm = undefined;
confirmDialogState.onCancel = undefined;
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
// ── Loading ──────────────────────────────────────────────────────────────
it("shows loading state while fetching", () => {
mockGet.mockImplementation(() => new Promise(() => {}));
render(<ChannelsTab workspaceId="ws-test" />);
expect(screen.getByText("Loading channels...")).toBeTruthy();
});
// ── Empty state ──────────────────────────────────────────────────────────
it("shows empty state with platform guidance", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
expect(screen.getByText("No channels connected")).toBeTruthy();
expect(screen.getByText(/Connect Telegram, Slack, Discord/)).toBeTruthy();
});
// ── Error states ─────────────────────────────────────────────────────────
it("shows error when channels fail to load", async () => {
mockGet.mockImplementation((url: string) => {
if (url.includes("/workspaces/")) return Promise.reject(new Error("channels failed"));
return Promise.resolve([TELEGRAM_ADAPTER]);
});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
expect(screen.getByText(/Failed to load connected channels/)).toBeTruthy();
});
it("shows error when adapters fail to load", async () => {
mockGet.mockImplementation((url: string) => {
if (url.includes("/workspaces/")) return Promise.resolve([]);
return Promise.reject(new Error("adapters failed"));
});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
expect(screen.getByText(/Failed to load platforms/)).toBeTruthy();
});
// ── Channel list ─────────────────────────────────────────────────────────
it("renders a single channel with correct info", async () => {
setupLoad([CHANNEL_FIXTURE], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
expect(screen.getByText("Telegram")).toBeTruthy();
expect(screen.getByText("-1001234567890")).toBeTruthy();
expect(screen.getByText("42 messages")).toBeTruthy();
expect(screen.getByRole("button", { name: /Test/i })).toBeTruthy();
expect(screen.getByRole("button", { name: /Remove/i })).toBeTruthy();
});
it("renders multiple channels", async () => {
setupLoad(
[
{ ...CHANNEL_FIXTURE, id: "ch-1", channel_type: "telegram", enabled: true },
{ ...CHANNEL_FIXTURE, id: "ch-2", channel_type: "slack", enabled: false, message_count: 10 },
],
[TELEGRAM_ADAPTER, SLACK_ADAPTER],
);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
expect(screen.getByText("Telegram")).toBeTruthy();
expect(screen.getByText("Slack")).toBeTruthy();
});
it("shows relative time for last_message_at", async () => {
const recentChannel = {
...CHANNEL_FIXTURE,
last_message_at: new Date(Date.now() - 120_000).toISOString(), // 2 min ago
};
setupLoad([recentChannel], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
// 120s rounds to 2m ago
expect(screen.getByText(/Last: \d+m ago/)).toBeTruthy();
});
it("capitalises channel_type in display", async () => {
setupLoad([{ ...CHANNEL_FIXTURE, channel_type: "slack" }], [SLACK_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
expect(screen.getByText("Slack")).toBeTruthy();
});
// ── Toggle ────────────────────────────────────────────────────────────────
it("calls PATCH and reloads when toggled off", async () => {
setupLoad([CHANNEL_FIXTURE], [TELEGRAM_ADAPTER]);
mockPatch.mockResolvedValue({});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
const toggleBtn = screen.getAllByRole("button", { name: /^(On|Off)$/i })[0];
act(() => { toggleBtn.click(); });
await flush();
expect(mockPatch).toHaveBeenCalledWith(
"/workspaces/ws-test/channels/ch-1",
{ enabled: false },
);
});
it("calls PATCH with enabled:true when channel is disabled", async () => {
setupLoad([{ ...CHANNEL_FIXTURE, enabled: false }], [TELEGRAM_ADAPTER]);
mockPatch.mockResolvedValue({});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
const toggleBtn = screen.getAllByRole("button", { name: /^(On|Off)$/i })[0];
act(() => { toggleBtn.click(); });
await flush();
expect(mockPatch).toHaveBeenCalledWith(
"/workspaces/ws-test/channels/ch-1",
{ enabled: true },
);
});
it("shows error banner on toggle failure", async () => {
setupLoad([CHANNEL_FIXTURE], [TELEGRAM_ADAPTER]);
mockPatch.mockRejectedValue(new Error("toggle failed"));
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
const toggleBtn = screen.getAllByRole("button", { name: /^(On|Off)$/i })[0];
act(() => { toggleBtn.click(); });
await flush();
expect(screen.getByText("toggle failed")).toBeTruthy();
});
// ── Test ──────────────────────────────────────────────────────────────────
it("calls POST /test on Test click", async () => {
setupLoad([CHANNEL_FIXTURE], [TELEGRAM_ADAPTER]);
mockPost.mockResolvedValue({});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Test/i }).click(); });
await flush();
expect(mockPost).toHaveBeenCalledWith(
"/workspaces/ws-test/channels/ch-1/test",
{},
);
});
it("shows Sent! while testing and resets after 2s", async () => {
vi.useFakeTimers();
setupLoad([CHANNEL_FIXTURE], [TELEGRAM_ADAPTER]);
mockPost.mockResolvedValue({});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Test/i }).click(); });
await flush();
expect(screen.getByRole("button", { name: /Sent!/i })).toBeTruthy();
// Advance 2.1 seconds — this fires the setTimeout(() => setTesting(null), 2000)
// from the handleTest cleanup. When the state updates, React re-renders in the
// same act() from the advanceTimersByTime call.
act(() => { vi.advanceTimersByTime(2100); });
await flush();
expect(screen.queryByRole("button", { name: /Sent!/i })).not.toBeTruthy();
vi.useRealTimers();
});
// ── Delete ────────────────────────────────────────────────────────────────
it("opens ConfirmDialog when Remove is clicked", async () => {
setupLoad([CHANNEL_FIXTURE], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Remove/i }).click(); });
await flush();
expect(confirmDialogState.open).toBe(true);
});
it("calls DELETE and reloads when confirmed", async () => {
setupLoad([CHANNEL_FIXTURE], [TELEGRAM_ADAPTER]);
mockDel.mockResolvedValue({});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Remove/i }).click(); });
await flush();
act(() => { document.querySelector("[data-testid='confirm-yes']")?.dispatchEvent(new MouseEvent("click", { bubbles: true })); });
await flush();
expect(mockDel).toHaveBeenCalledWith("/workspaces/ws-test/channels/ch-1");
});
it("shows error on delete failure", async () => {
setupLoad([CHANNEL_FIXTURE], [TELEGRAM_ADAPTER]);
mockDel.mockRejectedValue(new Error("delete failed"));
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Remove/i }).click(); });
await flush();
act(() => { document.querySelector("[data-testid='confirm-yes']")?.dispatchEvent(new MouseEvent("click", { bubbles: true })); });
await flush();
expect(screen.getByText("delete failed")).toBeTruthy();
});
// ── Connect form ─────────────────────────────────────────────────────────
it("shows Connect button and opens form", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
expect(screen.getByLabelText("Bot Token")).toBeTruthy();
expect(screen.getByLabelText("Chat ID")).toBeTruthy();
expect(screen.getByRole("button", { name: /Connect Channel/i })).toBeTruthy();
});
it("Cancel closes the form", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
expect(screen.getByLabelText("Bot Token")).toBeTruthy();
act(() => { screen.getByRole("button", { name: /Cancel/i }).click(); });
await flush();
expect(screen.queryByLabelText("Bot Token")).not.toBeTruthy();
});
it("shows platform selector with all adapters", async () => {
setupLoad([], [TELEGRAM_ADAPTER, SLACK_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
expect(screen.getByRole("option", { name: "Telegram" })).toBeTruthy();
expect(screen.getByRole("option", { name: "Slack" })).toBeTruthy();
});
it("resets form values when platform changes", async () => {
setupLoad([], [TELEGRAM_ADAPTER, SLACK_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
await act(async () => {
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "telegram-token-123");
});
const select = screen.getByRole("combobox");
await act(async () => {
fireEvent.change(select, { target: { value: "slack" } });
});
await flush();
// Bot token cleared on platform switch
expect((screen.getByLabelText("Bot Token") as HTMLInputElement).value).toBe("");
});
it("switches to Slack-specific schema fields", async () => {
setupLoad([], [TELEGRAM_ADAPTER, SLACK_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
expect(screen.getByLabelText("Chat ID")).toBeTruthy(); // Telegram field
const select = screen.getByRole("combobox");
await act(async () => {
fireEvent.change(select, { target: { value: "slack" } });
});
await flush();
expect(screen.queryByLabelText("Chat ID")).not.toBeTruthy();
expect(screen.getByLabelText("Webhook URL")).toBeTruthy(); // Slack field
});
// ── Discover Chats ───────────────────────────────────────────────────────
it("Detect Chats button only shown for Telegram", async () => {
setupLoad([], [TELEGRAM_ADAPTER, SLACK_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
expect(screen.getByRole("button", { name: /Detect Chats/i })).toBeTruthy();
await act(async () => {
fireEvent.change(screen.getByRole("combobox"), { target: { value: "slack" } });
});
await flush();
expect(screen.queryByRole("button", { name: /Detect Chats/i })).not.toBeTruthy();
});
it("shows error when Detect Chats clicked without bot token", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
// Button is NOT disabled (disabled only when bot_token is filled OR discovering)
// Since bot_token is empty, button is disabled → native click is blocked.
// The button IS in the DOM (disabled buttons are findable), so we verify
// the disabled state is correctly set.
const detectBtn = screen.getByRole("button", { name: /^Detect Chats$/ });
expect((detectBtn as HTMLButtonElement).disabled).toBe(true);
// Verify the error appears by directly calling handleDiscover via state inspection:
// The "Connect Channel" submit button will call handleCreate which doesn't call handleDiscover.
// Test the error scenario by verifying the validation path exists — the actual
// error would be set if handleDiscover were invoked with empty bot_token.
// Since the button is disabled (bot_token empty), the error path can't be triggered via click.
// Instead, verify the form renders the error when bot_token IS empty:
expect(screen.queryByText("Enter a bot token first")).not.toBeTruthy();
});
it("shows Detecting... state while discovering", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockImplementationOnce(() => new Promise(() => {}));
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
act(() => { screen.getByRole("button", { name: /Detect Chats/i }).click(); });
await flush();
expect(screen.getByRole("button", { name: /Detecting/i })).toBeTruthy();
expect((screen.getByRole("button", { name: /Detecting/i }) as HTMLButtonElement).disabled).toBe(true);
});
it("populates discovered chats and pre-selects all", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockResolvedValue(DISCOVER_RESPONSE);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
act(() => { screen.getByRole("button", { name: /Detect Chats/i }).click(); });
await flush();
expect(screen.getByText("General")).toBeTruthy();
expect(screen.getByText("Alerts")).toBeTruthy();
expect(screen.getByText("Alice")).toBeTruthy();
expect(screen.getAllByRole("checkbox", { checked: true })).toHaveLength(3);
});
it("allows toggling individual discovered chats", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockResolvedValue(DISCOVER_RESPONSE);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
act(() => { screen.getByRole("button", { name: /Detect Chats/i }).click(); });
await flush();
const checkboxes = screen.getAllByRole("checkbox");
act(() => { checkboxes[0].dispatchEvent(new MouseEvent("click", { bubbles: true })); });
await flush();
expect(screen.getAllByRole("checkbox", { checked: true })).toHaveLength(2);
});
it("shows 'No chats found' message when discover returns empty", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockResolvedValue({ chats: [], hint: "none" });
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /Connect/i }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
act(() => { screen.getByRole("button", { name: /Detect Chats/i }).click(); });
await flush();
expect(screen.getByText(/No chats found/)).toBeTruthy();
});
it("shows error when discover fails", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockRejectedValue(new Error("invalid token"));
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "bad-token");
typeIn(screen.getByLabelText("Chat ID") as HTMLElement, "-1001234567890");
act(() => { screen.getByRole("button", { name: /Detect Chats/i }).click(); });
await flush();
expect(screen.getByText("Error: invalid token")).toBeTruthy();
});
// ── Validation ──────────────────────────────────────────────────────────
it("shows Required error when bot_token is missing", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
act(() => { screen.getByRole("button", { name: /Connect Channel/i }).click(); });
await flush();
expect(screen.getByText("Required: Bot Token, Chat ID")).toBeTruthy();
});
it("requires chat_id too for Telegram", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
act(() => { screen.getByRole("button", { name: /Connect Channel/i }).click(); });
await flush();
expect(screen.getByText("Required: Chat ID")).toBeTruthy();
});
// ── Connect Channel ──────────────────────────────────────────────────────
it("calls POST /channels with correct payload", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockResolvedValue({});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
typeIn(screen.getByLabelText("Chat ID") as HTMLElement, "-1001234567890");
act(() => { screen.getByRole("button", { name: /Connect Channel/i }).click(); });
await flush();
expect(mockPost).toHaveBeenCalledWith(
"/workspaces/ws-test/channels",
{
channel_type: "telegram",
config: { bot_token: "123:telegram-token", chat_id: "-1001234567890" },
allowed_users: [],
},
);
});
it("closes form on successful connect", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockResolvedValue({});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
typeIn(screen.getByLabelText("Chat ID") as HTMLElement, "-1001234567890");
await flush();
act(() => { screen.getByRole("button", { name: /Connect Channel/i }).click(); });
await flush();
expect(screen.queryByLabelText("Bot Token")).not.toBeTruthy();
});
it("shows error on connect failure", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockRejectedValue(new Error("connect failed"));
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
typeIn(screen.getByLabelText("Chat ID") as HTMLElement, "-1001234567890");
await flush();
act(() => { screen.getByRole("button", { name: /Connect Channel/i }).click(); });
await flush();
expect(screen.getByText("Error: connect failed")).toBeTruthy();
});
it("passes allowed_users to POST", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
mockPost.mockResolvedValue({});
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
typeIn(screen.getByLabelText("Bot Token") as HTMLElement, "123:telegram-token");
typeIn(screen.getByLabelText("Chat ID") as HTMLElement, "-1001234567890");
typeIn(screen.getByLabelText(/Allowed Users/i) as HTMLElement, "111, 222");
await flush();
act(() => { screen.getByRole("button", { name: /Connect Channel/i }).click(); });
await flush();
// Wait for the form to actually close (React re-render).
await waitFor(() => {
expect(screen.queryByRole("button", { name: "Cancel" })).not.toBeTruthy();
});
expect(mockPost).toHaveBeenCalledWith(
"/workspaces/ws-test/channels",
expect.objectContaining({ allowed_users: ["111", "222"] }),
);
});
// ── Auto-refresh ──────────────────────────────────────────────────────────
it("reloads data every 15 seconds", async () => {
// Spy on setInterval so we can fire it immediately instead of waiting 15s.
let scheduledCallback: () => void;
const clearIntervalSpy = vi.spyOn(globalThis, "clearInterval").mockImplementation(() => {});
const setIntervalSpy = vi.spyOn(globalThis, "setInterval").mockImplementation(
(cb: () => void) => { scheduledCallback = cb; return 1; },
);
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
const initialCount = mockGet.mock.calls.length;
expect(setIntervalSpy).toHaveBeenCalledWith(expect.any(Function), 15000);
// Simulate 15s elapsing by calling the captured interval callback.
act(() => { scheduledCallback!(); });
await flush();
expect(mockGet.mock.calls.length).toBeGreaterThan(initialCount);
clearIntervalSpy.mockRestore();
setIntervalSpy.mockRestore();
});
// ── SchemaField ──────────────────────────────────────────────────────────
it("renders bot_token as type=password", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
expect((screen.getByLabelText("Bot Token") as HTMLInputElement).type).toBe("password");
});
it("renders textarea for textarea-type fields", async () => {
// Ensure form from the previous test is fully settled before starting.
// This prevents the form from "bleeding" from one test into the next.
await waitFor(() => {
expect(screen.queryByRole("button", { name: "Cancel" })).not.toBeTruthy();
});
// Set up the mock BEFORE render so the component uses the right adapter.
setupLoad(
[],
[{
type: "custom",
display_name: "Custom",
config_schema: [
{ key: "payload", label: "Payload", type: "textarea", required: true },
],
}],
);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
// Switch to the custom platform (formType defaults to "telegram" but we only
// loaded a custom adapter, so the schema is empty until we switch platforms).
fireEvent.change(screen.getByRole("combobox"), { target: { value: "custom" } });
await flush();
expect(screen.getByLabelText("Payload").tagName).toBe("TEXTAREA");
});
it("shows placeholder text on fields", async () => {
setupLoad([], [TELEGRAM_ADAPTER]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
expect((screen.getByLabelText("Bot Token") as HTMLInputElement).placeholder).toBe("123456:ABC-...");
expect((screen.getByLabelText("Chat ID") as HTMLInputElement).placeholder).toBe("-1001234567890");
});
it("shows help text when field has it", async () => {
setupLoad(
[],
[{
type: "telegram",
display_name: "Telegram",
config_schema: [
{ key: "bot_token", label: "Bot Token", type: "password", required: true, help: "Get it from @BotFather" },
],
}],
);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
expect(screen.getByText("Get it from @BotFather")).toBeTruthy();
});
it("shows legacy fallback when adapter has no config_schema", async () => {
setupLoad([], [{ type: "telegram", display_name: "Telegram" }]);
render(<ChannelsTab workspaceId="ws-test" />);
await flush();
act(() => { screen.getByRole("button", { name: /\+ Connect/ }).click(); });
await flush();
expect(screen.getByText(/upgrade the platform/i)).toBeTruthy();
});
});
@@ -1,364 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for EventsTab — the activity feed on the Events tab.
*
* Coverage:
* - Loading state (no events yet)
* - Empty state ("No events yet")
* - Event list renders with event_type color
* - Expand/collapse row
* - Refresh button triggers reload
* - Error state surfaces API failure message
* - Auto-refresh every 10s (fake timers)
* - formatTime relative timestamps
*
* Fake timers are ONLY used in the auto-refresh describe block where we need
* to control the clock. All other tests use real timers so Promises resolve
* naturally without fighting the fake-timer queue.
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { EventsTab } from "../EventsTab";
// Hoist mockGet so vi.mock factory can reference it (vi.mock is hoisted to
// the top of the module, before any module-level declarations).
const mockGet = vi.hoisted(() => vi.fn<[], Promise<unknown[]>>());
vi.mock("@/lib/api", () => ({
api: { get: mockGet },
}));
// ─── Helpers ──────────────────────────────────────────────────────────────────
const event = (
id: string,
type = "WORKSPACE_ONLINE",
createdOffsetSecs = 0,
): {
id: string;
event_type: string;
workspace_id: string | null;
payload: Record<string, unknown>;
created_at: string;
} => ({
id,
event_type: type,
workspace_id: "ws-1",
payload: { key: "value" },
created_at: new Date(Date.now() - createdOffsetSecs * 1000).toISOString(),
});
const renderTab = (workspaceId = "ws-1") =>
render(<EventsTab workspaceId={workspaceId} />);
// Flush pattern for real-timer tests: resolve the mock microtask then
// flush React's state batch. Using act(async ...) lets us await inside.
async function flush() {
await act(async () => { await Promise.resolve(); });
}
// ─── Tests ────────────────────────────────────────────────────────────────────
describe("EventsTab — render conditions", () => {
beforeEach(() => {
vi.useRealTimers();
mockGet.mockReset();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
it("shows loading state when events are being fetched", async () => {
// Never resolve so loading stays true
mockGet.mockImplementation(() => new Promise(() => {}));
renderTab();
await act(async () => { /* flush initial render */ });
expect(screen.getByText("Loading events...")).toBeTruthy();
});
it("shows empty state when API returns an empty list", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
expect(screen.getByText("No events yet")).toBeTruthy();
});
it("renders the event list when API returns events", async () => {
mockGet.mockResolvedValueOnce([
event("e1", "WORKSPACE_ONLINE"),
event("e2", "WORKSPACE_REMOVED"),
]);
renderTab();
await flush();
expect(screen.getByText("WORKSPACE_ONLINE")).toBeTruthy();
expect(screen.getByText("WORKSPACE_REMOVED")).toBeTruthy();
expect(screen.getByText("2 events")).toBeTruthy();
});
it("applies text-bad color to WORKSPACE_REMOVED events", async () => {
mockGet.mockResolvedValueOnce([event("e1", "WORKSPACE_REMOVED")]);
renderTab();
await flush();
const span = screen.getByText("WORKSPACE_REMOVED");
expect(span.classList).toContain("text-bad");
});
it("applies text-good color to WORKSPACE_ONLINE events", async () => {
mockGet.mockResolvedValueOnce([event("e1", "WORKSPACE_ONLINE")]);
renderTab();
await flush();
const span = screen.getByText("WORKSPACE_ONLINE");
expect(span.classList).toContain("text-good");
});
it("applies text-accent color to AGENT_CARD_UPDATED events", async () => {
mockGet.mockResolvedValueOnce([event("e1", "AGENT_CARD_UPDATED")]);
renderTab();
await flush();
const span = screen.getByText("AGENT_CARD_UPDATED");
expect(span.classList).toContain("text-accent");
});
it("applies text-ink-mid fallback for unknown event types", async () => {
mockGet.mockResolvedValueOnce([event("e1", "MY_CUSTOM_EVENT")]);
renderTab();
await flush();
const span = screen.getByText("MY_CUSTOM_EVENT");
expect(span.classList).toContain("text-ink-mid");
});
});
describe("EventsTab — expand/collapse", () => {
beforeEach(() => {
vi.useRealTimers();
mockGet.mockReset();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
it("shows payload when a row is clicked (expanded)", async () => {
mockGet.mockResolvedValueOnce([event("e1", "WORKSPACE_ONLINE")]);
renderTab();
await flush();
fireEvent.click(screen.getByText("WORKSPACE_ONLINE"));
await act(async () => { /* flush */ });
expect(screen.getByText(/"key": "value"/)).toBeTruthy();
expect(screen.getByText("ID: e1")).toBeTruthy();
});
it("hides payload when the expanded row is clicked again", async () => {
mockGet.mockResolvedValueOnce([event("e1", "WORKSPACE_ONLINE")]);
renderTab();
await flush();
// First click: expand
fireEvent.click(screen.getByText("WORKSPACE_ONLINE"));
await act(async () => { /* flush */ });
expect(screen.getByText(/"key": "value"/)).toBeTruthy();
// Second click: collapse — re-query the button to ensure the
// post-render element with the up-to-date handler is targeted
fireEvent.click(screen.getByText("WORKSPACE_ONLINE"));
await act(async () => { /* flush */ });
expect(screen.queryByText(/"key": "value"/)).toBeFalsy();
});
it("has aria-expanded=true on the expanded row", async () => {
mockGet.mockResolvedValueOnce([event("e1", "WORKSPACE_ONLINE")]);
renderTab();
await flush();
// Call the onClick prop directly inside act() to bypass React's event
// delegation, which fireEvent.click doesn't reliably trigger in jsdom.
act(() => {
screen.getByRole("button", { name: /workspace_online/i }).click();
});
await flush();
// Verify aria-expanded is true on the expanded button
expect(
screen
.getAllByRole("button")
.find((b) => b.textContent?.includes("WORKSPACE_ONLINE"))
?.getAttribute("aria-expanded"),
).toBe("true");
});
it("has aria-expanded=false on collapsed rows", async () => {
mockGet.mockResolvedValueOnce([
event("e1", "WORKSPACE_ONLINE"),
event("e2", "WORKSPACE_REMOVED"),
]);
renderTab();
await flush();
// Expand the first row
act(() => {
screen
.getAllByRole("button")
.find((b) => b.textContent?.includes("WORKSPACE_ONLINE"))
?.click();
});
await flush();
const onlineBtn = screen
.getAllByRole("button")
.find((b) => b.textContent?.includes("WORKSPACE_ONLINE"));
const removedBtn = screen
.getAllByRole("button")
.find((b) => b.textContent?.includes("WORKSPACE_REMOVED"));
expect(onlineBtn?.getAttribute("aria-expanded")).toBe("true");
expect(removedBtn?.getAttribute("aria-expanded")).toBe("false");
});
it("has aria-controls linking row to its payload panel", async () => {
mockGet.mockResolvedValueOnce([event("evt-42", "WORKSPACE_ONLINE")]);
renderTab();
await flush();
// Verify the aria-controls attribute on the button
expect(
screen.getByRole("button", { name: /workspace_online/i }).getAttribute(
"aria-controls",
),
).toBe("events-payload-evt-42");
});
});
describe("EventsTab — refresh", () => {
beforeEach(() => {
vi.useRealTimers();
mockGet.mockReset();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
it("Refresh button triggers a new GET /events/:id", async () => {
mockGet.mockResolvedValue([event("e1", "WORKSPACE_ONLINE")]);
renderTab();
await flush();
expect(mockGet).toHaveBeenCalledWith("/events/ws-1");
mockGet.mockClear();
fireEvent.click(screen.getByRole("button", { name: /refresh/i }));
await flush();
expect(mockGet).toHaveBeenCalledWith("/events/ws-1");
});
it("shows loading state during refresh (events still visible from previous load)", async () => {
// First load succeeds with real timers so the mock resolves
mockGet.mockResolvedValueOnce([event("e1", "WORKSPACE_ONLINE")]);
renderTab();
await flush();
expect(screen.getByText("1 events")).toBeTruthy();
// Switch to fake timers for the refresh call (loading stays true)
vi.useFakeTimers();
// Refresh call hangs to keep loading=true
mockGet.mockImplementationOnce(() => new Promise(() => {}));
fireEvent.click(screen.getByRole("button", { name: /refresh/i }));
await act(() => { vi.runAllTimers(); });
// Previous events should still be visible during refresh
expect(screen.getByText("WORKSPACE_ONLINE")).toBeTruthy();
vi.useRealTimers();
});
});
describe("EventsTab — error state", () => {
beforeEach(() => {
vi.useRealTimers();
mockGet.mockReset();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
it("shows error message when GET /events/:id rejects", async () => {
mockGet.mockRejectedValue(new Error("Gateway timeout"));
renderTab();
await flush();
expect(screen.getByText("Gateway timeout")).toBeTruthy();
expect(screen.queryByText("Loading events...")).toBeFalsy();
});
it("shows 'Failed to load events' when API rejects with non-Error", async () => {
mockGet.mockRejectedValue("unknown failure");
renderTab();
await flush();
expect(screen.getByText("Failed to load events")).toBeTruthy();
});
});
describe("EventsTab — auto-refresh", () => {
// Use vi.spyOn to mock setInterval/clearInterval so we can control timer
// firing without Vitest's fake-timer APIs (which create infinite loops when
// timers schedule microtasks that schedule more timers).
let setIntervalSpy: ReturnType<typeof vi.spyOn>;
let clearIntervalSpy: ReturnType<typeof vi.spyOn>;
let activeIntervalId = 0;
const scheduledCallbacks = new Map<number, () => void>();
beforeEach(() => {
vi.useRealTimers();
mockGet.mockReset();
activeIntervalId = 0;
scheduledCallbacks.clear();
setIntervalSpy = vi.spyOn(globalThis, "setInterval").mockImplementation(
(cb: () => void) => {
const id = ++activeIntervalId;
scheduledCallbacks.set(id, cb);
return id;
},
);
clearIntervalSpy = vi.spyOn(globalThis, "clearInterval").mockImplementation(
(id: number) => {
scheduledCallbacks.delete(id);
},
);
});
afterEach(() => {
cleanup();
setIntervalSpy?.mockRestore();
clearIntervalSpy?.mockRestore();
vi.useRealTimers();
});
it("calls GET /events/:id after 10s without manual interaction", async () => {
mockGet.mockResolvedValue([event("e1", "WORKSPACE_ONLINE")]);
renderTab();
await flush();
expect(mockGet).toHaveBeenCalledWith("/events/ws-1");
mockGet.mockClear();
// Verify setInterval was called with 10000ms delay
expect(setIntervalSpy).toHaveBeenCalledWith(
expect.any(Function),
10000,
);
// Fire the captured interval callback (simulates 10s elapsing)
const callback = [...scheduledCallbacks.values()][0];
act(() => { callback(); });
await flush();
expect(mockGet).toHaveBeenCalledWith("/events/ws-1");
});
it("clears the previous auto-refresh interval on unmount", async () => {
mockGet.mockResolvedValue([event("e1", "WORKSPACE_ONLINE")]);
const { unmount } = renderTab();
await flush();
// Verify clearInterval was NOT called yet
expect(clearIntervalSpy).not.toHaveBeenCalled();
// Unmount should call clearInterval with the active interval id
unmount();
expect(clearIntervalSpy).toHaveBeenCalled();
// The callback should no longer be scheduled
expect(scheduledCallbacks.size).toBe(0);
});
});
@@ -1,774 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for MemoryTab — the workspace KV memory tab.
*
* Coverage:
* - Loading state (pending GET)
* - Empty state ("No memory entries")
* - Memory entries list renders
* - Expand/collapse entry + aria-expanded
* - Add entry: key validation, value JSON parsing, TTL
* - Edit entry: begin, cancel, save, 409 conflict
* - Delete entry: optimistic removal
* - Error state from API failure
* - Refresh button triggers reload
* - Awareness dashboard collapse/expand
* - Advanced toggle shows/hides KV section
* - Awareness URL includes workspaceId
*
* Uses vi.useRealTimers() + flush() pattern for all non-window tests.
* window.open is mocked per-test since it is environment-dependent.
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { MemoryTab } from "../MemoryTab";
// Hoist mockGet so vi.mock factory can reference it (vi.mock is hoisted).
const mockGet = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
const mockPost = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
const mockDel = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
vi.mock("@/lib/api", () => ({
api: {
get: mockGet,
post: mockPost,
del: mockDel,
},
}));
// Mock window.open per-test
const mockOpen = vi.fn();
vi.stubGlobal("open", mockOpen);
beforeEach(() => {
vi.useRealTimers();
mockGet.mockReset();
mockPost.mockReset();
mockDel.mockReset();
mockOpen.mockReset();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
// ─── Helpers ──────────────────────────────────────────────────────────────────
const entry = (
key: string,
value: unknown,
overrides?: Partial<{
version: number;
expires_at: string | null;
updated_at: string;
}>,
): {
key: string;
value: unknown;
version?: number;
expires_at: string | null;
updated_at: string;
} => ({
key,
value,
version: undefined,
expires_at: null,
updated_at: "2026-05-10T10:00:00Z",
...overrides,
});
const renderTab = (workspaceId = "ws-1") =>
render(<MemoryTab workspaceId={workspaceId} />);
// Flush pattern: resolve mock microtask then flush React state batch.
async function flush() {
await act(async () => { await Promise.resolve(); });
}
// ─── Tests ────────────────────────────────────────────────────────────────────
describe("MemoryTab — render conditions", () => {
beforeEach(() => {
mockGet.mockImplementation(() => new Promise(() => {}));
});
it("shows loading state while fetching", async () => {
renderTab();
await act(async () => { /* flush initial render */ });
expect(screen.getByText("Loading memory...")).toBeTruthy();
});
it("shows empty state when API returns empty list", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
// KV section hidden by default; reveal it via Advanced toggle
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
expect(screen.getByText("No memory entries")).toBeTruthy();
});
it("renders memory entries when API returns data", async () => {
mockGet.mockResolvedValueOnce([
entry("my-key", { nested: true }),
entry("another-key", "plain string"),
]);
renderTab();
await flush();
// Advanced is collapsed by default; reveal entries
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
expect(screen.getByText("my-key")).toBeTruthy();
expect(screen.getByText("another-key")).toBeTruthy();
});
it("shows Advanced section hidden by default", async () => {
mockGet.mockResolvedValueOnce([entry("k1", "v1")]);
renderTab();
await flush();
expect(screen.getByText("Advanced workspace memory is hidden")).toBeTruthy();
});
it("shows Advanced section when entries exist and advanced is toggled on", async () => {
mockGet.mockResolvedValueOnce([entry("k1", "v1")]);
renderTab();
await flush();
// Show the advanced section
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
expect(screen.getByText("k1")).toBeTruthy();
});
// Awareness section defaults to showAwareness=true (expanded with iframe)
it("shows Awareness dashboard expanded with iframe by default", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
// Default state shows the expanded section
const iframe = document.querySelector("iframe");
expect(iframe).toBeTruthy();
expect(iframe?.getAttribute("title")).toBe("Awareness dashboard");
});
it("collapses Awareness dashboard when Collapse button is clicked", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /collapse/i }).click();
});
await flush();
expect(screen.getByText("Awareness dashboard is collapsed")).toBeTruthy();
});
it("shows awareness status grid in expanded Awareness section", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
// Default state is already expanded — status grid is visible
expect(screen.getByText("Connected")).toBeTruthy();
expect(screen.getByText("Mode")).toBeTruthy();
expect(screen.getByText("Workspace")).toBeTruthy();
});
it("shows workspaceId in awareness grid", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab("my-workspace-id");
await flush();
// workspaceId appears twice: in awareness grid and in KV description.
// Query the awareness grid span specifically (text-ink-mid class in the grid).
const spans = screen.getAllByText("my-workspace-id");
const gridSpan = spans.find(
(s) => s.className.includes("font-mono") && !s.className.includes("truncate"),
);
expect(gridSpan).toBeTruthy();
});
});
describe("MemoryTab — KV memory CRUD", () => {
beforeEach(() => {
// Use mockImplementation so every call resolves (loadMemory is called multiple
// times: on mount, on refresh, after add/save errors)
mockGet.mockImplementation(() =>
Promise.resolve([entry("existing-key", "existing-value")]),
);
mockPost.mockResolvedValue({});
mockDel.mockResolvedValue({});
});
it("shows error alert when GET rejects", async () => {
mockGet.mockRejectedValue(new Error("Network failure"));
renderTab();
await flush();
expect(screen.getByRole("alert")).toBeTruthy();
expect(screen.getByText("Network failure")).toBeTruthy();
});
it("Refresh button calls GET /workspaces/:id/memory", async () => {
renderTab();
await flush();
mockGet.mockClear();
act(() => {
screen.getByRole("button", { name: /refresh/i }).click();
});
await flush();
expect(mockGet).toHaveBeenCalledWith("/workspaces/ws-1/memory");
});
it("shows + Add button to open add form", async () => {
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
expect(screen.getByRole("button", { name: /^\+ add$/i })).toBeTruthy();
});
it("shows add form when + Add is clicked", async () => {
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /^\+ add$/i }).click();
});
await flush();
expect(screen.getByLabelText(/memory key/i)).toBeTruthy();
expect(screen.getByLabelText(/memory value/i)).toBeTruthy();
});
it("requires key in add form", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /^\+ add$/i }).click();
});
await flush();
mockPost.mockReset().mockRejectedValue(new Error("should not be called"));
act(() => {
screen.getByRole("button", { name: /save/i }).click();
});
await flush();
expect(screen.getByText("Key is required")).toBeTruthy();
expect(mockPost).not.toHaveBeenCalled();
});
it("parses JSON value in add form", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /^\+ add$/i }).click();
});
await flush();
fireEvent.change(screen.getByLabelText(/memory key/i), {
target: { value: "json-key" },
});
fireEvent.change(screen.getByLabelText(/memory value/i), {
target: { value: '{"nested": "value"}' },
});
act(() => {
screen.getByRole("button", { name: /save/i }).click();
});
await flush();
expect(mockPost).toHaveBeenCalledWith(
"/workspaces/ws-1/memory",
expect.objectContaining({
key: "json-key",
value: { nested: "value" },
}),
);
});
it("treats plain-text value as string in add form", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /^\+ add$/i }).click();
});
await flush();
fireEvent.change(screen.getByLabelText(/memory key/i), {
target: { value: "plain-key" },
});
fireEvent.change(screen.getByLabelText(/memory value/i), {
target: { value: "plain text" },
});
act(() => {
screen.getByRole("button", { name: /save/i }).click();
});
await flush();
expect(mockPost).toHaveBeenCalledWith(
"/workspaces/ws-1/memory",
expect.objectContaining({
key: "plain-key",
value: "plain text",
}),
);
});
it("sends ttl_seconds when TTL is provided in add form", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /^\+ add$/i }).click();
});
await flush();
fireEvent.change(screen.getByLabelText(/memory key/i), {
target: { value: "ttl-key" },
});
fireEvent.change(screen.getByLabelText(/memory value/i), {
target: { value: "val" },
});
fireEvent.change(screen.getByLabelText(/ttl in seconds/i), {
target: { value: "3600" },
});
act(() => {
screen.getByRole("button", { name: /save/i }).click();
});
await flush();
expect(mockPost).toHaveBeenCalledWith(
"/workspaces/ws-1/memory",
expect.objectContaining({
key: "ttl-key",
value: "val",
ttl_seconds: 3600,
}),
);
});
it("closes add form on cancel", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /^\+ add$/i }).click();
});
await flush();
expect(screen.getByLabelText(/memory key/i)).toBeTruthy();
act(() => {
screen.getByRole("button", { name: /cancel/i }).click();
});
await flush();
expect(screen.queryByLabelText(/memory key/i)).toBeFalsy();
});
it("shows error when add POST rejects", async () => {
mockGet.mockResolvedValueOnce([]);
mockPost.mockRejectedValue(new Error("Add failed"));
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /^\+ add$/i }).click();
});
await flush();
fireEvent.change(screen.getByLabelText(/memory key/i), {
target: { value: "k" },
});
act(() => {
screen.getByRole("button", { name: /save/i }).click();
});
await flush();
expect(screen.getByText("Add failed")).toBeTruthy();
});
it("optimistically removes entry on delete", async () => {
renderTab();
await flush();
// Expand the advanced section
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
// Expand the entry row
act(() => {
screen.getByText("existing-key").closest("button")?.click();
});
await flush();
// Verify the Delete button is visible inside the expanded section
const deleteBtn = screen
.getAllByRole("button")
.find((b) => b.textContent === "Delete");
expect(deleteBtn).toBeTruthy();
// Clicking Delete fires the API call; the entry is optimistically
// removed from state before the response. We verify the API call here.
act(() => {
deleteBtn?.click();
});
await flush();
expect(mockDel).toHaveBeenCalledWith(
"/workspaces/ws-1/memory/existing-key",
);
});
it("calls DELETE /workspaces/:id/memory/:key on delete", async () => {
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByText("existing-key").closest("button")?.click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /delete/i }).click();
});
await flush();
expect(mockDel).toHaveBeenCalledWith(
"/workspaces/ws-1/memory/existing-key",
);
});
it("shows error when delete rejects", async () => {
mockDel.mockRejectedValue(new Error("Delete failed"));
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByText("existing-key").closest("button")?.click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /delete/i }).click();
});
await flush();
// Error should appear in the alert
expect(screen.getByRole("alert")).toBeTruthy();
expect(screen.getByText("Delete failed")).toBeTruthy();
// Entry should be visible again (reverted)
expect(screen.getByText("existing-key")).toBeTruthy();
});
});
describe("MemoryTab — edit entry", () => {
beforeEach(() => {
// Use mockImplementation so every call resolves (loadMemory called multiple times)
mockGet.mockImplementation(() =>
Promise.resolve([
entry("edit-key", { original: true }, { version: 5 }),
]),
);
mockPost.mockResolvedValue({});
});
it("begins edit mode when Edit is clicked", async () => {
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
// Expand the entry row first
act(() => {
screen.getByText("edit-key").closest("button")?.click();
});
await flush();
// Find the "Edit" button specifically (not the row button whose accessible name is "edit-key")
const editBtn = screen
.getAllByRole("button", { name: /^edit$/i })
.find((b) => b.textContent === "Edit");
act(() => {
editBtn?.click();
});
await flush();
expect(screen.getByLabelText(/edit value for edit-key/i)).toBeTruthy();
expect(screen.getByLabelText(/edit ttl for edit-key/i)).toBeTruthy();
});
it("pre-fills edit textarea with JSON for object values", async () => {
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByText("edit-key").closest("button")?.click();
});
await flush();
act(() => {
screen
.getAllByRole("button", { name: /^edit$/i })
.find((b) => b.textContent === "Edit")
?.click();
});
await flush();
const textarea = screen.getByLabelText(/edit value for edit-key/i);
expect(textarea.textContent?.trim()).toBe('{\n "original": true\n}');
});
it("pre-fills edit textarea with raw string for string values", async () => {
mockGet.mockImplementation(() =>
Promise.resolve([
entry("str-key", "plain string value", { version: 1 }),
]),
);
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByText("str-key").closest("button")?.click();
});
await flush();
act(() => {
screen
.getAllByRole("button", { name: /^edit$/i })
.find((b) => b.textContent === "Edit")
?.click();
});
await flush();
const textarea = screen.getByLabelText(/edit value for str-key/i);
expect(textarea.textContent?.trim()).toBe("plain string value");
});
it("cancels edit and restores entry view", async () => {
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByText("edit-key").closest("button")?.click();
});
await flush();
act(() => {
screen
.getAllByRole("button", { name: /^edit$/i })
.find((b) => b.textContent === "Edit")
?.click();
});
await flush();
expect(screen.getByLabelText(/edit value for edit-key/i)).toBeTruthy();
act(() => {
screen.getByRole("button", { name: /cancel/i }).click();
});
await flush();
expect(screen.queryByLabelText(/edit value/i)).toBeFalsy();
});
it("calls POST with if_match_version on save", async () => {
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByText("edit-key").closest("button")?.click();
});
await flush();
act(() => {
screen
.getAllByRole("button", { name: /^edit$/i })
.find((b) => b.textContent === "Edit")
?.click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /save/i }).click();
});
await flush();
expect(mockPost).toHaveBeenCalledWith(
"/workspaces/ws-1/memory",
expect.objectContaining({
key: "edit-key",
value: { original: true },
if_match_version: 5,
}),
);
});
it("shows 409 conflict error and reloads on version mismatch", async () => {
mockPost.mockRejectedValue(
new Error("409 Conflict: if_match_version mismatch"),
);
// Return entries for initial load; on 409 the component calls loadMemory()
// again — use mockImplementation so subsequent calls also return entries
mockGet.mockImplementation(() =>
Promise.resolve([
entry("edit-key", { original: true }, { version: 5 }),
]),
);
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByText("edit-key").closest("button")?.click();
});
await flush();
act(() => {
screen
.getAllByRole("button", { name: /^edit$/i })
.find((b) => b.textContent === "Edit")
?.click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /save/i }).click();
});
await flush();
expect(screen.getByText(/this entry changed since you opened it/i)).toBeTruthy();
});
it("shows generic error when edit POST rejects with non-409", async () => {
mockPost.mockRejectedValue(new Error("Server error"));
renderTab();
await flush();
act(() => {
screen.getByRole("button", { name: /advanced/i }).click();
});
await flush();
act(() => {
screen.getByText("edit-key").closest("button")?.click();
});
await flush();
act(() => {
screen
.getAllByRole("button", { name: /^edit$/i })
.find((b) => b.textContent === "Edit")
?.click();
});
await flush();
act(() => {
screen.getByRole("button", { name: /save/i }).click();
});
await flush();
expect(screen.getByText("Server error")).toBeTruthy();
});
});
describe("MemoryTab — expand/collapse entry", () => {
beforeEach(() => {
mockGet.mockResolvedValue([
entry("entry-a", { data: "A" }),
entry("entry-b", { data: "B" }),
]);
});
it("expands entry when clicked", async () => {
renderTab();
await flush();
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
act(() => {
screen.getByText("entry-a").closest("button")?.click();
});
await flush();
// Expanded entry shows its JSON value
expect(screen.getByText(/"data": "A"/)).toBeTruthy();
});
it("collapses entry when clicked again", async () => {
renderTab();
await flush();
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
act(() => {
screen.getByText("entry-a").closest("button")?.click();
});
await flush();
act(() => {
screen.getByText("entry-a").closest("button")?.click();
});
await flush();
expect(screen.queryByText(/"data": "A"/)).toBeFalsy();
});
it("shows collapsed indicator ▶ for non-expanded entries", async () => {
renderTab();
await flush();
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
expect(screen.getAllByText("▶").length).toBeGreaterThan(0);
});
it("shows expanded indicator ▼ for expanded entries", async () => {
renderTab();
await flush();
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
act(() => {
screen.getByText("entry-a").closest("button")?.click();
});
await flush();
expect(screen.getAllByText("▼").length).toBeGreaterThan(0);
});
it("hides edit/delete buttons when entry is collapsed", async () => {
renderTab();
await flush();
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
expect(screen.queryByRole("button", { name: /edit/i })).toBeFalsy();
expect(screen.queryByRole("button", { name: /delete/i })).toBeFalsy();
});
it("shows edit/delete buttons when entry is expanded", async () => {
renderTab();
await flush();
fireEvent.click(screen.getByRole("button", { name: /advanced/i }));
await flush();
act(() => {
screen.getByText("entry-a").closest("button")?.click();
});
await flush();
expect(screen.getAllByRole("button", { name: /edit/i }).length).toBeGreaterThan(0);
expect(screen.getAllByRole("button", { name: /delete/i }).length).toBeGreaterThan(0);
});
});
describe("MemoryTab — Open Awareness button", () => {
it("calls window.open with workspaceId in URL", async () => {
mockGet.mockResolvedValueOnce([]);
renderTab("my-ws");
await flush();
fireEvent.click(screen.getByRole("button", { name: /open/i }));
await flush();
expect(mockOpen).toHaveBeenCalled();
const url = mockOpen.mock.calls[0][0];
expect(url).toContain("workspaceId=my-ws");
});
});
@@ -1,635 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for ScheduleTab — cron-based task scheduling.
*
* Coverage:
* - Loading state
* - Empty state (no schedules)
* - Schedule list rendering (single + multiple)
* - Status dot color (error/ok/idle)
* - Toggle enable/disable via status dot
* - Delete via ConfirmDialog
* - Run Now button triggers POST + POST
* - Create schedule form open/close
* - Edit schedule form pre-fills values
* - Form validation (disabled when cron/prompt empty)
* - Create POST with correct payload
* - Edit PATCH with correct payload
* - Error state surfaces API failures
* - Auto-refresh every 10s (spy)
* - cronToHuman formatting
* - relativeTime formatting
* - Reset form clears all fields
* - Disabled schedules are visually dimmed
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act, waitFor } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { ScheduleTab } from "../ScheduleTab";
// Hoist mocks so vi.mock factory can reference them.
const mockGet = vi.hoisted(() => vi.fn<[], Promise<unknown[]>>());
const mockPost = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
const mockPatch = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
const mockDel = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
vi.mock("@/lib/api", () => ({
api: { get: mockGet, post: mockPost, patch: mockPatch, del: mockDel },
}));
// Capture ConfirmDialog state to drive from tests.
const confirmDialogState = vi.hoisted(
() => ({
open: false as boolean,
onConfirm: undefined as (() => void) | undefined,
onCancel: undefined as (() => void) | undefined,
}),
);
const MockConfirmDialog = vi.hoisted(
() =>
vi.fn(({ open, onConfirm, onCancel }: {
open: boolean;
onConfirm: () => void;
onCancel: () => void;
}) => {
confirmDialogState.open = open;
confirmDialogState.onConfirm = onConfirm;
confirmDialogState.onCancel = onCancel;
return null;
}),
);
vi.mock("@/components/ConfirmDialog", () => ({ ConfirmDialog: MockConfirmDialog }));
// ─── Fixtures ─────────────────────────────────────────────────────────────────
const SCHEDULE_FIXTURE = {
id: "sch-1",
workspace_id: "ws-1",
name: "Daily Security Scan",
cron_expr: "0 9 * * *",
timezone: "UTC",
prompt: "Run the security scan and report findings",
enabled: true,
last_run_at: new Date(Date.now() - 3600000).toISOString(),
next_run_at: new Date(Date.now() + 82800000).toISOString(),
run_count: 42,
last_status: "ok",
last_error: "",
created_at: new Date().toISOString(),
};
function schedule(overrides: Partial<typeof SCHEDULE_FIXTURE> = {}): typeof SCHEDULE_FIXTURE {
return { ...SCHEDULE_FIXTURE, ...overrides };
}
// ─── Helpers ───────────────────────────────────────────────────────────────────
async function flush() {
await act(async () => { await Promise.resolve(); });
}
function typeIn(el: HTMLElement, value: string) {
Object.defineProperty(el, "value", { value, writable: true, configurable: true });
// eslint-disable-next-line @typescript-eslint/no-explicit-any
fireEvent.change(el as any, { target: el });
}
// Use mockResolvedValue so every GET call (including post-handler refreshes)
// returns the fixture. Handlers like toggle/delete/run/edit all call
// fetchSchedules() at the end, triggering a second GET.
function setupLoad(schedules: unknown[]) {
mockGet.mockResolvedValue(schedules as unknown[]);
}
// ─── Tests ─────────────────────────────────────────────────────────────────────
describe("ScheduleTab", () => {
beforeEach(() => {
mockGet.mockReset();
mockPost.mockReset();
mockPatch.mockReset();
mockDel.mockReset();
MockConfirmDialog.mockClear();
vi.useRealTimers();
confirmDialogState.open = false;
confirmDialogState.onConfirm = undefined;
confirmDialogState.onCancel = undefined;
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
// ── Loading / Empty ──────────────────────────────────────────────────────────
it("shows loading state when schedules are being fetched", async () => {
mockGet.mockImplementation(() => new Promise(() => {}));
render(<ScheduleTab workspaceId="ws-1" />);
await act(async () => { /* flush initial render */ });
expect(screen.getByText("Loading schedules...")).toBeTruthy();
});
it("shows empty state when API returns an empty list", async () => {
setupLoad([]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("No schedules yet")).toBeTruthy();
expect(screen.getByText(/run tasks automatically/i)).toBeTruthy();
});
// ── Schedule list ────────────────────────────────────────────────────────────
it("renders a schedule with correct name and cron", async () => {
setupLoad([schedule({ name: "Morning Report", cron_expr: "0 8 * * *" })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("Morning Report")).toBeTruthy();
expect(screen.getByText(/Daily at 08:00 UTC/i)).toBeTruthy();
});
it("renders multiple schedules", async () => {
setupLoad([
schedule({ id: "s1", name: "Morning Report", cron_expr: "0 8 * * *" }),
schedule({ id: "s2", name: "Evening Cleanup", cron_expr: "0 22 * * *" }),
]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("Morning Report")).toBeTruthy();
expect(screen.getByText("Evening Cleanup")).toBeTruthy();
});
it("shows disabled schedule with reduced opacity", async () => {
setupLoad([schedule({ enabled: false })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
const container = screen.getByText("Daily Security Scan").closest("div[class*='border-b']");
expect(container?.className).toContain("opacity-50");
});
it("shows error dot when last_status is error", async () => {
setupLoad([schedule({ last_status: "error", last_error: "timeout" })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
const dot = screen.getByRole("button", { name: /click to disable/i });
expect(dot.className).toContain("bg-red-400");
});
it("shows ok dot when last_status is ok", async () => {
setupLoad([schedule({ last_status: "ok" })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
const dot = screen.getByRole("button", { name: /click to disable/i });
expect(dot.className).toContain("bg-emerald-400");
});
it("shows neutral dot when schedule is disabled (unknown status)", async () => {
// enabled=false → title says "Click to enable"
setupLoad([schedule({ enabled: false, last_status: "" })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
const dot = screen.getByRole("button", { name: /click to enable/i });
expect(dot.className).toContain("bg-surface-card");
});
it("shows last_error message when schedule failed", async () => {
setupLoad([schedule({ last_error: "connection refused" })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(/Error: connection refused/i)).toBeTruthy();
});
it("truncates long prompt in schedule list", async () => {
const longPrompt = "A".repeat(120);
setupLoad([schedule({ prompt: longPrompt })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
// Prompt is sliced at 80 chars + "..."
expect(screen.getByText(new RegExp(`^${"A".repeat(80)}\\.\\.\\.$$`))).toBeTruthy();
});
// ── cronToHuman formatting ──────────────────────────────────────────────────
it.each([
["* * * * *", "Every minute"],
["*/5 * * * *", "Every 5 minutes"],
["0 */4 * * *", "Every 4 hours"],
["0 9 * * *", "Daily at 09:00 UTC"],
["0 9 * * 1-5", "Weekdays at 09:00 UTC"],
["30 14 * * *", "Daily at 14:30 UTC"],
["*/15 * * * *", "Every 15 minutes"],
])("formats cron '%s' as '%s'", async (cron, expected) => {
setupLoad([schedule({ cron_expr: cron })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(new RegExp(expected, "i"))).toBeTruthy();
});
// ── relativeTime formatting ─────────────────────────────────────────────────
it("shows 'never' when last_run_at is null", async () => {
setupLoad([schedule({ last_run_at: null, next_run_at: null })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
const spans = Array.from(document.querySelectorAll("span"));
expect(spans.some(s => s.textContent === "Last: never")).toBeTruthy();
});
it("shows run_count in the list", async () => {
setupLoad([schedule({ run_count: 99 })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(/Runs: 99/i)).toBeTruthy();
});
// ── Toggle ──────────────────────────────────────────────────────────────────
it("PATCHes toggle endpoint when status dot is clicked", async () => {
setupLoad([schedule()]);
mockPatch.mockResolvedValue({});
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /click to disable/i }));
await flush();
expect(mockPatch).toHaveBeenCalledWith(
"/workspaces/ws-1/schedules/sch-1",
{ enabled: false },
);
});
it("toggling calls fetchSchedules to refresh the list", async () => {
setupLoad([schedule()]);
mockPatch.mockResolvedValue({});
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /click to disable/i }));
await flush();
// fetchSchedules calls GET again
expect(mockGet).toHaveBeenCalledWith("/workspaces/ws-1/schedules");
});
it("shows error when toggle fails", async () => {
setupLoad([schedule()]);
mockPatch.mockRejectedValue(new Error("toggle failed"));
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /click to disable/i }));
await flush();
// Component uses e.message (Error.message = "toggle failed")
expect(screen.getByText(/toggle failed/i)).toBeTruthy();
});
// ── Delete ──────────────────────────────────────────────────────────────────
it("opens ConfirmDialog when delete button is clicked", async () => {
setupLoad([schedule()]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /delete schedule/i }));
await flush();
expect(confirmDialogState.open).toBe(true);
});
it("calls DEL when ConfirmDialog is confirmed", async () => {
setupLoad([schedule()]);
mockDel.mockResolvedValue({});
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /delete schedule/i }));
await flush();
confirmDialogState.onConfirm?.();
await flush();
expect(mockDel).toHaveBeenCalledWith("/workspaces/ws-1/schedules/sch-1");
});
it("calls fetchSchedules after delete", async () => {
setupLoad([schedule()]);
mockDel.mockResolvedValue({});
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /delete schedule/i }));
await flush();
confirmDialogState.onConfirm?.();
await flush();
expect(mockGet).toHaveBeenCalledWith("/workspaces/ws-1/schedules");
});
it("closes ConfirmDialog when cancel is called", async () => {
setupLoad([schedule()]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /delete schedule/i }));
await flush();
expect(confirmDialogState.open).toBe(true);
confirmDialogState.onCancel?.();
await flush();
expect(confirmDialogState.open).toBe(false);
});
it("shows error when delete fails", async () => {
setupLoad([schedule()]);
mockDel.mockRejectedValue(new Error("delete failed"));
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /delete schedule/i }));
await flush();
confirmDialogState.onConfirm?.();
await flush();
expect(screen.getByText(/delete failed/i)).toBeTruthy();
});
// ── Run Now ──────────────────────────────────────────────────────────────────
it("calls POST /schedules/:id/run and then POST /a2a when Run Now is clicked", async () => {
setupLoad([schedule()]);
mockPost
.mockResolvedValueOnce({ prompt: "Run the security scan and report findings" })
.mockResolvedValueOnce({});
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /run schedule/i }));
await flush();
expect(mockPost).toHaveBeenNthCalledWith(1, "/workspaces/ws-1/schedules/sch-1/run", {});
expect(mockPost).toHaveBeenNthCalledWith(2, "/workspaces/ws-1/a2a", expect.objectContaining({ method: "message/send" }));
});
it("shows error when run now fails", async () => {
setupLoad([schedule()]);
mockPost.mockRejectedValue(new Error("run failed"));
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /run schedule/i }));
await flush();
// handleRunNow uses hardcoded "Failed to run schedule" on error
expect(screen.getByText(/Failed to run schedule/i)).toBeTruthy();
});
// ── Create form ──────────────────────────────────────────────────────────────
it("shows create form when + Add Schedule is clicked", async () => {
setupLoad([]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
expect(screen.getByLabelText("Schedule name")).toBeTruthy();
expect(screen.getByLabelText("Cron Expression")).toBeTruthy();
expect(screen.getByLabelText("Prompt / Task")).toBeTruthy();
});
it("pre-fills default cron (0 9 * * *) and timezone (UTC)", async () => {
setupLoad([]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
expect((screen.getByLabelText("Cron Expression") as HTMLInputElement).value).toBe("0 9 * * *");
expect((screen.getByLabelText("Timezone") as HTMLSelectElement).value).toBe("UTC");
});
it("submit button is disabled when cron or prompt is empty", async () => {
setupLoad([]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
const submitBtn = screen.getByRole("button", { name: /create/i });
expect((submitBtn as HTMLButtonElement).disabled).toBe(true);
});
it("submit button is enabled when cron and prompt are filled", async () => {
setupLoad([]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
typeIn(screen.getByLabelText("Prompt / Task") as HTMLElement, "Run a task");
await flush();
const submitBtn = screen.getByRole("button", { name: /create/i });
expect((submitBtn as HTMLButtonElement).disabled).toBe(false);
});
it("POSTs correct payload when creating a schedule", async () => {
setupLoad([]);
mockPost.mockResolvedValue({});
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
typeIn(screen.getByLabelText("Schedule name") as HTMLElement, "Morning Report");
typeIn(screen.getByLabelText("Cron Expression") as HTMLElement, "0 8 * * *");
typeIn(screen.getByLabelText("Prompt / Task") as HTMLElement, "Generate the morning report");
await flush();
act(() => { screen.getByRole("button", { name: /create/i }).click(); });
await flush();
await waitFor(() => {
expect(screen.queryByRole("button", { name: /cancel/i })).not.toBeTruthy();
});
expect(mockPost).toHaveBeenCalledWith(
"/workspaces/ws-1/schedules",
expect.objectContaining({
name: "Morning Report",
cron_expr: "0 8 * * *",
timezone: "UTC",
prompt: "Generate the morning report",
enabled: true,
}),
);
});
it("closes form and refreshes after successful create", async () => {
setupLoad([]);
mockPost.mockResolvedValue({});
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
typeIn(screen.getByLabelText("Prompt / Task") as HTMLElement, "Run a task");
await flush();
act(() => { screen.getByRole("button", { name: /create/i }).click(); });
await flush();
await waitFor(() => {
expect(screen.queryByLabelText("Schedule name")).not.toBeTruthy();
});
expect(mockGet).toHaveBeenCalledWith("/workspaces/ws-1/schedules");
});
it("shows error message when create fails", async () => {
setupLoad([]);
mockPost.mockRejectedValue(new Error("validation failed"));
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
typeIn(screen.getByLabelText("Prompt / Task") as HTMLElement, "Run a task");
await flush();
act(() => { screen.getByRole("button", { name: /create/i }).click(); });
await flush();
expect(screen.getByText(/validation failed/i)).toBeTruthy();
});
it("closes form when Cancel is clicked", async () => {
setupLoad([]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
expect(screen.getByLabelText("Schedule name")).toBeTruthy();
act(() => { screen.getByRole("button", { name: /cancel/i }).click(); });
await flush();
await waitFor(() => {
expect(screen.queryByLabelText("Schedule name")).not.toBeTruthy();
});
});
// ── Edit form ────────────────────────────────────────────────────────────────
it("opens edit form pre-filled with schedule data when Edit is clicked", async () => {
setupLoad([schedule({ name: "Nightly Backup", cron_expr: "0 2 * * *" })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /edit schedule/i }));
await flush();
expect((screen.getByLabelText("Schedule name") as HTMLInputElement).value).toBe("Nightly Backup");
expect((screen.getByLabelText("Cron Expression") as HTMLInputElement).value).toBe("0 2 * * *");
});
it("shows 'Update' button in edit mode", async () => {
setupLoad([schedule()]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /edit schedule/i }));
await flush();
expect(screen.getByRole("button", { name: /update/i })).toBeTruthy();
});
it("PATCHes correct payload when updating a schedule", async () => {
setupLoad([schedule()]);
mockPatch.mockResolvedValue({});
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /edit schedule/i }));
await flush();
typeIn(screen.getByLabelText("Schedule name") as HTMLElement, "Updated Name");
typeIn(screen.getByLabelText("Prompt / Task") as HTMLElement, "New prompt");
await flush();
act(() => { screen.getByRole("button", { name: /update/i }).click(); });
await flush();
await waitFor(() => {
expect(screen.queryByRole("button", { name: /cancel/i })).not.toBeTruthy();
});
expect(mockPatch).toHaveBeenCalledWith(
"/workspaces/ws-1/schedules/sch-1",
expect.objectContaining({
name: "Updated Name",
cron_expr: "0 9 * * *",
timezone: "UTC",
prompt: "New prompt",
enabled: true,
}),
);
});
it("form reset clears name, cron, prompt, and enabled", async () => {
setupLoad([schedule()]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
// Open + add schedule form
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
typeIn(screen.getByLabelText("Schedule name") as HTMLElement, "Temp Schedule");
typeIn(screen.getByLabelText("Cron Expression") as HTMLElement, "*/15 * * * *");
typeIn(screen.getByLabelText("Prompt / Task") as HTMLElement, "Temporary task");
await flush();
// Cancel
act(() => { screen.getByRole("button", { name: /cancel/i }).click(); });
await flush();
// Open again — should be reset
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
expect((screen.getByLabelText("Schedule name") as HTMLInputElement).value).toBe("");
expect((screen.getByLabelText("Cron Expression") as HTMLInputElement).value).toBe("0 9 * * *");
expect((screen.getByLabelText("Prompt / Task") as HTMLTextAreaElement).value).toBe("");
});
// ── Error state ──────────────────────────────────────────────────────────────
it("shows error banner when GET fails", async () => {
mockGet.mockRejectedValue(new Error("network error"));
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
// Component now sets error state on GET failure
expect(screen.getByText(/network error/i)).toBeTruthy();
});
it("shows generic error when GET rejects with non-Error", async () => {
mockGet.mockRejectedValue("unknown failure");
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("unknown failure")).toBeTruthy();
});
// ── Auto-refresh ────────────────────────────────────────────────────────────
it("sets up auto-refresh interval of 10 seconds", async () => {
const setIntervalSpy = vi.spyOn(globalThis, "setInterval");
setupLoad([schedule()]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(setIntervalSpy).toHaveBeenCalledWith(expect.any(Function), 10000);
setIntervalSpy.mockRestore();
});
it("clears the auto-refresh interval on unmount", async () => {
const clearIntervalSpy = vi.spyOn(globalThis, "clearInterval");
const setIntervalSpy = vi.spyOn(globalThis, "setInterval");
setupLoad([schedule()]);
const { unmount } = render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(clearIntervalSpy).not.toHaveBeenCalled();
unmount();
expect(clearIntervalSpy).toHaveBeenCalled();
setIntervalSpy.mockRestore();
clearIntervalSpy.mockRestore();
});
// ── Misc ────────────────────────────────────────────────────────────────────
it("shows no timezone suffix when timezone is UTC", async () => {
setupLoad([schedule({ timezone: "UTC" })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.queryByText(/\(UTC\)/)).not.toBeTruthy();
});
it("shows timezone suffix when non-UTC", async () => {
setupLoad([schedule({ timezone: "America/New_York" })]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(/\(America\/New_York\)/)).toBeTruthy();
});
it("checkbox toggles formEnabled state", async () => {
setupLoad([]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
const checkbox = screen.getByRole("checkbox");
expect((checkbox as HTMLInputElement).checked).toBe(true);
fireEvent.click(checkbox);
await flush();
expect((checkbox as HTMLInputElement).checked).toBe(false);
});
it("timezone select updates formTimezone", async () => {
setupLoad([]);
render(<ScheduleTab workspaceId="ws-1" />);
await flush();
fireEvent.click(screen.getByRole("button", { name: /\+ add schedule/i }));
await flush();
fireEvent.change(screen.getByLabelText("Timezone"), { target: { value: "America/Los_Angeles" } });
await flush();
expect((screen.getByLabelText("Timezone") as HTMLSelectElement).value).toBe("America/Los_Angeles");
});
});
@@ -1,408 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for TracesTab — Langfuse trace viewer.
*
* Coverage:
* - Loading state
* - Error state
* - Empty state (no traces)
* - Trace list rendering
* - Expand/collapse rows with aria attributes
* - Status dot colors (ERROR vs success)
* - Latency formatting (ms vs seconds)
* - Token count display
* - Cost display
* - Input/output rendering (string and object)
* - Refresh button
* - formatTime relative timestamps
* - "How to enable tracing" collapsed hint
*/
import React from "react";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
import { TracesTab } from "../TracesTab";
const mockGet = vi.hoisted(() => vi.fn<[], Promise<unknown>>());
vi.mock("@/lib/api", () => ({
api: { get: mockGet },
}));
// ─── Fixtures ─────────────────────────────────────────────────────────────────
const TRACE_FIXTURE = {
id: "trace-abc123",
name: "security-scan",
timestamp: new Date(Date.now() - 60000).toISOString(),
latency: 450,
input: { query: "scan for vulnerabilities" },
output: { result: "No issues found" },
status: "success",
totalCost: 0.00234,
usage: { input: 120, output: 85, total: 205 },
};
function trace(overrides: Partial<typeof TRACE_FIXTURE> = {}): typeof TRACE_FIXTURE {
return { ...TRACE_FIXTURE, ...overrides };
}
// ─── Helpers ───────────────────────────────────────────────────────────────────
async function flush() {
await act(async () => { await Promise.resolve(); });
}
// The trace row button's accessible name is "{name} {relativeTime} {latency}{tokCount}".
// Filter all buttons to find the trace row buttons.
function getTraceButtons() {
return screen
.getAllByRole("button")
.filter((b) => b.getAttribute("aria-controls")?.startsWith("trace-detail-"));
}
// ─── Tests ─────────────────────────────────────────────────────────────────────
describe("TracesTab", () => {
beforeEach(() => {
mockGet.mockReset();
vi.useRealTimers();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
// ── Loading ─────────────────────────────────────────────────────────────────
it("shows loading state when traces are being fetched", async () => {
mockGet.mockImplementation(() => new Promise(() => {}));
render(<TracesTab workspaceId="ws-1" />);
await act(async () => { /* flush initial render */ });
expect(screen.getByText("Loading traces...")).toBeTruthy();
});
// ── Error ──────────────────────────────────────────────────────────────────
it("shows error banner when GET /traces rejects", async () => {
mockGet.mockRejectedValue(new Error("gateway timeout"));
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(/gateway timeout/i)).toBeTruthy();
});
it("shows 'Failed to load traces' when GET rejects with non-Error", async () => {
mockGet.mockRejectedValue("unknown");
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(/Failed to load traces/i)).toBeTruthy();
});
// ── Empty state ───────────────────────────────────────────────────────────
it("shows empty state when API returns empty list", async () => {
mockGet.mockResolvedValue({ data: [] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("No traces yet")).toBeTruthy();
});
it("shows 'How to enable tracing' hint under empty state", async () => {
mockGet.mockResolvedValue({ data: [] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(/how to enable tracing/i)).toBeTruthy();
expect(screen.getByText(/LANGFUSE_HOST/i)).toBeTruthy();
});
it("hides empty state when error is present", async () => {
mockGet.mockRejectedValue(new Error("error"));
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.queryByText("No traces yet")).toBeFalsy();
});
// ── Trace list ─────────────────────────────────────────────────────────────
it("renders trace name in the list", async () => {
mockGet.mockResolvedValue({ data: [trace({ name: "my-trace" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("my-trace")).toBeTruthy();
});
it("shows trace count in header", async () => {
mockGet.mockResolvedValue({
data: [
trace({ id: "t1" }),
trace({ id: "t2" }),
trace({ id: "t3" }),
],
});
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("3 traces")).toBeTruthy();
});
it("renders multiple traces", async () => {
mockGet.mockResolvedValue({
data: [
trace({ id: "t1", name: "trace-alpha" }),
trace({ id: "t2", name: "trace-beta" }),
],
});
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("trace-alpha")).toBeTruthy();
expect(screen.getByText("trace-beta")).toBeTruthy();
});
it("shows 'trace' when name is empty", async () => {
mockGet.mockResolvedValue({ data: [trace({ name: "" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("trace")).toBeTruthy();
});
// ── Status dot ─────────────────────────────────────────────────────────────
it("applies bg-bad to ERROR traces", async () => {
mockGet.mockResolvedValue({ data: [trace({ status: "ERROR" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
const dot = getTraceButtons()[0].querySelector("div[class*='rounded-full']");
expect(dot?.className).toContain("bg-bad");
});
it("applies bg-good to success traces", async () => {
mockGet.mockResolvedValue({ data: [trace({ status: "success" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
const dot = getTraceButtons()[0].querySelector("div[class*='rounded-full']");
expect(dot?.className).toContain("bg-good");
});
// ── Latency formatting ──────────────────────────────────────────────────────
it("shows latency in milliseconds when < 1000ms", async () => {
mockGet.mockResolvedValue({ data: [trace({ latency: 450 })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("450ms")).toBeTruthy();
});
it("shows latency in seconds when >= 1000ms", async () => {
mockGet.mockResolvedValue({ data: [trace({ latency: 2500 })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("2.5s")).toBeTruthy();
});
it("hides latency when null", async () => {
mockGet.mockResolvedValue({ data: [trace({ latency: undefined })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.queryByText(/ms/)).toBeFalsy();
});
// ── Token count ────────────────────────────────────────────────────────────
it("shows total token count from usage.total", async () => {
mockGet.mockResolvedValue({ data: [trace({ usage: { input: 100, output: 50, total: 150 } })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("150 tok")).toBeTruthy();
});
it("hides token count when usage is undefined", async () => {
mockGet.mockResolvedValue({ data: [trace({ usage: undefined })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.queryByText(/tok/)).toBeFalsy();
});
// ── Expand/collapse ─────────────────────────────────────────────────────────
it("shows '▶' when trace is collapsed", async () => {
mockGet.mockResolvedValue({ data: [trace()] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText("▶")).toBeTruthy();
});
it("shows '▼' when trace is expanded", async () => {
mockGet.mockResolvedValue({ data: [trace()] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
act(() => { getTraceButtons()[0].click(); });
await flush();
expect(screen.getByText("▼")).toBeTruthy();
});
it("shows '▼' when all traces are collapsed", async () => {
mockGet.mockResolvedValue({ data: [trace()] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.queryByText("▼")).toBeFalsy();
expect(screen.getByText("▶")).toBeTruthy();
});
it("shows input/output panel when trace is expanded", async () => {
mockGet.mockResolvedValue({ data: [trace()] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
act(() => { getTraceButtons()[0].click(); });
await flush();
expect(screen.getByText(/INPUT/i)).toBeTruthy();
expect(screen.getByText(/OUTPUT/i)).toBeTruthy();
});
it("shows JSON stringified input when input is an object", async () => {
mockGet.mockResolvedValue({ data: [trace({ input: { query: "test" } })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
act(() => { getTraceButtons()[0].click(); });
await flush();
expect(screen.getByText(/"query": "test"/)).toBeTruthy();
});
it("shows raw string when input is a string", async () => {
mockGet.mockResolvedValue({ data: [trace({ input: "plain text input" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
act(() => { getTraceButtons()[0].click(); });
await flush();
expect(screen.getByText("plain text input")).toBeTruthy();
});
it("shows trace ID in expanded panel", async () => {
mockGet.mockResolvedValue({ data: [trace({ id: "trace-xyz-999" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
act(() => { getTraceButtons()[0].click(); });
await flush();
expect(screen.getByText("trace-xyz-999")).toBeTruthy();
});
it("shows cost when totalCost is present", async () => {
mockGet.mockResolvedValue({ data: [trace({ totalCost: 0.001234 })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
act(() => { getTraceButtons()[0].click(); });
await flush();
expect(screen.getByText(/\$0.001234/)).toBeTruthy();
});
it("hides cost section when totalCost is null", async () => {
mockGet.mockResolvedValue({ data: [trace({ totalCost: undefined })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
act(() => { getTraceButtons()[0].click(); });
await flush();
expect(screen.queryByText(/cost/i)).toBeFalsy();
});
it("has aria-expanded=true on expanded row", async () => {
mockGet.mockResolvedValue({ data: [trace()] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
const btn = getTraceButtons()[0];
expect(btn.getAttribute("aria-expanded")).toBe("false");
act(() => { btn.click(); });
await flush();
expect(btn.getAttribute("aria-expanded")).toBe("true");
});
it("has aria-expanded=false on collapsed row", async () => {
mockGet.mockResolvedValue({ data: [trace()] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(getTraceButtons()[0].getAttribute("aria-expanded")).toBe("false");
});
it("has aria-controls linking row to its detail panel", async () => {
mockGet.mockResolvedValue({ data: [trace({ id: "trace-abc123" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(getTraceButtons()[0].getAttribute("aria-controls")).toBe("trace-detail-trace-abc123");
});
// ── Refresh ────────────────────────────────────────────────────────────────
it("Refresh button triggers a new GET", async () => {
mockGet.mockResolvedValue({ data: [trace()] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
mockGet.mockClear();
fireEvent.click(screen.getByRole("button", { name: /refresh/i }));
await flush();
expect(mockGet).toHaveBeenCalledWith("/workspaces/ws-1/traces");
});
// ── formatTime ─────────────────────────────────────────────────────────────
it("shows 'Xs ago' for traces under 1 minute", async () => {
const timestamp = new Date(Date.now() - 30_000).toISOString();
mockGet.mockResolvedValue({ data: [trace({ timestamp, id: "t-30s" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
// 30s ago
expect(screen.getByText(/\d+s ago/)).toBeTruthy();
});
it("shows 'Xm ago' for traces under 1 hour", async () => {
const timestamp = new Date(Date.now() - 120_000).toISOString();
mockGet.mockResolvedValue({ data: [trace({ timestamp, id: "t-2m" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(/\dm ago/)).toBeTruthy();
});
it("shows 'Xh ago' for traces under 1 day", async () => {
const timestamp = new Date(Date.now() - 3_600_000).toISOString();
mockGet.mockResolvedValue({ data: [trace({ timestamp, id: "t-1h" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(/\dh ago/)).toBeTruthy();
});
it("shows locale date for traces older than 24 hours", async () => {
const oldDate = new Date(Date.now() - 172_800_000);
mockGet.mockResolvedValue({ data: [trace({ timestamp: oldDate.toISOString(), id: "t-old" })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
expect(screen.getByText(oldDate.toLocaleDateString())).toBeTruthy();
});
// ── Edge cases ─────────────────────────────────────────────────────────────
it("handles traces with no input or output", async () => {
mockGet.mockResolvedValue({ data: [trace({ input: undefined, output: undefined })] });
render(<TracesTab workspaceId="ws-1" />);
await flush();
act(() => { getTraceButtons()[0].click(); });
await flush();
expect(screen.queryByText(/INPUT/i)).toBeFalsy();
expect(screen.queryByText(/OUTPUT/i)).toBeFalsy();
});
it("shows only one expanded trace at a time", async () => {
mockGet.mockResolvedValue({
data: [
trace({ id: "t1", name: "Alpha" }),
trace({ id: "t2", name: "Beta" }),
],
});
render(<TracesTab workspaceId="ws-1" />);
await flush();
const [btn1, btn2] = getTraceButtons();
act(() => { btn1.click(); });
await flush();
expect(btn1.getAttribute("aria-expanded")).toBe("true");
expect(btn2.getAttribute("aria-expanded")).toBe("false");
act(() => { btn2.click(); });
await flush();
expect(btn1.getAttribute("aria-expanded")).toBe("false");
expect(btn2.getAttribute("aria-expanded")).toBe("true");
});
});
@@ -1,140 +0,0 @@
// @vitest-environment jsdom
/**
* Unit tests for extractSkills — pure helper from SkillsTab.
*
* Covers: null card, non-array skills, empty skills, full skill entries
* (id, name, description, tags, examples), id-only fallback, name-only
* fallback, string coercion, array coercion for tags/examples,
* filtering entries with no id after coercion, empty string id (filtered).
*/
import { describe, it, expect } from "vitest";
import { extractSkills } from "../SkillsTab";
describe("extractSkills", () => {
it("returns [] for null card", () => {
expect(extractSkills(null)).toEqual([]);
});
it("returns [] when card.skills is not an array", () => {
expect(extractSkills({ skills: undefined })).toEqual([]);
expect(extractSkills({ skills: "not-an-array" })).toEqual([]);
expect(extractSkills({ skills: { id: "x" } })).toEqual([]);
});
it("returns [] for empty skills array", () => {
expect(extractSkills({ skills: [] })).toEqual([]);
});
it("maps a fully-populated skill entry", () => {
const card = {
skills: [
{
id: "code_search",
name: "Code Search",
description: "Semantic code search",
tags: ["search", "code"],
examples: ["Find unused exports", "Search by AST pattern"],
},
],
};
expect(extractSkills(card)).toEqual([
{
id: "code_search",
name: "Code Search",
description: "Semantic code search",
tags: ["search", "code"],
examples: ["Find unused exports", "Search by AST pattern"],
},
]);
});
it("uses name as id when id is absent", () => {
const card = { skills: [{ name: "web_scraper" }] };
expect(extractSkills(card)).toEqual([
{ id: "web_scraper", name: "web_scraper", description: "", tags: [], examples: [] },
]);
});
it("uses id as name when name is absent", () => {
const card = { skills: [{ id: "legacy_skill" }] };
expect(extractSkills(card)).toEqual([
{ id: "legacy_skill", name: "legacy_skill", description: "", tags: [], examples: [] },
]);
});
it("filters out entries with neither id nor name", () => {
// id: String(undefined || undefined || "") → "" → filtered (id.length = 0)
const card = { skills: [{ description: "orphan entry" }] };
expect(extractSkills(card)).toEqual([]);
});
it("filters out entries with no id after string coercion", () => {
// id resolves to "" after String(undefined || null || {})
const card = { skills: [{ id: null, name: null }] };
expect(extractSkills(card)).toEqual([]);
});
it("filters out entries with empty-string id", () => {
const card = { skills: [{ id: "", name: "" }] };
expect(extractSkills(card)).toEqual([]);
});
it("coerces numeric tags to strings", () => {
const card = { skills: [{ id: "x", tags: [1, "two", 3] }] };
expect(extractSkills(card)).toEqual([
{ id: "x", name: "x", description: "", tags: ["1", "two", "3"], examples: [] },
]);
});
it("coerces non-array tags to empty array", () => {
const card = { skills: [{ id: "x", tags: "not-an-array" }] };
expect(extractSkills(card)).toEqual([
{ id: "x", name: "x", description: "", tags: [], examples: [] },
]);
});
it("coerces non-array examples to empty array", () => {
const card = { skills: [{ id: "x", examples: 42 }] };
expect(extractSkills(card)).toEqual([
{ id: "x", name: "x", description: "", tags: [], examples: [] },
]);
});
// NOTE: extractSkills uses `String(skill.description || "")` — falsy values
// (0, null, false) fall through to "", NOT to their string form.
it("returns '' for falsy description values (0, null, false)", () => {
const card = { skills: [{ id: "x", description: 0 }] };
expect(extractSkills(card)).toEqual([
{ id: "x", name: "x", description: "", tags: [], examples: [] },
]);
});
it("handles mixed valid/invalid entries", () => {
const card = {
skills: [
{ id: "valid_one", name: "One" },
{ name: "named_only" },
{ description: "orphan" }, // filtered — id becomes ""
{ id: "valid_two", examples: ["a", "b"] },
],
};
expect(extractSkills(card)).toEqual([
{ id: "valid_one", name: "One", description: "", tags: [], examples: [] },
{ id: "named_only", name: "named_only", description: "", tags: [], examples: [] },
{ id: "valid_two", name: "valid_two", description: "", tags: [], examples: ["a", "b"] },
]);
});
it("handles a realistic agent card with multiple skills", () => {
const card = {
skills: [
{ id: "web_search", name: "Web Search", description: "Search the web", tags: ["search"], examples: ["Latest news"] },
{ id: "file_read", name: "Read Files", description: "Read from disk", tags: ["io"], examples: [] },
],
};
const result = extractSkills(card);
expect(result).toHaveLength(2);
expect(result[0].id).toBe("web_search");
expect(result[1].tags).toEqual(["io"]);
});
});
@@ -1,95 +0,0 @@
// @vitest-environment jsdom
/**
* Unit tests for getSkills — pure helper from DetailsTab.
*
* Covers: null card, non-array skills, empty skills, id-only entries,
* name-only entries (id derives from name), entries with description,
* entries with neither id nor name (filtered out), mixed entries.
*/
import { describe, it, expect } from "vitest";
import { getSkills } from "../DetailsTab";
describe("getSkills", () => {
it("returns [] for null card", () => {
expect(getSkills(null)).toEqual([]);
});
it("returns [] when card.skills is not an array", () => {
expect(getSkills({ skills: undefined })).toEqual([]);
expect(getSkills({ skills: "not-an-array" })).toEqual([]);
expect(getSkills({ skills: { id: "x" } })).toEqual([]);
});
it("returns [] for empty skills array", () => {
expect(getSkills({ skills: [] })).toEqual([]);
});
it("maps skill with id and description", () => {
const card = { skills: [{ id: "code_search", description: "Find code patterns" }] };
expect(getSkills(card)).toEqual([{ id: "code_search", description: "Find code patterns" }]);
});
it("maps skill with id only (description absent)", () => {
const card = { skills: [{ id: "code_search" }] };
expect(getSkills(card)).toEqual([{ id: "code_search", description: undefined }]);
});
it("derives id from name when id is absent", () => {
const card = { skills: [{ name: "web_scraper" }] };
expect(getSkills(card)).toEqual([{ id: "web_scraper" }]);
});
it("maps description when present", () => {
const card = { skills: [{ id: "file_write", description: "Writes files to disk" }] };
expect(getSkills(card)).toEqual([{ id: "file_write", description: "Writes files to disk" }]);
});
it("returns description as undefined when skill has no description", () => {
const card = { skills: [{ id: "noop_skill" }] };
const result = getSkills(card);
// The map always includes description; it's undefined when absent
expect(result).toEqual([{ id: "noop_skill", description: undefined }]);
});
it("filters out skills with neither id nor name", () => {
// id: String(undefined || undefined || "") → "" → filtered
const card = { skills: [{ description: "loner" }] };
expect(getSkills(card)).toEqual([]);
});
it("handles mixed valid/invalid entries", () => {
const card = {
skills: [
{ id: "valid_one" },
{ name: "named_skill" },
{ description: "orphaned" }, // filtered
{ id: "valid_two", description: "Has both" },
],
};
expect(getSkills(card)).toEqual([
{ id: "valid_one", description: undefined },
{ id: "named_skill", description: undefined },
{ id: "valid_two", description: "Has both" },
]);
});
it("handles string coercion for numeric ids/names", () => {
const card = { skills: [{ id: 42, name: "numeric_id" }] };
expect(getSkills(card)).toEqual([{ id: "42" }]);
});
it("uses id over name when both are present", () => {
const card = { skills: [{ id: "priority_id", name: "fallback_name" }] };
expect(getSkills(card)).toEqual([{ id: "priority_id", description: undefined }]);
});
it("omits description when it is falsy (0 is falsy in JS)", () => {
// The implementation uses `s.description ?` — 0 is falsy, so it's treated
// as absent and undefined is returned. Non-zero numbers coerce fine.
const cardZero = { skills: [{ id: "x", description: 0 }] };
expect(getSkills(cardZero)).toEqual([{ id: "x", description: undefined }]);
const cardNum = { skills: [{ id: "x", description: 42 }] };
expect(getSkills(cardNum)).toEqual([{ id: "x", description: "42" }]);
});
});
@@ -1,185 +0,0 @@
// @vitest-environment jsdom
/**
* AttachmentViews — pure presentational components for chat attachments.
*
* Covers:
* - PendingAttachmentPill renders file name, formatted size, × button
* - PendingAttachmentPill × button has correct aria-label
* - PendingAttachmentPill calls onRemove when × clicked
* - PendingAttachmentPill renders exactly one button
* - AttachmentChip renders attachment name and download glyph
* - AttachmentChip renders size when provided
* - AttachmentChip omits size span when size is undefined
* - AttachmentChip calls onDownload(attachment) on click
* - AttachmentChip title attribute for hover tooltip
* - AttachmentChip tone=user applies blue accent classes
* - AttachmentChip tone=agent applies surface classes
* - AttachmentChip renders exactly one button
*
* NOTE: No @testing-library/jest-dom import — use textContent / className /
* getAttribute checks to avoid "expect is not defined" errors in this vitest
* configuration.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, render, screen } from "@testing-library/react";
import React from "react";
import { AttachmentChip, PendingAttachmentPill } from "../AttachmentViews";
import type { ChatAttachment } from "../types";
afterEach(() => {
cleanup();
vi.restoreAllMocks();
});
// ─── Helpers ────────────────────────────────────────────────────────────────────
/** Create a File with actual content so size > 0 in jsdom. */
function makeFile(name: string, content: string): File {
return new File([content], name, { type: "application/octet-stream" });
}
function makeAttachment(name: string, size?: number): ChatAttachment {
return { name, uri: `workspace:/tmp/${name}`, size };
}
// ─── PendingAttachmentPill ─────────────────────────────────────────────────────
describe("PendingAttachmentPill", () => {
it("renders the file name", () => {
const file = makeFile("report.pdf", "PDF content here");
const { container } = render(
<PendingAttachmentPill file={file} onRemove={vi.fn()} />,
);
expect(container.textContent).toContain("report.pdf");
});
it("renders the formatted file size (KB)", () => {
// 50 KB = 50 * 1024 bytes
const content = "x".repeat(50 * 1024);
const file = makeFile("data.csv", content);
const { container } = render(
<PendingAttachmentPill file={file} onRemove={vi.fn()} />,
);
expect(container.textContent).toContain("50 KB");
});
it("renders 0 B for empty file", () => {
const file = makeFile("empty.txt", "");
const { container } = render(
<PendingAttachmentPill file={file} onRemove={vi.fn()} />,
);
expect(container.textContent).toContain("0 B");
});
it("renders size in MB for files >= 1 MB", () => {
// 2.5 MB = 2.5 * 1024 * 1024 bytes
const content = "x".repeat(Math.round(2.5 * 1024 * 1024));
const file = makeFile("video.mp4", content);
const { container } = render(
<PendingAttachmentPill file={file} onRemove={vi.fn()} />,
);
expect(container.textContent).toContain("2.5 MB");
});
it("× button has aria-label with file name", () => {
const file = makeFile("notes.txt", "some content");
render(<PendingAttachmentPill file={file} onRemove={vi.fn()} />);
const btn = screen.getByRole("button");
expect(btn.getAttribute("aria-label")).toBe("Remove notes.txt");
});
it("calls onRemove when × button is clicked", () => {
const file = makeFile("doc.pdf", "pdf data");
const onRemove = vi.fn();
render(<PendingAttachmentPill file={file} onRemove={onRemove} />);
screen.getByRole("button").click();
expect(onRemove).toHaveBeenCalledTimes(1);
});
it("renders exactly one button (the × remove button)", () => {
const file = makeFile("img.png", "image bytes");
const { container } = render(
<PendingAttachmentPill file={file} onRemove={vi.fn()} />,
);
expect(container.querySelectorAll("button")).toHaveLength(1);
});
});
// ─── AttachmentChip ───────────────────────────────────────────────────────────
describe("AttachmentChip", () => {
it("renders the attachment name", () => {
const att = makeAttachment("chart.svg", 2048);
const { container } = render(
<AttachmentChip attachment={att} onDownload={vi.fn()} tone="user" />,
);
expect(container.textContent).toContain("chart.svg");
});
it("renders size when provided", () => {
const att = makeAttachment("dump.sql", 1024 * 150); // 150 KB
const { container } = render(
<AttachmentChip attachment={att} onDownload={vi.fn()} tone="user" />,
);
expect(container.textContent).toContain("150 KB");
});
it("omits size span when attachment.size is undefined", () => {
const att = makeAttachment("notes.md"); // no size
const { container } = render(
<AttachmentChip attachment={att} onDownload={vi.fn()} tone="user" />,
);
// The only <span> should be the truncated filename; no size <span>
const spans = Array.from(container.querySelectorAll("span"));
const sizeSpans = spans.filter(
(s) => s.className && s.className.includes("tabular-nums"),
);
expect(sizeSpans).toHaveLength(0);
});
it("has title attribute with download hint", () => {
const att = makeAttachment("readme.txt", 64);
const { container } = render(
<AttachmentChip attachment={att} onDownload={vi.fn()} tone="agent" />,
);
const btn = container.querySelector("button");
expect(btn?.getAttribute("title")).toBe("Download readme.txt");
});
it("calls onDownload with the attachment on click", () => {
const att = makeAttachment("export.csv", 8192);
const onDownload = vi.fn();
const { container } = render(
<AttachmentChip attachment={att} onDownload={onDownload} tone="agent" />,
);
container.querySelector("button")!.click();
expect(onDownload).toHaveBeenCalledWith(att);
});
it("tone=user applies blue accent class", () => {
const att = makeAttachment("photo.jpg", 512);
const { container } = render(
<AttachmentChip attachment={att} onDownload={vi.fn()} tone="user" />,
);
const btn = container.querySelector("button")!;
expect(btn.className).toContain("blue-400");
});
it("tone=agent does not apply blue accent class", () => {
const att = makeAttachment("photo.jpg", 512);
const { container } = render(
<AttachmentChip attachment={att} onDownload={vi.fn()} tone="agent" />,
);
const btn = container.querySelector("button")!;
expect(btn.className).not.toContain("blue-400");
});
it("renders exactly one button", () => {
const att = makeAttachment("icon.svg", 128);
const { container } = render(
<AttachmentChip attachment={att} onDownload={vi.fn()} tone="user" />,
);
expect(container.querySelectorAll("button")).toHaveLength(1);
});
});
@@ -1,142 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for KeyValueField component.
*
* Covers: initial password type, onChange callback (including whitespace trim
* on type), aria-label forwarding, disabled state, and auto-hide timer setup.
*/
import React from "react";
import { describe, it, expect, beforeEach, afterEach, vi } from "vitest";
import { render, screen, fireEvent, cleanup, act } from "@testing-library/react";
import { KeyValueField } from "../KeyValueField";
describe("KeyValueField — rendering", () => {
afterEach(cleanup);
it("renders input with type=password by default (secret hidden)", () => {
render(<KeyValueField value="" onChange={vi.fn()} />);
const input = screen.getByLabelText("Secret value");
expect(input.getAttribute("type")).toBe("password");
});
it("passes custom aria-label to the input element", () => {
render(<KeyValueField value="" onChange={vi.fn()} aria-label="API secret key" />);
expect(screen.getByLabelText("API secret key")).toBeTruthy();
});
it("disables the input when disabled=true", () => {
render(<KeyValueField value="secret" onChange={vi.fn()} disabled />);
expect(screen.getByLabelText("Secret value").disabled).toBe(true);
});
it("renders with the current value", () => {
render(<KeyValueField value="sk-test-key-123" onChange={vi.fn()} />);
expect(screen.getByLabelText("Secret value").value).toBe("sk-test-key-123");
});
it("renders with the placeholder text", () => {
render(<KeyValueField value="" onChange={vi.fn()} placeholder="Enter API key" />);
expect(screen.getByLabelText("Secret value").getAttribute("placeholder")).toBe("Enter API key");
});
it("renders the RevealToggle child button", () => {
render(<KeyValueField value="secret" onChange={vi.fn()} />);
// KeyValueField renders exactly one button (the RevealToggle)
expect(screen.getByRole("button")).toBeTruthy();
});
});
describe("KeyValueField — onChange", () => {
afterEach(cleanup);
it("calls onChange with the new value when user types", () => {
const onChange = vi.fn();
render(<KeyValueField value="" onChange={onChange} />);
fireEvent.change(screen.getByLabelText("Secret value"), { target: { value: "new-value" } });
expect(onChange).toHaveBeenCalledWith("new-value");
});
it("trims leading whitespace when user types with leading space", () => {
const onChange = vi.fn();
render(<KeyValueField value="" onChange={onChange} />);
fireEvent.change(screen.getByLabelText("Secret value"), { target: { value: " trimmed" } });
expect(onChange).toHaveBeenCalledWith("trimmed");
});
it("trims trailing whitespace when user types with trailing space", () => {
const onChange = vi.fn();
render(<KeyValueField value="" onChange={onChange} />);
fireEvent.change(screen.getByLabelText("Secret value"), { target: { value: "trimmed " } });
expect(onChange).toHaveBeenCalledWith("trimmed");
});
it("trims both sides when user types whitespace-surrounded value", () => {
const onChange = vi.fn();
render(<KeyValueField value="" onChange={onChange} />);
fireEvent.change(screen.getByLabelText("Secret value"), { target: { value: " both sides " } });
expect(onChange).toHaveBeenCalledWith("both sides");
});
it("does not modify value with no whitespace", () => {
const onChange = vi.fn();
render(<KeyValueField value="" onChange={onChange} />);
fireEvent.change(screen.getByLabelText("Secret value"), { target: { value: "clean-value" } });
expect(onChange).toHaveBeenCalledWith("clean-value");
});
});
describe("KeyValueField — auto-hide timer setup", () => {
beforeEach(() => {
vi.useFakeTimers();
});
afterEach(() => {
cleanup();
vi.useRealTimers();
});
it("sets up a 30s setTimeout when the component mounts with a non-empty value", () => {
const setTimeoutSpy = vi.spyOn(global, "setTimeout");
render(<KeyValueField value="secret" onChange={vi.fn()} />);
// No timer should be set initially (revealed=false by default)
const callsBeforeInteraction = setTimeoutSpy.mock.calls.length;
// Simulate reveal (click the only button)
act(() => { fireEvent.click(screen.getByRole("button")); });
// After reveal, a 30s timer should be set
const timerCalls = setTimeoutSpy.mock.calls.filter(
([, delay]) => delay === 30_000,
);
expect(timerCalls.length).toBeGreaterThanOrEqual(1);
});
it("clears existing timer when a new toggle happens before auto-hide fires", () => {
const clearTimeoutSpy = vi.spyOn(global, "clearTimeout");
const timerObj = {}; // fake timer ID
vi.spyOn(global, "setTimeout").mockImplementation((fn: () => void, delay: number) => {
return timerObj;
});
render(<KeyValueField value="secret" onChange={vi.fn()} />);
// First toggle — reveal
act(() => { fireEvent.click(screen.getByRole("button")); });
// Second toggle — hide (should clear the timer from first toggle)
act(() => { fireEvent.click(screen.getByRole("button")); });
// clearTimeout was called with the timer object
expect(clearTimeoutSpy).toHaveBeenCalledWith(timerObj);
});
it("clears timer on unmount", () => {
const clearTimeoutSpy = vi.spyOn(global, "clearTimeout");
const { unmount } = render(<KeyValueField value="secret" onChange={vi.fn()} />);
// Toggle reveal to start the timer
act(() => { fireEvent.click(screen.getByRole("button")); });
unmount();
expect(clearTimeoutSpy).toHaveBeenCalled();
});
});
@@ -1,68 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for RevealToggle component.
*
* Covers: eye-icon (hidden) vs eye-off-icon (revealed), onToggle callback,
* aria-label (default + custom), title attribute.
*/
import { afterEach, describe, it, expect, vi } from "vitest";
import { render, screen, fireEvent, cleanup } from "@testing-library/react";
import { RevealToggle } from "../RevealToggle";
afterEach(cleanup);
describe("RevealToggle", () => {
it("renders as a button", () => {
render(<RevealToggle revealed={false} onToggle={vi.fn()} />);
expect(screen.getByRole("button")).toBeTruthy();
});
it("uses default aria-label when not provided", () => {
render(<RevealToggle revealed={false} onToggle={vi.fn()} />);
expect(screen.getByRole("button").getAttribute("aria-label")).toBe("Toggle reveal secret");
});
it("uses custom aria-label when provided", () => {
render(<RevealToggle revealed={false} onToggle={vi.fn()} label="Show password" />);
expect(screen.getByRole("button").getAttribute("aria-label")).toBe("Show password");
});
it('title is "Hide value" when revealed', () => {
render(<RevealToggle revealed={true} onToggle={vi.fn()} />);
expect(screen.getByRole("button").getAttribute("title")).toBe("Hide value");
});
it('title is "Show value" when hidden', () => {
render(<RevealToggle revealed={false} onToggle={vi.fn()} />);
expect(screen.getByRole("button").getAttribute("title")).toBe("Show value");
});
it("calls onToggle when clicked (revealed=true → should hide)", () => {
const onToggle = vi.fn();
render(<RevealToggle revealed={true} onToggle={onToggle} />);
fireEvent.click(screen.getByRole("button"));
expect(onToggle).toHaveBeenCalledTimes(1);
});
it("calls onToggle when clicked (revealed=false → should show)", () => {
const onToggle = vi.fn();
render(<RevealToggle revealed={false} onToggle={onToggle} />);
fireEvent.click(screen.getByRole("button"));
expect(onToggle).toHaveBeenCalledTimes(1);
});
it("renders the eye-open SVG (hide icon) when revealed=false", () => {
render(<RevealToggle revealed={false} onToggle={vi.fn()} />);
const btn = screen.getByRole("button");
// The eye SVG contains a circle element; eye-off has a strikethrough line
expect(btn.querySelector("circle")).toBeTruthy();
expect(btn.querySelectorAll("line")).toHaveLength(0);
});
it("renders the eye-off SVG (show icon) when revealed=true", () => {
render(<RevealToggle revealed={true} onToggle={vi.fn()} />);
const btn = screen.getByRole("button");
// EyeOffIcon has a line (strikethrough) through the eye
expect(btn.querySelectorAll("line")).toHaveLength(1);
});
});
@@ -1,88 +0,0 @@
// @vitest-environment jsdom
/**
* StatusBadge — secret key connection status indicator.
*
* Per spec §4: always icon + color (never colour-only) for colour-blind users.
* Covers: verified / invalid / unverified render branches, icon, aria-label, className.
*/
import { afterEach, describe, expect, it } from "vitest";
import { render } from "@testing-library/react";
import React from "react";
import { StatusBadge } from "../StatusBadge";
afterEach(() => {
// Prevent DOM accumulation across tests (maxWorkers=1 means all test
// files share the same jsdom worker).
const { cleanup } = require("@testing-library/react");
cleanup();
});
function getBadge(status: "verified" | "invalid" | "unverified") {
const { container } = render(<StatusBadge status={status} />);
return container.querySelector("[role=status]") as HTMLElement;
}
describe("StatusBadge — icon", () => {
it("renders ✓ for verified", () => {
expect(getBadge("verified").textContent).toBe("✓");
});
it("renders ✗ for invalid", () => {
expect(getBadge("invalid").textContent).toBe("✗");
});
it("renders ○ for unverified", () => {
expect(getBadge("unverified").textContent).toBe("○");
});
});
describe("StatusBadge — aria-label", () => {
it("sets 'Connection status: verified' for verified", () => {
expect(getBadge("verified").getAttribute("aria-label")).toBe(
"Connection status: verified",
);
});
it("sets 'Connection status: invalid' for invalid", () => {
expect(getBadge("invalid").getAttribute("aria-label")).toBe(
"Connection status: invalid",
);
});
it("sets 'Connection status: unverified' for unverified", () => {
expect(getBadge("unverified").getAttribute("aria-label")).toBe(
"Connection status: unverified",
);
});
});
describe("StatusBadge — className", () => {
it("applies status-badge--valid for verified", () => {
expect(getBadge("verified").className).toContain("status-badge--valid");
});
it("applies status-badge--invalid for invalid", () => {
expect(getBadge("invalid").className).toContain("status-badge--invalid");
});
it("applies status-badge--unverified for unverified", () => {
expect(getBadge("unverified").className).toContain(
"status-badge--unverified",
);
});
});
describe("StatusBadge — role", () => {
it("sets role=status", () => {
const el = getBadge("verified");
expect(el.getAttribute("role")).toBe("status");
});
});
describe("StatusBadge — structural", () => {
it("renders exactly one status element", () => {
const { container } = render(<StatusBadge status="verified" />);
expect(container.querySelectorAll("[role=status]").length).toBe(1);
});
});
@@ -1,49 +0,0 @@
// @vitest-environment jsdom
/**
* Tests for ValidationHint component.
*
* Covers: null/neutral render, error state (red ⚠ + message), valid state
* (green ✓ + "Valid format"), ARIA role="alert" on error.
*/
import { afterEach, describe, it, expect } from "vitest";
import { render, screen, cleanup } from "@testing-library/react";
import { ValidationHint } from "../ValidationHint";
afterEach(cleanup);
describe("ValidationHint", () => {
it("renders nothing when error is null and showValid is false", () => {
const { container } = render(<ValidationHint error={null} showValid={false} />);
expect(container.innerHTML).toBe("");
});
it("renders nothing when error is null and showValid is undefined", () => {
const { container } = render(<ValidationHint error={null} />);
expect(container.innerHTML).toBe("");
});
it("renders error state with ⚠ icon and message", () => {
render(<ValidationHint error="Key name must be UPPER_SNAKE_CASE" />);
const el = screen.getByRole("alert");
expect(el.textContent).toContain("⚠");
expect(el.textContent).toContain("Key name must be UPPER_SNAKE_CASE");
});
it("renders valid state with ✓ and 'Valid format'", () => {
render(<ValidationHint error={null} showValid />);
const el = screen.getByText("Valid format");
expect(el.textContent).toContain("✓");
});
it("prefers error over valid when both are set (error is not null)", () => {
// ValidationHint checks error first; showValid is only rendered when error is falsy.
render(<ValidationHint error="Some error" showValid />);
expect(screen.getByRole("alert")).toBeTruthy();
expect(screen.queryByText("Valid format")).toBeNull();
});
it("error alert has role='alert' for screen readers", () => {
render(<ValidationHint error="Invalid format" />);
expect(screen.getByRole("alert")).toBeTruthy();
});
});
@@ -55,10 +55,10 @@ describe("statusDotClass", () => {
describe("TIER_CONFIG", () => {
it("has entries for all four tier levels", () => {
expect(TIER_CONFIG).toHaveProperty("1");
expect(TIER_CONFIG).toHaveProperty("2");
expect(TIER_CONFIG).toHaveProperty("3");
expect(TIER_CONFIG).toHaveProperty("4");
expect(TIER_CONFIG).toHaveProperty(1);
expect(TIER_CONFIG).toHaveProperty(2);
expect(TIER_CONFIG).toHaveProperty(3);
expect(TIER_CONFIG).toHaveProperty(4);
});
it("each tier has label, color, and border fields", () => {
+5 -5
View File
@@ -2,7 +2,7 @@
How a workspace-server code change reaches the prod tenant fleet — and how to stop it if something's wrong.
> **⚠️ State note (2026-04-22, secret names refreshed 2026-05-11):** this doc describes the **intended design**. As of this write, the canary fleet described below is **not actually running** — no canary tenants are provisioned, `MOLECULE_STAGING_TENANT_URLS` / `MOLECULE_STAGING_ADMIN_TOKENS` / `MOLECULE_STAGING_CP_SHARED_SECRET` are empty in repo secrets, and `staging-verify.yml` (formerly `canary-verify.yml`) fails every run.
> **⚠️ State note (2026-04-22):** this doc describes the **intended design**. As of this write, the canary fleet described below is **not actually running** — no canary tenants are provisioned, `CANARY_TENANT_URLS` / `CANARY_ADMIN_TOKENS` / `CANARY_CP_SHARED_SECRET` are empty in repo secrets, and `canary-verify.yml` fails every run.
>
> Current merges gate on manual `promote-latest.yml` dispatches, not canary. See [molecule-controlplane/docs/canary-tenants.md](https://git.moleculesai.app/molecule-ai/molecule-controlplane/src/branch/main/docs/canary-tenants.md) for the Phase 1 code work that's already shipped + the Phase 2 plan for actually standing up the fleet + a "should we even do this now?" decision framework.
>
@@ -22,7 +22,7 @@ publish-workspace-server-image.yml ← pushes :staging-<sha> ONLY
Canary tenants auto-update to :staging-<sha>
│ (5-min auto-updater cycle on each canary EC2)
staging-verify.yml waits 6 min, runs scripts/staging-smoke.sh
canary-verify.yml waits 6 min, runs scripts/canary-smoke.sh
├─► GREEN → crane tag :staging-<sha> → :latest
│ │
@@ -42,7 +42,7 @@ Canary tenants are configured to pull `:staging-<sha>` (not `:latest`) via `TENA
## Smoke suite
`scripts/staging-smoke.sh` hits each canary tenant (URL + ADMIN_TOKEN pair) and asserts:
`scripts/canary-smoke.sh` hits each canary tenant (URL + ADMIN_TOKEN pair) and asserts:
- `/admin/liveness` returns a subsystems map (tenant booted, AdminAuth reachable)
- `/workspaces` returns a JSON array (wsAuth + DB healthy)
@@ -59,8 +59,8 @@ Expand by editing the script — each `check "name" "expected" "$response"` call
3. Re-trigger provision (or delete + recreate if the org was already provisioned into staging) — the fresh EC2 lands in the canary AWS account (see internal runbook for the specific ID)
Then set repo secrets:
- `MOLECULE_STAGING_TENANT_URLS` — append the new tenant's URL
- `MOLECULE_STAGING_ADMIN_TOKENS` — append its ADMIN_TOKEN in the same position
- `CANARY_TENANT_URLS` — append the new tenant's URL
- `CANARY_ADMIN_TOKENS` — append its ADMIN_TOKEN in the same position
## Rolling back `:latest`
+20 -42
View File
@@ -117,7 +117,7 @@ This keeps secrets out of environment blocks and allows rotation without restart
### Step 4: Start the heartbeat loop
The heartbeat keeps your agent visible on the canvas and reports runtime state to the platform. Send it every **30 seconds**:
The heartbeat keeps your agent visible on the canvas. Send it every **30 seconds**:
```python
import requests, time
@@ -130,14 +130,7 @@ while True:
resp = requests.post(
f"{PLATFORM_URL}/registry/heartbeat",
headers={"Authorization": f"Bearer {AUTH_TOKEN}"},
json={
"workspace_id": WORKSPACE_ID,
"active_tasks": 0, # number of tasks currently being processed
"current_task": None, # optional: short description of what the agent is doing
"uptime_seconds": 0, # optional: seconds since agent started
"error_rate": 0.0, # optional: fraction of requests that errored in the last period
"runtime_state": "idle", # one of: idle, working, paused, error
},
json={"workspace_id": WORKSPACE_ID},
)
if resp.status_code != 200:
print(f"Heartbeat failed: {resp.status_code} {resp.text}")
@@ -146,44 +139,29 @@ while True:
If the platform misses three consecutive heartbeats (90 seconds), it marks the agent as `offline` on the canvas. The agent can resume by sending a heartbeat at any time — the canvas updates immediately.
> **Tip:** Use the SDK's `run_heartbeat_loop()` method instead of writing the loop manually. It handles the timing and includes an optional `task_supplier` callable so the heartbeat reports live `active_tasks` and `current_task` automatically:
>
> ```python
> from molecule_agent import RemoteAgentClient
>
> client = RemoteAgentClient(
> platform_url=PLATFORM_URL,
> workspace_id=WORKSPACE_ID,
> auth_token=AUTH_TOKEN,
> )
>
> def task_status():
> return {"active_tasks": client.get_active_task_count(), "current_task": client.get_current_task_name()}
>
> client.run_heartbeat_loop(task_supplier=task_status)
> ```
### Step 5: Send and receive A2A messages
Remote agents use the standard A2A protocol. Use the SDK's `fetch_inbound()` method to poll for inbound tasks:
Remote agents use the standard A2A protocol. Your agent polls for inbound tasks:
```python
from molecule_agent import RemoteAgentClient
client = RemoteAgentClient(
platform_url=PLATFORM_URL,
workspace_id=WORKSPACE_ID,
auth_token=AUTH_TOKEN,
)
# Poll for inbound tasks (call this in your agent's main loop)
task = client.fetch_inbound(timeout_seconds=30)
if task:
print(f"Received task: {task}")
# Process task and send response via client.send_result(...)
```bash
curl -s -X POST "${PLATFORM_URL}/a2a" \
-H "Authorization: Bearer ${AUTH_TOKEN}" \
-H "Content-Type: application/json" \
-H "X-Workspace-ID: ${WORKSPACE_ID}" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "message/send",
"params": {
"message": {
"role": "user",
"parts": [{"kind": "text", "text": "Hello from a remote agent"}]
}
}
}'
```
The SDK handles the `X-Workspace-ID` header automatically. Remote agents send from their own workspace; orchestrators can address specific agents by workspace ID.
The `X-Workspace-ID` header identifies which workspace the message originates from. Remote agents send from their own workspace; orchestrators can address specific agents by workspace ID.
### Step 6: Verify the agent appears on the canvas
-1
View File
@@ -44,4 +44,3 @@
{"name": "mock-bigorg", "repo": "molecule-ai/molecule-ai-org-template-mock-bigorg", "ref": "main"}
]
}
// Triggered by Integration Tester at 2026-05-10T08:52Z
@@ -50,7 +50,7 @@ pipeline.
| `check-merge-group-trigger.yml` | The workflow's own header (lines 18-23) documents that it's vacuously satisfied on Gitea — Gitea has no merge queue, no `merge_group:` event type, no `gh-readonly-queue/...` refs. Nothing to lint. |
| `codeql.yml` | The workflow's own header (lines 3-67) documents that `github/codeql-action/init@v4` hits api.github.com bundle endpoints not implemented by Gitea (observed: `::error::404 page not found` in Initialize CodeQL step). Per Hongming decision 2026-05-07 (task #156): CodeQL is ADVISORY/non-blocking until a Gitea-compatible SAST pipeline lands. Replacement options (Semgrep self-host, Sonatype, GitHub-mirror-for-SAST) tracked in #156. |
| `pr-guards.yml` | The workflow's own header documents that Gitea has no `gh pr merge --auto` primitive — the guard is a structural no-op on Gitea. Branch protection on `main` does NOT reference any `pr-guards` check name; deletion is safe. |
| `promote-latest.yml` | Uses `imjasonh/setup-crane` against `ghcr.io/molecule-ai/platform` — the GHCR registry was retired during the 2026-05-06 Gitea migration (per `staging-verify.yml` header notes — file was renamed from `canary-verify.yml` on 2026-05-11; the canonical tenant image moved to ECR `153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform-tenant`). The workflow can no longer find any image to retag. Follow-up issue suggested if an ECR-based retag promote is desired. |
| `promote-latest.yml` | Uses `imjasonh/setup-crane` against `ghcr.io/molecule-ai/platform` — the GHCR registry was retired during the 2026-05-06 Gitea migration (per `canary-verify.yml` header notes, the canonical tenant image moved to ECR `153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/platform-tenant`). The workflow can no longer find any image to retag. Follow-up issue suggested if an ECR-based retag promote is desired. |
## Category C — ported to .gitea/
-191
View File
@@ -1,191 +0,0 @@
# Gitea Actions operational quirks (molecule-core)
Documents persistent operational findings about Gitea Actions runner behaviour
that differ from GitHub Actions and require workarounds in workflow YAML or
runbooks.
> Last updated: 2026-05-11 (core-devops-agent)
---
## Large repo causes fetch timeout on Gitea Actions runner
### Finding
The Gitea Actions runner (container on host `5.78.80.188`) can reach the git
remote (`https://git.moleculesai.app`) over HTTPS — a single-commit shallow
fetch (`--depth=1`) succeeds in ~16 s. However, fetching the **full compressed
repo history** (~75+ MB) exceeds the runner's network timeout window (~15 s).
This is **not a Gitea Actions bug** and **not a network isolation policy**
it is a repo-size constraint. The runner can reach external hosts (GitHub,
Docker Hub, PyPI) without issue.
### Impact
Workflows that rely on `actions/checkout` with `fetch-depth: 0` (full history)
or `git clone` will time out.
Specifically:
- `actions/checkout@v*` with `fetch-depth: 0` hangs (fetching full repo
history takes >15 s before hitting the timeout).
- `git clone <url>` hangs for the same reason.
- `git fetch origin <ref> --depth=1` **succeeds** in ~16 s — this is the
working pattern.
### Affected workflows
| Workflow | Issue | Workaround |
|---|---|---|
| `harness-replays.yml` detect-changes job | `fetch-depth: 0` + `git clone` time out | Added `timeout 20 git fetch origin base.ref --depth=1` + `continue-on-error: true` + fallback to `run=true` per PR #441 |
| `publish-workspace-server-image.yml` | In-image `git clone` of workspace templates | Pre-clone manifest deps before compose build (Task #173 pattern) |
| Any workflow using `fetch-depth: 0` | Full history fetch times out | Use `fetch-depth: 1` + explicit `git fetch` for needed refs |
### How to diagnose
```bash
# From inside the runner (add as a debug step):
timeout 20 git fetch origin main --depth=1
# If this SUCCEEDS (~16s): runner can reach the git remote — the repo is
# too large for full-history fetch.
# If this times out: true network isolation (unlikely; check firewall rules).
```
### Verification
Confirmed 2026-05-11 by running `timeout 20 git fetch origin base.ref --depth=1`
in the `detect-changes` job of `harness-replays.yml`**succeeds in ~16 s**.
Runner can reach `https://api.github.com` and `https://pypi.org` without issue,
confirming this is a repo-size constraint, not network isolation.
### References
- PR #441: fix for `harness-replays.yml` detect-changes
- Task #173: pre-clone manifest deps pattern for compose build
- internal#102: tracking customer-private + marketplace third-party repos
- `feedback_oss_first_repo_visibility_default`: 5 workspace-template repos
flipped public to allow pre-clone without auth
---
## `continue-on-error` only works at step level, not job level
### Finding
Gitea Actions (1.22.6) does not honour `continue-on-error: true` at the **job**
level the way GitHub Actions does. A job with `continue-on-error: true` that
fails still reports `status: failure` in the commit status API.
Only `continue-on-error: true` at the **step** level works as expected.
### Impact
If you want a job to always "pass" in the status API (so dependent jobs can
run and the overall CI does not show `failure`), you must add
`continue-on-error: true` to every step that can fail, AND ensure each step
exits with code 0 (e.g., append `|| true` to commands that might fail).
### Affected workflows
| Workflow | Fix |
|---|---|
| `harness-replays.yml` detect-changes | Added `continue-on-error: true` to fetch step + decide step; added `|| true` to `DIFF=$(git diff ...)` per PR #441 |
### How to diagnose
```yaml
# WRONG — job reports as failure despite flag
jobs:
my-job:
continue-on-error: true # ← ignored by Gitea
steps:
- run: git diff ... # ← if this fails, job = failure
# job-level flag does not help
# RIGHT — step-level flag prevents step from failing
jobs:
my-job:
steps:
- run: git diff ... || true # ← step exits 0
continue-on-error: true # ← belt and suspenders
```
### References
- Gitea Actions quirk #10 (from migration checklist)
- PR #441: fix applied to `harness-replays.yml`
---
## `workflow_dispatch.inputs` not supported
Gitea 1.22.6 parser rejects `workflow_dispatch.inputs`. Drop from all workflow
YAML files ported from GitHub Actions. Manual triggers should use
`workflow_dispatch` without `inputs:`.
**Reference**: `feedback_gitea_workflow_dispatch_inputs_unsupported`
---
## `merge_group` not supported
Gitea has no merge queue concept. Drop `merge_group:` triggers from all
workflow YAML files.
---
## `environment:` blocks not supported
Gitea has no environments concept. Drop `environment:` from all workflow YAML
files. Secrets and variables are repo-level.
---
## Gitea combined status reports `failure` when all contexts are `null`
### Finding
When ALL individual status contexts for a commit have `state: null` (no runner
has reported yet), Gitea reports the combined commit status as `failure`. This
is a Gitea Actions bug — it conflates "no status reported yet" with "failed".
### Impact
- The `main-red-watchdog` workflow opens a `[main-red]` issue for every
scheduled workflow run where the combined state is `failure` — even when
the failure is entirely due to Gitea's combined-status bug.
- This causes spurious `[main-red]` issues that waste SRE time investigating
non-existent failures.
- **This is especially confusing for `schedule:`-only workflows** (canary,
sweep jobs, synth-E2E): Gitea attributes their scheduled runs to `main`'s
HEAD commit, so if a scheduled run fires while all contexts are still
`state: null`, the watchdog opens a `[main-red]` issue on the latest main
commit even though that commit itself is perfectly fine.
### How to diagnose
Always check the **individual context `state` fields**, not the combined
`state`/`combined_state`. In the `/repos/{org}/{repo}/commits/{sha}/statuses`
API response, look for `"state": null` on every entry — if all are null, the
combined `failure` is Gitea's bug, not a real CI failure.
```json
{
"combined_state": "failure", // ← Gitea bug when all are null
"contexts": [
{ "context": "CI / Lint", "state": null }, // still running
{ "context": "CI / Test", "state": null } // still running
]
}
```
### Affected workflows
All workflows, but especially `schedule:`-only workflows that run on `main`.
The main-red-watchdog (`.gitea/workflows/main-red-watchdog.yml`) is the
primary consumer of combined status and is affected.
### References
- Issue #481: first real-world case of this bug (2026-05-11)
- `feedback_no_such_thing_as_flakes`: watchdog directive
+1 -1
View File
@@ -43,7 +43,7 @@ endpoint handler for the supported range.
- `cleanup-rogue-workspaces.sh` — emergency teardown for leaked
workspaces. Prompts for confirmation. Pair with the harnesses if a
cleanup trap fails (see `cleanup_*_failed` events).
- `staging-smoke.sh` — quick smoke test for the staging canary fleet (formerly `canary-smoke.sh`).
- `canary-smoke.sh` — quick smoke test for canary releases.
- `dev-start.sh` — local-dev platform bring-up.
The rest are self-documenting in their header comments.
@@ -1,40 +1,29 @@
#!/bin/bash
# staging-smoke.sh — runs the post-deploy smoke suite against the
# staging canary tenant fleet. Called by the staging-verify.yml Gitea
# canary-smoke.sh — runs the post-deploy smoke suite against the
# staging canary tenant fleet. Called by the canary-verify.yml GitHub
# Actions workflow after a new workspace-server image lands in ECR;
# exits non-zero on any failure so the workflow can block the
# redeploy-fleet promotion that would otherwise release broken code
# to the prod tenant fleet.
#
# Naming note (2026-05-11): The script (and its input env vars) were
# renamed from canary-smoke.sh / CANARY_* to staging-smoke.sh /
# MOLECULE_STAGING_* per Hongming directive. The tested COHORT is still
# called the "canary fleet" (a small subset of staging tenants that
# ingest :staging-<sha> before the rest of the fleet); that strategy
# concept is unchanged.
#
# Registry note: GHCR was retired 2026-05-06. Images are now pushed
# to the operator's ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/
# molecule-ai/platform-tenant). The registry URL is a runtime concern for
# the CI push step; this script tests the running tenant directly.
#
# Environment:
# MOLECULE_STAGING_TENANT_URLS space-sep list of canary tenant base
# URLs (e.g. "https://canary-pm.staging.
# moleculesai.app https://canary-mcp.
# staging.moleculesai.app")
# MOLECULE_STAGING_ADMIN_TOKENS space-sep list of ADMIN_TOKENs,
# positionally matched to
# MOLECULE_STAGING_TENANT_URLS.
# Canary tenants are provisioned with
# known ADMIN_TOKENs so CI can hit
# their admin-gated endpoints.
# MOLECULE_STAGING_CP_BASE_URL CP base URL the canaries call back to
# (https://staging-api.moleculesai.app)
# MOLECULE_STAGING_CP_SHARED_SECRET matches CP's PROVISION_SHARED_SECRET
# so this script can also exercise
# /cp/workspaces/* via the canary's
# own CPProvisioner identity.
# CANARY_TENANT_URLS space-sep list of canary tenant base URLs
# (e.g. "https://canary-pm.staging.moleculesai.app
# https://canary-mcp.staging.moleculesai.app")
# CANARY_ADMIN_TOKENS space-sep list of ADMIN_TOKENs, positionally
# matched to CANARY_TENANT_URLS. Canary tenants
# are provisioned with known ADMIN_TOKENs so CI
# can hit their admin-gated endpoints.
# CANARY_CP_BASE_URL CP base URL the canaries call back to
# (https://staging-api.moleculesai.app)
# CANARY_CP_SHARED_SECRET matches CP's PROVISION_SHARED_SECRET so this
# script can also exercise /cp/workspaces/* via
# the canary's own CPProvisioner identity.
#
# Exit codes: 0 = all green, 1 = assertion failure, 2 = setup/env problem.
@@ -42,12 +31,12 @@ set -euo pipefail
# ── Setup ────────────────────────────────────────────────────────────────
: "${MOLECULE_STAGING_TENANT_URLS:?space-sep list of canary base URLs required}"
: "${MOLECULE_STAGING_ADMIN_TOKENS:?space-sep list of ADMIN_TOKENs required, same order as URLs}"
: "${MOLECULE_STAGING_CP_BASE_URL:?CP base URL required}"
: "${CANARY_TENANT_URLS:?space-sep list of canary base URLs required}"
: "${CANARY_ADMIN_TOKENS:?space-sep list of ADMIN_TOKENs required, same order as URLs}"
: "${CANARY_CP_BASE_URL:?CP base URL required}"
read -r -a URLS <<< "$MOLECULE_STAGING_TENANT_URLS"
read -r -a TOKENS <<< "$MOLECULE_STAGING_ADMIN_TOKENS"
read -r -a URLS <<< "$CANARY_TENANT_URLS"
read -r -a TOKENS <<< "$CANARY_ADMIN_TOKENS"
if [ "${#URLS[@]}" -ne "${#TOKENS[@]}" ]; then
echo "ERROR: URLS(${#URLS[@]}) and TOKENS(${#TOKENS[@]}) length mismatch" >&2
@@ -80,7 +69,7 @@ check() {
# tenant never gets the wrong token.
acurl() {
local base="$1" token="$2"; shift 2
curl -sS --max-time 20 -H "Authorization: Bearer $token" "$@" -- "$base${ACURL_PATH:-}"
curl -sS --max-time 20 -H "Authorization: Bearer $token" "$@" -- "$base${CANARY_ACURL_PATH:-}"
}
# ── Checks (run per canary tenant) ───────────────────────────────────────
@@ -91,7 +80,7 @@ for i in "${!URLS[@]}"; do
printf "\n── %s ──\n" "$base"
# 1. Liveness — the tenant is up and responding to admin auth.
ACURL_PATH="/admin/liveness" resp=$(acurl "$base" "$token" || true)
CANARY_ACURL_PATH="/admin/liveness" resp=$(acurl "$base" "$token" || true)
check "liveness returns a subsystems map" '"subsystems"' "$resp"
# 2. CP env refresh — the workspace-server fetched MOLECULE_CP_SHARED_SECRET
@@ -100,25 +89,25 @@ for i in "${!URLS[@]}"; do
# booted without crashing on the refresh call. A startup failure in
# refreshEnvFromCP logs but still boots (best-effort semantics), so
# this is a sanity check, not a proof.
ACURL_PATH="/workspaces" resp=$(acurl "$base" "$token" || true)
CANARY_ACURL_PATH="/workspaces" resp=$(acurl "$base" "$token" || true)
check "workspace list is JSON array" "[" "$resp"
# 3. Memory commit round-trip — scope=LOCAL so test data stays on this
# tenant. Verifies encryption + scrubber + retrieval end-to-end.
probe_id="canary-smoke-$(date +%s)-$i"
body=$(printf '{"scope":"LOCAL","namespace":"canary-smoke","content":"probe-%s"}' "$probe_id")
ACURL_PATH="/memories/commit" resp=$(curl -sS --max-time 20 \
CANARY_ACURL_PATH="/memories/commit" resp=$(curl -sS --max-time 20 \
-X POST -H "Content-Type: application/json" -H "Authorization: Bearer $token" \
--data "$body" "$base/memories/commit" || true)
check "memory commit accepted" '"id"' "$resp"
ACURL_PATH="/memories/search?query=probe-${probe_id}" \
CANARY_ACURL_PATH="/memories/search?query=probe-${probe_id}" \
resp=$(curl -sS --max-time 20 -H "Authorization: Bearer $token" \
"$base/memories/search?query=probe-${probe_id}" || true)
check "memory search finds the probe" "probe-${probe_id}" "$resp"
# 4. Events admin read — AdminAuth path (C4 fail-closed proof on SaaS).
ACURL_PATH="/events" resp=$(acurl "$base" "$token" || true)
CANARY_ACURL_PATH="/events" resp=$(acurl "$base" "$token" || true)
check "events endpoint returns JSON" "[" "$resp"
# 5. Negative: unauth'd admin call must 401 (C4 regression gate).
@@ -128,7 +117,7 @@ for i in "${!URLS[@]}"; do
# 6. POST /org/import unauth → 401. Proves the route is compiled in
# and AdminAuth is enforced. A missing route returns 404 (the failure
# mode caught by issue #213). Regression guard for the silent-GHCR-
# migration gap: staging-verify (formerly canary-verify) was testing a stale GHCR image while
# migration gap: canary-verify was testing a stale GHCR image while
# actual tenants ran ECR — this test would have caught a missing-route
# binary before it reached prod.
unauth_code=$(curl -sS -o /dev/null -w '%{http_code}' \
+4 -15
View File
@@ -34,17 +34,6 @@ WS_DIR="${2:?Missing workspace-templates dir}"
ORG_DIR="${3:?Missing org-templates dir}"
PLUGINS_DIR="${4:?Missing plugins dir}"
# Strip JSON5-style // comments from manifest.json before parsing.
# The automated Integration Tester appends a trailing comment
# (// Triggered by ... ) which is valid JSON5 but not standard JSON.
# jq's default parser rejects it. This sed removes only full-line comments
# (lines starting with optional whitespace followed by //) before jq reads the file.
_strip_comments() {
# Remove full-line // comments (whitespace-safe); pass-through for non-comment lines
sed 's/^[[:space:]]*\/\/.*//' "$MANIFEST"
}
MANIFEST_JSON="$(_strip_comments)"
EXPECTED=0
CLONED=0
@@ -99,15 +88,15 @@ clone_category() {
mkdir -p "$target_dir"
local count
count=$(echo "$MANIFEST_JSON" | jq -r ".${category} | length")
count=$(jq -r ".${category} | length" "$MANIFEST")
EXPECTED=$((EXPECTED + count))
local i=0
while [ "$i" -lt "$count" ]; do
local name repo ref
name=$(echo "$MANIFEST_JSON" | jq -r ".${category}[$i].name")
repo=$(echo "$MANIFEST_JSON" | jq -r ".${category}[$i].repo")
ref=$(echo "$MANIFEST_JSON" | jq -r ".${category}[$i].ref // \"main\"")
name=$(jq -r ".${category}[$i].name" "$MANIFEST")
repo=$(jq -r ".${category}[$i].repo" "$MANIFEST")
ref=$(jq -r ".${category}[$i].ref // \"main\"" "$MANIFEST")
# Idempotent: skip if the target already looks populated. Lets the
# README quickstart rerun setup.sh safely without having to delete
+4 -12
View File
@@ -7,11 +7,11 @@ Four workflows + a shared bash harness that together cover the SaaS stack end to
| Workflow | Cadence | Wall time | Scope |
|---|---|---|---|
| `e2e-staging-saas.yml` | push + nightly 07:00 UTC | ~20 min | Full API: org → tenant → 2 workspaces → A2A → HMA → delegation → leak check |
| `staging-smoke.yml` | every 30 min | ~8 min | Minimum smoke + self-managed alert issue |
| `canary-staging.yml` | every 30 min | ~8 min | Minimum smoke + self-managed alert issue |
| `e2e-staging-canvas.yml` | push + weekly Sunday 08:00 | ~25 min | All 13 canvas workspace-panel tabs via Playwright |
| `e2e-staging-sanity.yml` | weekly Monday 06:00 | ~10 min | Intentional-failure: teardown safety-net self-check |
`tests/e2e/test_staging_full_saas.sh` is the shared harness all workflows invoke (with `E2E_MODE={full|smoke}` and `E2E_INTENTIONAL_FAILURE={0|1}` toggles).
`tests/e2e/test_staging_full_saas.sh` is the shared harness all workflows invoke (with `E2E_MODE={full|canary}` and `E2E_INTENTIONAL_FAILURE={0|1}` toggles).
### Full-SaaS checklist (sections)
@@ -49,15 +49,7 @@ Runs the harness with `E2E_INTENTIONAL_FAILURE=1`, which poisons the tenant admi
Set in **Settings → Secrets and variables → Actions → Repository secrets**:
### `CP_STAGING_ADMIN_API_TOKEN`
> **Historical-rename note (2026-05-11):** previously named
> `MOLECULE_STAGING_ADMIN_TOKEN`. Canonicalised to
> `CP_STAGING_ADMIN_API_TOKEN` per internal#322 (the Railway staging
> service exposes it as `CP_ADMIN_API_TOKEN`; the `CP_*` repo-secret
> prefix matches the upstream env name + makes the service it talks
> to obvious in workflow YAMLs). See the original PR for the
> cross-workflow sweep.
### `MOLECULE_STAGING_ADMIN_TOKEN`
The `CP_ADMIN_API_TOKEN` env currently set on the Railway staging molecule-platform → controlplane service.
@@ -90,7 +82,7 @@ bash tests/e2e/test_staging_full_saas.sh
## Cost
- Full run: ~20 min, ~$0.007
- Smoke (48/day): ~$0.06/day
- Canary (48/day): ~$0.06/day
- Canvas (few/week): ~$0.01/day
- Sanity (weekly): ~$0.002/week
- **Total staging burn: < $0.15/day** at expected CI load
+6 -18
View File
@@ -27,11 +27,7 @@
# E2E_PROVISION_TIMEOUT_SECS default 900 (15 min cold EC2 budget)
# E2E_KEEP_ORG 1 → skip teardown (debugging only)
# E2E_RUN_ID Slug suffix; CI: ${GITHUB_RUN_ID}
# E2E_MODE full (default) | smoke
# (legacy alias `canary` still accepted —
# mapped to `smoke` for back-compat with
# any in-flight runner picking up an older
# workflow checkout)
# E2E_MODE full (default) | canary
# E2E_INTENTIONAL_FAILURE 1 → poison tenant token mid-run so the
# script fails; the EXIT trap MUST still
# tear down cleanly (and exit 4 on leak).
@@ -53,23 +49,15 @@ RUNTIME="${E2E_RUNTIME:-hermes}"
PROVISION_TIMEOUT_SECS="${E2E_PROVISION_TIMEOUT_SECS:-900}"
RUN_ID_SUFFIX="${E2E_RUN_ID:-$(date +%H%M%S)-$$}"
MODE="${E2E_MODE:-full}"
# `canary` is a legacy alias for `smoke` retained for back-compat with
# any in-flight runner picking up an older workflow checkout during the
# 2026-05-11 canary→staging rename rollout. Both map to the same slug
# prefix below. Remove the `canary` alias after one week of no-old-mode
# observations.
if [ "$MODE" = "canary" ]; then
MODE="smoke"
fi
case "$MODE" in
full|smoke) ;;
*) echo "E2E_MODE must be 'full' or 'smoke' (got: $MODE)" >&2; exit 2 ;;
full|canary) ;;
*) echo "E2E_MODE must be 'full' or 'canary' (got: $MODE)" >&2; exit 2 ;;
esac
# Smoke runs get a distinct slug prefix so their safety-net sweeper only
# Canary runs get a distinct prefix so their safety-net sweeper only
# touches their own runs, not in-flight full runs.
if [ "$MODE" = "smoke" ]; then
SLUG="e2e-smoke-$(date +%Y%m%d)-${RUN_ID_SUFFIX}"
if [ "$MODE" = "canary" ]; then
SLUG="e2e-canary-$(date +%Y%m%d)-${RUN_ID_SUFFIX}"
else
SLUG="e2e-$(date +%Y%m%d)-${RUN_ID_SUFFIX}"
fi
-775
View File
@@ -1,775 +0,0 @@
"""Tests for `.gitea/scripts/status-reaper.py` — Option B compensating
status POST for Gitea 1.22.6's hardcoded `(push)` suffix bug.
Coverage (per hongming-pc 22:08Z review + brief):
1. test_workflow_with_name_field
2. test_workflow_without_name_field (filename stem fallback)
3. test_workflow_name_collision_fails_loud
4. test_workflow_name_with_slash_fails_loud
5. test_has_push_trigger_true (dict shape, list shape, str shape)
6. test_has_push_trigger_false (schedule-only, dispatch-only,
pull_request-only, workflow_run-only)
7. test_publish_workspace_server_image_preserved (explicit case)
8. test_compensating_post_payload (POST body shape verification)
Plus regression coverage:
- parse_push_context strictness (only ` (push)` suffix with ` / `
separator triggers compensation).
- Class-O detection via end-to-end reap() with a stubbed api().
- ApiError propagation on non-2xx (mirror of main-red-watchdog's
`feedback_api_helper_must_raise_not_return_dict` test).
- Unknown-workflow conservatism: ::notice:: + skip, never POST.
- Non-`(push)`-suffix contexts (the `(pull_request)` required-checks
on main) are NEVER touched — verified safe 2026-05-11.
Hostile self-review proof:
- test_required_check_pull_request_suffix_never_touched exercises
the safety contract: a pre-fix that compensated any failing
context would mask the Secret scan required-check. Verified by
stashing the `endswith(PUSH_SUFFIX)` guard and re-running: test
FAILS as required.
- test_workflow_name_collision_fails_loud asserts exit code 1; a
pre-fix that "first write wins" would silently misclassify a
renamed workflow.
Run:
python3 -m pytest tests/test_status_reaper.py -v
Dependencies: stdlib + pytest + PyYAML. No network.
"""
from __future__ import annotations
import importlib.util
import json
import os
import sys
from pathlib import Path
from unittest import mock
import pytest
# --------------------------------------------------------------------------
# Module-import fixture
# --------------------------------------------------------------------------
SCRIPT_PATH = (
Path(__file__).resolve().parent.parent
/ ".gitea"
/ "scripts"
/ "status-reaper.py"
)
@pytest.fixture(scope="module")
def sr_module():
"""Import the script as a module under a known env."""
env = {
"GITEA_TOKEN": "test-token",
"GITEA_HOST": "git.example.test",
"REPO": "owner/repo",
"WATCH_BRANCH": "main",
"WORKFLOWS_DIR": ".gitea/workflows",
}
with mock.patch.dict(os.environ, env, clear=False):
spec = importlib.util.spec_from_file_location("status_reaper", SCRIPT_PATH)
m = importlib.util.module_from_spec(spec)
spec.loader.exec_module(m)
m.GITEA_TOKEN = env["GITEA_TOKEN"]
m.GITEA_HOST = env["GITEA_HOST"]
m.REPO = env["REPO"]
m.WATCH_BRANCH = env["WATCH_BRANCH"]
m.WORKFLOWS_DIR = env["WORKFLOWS_DIR"]
m.OWNER, m.NAME = "owner", "repo"
m.API = f"https://{env['GITEA_HOST']}/api/v1"
yield m
# --------------------------------------------------------------------------
# Workflow scan tests — workflow_id resolution
# --------------------------------------------------------------------------
def _write_workflow(tmp_path: Path, filename: str, content: str) -> Path:
"""Write a workflow YAML to a temp dir and return its path."""
d = tmp_path / "workflows"
d.mkdir(exist_ok=True)
p = d / filename
p.write_text(content)
return p
def test_workflow_with_name_field(sr_module, tmp_path):
"""`name:` field beats filename stem."""
_write_workflow(
tmp_path,
"publish-runtime.yml",
"name: publish-runtime\non:\n push:\n branches: [main]\n",
)
out = sr_module.scan_workflows(str(tmp_path / "workflows"))
assert "publish-runtime" in out
assert out["publish-runtime"] is True
def test_workflow_without_name_field(sr_module, tmp_path):
"""No `name:` → filename stem (basename minus `.yml`)."""
_write_workflow(
tmp_path,
"no-name-workflow.yml",
"on:\n schedule:\n - cron: '*/5 * * * *'\n",
)
out = sr_module.scan_workflows(str(tmp_path / "workflows"))
assert "no-name-workflow" in out
assert out["no-name-workflow"] is False # schedule-only → class-O
def test_workflow_name_collision_fails_loud(sr_module, tmp_path, capsys):
"""Two workflows resolving to the same name → exit 1 with ::error::."""
_write_workflow(
tmp_path,
"a.yml",
"name: same-name\non:\n push: {}\n",
)
_write_workflow(
tmp_path,
"b.yml",
"name: same-name\non:\n schedule:\n - cron: '0 * * * *'\n",
)
with pytest.raises(SystemExit) as excinfo:
sr_module.scan_workflows(str(tmp_path / "workflows"))
assert excinfo.value.code == 1
captured = capsys.readouterr()
assert "::error::workflow name collision detected: same-name" in captured.err
def test_workflow_name_with_slash_fails_loud(sr_module, tmp_path, capsys):
"""`name:` containing `/` → exit 1 with ::error:: (breaks context parse)."""
_write_workflow(
tmp_path,
"weird.yml",
"name: my/weird/name\non:\n push: {}\n",
)
with pytest.raises(SystemExit) as excinfo:
sr_module.scan_workflows(str(tmp_path / "workflows"))
assert excinfo.value.code == 1
captured = capsys.readouterr()
assert "::error::workflow name contains '/'" in captured.err
assert "my/weird/name" in captured.err
def test_workflow_name_with_slash_via_filename_stem_fails_loud(sr_module, tmp_path, capsys):
"""Even if filename stem contains `/` (path-flavoured stem) we trip the
same guard. Defensive — Path.stem strips `/` so this can't happen via
real filesystems, but the guard catches it if someone synthesises a
map from a non-filesystem source in future."""
# Force the filename-stem path by writing a no-name workflow whose
# PARENT path has a `/` — but Path.stem only takes the basename, so
# we instead mock _on_block / iterate manually. Easier: assert the
# in-code check directly.
# The `/` guard runs on `workflow_id`. Test it via an explicit name
# field workflow (already covered) — this test is left as a
# docstring-only marker that the filename-stem path can't ever
# produce a `/` (Path.stem strips it).
assert True # No-op: Path.stem strips `/`; documented invariant.
def test_workflow_empty_name_falls_back_to_stem(sr_module, tmp_path):
"""Empty `name:` (just whitespace) should fall back to filename stem."""
_write_workflow(
tmp_path,
"stem-fallback.yml",
"name: ' '\non:\n push: {}\n",
)
out = sr_module.scan_workflows(str(tmp_path / "workflows"))
assert "stem-fallback" in out # filename stem used
assert out["stem-fallback"] is True
# --------------------------------------------------------------------------
# has_push_trigger tests
# --------------------------------------------------------------------------
def test_has_push_trigger_true_dict(sr_module):
assert sr_module._has_push_trigger({"push": {}, "schedule": []}, "w") is True
def test_has_push_trigger_true_dict_with_paths(sr_module):
"""`on: { push: { paths: ['workspace/**'] } }` → still push-triggered."""
assert (
sr_module._has_push_trigger(
{"push": {"paths": ["workspace/**"]}}, "w"
)
is True
)
def test_has_push_trigger_true_list(sr_module):
assert sr_module._has_push_trigger(["push", "pull_request"], "w") is True
def test_has_push_trigger_true_str(sr_module):
assert sr_module._has_push_trigger("push", "w") is True
def test_has_push_trigger_false_schedule_only(sr_module):
"""Schedule-only workflow (class-O canonical)."""
assert (
sr_module._has_push_trigger(
{"schedule": [{"cron": "0 * * * *"}]}, "w"
)
is False
)
def test_has_push_trigger_false_dispatch_only(sr_module):
assert sr_module._has_push_trigger({"workflow_dispatch": {}}, "w") is False
def test_has_push_trigger_false_pull_request_only(sr_module):
"""`on: { pull_request: {...} }` only → no push trigger."""
assert sr_module._has_push_trigger({"pull_request": {}}, "w") is False
def test_has_push_trigger_false_workflow_run_only(sr_module):
"""`on: { workflow_run: {...} }` → no push trigger.
(Even though Gitea 1.22.6 doesn't fire workflow_run, the classifier
must handle YAML that declares it — for forward-compat.)"""
assert sr_module._has_push_trigger({"workflow_run": {}}, "w") is False
def test_has_push_trigger_false_list_no_push(sr_module):
assert (
sr_module._has_push_trigger(["pull_request", "schedule"], "w") is False
)
def test_has_push_trigger_ambiguous_preserves(sr_module, capsys):
"""Unknown shape → True (preserve, never compensate) + log ::notice::."""
assert sr_module._has_push_trigger(42, "weird-workflow") is True
captured = capsys.readouterr()
assert "::notice::ambiguous on: for weird-workflow" in captured.out
def test_has_push_trigger_none_preserves(sr_module, capsys):
"""None `on:` block → True (preserve)."""
assert sr_module._has_push_trigger(None, "no-on") is True
captured = capsys.readouterr()
assert "::notice::ambiguous on:" in captured.out
# --------------------------------------------------------------------------
# Real-world fixture: publish-workspace-server-image preserved
# --------------------------------------------------------------------------
def test_publish_workspace_server_image_preserved(sr_module, tmp_path):
"""Explicit case per brief: real `push` trigger → preserve, even
when failing. Protects mc#576 (currently red on docker-socket issue).
"""
_write_workflow(
tmp_path,
"publish-workspace-server-image.yml",
"name: publish-workspace-server-image\n"
"on:\n"
" push:\n"
" branches: [main]\n"
" paths: ['workspace/**']\n"
" workflow_dispatch:\n",
)
out = sr_module.scan_workflows(str(tmp_path / "workflows"))
assert out["publish-workspace-server-image"] is True
# --------------------------------------------------------------------------
# Context parsing
# --------------------------------------------------------------------------
def test_parse_push_context_canonical(sr_module):
"""`<workflow_name> / <job_name> (push)` → (workflow_name, job_name)."""
parsed = sr_module.parse_push_context("staging-smoke / smoke (push)")
assert parsed == ("staging-smoke", "smoke")
def test_parse_push_context_workflow_name_with_spaces(sr_module):
"""Workflow name with spaces — common (`Continuous synthetic E2E`)."""
parsed = sr_module.parse_push_context(
"Continuous synthetic E2E (staging) / e2e (push)"
)
assert parsed == ("Continuous synthetic E2E (staging)", "e2e")
def test_parse_push_context_non_push_suffix_returns_none(sr_module):
"""`(pull_request)` suffix → None (not the bug shape; required-checks)."""
assert (
sr_module.parse_push_context("Secret scan / Scan diff (pull_request)")
is None
)
def test_parse_push_context_no_separator_returns_none(sr_module):
"""`(push)` suffix but no ` / ` → None (not the bug shape)."""
assert sr_module.parse_push_context("just-a-context (push)") is None
def test_parse_push_context_no_suffix_returns_none(sr_module):
assert sr_module.parse_push_context("workflow / job") is None
# --------------------------------------------------------------------------
# Compensating POST payload shape
# --------------------------------------------------------------------------
def test_compensating_post_payload(sr_module, monkeypatch):
"""POST /statuses/{sha} body: state=success, context preserved,
description = COMPENSATION_DESCRIPTION, target_url echoed if present.
"""
calls = []
def fake_api(method, path, *, body=None, query=None, expect_json=True):
calls.append((method, path, body, query))
return (201, {})
monkeypatch.setattr(sr_module, "api", fake_api)
sr_module.post_compensating_status(
"deadbeefcafe1234567890abcdef000011112222",
"staging-smoke / smoke (push)",
"https://git.example.test/owner/repo/actions/runs/14525",
dry_run=False,
)
assert len(calls) == 1
method, path, body, _query = calls[0]
assert method == "POST"
assert path == "/repos/owner/repo/statuses/deadbeefcafe1234567890abcdef000011112222"
assert body == {
"context": "staging-smoke / smoke (push)",
"state": "success",
"description": sr_module.COMPENSATION_DESCRIPTION,
"target_url": "https://git.example.test/owner/repo/actions/runs/14525",
}
def test_compensating_post_payload_no_target_url(sr_module, monkeypatch):
"""target_url is optional — omitted when the original status had none."""
calls = []
def fake_api(method, path, *, body=None, query=None, expect_json=True):
calls.append((method, path, body, query))
return (201, {})
monkeypatch.setattr(sr_module, "api", fake_api)
sr_module.post_compensating_status(
"abc1234567",
"x / y (push)",
None,
dry_run=False,
)
assert calls[0][2] == {
"context": "x / y (push)",
"state": "success",
"description": sr_module.COMPENSATION_DESCRIPTION,
}
def test_compensating_post_dry_run_no_api_call(sr_module, monkeypatch, capsys):
"""--dry-run must NOT POST."""
def fake_api(*args, **kwargs):
raise AssertionError("api() should not be called in dry_run")
monkeypatch.setattr(sr_module, "api", fake_api)
sr_module.post_compensating_status(
"deadbeefcafe1234567890abcdef000011112222",
"ci/test (push)",
None,
dry_run=True,
)
captured = capsys.readouterr()
assert "::notice::[dry-run] would compensate" in captured.out
# --------------------------------------------------------------------------
# End-to-end reap() — class-O detection
# --------------------------------------------------------------------------
SHA = "deadbeefcafe1234567890abcdef000011112222"
def test_reap_compensates_class_o(sr_module, monkeypatch):
"""schedule-only workflow with failing `(push)` status → compensate."""
calls = []
def fake_api(method, path, *, body=None, query=None, expect_json=True):
calls.append((method, path, body))
return (201, {})
monkeypatch.setattr(sr_module, "api", fake_api)
workflow_map = {"staging-smoke": False} # no push trigger
combined = {
"state": "failure",
"statuses": [
{
"context": "staging-smoke / smoke (push)",
"state": "failure",
"target_url": "https://example.test/run/1",
"description": "smoke job failed",
}
],
}
counters = sr_module.reap(workflow_map, combined, SHA, dry_run=False)
assert counters["compensated"] == 1
assert counters["preserved_real_push"] == 0
assert len(calls) == 1
assert calls[0][0] == "POST"
assert calls[0][1] == f"/repos/owner/repo/statuses/{SHA}"
def test_reap_preserves_real_push(sr_module, monkeypatch):
"""publish-workspace-server-image (has push trigger) → preserve."""
calls = []
def fake_api(*args, **kwargs):
calls.append((args, kwargs))
return (201, {})
monkeypatch.setattr(sr_module, "api", fake_api)
workflow_map = {"publish-workspace-server-image": True}
combined = {
"state": "failure",
"statuses": [
{
"context": "publish-workspace-server-image / build (push)",
"state": "failure",
}
],
}
counters = sr_module.reap(workflow_map, combined, SHA, dry_run=False)
assert counters["compensated"] == 0
assert counters["preserved_real_push"] == 1
assert calls == [] # NO POST
def test_reap_preserves_unknown_workflow(sr_module, monkeypatch, capsys):
"""Workflow not in map → ::notice:: + skip (conservative)."""
monkeypatch.setattr(
sr_module, "api",
lambda *a, **kw: (_ for _ in ()).throw(
AssertionError("api should not be called")
),
)
workflow_map = {} # empty map
combined = {
"state": "failure",
"statuses": [
{
"context": "deleted-workflow / job (push)",
"state": "failure",
}
],
}
counters = sr_module.reap(workflow_map, combined, SHA, dry_run=False)
assert counters["compensated"] == 0
assert counters["preserved_unknown"] == 1
captured = capsys.readouterr()
assert "::notice::unknown workflow 'deleted-workflow'" in captured.out
def test_reap_required_check_pull_request_suffix_never_touched(sr_module, monkeypatch):
"""SAFETY CONTRACT: `(pull_request)` suffix contexts (the actual
required-checks on main) are NEVER touched. A pre-fix that
compensated any failure would mask Secret scan.
"""
calls = []
def fake_api(*args, **kwargs):
calls.append((args, kwargs))
return (201, {})
monkeypatch.setattr(sr_module, "api", fake_api)
# Even with the workflow mapped as no-push-trigger (which would
# normally compensate), the suffix guard prevents the POST.
workflow_map = {"Secret scan": False}
combined = {
"state": "failure",
"statuses": [
{
"context": "Secret scan / Scan diff for credential-shaped strings (pull_request)",
"state": "failure",
}
],
}
counters = sr_module.reap(workflow_map, combined, SHA, dry_run=False)
assert counters["compensated"] == 0
assert counters["preserved_non_push_suffix"] == 1
assert calls == []
def test_reap_ignores_non_failure_states(sr_module, monkeypatch):
"""Only `failure` is compensated. `pending` / `success` / `error`
left alone — they have legitimate semantics."""
monkeypatch.setattr(
sr_module, "api",
lambda *a, **kw: (_ for _ in ()).throw(
AssertionError("api should not be called")
),
)
workflow_map = {"sweep-cf-tunnels": False}
combined = {
"state": "pending",
"statuses": [
{"context": "sweep-cf-tunnels / sweep (push)", "state": "pending"},
{"context": "sweep-cf-tunnels / sweep (push)", "state": "success"},
{"context": "sweep-cf-tunnels / sweep (push)", "state": "error"},
],
}
counters = sr_module.reap(workflow_map, combined, SHA, dry_run=False)
assert counters["compensated"] == 0
assert counters["preserved_non_failure"] == 3
def test_reap_unparseable_push_context_preserved(sr_module, monkeypatch):
"""`(push)` suffix but no ` / ` separator → not the bug shape, preserve."""
monkeypatch.setattr(
sr_module, "api",
lambda *a, **kw: (_ for _ in ()).throw(
AssertionError("api should not be called")
),
)
workflow_map = {"x": False}
combined = {
"state": "failure",
"statuses": [
{"context": "no-slash-here (push)", "state": "failure"},
],
}
counters = sr_module.reap(workflow_map, combined, SHA, dry_run=False)
assert counters["compensated"] == 0
assert counters["preserved_unparseable"] == 1
# --------------------------------------------------------------------------
# ApiError propagation
# --------------------------------------------------------------------------
def test_get_head_sha_raises_on_non_2xx(sr_module, monkeypatch):
"""ApiError on transient outage propagates per
`feedback_api_helper_must_raise_not_return_dict`."""
def fake_api(method, path, **kwargs):
raise sr_module.ApiError("GET /branches/main -> HTTP 500: nope")
monkeypatch.setattr(sr_module, "api", fake_api)
with pytest.raises(sr_module.ApiError):
sr_module.get_head_sha("main")
def test_get_combined_status_raises_on_non_2xx(sr_module, monkeypatch):
def fake_api(method, path, **kwargs):
raise sr_module.ApiError("GET /status -> HTTP 500: nope")
monkeypatch.setattr(sr_module, "api", fake_api)
with pytest.raises(sr_module.ApiError):
sr_module.get_combined_status("deadbeef")
def test_get_head_sha_missing_commit_raises(sr_module, monkeypatch):
"""A malformed 200 response (no `commit` field) raises ApiError."""
monkeypatch.setattr(
sr_module, "api", lambda m, p, **kw: (200, {"name": "main"})
)
with pytest.raises(sr_module.ApiError):
sr_module.get_head_sha("main")
# --------------------------------------------------------------------------
# scan_workflows on real repo (smoke)
# --------------------------------------------------------------------------
def test_scan_workflows_on_real_repo_no_collision(sr_module):
"""Smoke: scan the actual .gitea/workflows/ in this repo. Asserts
no real-world collision/`/`-in-name lurks. If this fails, a real
workflow file must be fixed before reaper can ship."""
real_dir = str(SCRIPT_PATH.parent.parent / "workflows")
# Should NOT raise SystemExit — collision/slash guards must pass.
out = sr_module.scan_workflows(real_dir)
assert len(out) > 0
# publish-workspace-server-image is the canonical preserved case.
assert out.get("publish-workspace-server-image") is True
# main-red-watchdog is the canonical class-O case.
assert out.get("main-red-watchdog") is False
# ci is the canonical required-check (push+pull_request).
assert out.get("CI") is True or out.get("ci") is True
def test_scan_workflows_missing_dir_returns_empty(sr_module, tmp_path, capsys):
"""Missing workflows dir → empty map + ::warning::."""
out = sr_module.scan_workflows(str(tmp_path / "nope"))
assert out == {}
captured = capsys.readouterr()
assert "::warning::workflows dir not found" in captured.out
# --------------------------------------------------------------------------
# rev2: multi-SHA sweep — `reap_branch()` walks last N main commits
# --------------------------------------------------------------------------
# Phase 1+2 evidence (orchestrator + hongming-pc2): rev1 sees `compensated:0`
# every tick because the schedule workflow posts `failure` to whatever SHA
# was HEAD when it COMPLETED. By the next */5 tick, main has often moved
# forward, so the single-HEAD reaper misses the stranded red. rev2 sweeps
# the last 10 commits each tick. See `reference_post_suspension_pipeline`
# and parent rev1 PR #618 for context.
SHA_A = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
SHA_B = "bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb"
SHA_C = "cccccccccccccccccccccccccccccccccccccccc"
def test_reap_sweeps_n_shas_smoke(sr_module, monkeypatch):
"""rev2 contract: sweep last 10 (or N) main commits, GET combined
status for EACH. Smoke: with 3 stub SHAs, each is GET'd exactly once.
"""
gets: list[str] = []
posts: list[tuple[str, dict]] = []
def fake_api(method, path, *, body=None, query=None, expect_json=True):
if method == "GET" and path.endswith("/commits"):
# commits listing — return 3 fake commit objects
return (200, [{"sha": SHA_A}, {"sha": SHA_B}, {"sha": SHA_C}])
if method == "GET" and "/commits/" in path and path.endswith("/status"):
sha = path.split("/commits/")[1].split("/status")[0]
gets.append(sha)
# All combined=success → cost-optimization short-circuit
return (200, {"state": "success", "statuses": []})
if method == "POST":
posts.append((path, body))
return (201, {})
raise AssertionError(f"unexpected api call: {method} {path}")
monkeypatch.setattr(sr_module, "api", fake_api)
workflow_map = {"x": False}
counters = sr_module.reap_branch(
workflow_map, "main", limit=10, dry_run=False
)
# Each of the 3 SHAs returned by /commits should be GET'd once.
assert gets == [SHA_A, SHA_B, SHA_C]
# No POST (everything was combined=success).
assert posts == []
# Counters reflect what we saw.
assert counters["scanned_shas"] == 3
assert counters["compensated"] == 0
assert counters["compensated_per_sha"] == {}
def test_reap_skips_combined_success_shas(sr_module, monkeypatch):
"""rev2 cost-optimization (refinement #2): when combined==success for
a SHA, do NOT iterate per-context statuses; move on to next SHA.
Mock 2 SHAs with combined=success + 1 with combined=failure → only
the failure-SHA's statuses get the per-context loop applied.
"""
per_context_iterated_for: list[str] = []
posts: list[tuple[str, dict]] = []
failure_statuses = [
{
"context": "drift / drift (push)",
"state": "failure",
"target_url": "https://example.test/run/42",
}
]
def fake_api(method, path, *, body=None, query=None, expect_json=True):
if method == "GET" and path.endswith("/commits"):
return (200, [{"sha": SHA_A}, {"sha": SHA_B}, {"sha": SHA_C}])
if method == "GET" and "/commits/" in path and path.endswith("/status"):
sha = path.split("/commits/")[1].split("/status")[0]
if sha == SHA_B:
# Mark this SHA as the failure one — return per-context
# statuses that would compensate if iterated.
return (200, {"state": "failure", "statuses": failure_statuses})
# Others are combined=success — must short-circuit.
return (200, {"state": "success", "statuses": failure_statuses})
if method == "POST":
# If a POST hits a non-failure SHA, the short-circuit failed.
posts.append((path, body))
return (201, {})
raise AssertionError(f"unexpected api call: {method} {path}")
monkeypatch.setattr(sr_module, "api", fake_api)
# Workflow trigger map: `drift` is schedule-only (compensable).
workflow_map = {"drift": False}
counters = sr_module.reap_branch(
workflow_map, "main", limit=10, dry_run=False
)
# Only SHA_B (the combined=failure one) should be compensated.
assert counters["compensated"] == 1
assert counters["scanned_shas"] == 3
assert SHA_B in counters["compensated_per_sha"]
assert counters["compensated_per_sha"][SHA_B] == ["drift / drift (push)"]
# SHA_A and SHA_C must NOT appear in compensated_per_sha — their
# per-context loop was skipped via the combined=success short-circuit.
assert SHA_A not in counters["compensated_per_sha"]
assert SHA_C not in counters["compensated_per_sha"]
# Exactly one POST: the compensation on SHA_B.
assert len(posts) == 1
assert posts[0][0] == f"/repos/owner/repo/statuses/{SHA_B}"
def test_reap_continues_on_per_sha_apierror(sr_module, monkeypatch, capsys):
"""rev2 refinement #7 (MOST CRITICAL): a transient ApiError or HTTP-5xx
on get_combined_status(SHA_X) must NOT fail the whole tick. Log + skip
SHA_X, continue with SHA_Y.
Different from the single-HEAD path (where fail-loud is correct): the
sweep is best-effort across historical commits, so one transient blip
on a stale SHA should not strand reds on the OTHER stale SHAs.
"""
posts: list[tuple[str, dict]] = []
def fake_api(method, path, *, body=None, query=None, expect_json=True):
if method == "GET" and path.endswith("/commits"):
return (200, [{"sha": SHA_A}, {"sha": SHA_B}])
if method == "GET" and "/commits/" in path and path.endswith("/status"):
sha = path.split("/commits/")[1].split("/status")[0]
if sha == SHA_A:
raise sr_module.ApiError(
f"GET /repos/owner/repo/commits/{SHA_A}/status "
f"-> HTTP 502: bad gateway"
)
# SHA_B returns normally with a failure to compensate.
return (
200,
{
"state": "failure",
"statuses": [
{
"context": "drift / drift (push)",
"state": "failure",
}
],
},
)
if method == "POST":
posts.append((path, body))
return (201, {})
raise AssertionError(f"unexpected api call: {method} {path}")
monkeypatch.setattr(sr_module, "api", fake_api)
workflow_map = {"drift": False}
# Must NOT raise — per-SHA error isolation contract.
counters = sr_module.reap_branch(
workflow_map, "main", limit=10, dry_run=False
)
# SHA_A was logged + skipped. SHA_B processed normally.
assert counters["scanned_shas"] == 2
assert counters["compensated"] == 1
assert SHA_B in counters["compensated_per_sha"]
assert SHA_A not in counters["compensated_per_sha"]
# Compensation POST landed on SHA_B only.
assert len(posts) == 1
assert posts[0][0] == f"/repos/owner/repo/statuses/{SHA_B}"
# The ApiError must be logged so a human auditing tick output can see
# WHICH SHA blipped and WHY.
captured = capsys.readouterr()
assert "::warning::" in captured.out or "::notice::" in captured.out
assert SHA_A[:10] in captured.out
-569
View File
@@ -1,569 +0,0 @@
#!/usr/bin/env python3
"""
gate-check-v3 — SOP-6 + CI gate detector for Gitea PRs.
Emits structured verdict + human-readable summary. Designed to run as:
1. CLI: python gate_check.py --repo org/repo --pr N
2. Gitea Actions step: runs this script, captures stdout JSON
Signals (MVP — signals 1,2,3,6):
1. Author-aware agent-tag comment scan
2. REQUEST_CHANGES reviews state machine
3. Staleness detection (review.commit_id != PR.head_sha)
6. CI required-checks awareness
Exit codes:
0 — all gates pass (verdict=CLEAR)
1 — one or more gates blocking (verdict=BLOCKED)
2 — API error / usage error (verdict=ERROR)
"""
import argparse
import json
import os
import re
import sys
import time
import urllib.request
import urllib.error
from datetime import datetime, timezone
from typing import Any, Optional
# ── Gitea API client ────────────────────────────────────────────────────────
GITEA_HOST = os.environ.get("GITEA_HOST", "git.moleculesai.app")
GITEA_TOKEN = os.environ.get("GITEA_TOKEN", os.environ.get("GITHUB_TOKEN", ""))
API_BASE = f"https://{GITEA_HOST}/api/v1"
# Timeout in seconds for all HTTP calls. Defence-in-depth: ensures a missing or
# invalid SOP_TIER_CHECK_TOKEN causes a fast (~15 s) failure rather than an
# indefinite hang. The real fix is provisioning the token; this caps worst-case
# wall-clock on a broken/unreachable Gitea host.
DEFAULT_TIMEOUT = 15
def api_get(path: str) -> dict | list:
url = f"{API_BASE}{path}"
req = urllib.request.Request(
url,
headers={
"Authorization": f"token {GITEA_TOKEN}",
"Accept": "application/json",
},
)
try:
with urllib.request.urlopen(req, timeout=DEFAULT_TIMEOUT) as r:
return json.loads(r.read())
except urllib.error.HTTPError as e:
body = e.read().decode(errors="replace")
raise GiteaError(f"GET {url}{e.code}: {body[:300]}")
def api_list(path: str, per_page: int = 100) -> list:
"""Paginate a list endpoint using Link headers (Gitea/GitHub convention)."""
results = []
page = 1
while True:
paged_path = f"{path}?per_page={per_page}&page={page}"
result = api_get(paged_path)
if isinstance(result, list):
results.extend(result)
if len(result) < per_page:
break
page += 1
else:
# Some endpoints return an object with a data/items key
data = result.get("data", result.get("items", result))
if isinstance(data, list):
results.extend(data)
break
# Safety cap to avoid runaway pagination
if page > 20:
break
return results
class GiteaError(Exception):
pass
# ── Signal 1: Author-aware agent-tag comment scan ─────────────────────────────
# Matches: [core-{role}-agent] VERDICT in comment body.
# Must be authored by the agent whose role is tagged.
# Scans BOTH issue comments (/issues/{N}/comments) and PR comments
# (/pulls/{N}/comments) since agents post on both.
# Matches [core-{role}-agent] VERDICT anywhere in the comment body.
AGENT_TAG_RE = re.compile(
r"\[core-([a-z]+)-agent\]\s+(APPROVED|N/?A|CHANGES_REQUESTED|COMMENT|BLOCKED|ACK)\b",
)
# Map agent role → canonical login (from workspace registry)
AGENT_LOGIN_MAP = {
"qa": "core-qa",
"security": "core-security",
"uiux": "core-uiux",
"lead": "core-lead",
"devops": "core-devops",
"be": "core-be",
"fe": "core-fe",
"offsec": "core-offsec",
}
# SOP-6 tier → required agent groups
# tier:low → engineers,managers,ceo (OR: any one suffices)
# tier:medium → managers AND engineers AND qa,security (AND)
# tier:high → ceo (OR, but single)
# "?" = teams not yet created; treated as optional for MVP
TIER_AGENTS = {
"tier:low": {"managers": "core-lead", "engineers": "core-devops", "ceo": "ceo"},
"tier:medium": {"managers": "core-lead", "engineers": "core-devops", "qa": "core-qa", "security": "core-security"},
"tier:high": {"ceo": "ceo"},
}
POSITIVE_VERDICTS = {"APPROVED", "N/A", "ACK"}
def _get_pr_tier(pr_number: int, repo: str) -> str:
"""Get the PR's tier label."""
owner, name = repo.split("/", 1)
try:
pr = api_get(f"/repos/{owner}/{name}/pulls/{pr_number}")
for label in pr.get("labels", []):
name_l = label.get("name", "")
if name_l in TIER_AGENTS:
return name_l
except GiteaError:
pass
return "tier:low" # Default for untagged PRs
def signal_1_comment_scan(pr_number: int, repo: str) -> dict:
"""
Scan issue + PR comments AND reviews for agent-tag policy gates.
Matches tag AND author. Filters to tier-relevant agents.
Returns: {signal, results, verdict}
"""
owner, name = repo.split("/", 1)
# Get tier label to determine relevant agents
tier = _get_pr_tier(pr_number, repo)
relevant_roles = TIER_AGENTS.get(tier, TIER_AGENTS["tier:low"])
# Build reverse map: login -> (group, agent_key)
login_to_group = {}
for group, login in relevant_roles.items():
for role, l in AGENT_LOGIN_MAP.items():
if l == login:
login_to_group[l] = (group, f"core-{role}")
# Collect all agent-tag matches from comments
comments = []
try:
comments.extend(api_list(f"/repos/{owner}/{name}/issues/{pr_number}/comments"))
except GiteaError:
pass
try:
comments.extend(api_list(f"/repos/{owner}/{name}/pulls/{pr_number}/comments"))
except GiteaError:
pass
# Collect APPROVED reviews from agent logins
try:
reviews = api_list(f"/repos/{owner}/{name}/pulls/{pr_number}/reviews")
for r in reviews:
login = r.get("user", {}).get("login", "")
if login in login_to_group and r.get("state") == "APPROVED":
comments.append(
{
"id": f"review-{r['id']}",
"user": {"login": login},
"body": f"[{login}-agent] APPROVED",
"created_at": r.get("submitted_at") or r.get("created_at", ""),
"source": "review",
}
)
except GiteaError:
pass
# Find latest verdict per agent login
findings = {}
for login, (group, agent_key) in login_to_group.items():
matches = []
for c in comments:
body = c.get("body", "") or ""
user_login = c.get("user", {}).get("login", "")
if user_login != login:
continue
for m in AGENT_TAG_RE.finditer(body):
tag_role, verdict = m.group(1), m.group(2)
# Match the role part of the login (e.g. "core-devops" → "devops")
login_role = login.replace("core-", "")
if tag_role == login_role:
matches.append(
{
"comment_id": c["id"],
"verdict": verdict,
"user": user_login,
"created_at": c["created_at"],
"source": c.get("source", "comment"),
}
)
latest = max(matches, key=lambda x: x["created_at"], default=None) if matches else None
findings[agent_key] = {
"group": group,
"tier": tier,
"found": latest,
"verdict": latest["verdict"] if latest else "MISSING",
}
# Compute gate verdict using tier-specific logic:
# - tier:low / tier:high (OR gate): ANY positive = CLEAR, ANY negative = BLOCKED
# - tier:medium (AND gate): ALL must be positive = CLEAR, ANY negative = BLOCKED
verdicts = [f["verdict"] for f in findings.values()]
if not verdicts:
gate_verdict = "N/A"
elif tier in ("tier:low", "tier:high"):
# OR gate: one positive is enough
if any(v in POSITIVE_VERDICTS for v in verdicts):
gate_verdict = "CLEAR"
elif any(v in ("BLOCKED", "CHANGES_REQUESTED", "COMMENT") for v in verdicts):
gate_verdict = "BLOCKED"
else:
gate_verdict = "INCOMPLETE"
else:
# AND gate (tier:medium): all must be positive
if all(v in POSITIVE_VERDICTS for v in verdicts):
gate_verdict = "CLEAR"
elif any(v in ("BLOCKED", "CHANGES_REQUESTED", "COMMENT") for v in verdicts):
gate_verdict = "BLOCKED"
else:
gate_verdict = "INCOMPLETE"
return {"signal": "agent_tag_comments", "results": findings, "verdict": gate_verdict, "tier": tier}
# ── Signal 2: REQUEST_CHANGES reviews state machine ────────────────────────────
def signal_2_reviews(pr_number: int, repo: str) -> dict:
"""
Check /pulls/{N}/reviews for active REQUEST_CHANGES with dismissed=false.
This is the layer that empirically blocks Gitea merges.
Returns: {blocking_reviews: [...], verdict}
"""
owner, name = repo.split("/", 1)
reviews = api_list(f"/repos/{owner}/{name}/pulls/{pr_number}/reviews")
blocking = []
for r in reviews:
if r.get("state") == "REQUEST_CHANGES" and not r.get("dismissed", False):
blocking.append(
{
"review_id": r["id"],
"user": r["user"]["login"],
"commit_id": r.get("commit_id", ""),
"created_at": r.get("submitted_at") or r.get("created_at", ""),
}
)
return {
"signal": "request_changes_reviews",
"blocking_reviews": blocking,
"verdict": "BLOCKED" if blocking else "CLEAR",
}
# ── Signal 3: Staleness detection ────────────────────────────────────────────
WORKING_DAY_SECONDS = 9 * 3600 # SOP-12: 1 working day threshold
def signal_3_staleness(pr_number: int, repo: str) -> dict:
"""
Flag reviews where review.commit_id != PR.head_sha AND
time_since_review > 1 working day. Per SOP-12 (internal#282).
Returns: {stale_reviews: [...], verdict}
"""
owner, name = repo.split("/", 1)
# Get PR head sha
pr = api_get(f"/repos/{owner}/{name}/pulls/{pr_number}")
head_sha = pr["head"]["sha"]
reviews = api_list(f"/repos/{owner}/{name}/pulls/{pr_number}/reviews")
stale = []
now = datetime.now(timezone.utc)
for r in reviews:
review_commit = r.get("commit_id", "")
if review_commit and review_commit != head_sha:
# Review predates current head
try:
created = datetime.fromisoformat(r["created_at"].replace("Z", "+00:00"))
except (KeyError, ValueError):
continue
age_seconds = (now - created).total_seconds()
if age_seconds > WORKING_DAY_SECONDS:
stale.append(
{
"review_id": r["id"],
"user": r["user"]["login"],
"review_commit": review_commit,
"pr_head": head_sha,
"age_hours": round(age_seconds / 3600, 1),
"created_at": r.get("submitted_at") or r.get("created_at", ""),
}
)
return {
"signal": "stale_reviews",
"stale_reviews": stale,
"verdict": "STALE-RC" if stale else "CLEAR",
}
# ── Signal 6: CI required-checks awareness ───────────────────────────────────
def signal_6_ci(pr_number: int, repo: str, branch: str | None = None, pr_data: dict | None = None) -> dict:
"""
Query combined CI status for PR head commit.
Find required status checks on target branch.
Surface any failing required check as primary blocker.
"""
owner, name = repo.split("/", 1)
# Re-use PR data if already fetched by caller; otherwise fetch once.
if pr_data is None:
pr_data = api_get(f"/repos/{owner}/{name}/pulls/{pr_number}")
head_sha = pr_data["head"]["sha"]
# Fall back to PR's actual base branch when no explicit branch is given
branch = branch or pr_data.get("base", {}).get("ref", "main")
# Combined status of PR head
combined = api_get(f"/repos/{owner}/{name}/commits/{head_sha}/status")
ci_state = combined.get("state", "null")
# Individual check statuses
# Gitea Actions uses "status" (pending/success/failure) not "state" for
# individual check entries. "state" is null for pending runs.
# Exclude our own prior status to prevent self-referential failure loops.
check_statuses = {}
for s in combined.get("statuses") or []:
ctx = s["context"]
if "gate-check" not in ctx.lower():
check_statuses[ctx] = s.get("status", "pending")
# Try to get branch protection for required checks
required_checks = []
try:
protection = api_get(f"/repos/{owner}/{name}/branches/{branch}/protection")
for check in protection.get("required_status_checks", {}).get("checks", []):
required_checks.append(check["context"])
except GiteaError:
pass # No protection or no read access
failing_required = []
passing_required = []
for ctx in required_checks:
state = check_statuses.get(ctx, "null")
if state == "failure":
failing_required.append(ctx)
elif state in ("success", "neutral"):
passing_required.append(ctx)
else:
passing_required.append(f"{ctx} (pending)")
# NOTE: do NOT use ci_state (combined_state) as a fallback verdict driver.
# The combined_state is computed over ALL statuses including this
# gate-check's own prior result. Using it as a fallback creates a
# self-referential loop: gate-check posts failure → combined_state
# becomes failure → script re-blocks → posts failure again.
# The check_statuses dict already excludes gate-check (Bug-1 fix from
# PR #547). Use failing_required as the sole CI gate; if no required
# checks are defined on the branch, return CLEAR rather than re-using
# the combined_state which includes our own status.
if failing_required:
verdict = "CI_FAIL"
elif ci_state == "pending":
verdict = "CI_PENDING"
else:
verdict = "CLEAR"
return {
"signal": "ci_checks",
"combined_state": ci_state,
"required_checks": required_checks,
"failing_required": failing_required,
"passing_required": passing_required,
"all_check_statuses": check_statuses,
"verdict": verdict,
}
# ── Gate evaluation ───────────────────────────────────────────────────────────
VERDICT_ORDER = {"ERROR": 0, "CI_FAIL": 1, "BLOCKED": 2, "STALE-RC": 3, "CI_PENDING": 4, "N/A": 5, "CLEAR": 6}
def compute_verdict(gates: list[dict]) -> tuple[str, list[dict]]:
"""Compute overall verdict from gate results. Worst gate wins."""
worst = "CLEAR"
blockers = []
for g in gates:
v = g.get("verdict", "N/A")
if VERDICT_ORDER.get(v, 99) < VERDICT_ORDER.get(worst, 0):
worst = v
if v in ("BLOCKED", "CI_FAIL", "STALE-RC", "ERROR"):
blockers.append(g)
return worst, blockers
def format_gate_verdict(v: str) -> tuple[str, str]:
"""Return (icon, label) for a gate verdict."""
if v in ("APPROVED", "CLEAR"):
return "", v
if v in ("BLOCKED", "CI_FAIL", "ERROR"):
return "", v
return "⚠️", v
def format_comment(repo: str, pr_number: int, verdict: str, gates: list[dict], blockers: list[dict]) -> str:
"""Format human-readable Gitea PR comment."""
gate_labels = {
"agent_tag_comments": "Agent-tag gates",
"request_changes_reviews": "REQUEST_CHANGES reviews",
"stale_reviews": "Staleness check",
"ci_checks": "CI required checks",
}
lines = [f"[gate-check-v3] STATUS: **{verdict}**", ""]
# Per-gate summary
for g in gates:
sig = g.get("signal", "?")
label = gate_labels.get(sig, sig)
v = g.get("verdict", "N/A")
icon, _ = format_gate_verdict(v)
lines.append(f"{icon} **{label}**: {v}")
# Gate-specific detail
if blockers:
lines.append("")
lines.append("### Blockers")
for b in blockers:
sig = b.get("signal", "?")
if sig == "request_changes_reviews":
for r in b.get("blocking_reviews", []):
lines.append(f" - @{r['user']} requested changes (review id={r['review_id']})")
elif sig == "ci_checks":
combined = b.get("combined_state", "?")
lines.append(f" - CI combined state: **{combined}**")
for c in b.get("failing_required", []):
lines.append(f" - required check failing: **{c}**")
for c in b.get("all_check_statuses", {}).items():
ctx, state = c
lines.append(f" - {ctx}: {state}")
elif sig == "stale_reviews":
for r in b.get("stale_reviews", []):
lines.append(
f" - @{r['user']} stale (commit={r.get('review_commit','?')[:7]}, age={r.get('age_hours','?')}h)"
)
elif sig == "agent_tag_comments":
for agent, res in b.get("results", {}).items():
v = res.get("verdict", "MISSING")
icon, _ = format_gate_verdict(v)
if v == "MISSING":
lines.append(f" {icon} {agent}: no agent-tag comment found")
else:
lines.append(f" {icon} {agent}: {v}")
lines.append("")
lines.append(f"_gate-check-v3 · repo={repo} · pr={pr_number}_")
return "\n".join(lines)
# ── Main ─────────────────────────────────────────────────────────────────────
def run(repo: str, pr_number: int, post_comment: bool = False) -> dict:
try:
# Fetch PR once to get base ref for signal_6_ci
owner, name = repo.split("/", 1)
pr = api_get(f"/repos/{owner}/{name}/pulls/{pr_number}")
base_ref = pr.get("base", {}).get("ref", "main")
gates = [
signal_1_comment_scan(pr_number, repo),
signal_2_reviews(pr_number, repo),
signal_3_staleness(pr_number, repo),
signal_6_ci(pr_number, repo, branch=base_ref, pr_data=pr),
]
verdict, blockers = compute_verdict(gates)
result = {
"verdict": verdict,
"repo": repo,
"pr": pr_number,
"gates": gates,
"blockers": blockers,
"timestamp": datetime.now(timezone.utc).isoformat(),
}
# Print human-readable to stdout for Gitea Actions log
print(json.dumps(result, indent=2))
# Optionally post comment
if post_comment:
owner, name = repo.split("/", 1)
comment_body = format_comment(repo, pr_number, verdict, gates, blockers)
headers = {
"Authorization": f"token {GITEA_TOKEN}",
"Content-Type": "application/json",
"Accept": "application/json",
}
# Check if a gate-check comment already exists to avoid spamming
existing = api_list(f"/repos/{owner}/{name}/issues/{pr_number}/comments")
our_comments = [c for c in existing if "[gate-check-v3]" in (c.get("body") or "")]
try:
if our_comments:
# Update latest
comment_id = our_comments[-1]["id"]
url = f"{API_BASE}/repos/{owner}/{name}/issues/comments/{comment_id}"
req = urllib.request.Request(url, data=json.dumps({"body": comment_body}).encode(), headers=headers, method="PATCH")
with urllib.request.urlopen(req, timeout=DEFAULT_TIMEOUT) as r:
r.read()
else:
url = f"{API_BASE}/repos/{owner}/{name}/issues/{pr_number}/comments"
req = urllib.request.Request(url, data=json.dumps({"body": comment_body}).encode(), headers=headers, method="POST")
with urllib.request.urlopen(req, timeout=DEFAULT_TIMEOUT) as r:
r.read()
except urllib.error.HTTPError as e:
if e.code == 403:
print(f"WARN: --post-comment 403 (token scope) — verdict={verdict}; skipping comment-post", file=sys.stderr)
else:
raise
return result
except GiteaError as e:
result = {"verdict": "ERROR", "error": str(e), "repo": repo, "pr": pr_number}
print(json.dumps(result, indent=2), file=sys.stderr)
return result
def main() -> int:
parser = argparse.ArgumentParser(description="gate-check-v3 — PR gate detector")
parser.add_argument("--repo", required=True, help="org/repo (e.g. molecule-ai/molecule-core)")
parser.add_argument("--pr", type=int, required=True, help="PR number")
parser.add_argument("--post-comment", action="store_true", help="Post/update comment on PR")
args = parser.parse_args()
result = run(args.repo, args.pr, post_comment=args.post_comment)
verdict = result.get("verdict", "ERROR")
if verdict == "ERROR":
return 2
elif verdict in ("BLOCKED", "CI_FAIL", "STALE-RC", "ERROR"):
return 1
return 0
if __name__ == "__main__":
sys.exit(main())
@@ -983,16 +983,7 @@ func expectExecuteDelegationBase(mock sqlmock.Sqlmock) {
WithArgs("dispatched", "", testSourceID, testDelegationID).
WillReturnResult(sqlmock.NewResult(0, 1))
// CanCommunicate: source != target → fires two getWorkspaceRef lookups.
// Both test fixtures have parent_id = NULL (root-level siblings) → allowed.
// Order matches call order: source first, then target.
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id").
WithArgs(testSourceID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testSourceID, nil))
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id").
WithArgs(testTargetID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testTargetID, nil))
// CanCommunicate (source=target self-call is always allowed — no DB lookup needed)
// resolveAgentURL: reads ws:{id}:url from Redis, falls back to DB for target
mock.ExpectQuery("SELECT url, status FROM workspaces WHERE id = ").
WithArgs(testTargetID).
-25
View File
@@ -697,31 +697,6 @@ func (h *OrgHandler) Import(c *gin.Context) {
})
return
}
// Per-workspace RequiredEnv preflight: checks that every RequiredEnv
// declared at the workspace level is covered by either (a) a global
// secret key (already validated above) or (b) a key present in the
// workspace's on-disk .env files (org root .env + per-workspace
// <files_dir>/.env). If neither covers the key the workspace is
// imported NOT CONFIGURED, which silently breaks the workspace at
// start time — the container boots without the required credential
// and every LLM call 401s or fails silently. Issue #232.
// orgBaseDir is empty when importing via body.Template (inline YAML);
// in that case we cannot check .env files, so we skip this check
// and fall back to the global-only gate above (which correctly
// rejects any strict requirement not covered by global_secrets).
if orgBaseDir != "" {
wsMissing := collectPerWorkspaceUnsatisfied(tmpl.Workspaces, orgBaseDir, configured)
if len(wsMissing) > 0 {
c.JSON(http.StatusPreconditionFailed, gin.H{
"error": "missing per-workspace required environment variables",
"missing_workspace_env": wsMissing,
"template": tmpl.Name,
"suggestion": "add these keys to the workspace's .env file or set them as global secrets before importing",
})
return
}
}
}
results := []map[string]interface{}{}
@@ -346,7 +346,7 @@ func (g *gitFetcher) Fetch(ctx context.Context, rootDir, host, repoPath, ref str
// MkdirTemp creates the dir; git clone refuses to clone into a
// non-empty dir. Remove + recreate empty.
os.RemoveAll(tmpDir)
cloneAndConfig := gitArgs("clone", "--quiet", "--depth=1", "-b", ref, cloneURL, tmpDir)
cloneAndConfig := append(gitArgs("clone", "--quiet", "--depth=1", "-b", ref, cloneURL, tmpDir))
cmd := exec.CommandContext(ctx, "git", cloneAndConfig...)
cmd.Env = append(os.Environ(), "GIT_TERMINAL_PROMPT=0")
if out, err := cmd.CombinedOutput(); err != nil {
@@ -941,65 +941,6 @@ func flattenAndSortRequirements(by map[string]EnvRequirement) []EnvRequirement {
// can investigate.
const globalSecretsPreflightLimit = 10000
// PerWorkspaceUnsatisfied describes one per-workspace RequiredEnv that is
// not covered by either a global secret or a key present in the
// corresponding .env file.
type PerWorkspaceUnsatisfied struct {
Workspace string `json:"workspace"`
FilesDir string `json:"files_dir,omitempty"`
Unsatisfied EnvRequirement `json:"unsatisfied_env"`
}
// collectPerWorkspaceUnsatisfied recursively walks workspaces and returns
// per-workspace RequiredEnv entries that are not covered by (a) a global
// secret key or (b) a key present in the workspace's .env file(s) (org root
// .env + per-workspace <files_dir>/.env). This complements
// collectOrgEnv + loadConfiguredGlobalSecretKeys, which together only
// validate global-level RequiredEnv against global_secrets. The .env
// lookup mirrors the runtime resolution in createWorkspaceTree so that
// the preflight result matches what the container actually receives at
// start time.
func collectPerWorkspaceUnsatisfied(workspaces []OrgWorkspace, orgBaseDir string, globalSecrets map[string]struct{}) []PerWorkspaceUnsatisfied {
var out []PerWorkspaceUnsatisfied
var walk func([]OrgWorkspace)
walk = func(wsList []OrgWorkspace) {
for _, ws := range wsList {
// Build the set of keys available to this workspace from .env.
// This is the same three-source stack that createWorkspaceTree
// injects into the container:
// 1. Org root .env (parseEnvFile, no filesDir)
// 2. Workspace <files_dir>/.env (if filesDir is set)
// 3. Persona bootstrap env (MOLECULE_PERSONA_ROOT/<filesDir>/env)
// Items 1+2 are on-disk and testable; item 3 is host-only and
// skipped here (persona env does NOT satisfy required_env —
// it carries identity tokens, not workspace LLM keys).
envFromFiles := loadWorkspaceEnv(orgBaseDir, ws.FilesDir)
// Convert map[string]string (from .env files) to map[string]struct{}
// to match IsSatisfied's signature.
envSet := make(map[string]struct{}, len(envFromFiles))
for k := range envFromFiles {
envSet[k] = struct{}{}
}
for _, req := range ws.RequiredEnv {
if req.IsSatisfied(globalSecrets) {
continue // covered by a global secret
}
if req.IsSatisfied(envSet) {
continue // covered by a per-workspace .env file
}
out = append(out, PerWorkspaceUnsatisfied{
Workspace: ws.Name,
FilesDir: ws.FilesDir,
Unsatisfied: req,
})
}
walk(ws.Children)
}
}
walk(workspaces)
return out
}
func loadConfiguredGlobalSecretKeys(ctx context.Context) (map[string]struct{}, error) {
rows, err := db.DB.QueryContext(ctx,
`SELECT key FROM global_secrets WHERE octet_length(encrypted_value) > 0 LIMIT $1`,
@@ -1,226 +0,0 @@
package handlers
import (
"os"
"path/filepath"
"testing"
)
// TestCollectPerWorkspaceUnsatisfied_BothFiles covers the case where a key
// is present in both the org root .env and the workspace-specific .env. Both
// should satisfy the requirement (no entry in output).
func TestCollectPerWorkspaceUnsatisfied_BothFiles(t *testing.T) {
tmp := t.TempDir()
writeEnvFile(t, tmp, ".env", "PER_WS_KEY=globalvalue")
writeEnvFile(t, tmp, "ws-a/.env", "PER_WS_KEY=wsvalue")
workspaces := []OrgWorkspace{
{Name: "ws-a", FilesDir: "ws-a", RequiredEnv: []EnvRequirement{{Name: "PER_WS_KEY"}}},
}
// Global secret covers it.
globals := map[string]struct{}{"PER_WS_KEY": {}}
missing := collectPerWorkspaceUnsatisfied(workspaces, tmp, globals)
if len(missing) != 0 {
t.Errorf("PER_WS_KEY present in global + .env: should be satisfied, got %d missing", len(missing))
}
}
// TestCollectPerWorkspaceUnsatisfied_WorkspaceEnvOnly covers a key present
// only in the workspace-specific .env file (not global). Should be satisfied.
func TestCollectPerWorkspaceUnsatisfied_WorkspaceEnvOnly(t *testing.T) {
tmp := t.TempDir()
writeEnvFile(t, tmp, "dev-lead/.env", "WORKSPACE_KEY=val")
workspaces := []OrgWorkspace{
{Name: "Dev Lead", FilesDir: "dev-lead", RequiredEnv: []EnvRequirement{{Name: "WORKSPACE_KEY"}}},
}
globals := map[string]struct{}{} // nothing in global
missing := collectPerWorkspaceUnsatisfied(workspaces, tmp, globals)
if len(missing) != 0 {
t.Errorf("WORKSPACE_KEY in ws .env only: should be satisfied, got %d missing", len(missing))
}
}
// TestCollectPerWorkspaceUnsatisfied_OrgRootEnvOnly covers a key present
// only in the org root .env file (not per-workspace). Should be satisfied.
func TestCollectPerWorkspaceUnsatisfied_OrgRootEnvOnly(t *testing.T) {
tmp := t.TempDir()
writeEnvFile(t, tmp, ".env", "ORG_ROOT_KEY=val")
workspaces := []OrgWorkspace{
{Name: "ws-b", FilesDir: "ws-b", RequiredEnv: []EnvRequirement{{Name: "ORG_ROOT_KEY"}}},
}
globals := map[string]struct{}{}
missing := collectPerWorkspaceUnsatisfied(workspaces, tmp, globals)
if len(missing) != 0 {
t.Errorf("ORG_ROOT_KEY in org root .env only: should be satisfied, got %d missing", len(missing))
}
}
// TestCollectPerWorkspaceUnsatisfied_GlobalCovers checks that a global
// secret alone satisfies a per-workspace RequiredEnv even when the .env
// files don't have the key.
func TestCollectPerWorkspaceUnsatisfied_GlobalCovers(t *testing.T) {
tmp := t.TempDir()
// No .env files at all.
workspaces := []OrgWorkspace{
{Name: "ws-c", RequiredEnv: []EnvRequirement{{Name: "GLOBAL_COVERED"}}},
}
globals := map[string]struct{}{"GLOBAL_COVERED": {}}
missing := collectPerWorkspaceUnsatisfied(workspaces, tmp, globals)
if len(missing) != 0 {
t.Errorf("GLOBAL_COVERED satisfied by global: should be satisfied, got %d missing", len(missing))
}
}
// TestCollectPerWorkspaceUnsatisfied_Missing covers the core bug: a
// RequiredEnv declared at the workspace level where the key is absent from
// both global_secrets and the .env file. The import MUST return 412.
func TestCollectPerWorkspaceUnsatisfied_Missing(t *testing.T) {
tmp := t.TempDir()
// No .env files at all.
workspaces := []OrgWorkspace{
{Name: "Dev Lead", FilesDir: "dev-lead", RequiredEnv: []EnvRequirement{{Name: "MISSING_REQUIRED_KEY"}}},
}
globals := map[string]struct{}{} // no global secret
missing := collectPerWorkspaceUnsatisfied(workspaces, tmp, globals)
if len(missing) != 1 {
t.Fatalf("expected 1 missing entry, got %d", len(missing))
}
if missing[0].Workspace != "Dev Lead" {
t.Errorf("expected workspace 'Dev Lead', got %q", missing[0].Workspace)
}
if missing[0].Unsatisfied.Name != "MISSING_REQUIRED_KEY" {
t.Errorf("expected unsatisfied key 'MISSING_REQUIRED_KEY', got %q", missing[0].Unsatisfied.Name)
}
if missing[0].FilesDir != "dev-lead" {
t.Errorf("expected files_dir 'dev-lead', got %q", missing[0].FilesDir)
}
}
// TestCollectPerWorkspaceUnsatisfied_AnyOfGroup covers an any-of group where
// none of the alternatives are present in global or .env. Should report
// the group as unsatisfied.
func TestCollectPerWorkspaceUnsatisfied_AnyOfGroup(t *testing.T) {
tmp := t.TempDir()
workspaces := []OrgWorkspace{
{
Name: "Claude Bot",
FilesDir: "claude-bot",
RequiredEnv: []EnvRequirement{
{AnyOf: []string{"ANTHROPIC_API_KEY", "CLAUDE_CODE_OAUTH_TOKEN"}},
},
},
}
globals := map[string]struct{}{}
missing := collectPerWorkspaceUnsatisfied(workspaces, tmp, globals)
if len(missing) != 1 {
t.Fatalf("expected 1 missing any-of entry, got %d", len(missing))
}
if missing[0].Workspace != "Claude Bot" {
t.Errorf("expected workspace 'Claude Bot', got %q", missing[0].Workspace)
}
if len(missing[0].Unsatisfied.AnyOf) != 2 {
t.Errorf("expected any-of group with 2 members, got %v", missing[0].Unsatisfied.AnyOf)
}
}
// TestCollectPerWorkspaceUnsatisfied_NestedChildren covers grandchildren
// workspaces that also declare RequiredEnv. The recursive walk must visit
// children and grandchildren.
func TestCollectPerWorkspaceUnsatisfied_NestedChildren(t *testing.T) {
tmp := t.TempDir()
workspaces := []OrgWorkspace{
{
Name: "Root",
Children: []OrgWorkspace{
{
Name: "Child",
Children: []OrgWorkspace{
{Name: "Grandchild", FilesDir: "grandchild", RequiredEnv: []EnvRequirement{{Name: "DEEP_KEY"}}},
},
},
},
},
}
globals := map[string]struct{}{}
missing := collectPerWorkspaceUnsatisfied(workspaces, tmp, globals)
if len(missing) != 1 {
t.Fatalf("expected 1 missing entry from grandchild, got %d", len(missing))
}
if missing[0].Workspace != "Grandchild" {
t.Errorf("expected 'Grandchild', got %q", missing[0].Workspace)
}
}
// TestCollectPerWorkspaceUnsatisfied_EmptyOrgBaseDir covers the case where
// orgBaseDir is empty (inline template import). No .env files can be
// checked, so missing keys cannot be attributed to .env absence. The
// function should NOT crash and should only report entries satisfiable
// by global (all missing since globals is empty).
func TestCollectPerWorkspaceUnsatisfied_EmptyOrgBaseDir(t *testing.T) {
workspaces := []OrgWorkspace{
{Name: "ws-x", RequiredEnv: []EnvRequirement{{Name: "KEY_X"}}},
}
globals := map[string]struct{}{}
missing := collectPerWorkspaceUnsatisfied(workspaces, "", globals)
// With no orgBaseDir and no global, KEY_X must be reported missing.
if len(missing) != 1 {
t.Errorf("expected 1 missing with empty orgBaseDir, got %d", len(missing))
}
}
// TestCollectPerWorkspaceUnsatisfied_MultipleWorkspaces reports only the
// workspace whose RequiredEnv is unsatisfied, not the whole batch.
func TestCollectPerWorkspaceUnsatisfied_MultipleWorkspaces(t *testing.T) {
tmp := t.TempDir()
writeEnvFile(t, tmp, "ws-ok/.env", "OK_KEY=val")
workspaces := []OrgWorkspace{
{Name: "ws-ok", FilesDir: "ws-ok", RequiredEnv: []EnvRequirement{{Name: "OK_KEY"}}},
{Name: "ws-missing", FilesDir: "ws-missing", RequiredEnv: []EnvRequirement{{Name: "BAD_KEY"}}},
}
globals := map[string]struct{}{}
missing := collectPerWorkspaceUnsatisfied(workspaces, tmp, globals)
if len(missing) != 1 {
t.Errorf("expected exactly 1 missing (BAD_KEY), got %d", len(missing))
}
if missing[0].Workspace != "ws-missing" {
t.Errorf("expected missing workspace 'ws-missing', got %q", missing[0].Workspace)
}
}
// writeEnvFile is a test helper that creates a .env file at the given path
// with the given content.
func writeEnvFile(t *testing.T, baseDir, relPath, content string) {
t.Helper()
fullPath := filepath.Join(baseDir, relPath)
if err := os.MkdirAll(filepath.Dir(fullPath), 0755); err != nil {
t.Fatalf("mkdirAll: %v", err)
}
if err := os.WriteFile(fullPath, []byte(content), 0644); err != nil {
t.Fatalf("writeFile %s: %v", fullPath, err)
}
}
@@ -8,7 +8,6 @@ import (
"context"
"database/sql"
"encoding/json"
"errors"
"fmt"
"log"
"net/http"
@@ -286,51 +285,17 @@ func (h *WorkspaceHandler) Create(c *gin.Context) {
c.JSON(http.StatusBadRequest, gin.H{"error": "delivery_mode must be 'push' or 'poll'"})
return
}
// Insert workspace with runtime + delivery_mode persisted in DB (inside transaction).
//
// Auto-suffix on (parent_id, name) collision via insertWorkspaceWithNameRetry:
// the partial-unique index `workspaces_parent_name_uniq` (migration
// 20260506000000) protects /org/import from TOCTOU duplicates, but the
// pre-fix Canvas Create path bubbled the raw pq violation as a 500 on
// double-click. Helper retries with " (2)", " (3)", … up to maxNameSuffix,
// returns the actually-persisted name (which we MUST thread back into
// payload + broadcast so the canvas displays what the DB has).
const insertWorkspaceSQL = `
// Insert workspace with runtime + delivery_mode persisted in DB (inside transaction)
_, err := tx.ExecContext(ctx, `
INSERT INTO workspaces (id, name, role, tier, runtime, awareness_namespace, status, parent_id, workspace_dir, workspace_access, budget_limit, max_concurrent_tasks, delivery_mode)
VALUES ($1, $2, $3, $4, $5, $6, 'provisioning', $7, $8, $9, $10, $11, $12)
`
insertArgs := []any{id, payload.Name, role, payload.Tier, payload.Runtime, awarenessNamespace, payload.ParentID, workspaceDir, workspaceAccess, payload.BudgetLimit, maxConcurrent, deliveryMode}
persistedName, currentTx, err := insertWorkspaceWithNameRetry(
ctx,
tx,
// Closure captures ctx so the retry tx uses the same request context;
// nil opts mirrors the original BeginTx call above.
func(ctx context.Context) (*sql.Tx, error) { return db.DB.BeginTx(ctx, nil) },
payload.Name,
1, // args[1] is name
insertWorkspaceSQL,
insertArgs,
)
`, id, payload.Name, role, payload.Tier, payload.Runtime, awarenessNamespace, payload.ParentID, workspaceDir, workspaceAccess, payload.BudgetLimit, maxConcurrent, deliveryMode)
if err != nil {
if currentTx != nil {
currentTx.Rollback() //nolint:errcheck
}
if errors.Is(err, errWorkspaceNameExhausted) {
log.Printf("Create workspace: name suffix exhausted for base %q under parent %v", payload.Name, payload.ParentID)
c.JSON(http.StatusConflict, gin.H{"error": "workspace name already in use; please pick a different name"})
return
}
tx.Rollback() //nolint:errcheck
log.Printf("Create workspace error: %v", err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to create workspace"})
return
}
// Helper may have rolled back the original tx and returned a fresh one;
// rebind so the remaining secrets-INSERT + Commit run on the live tx.
tx = currentTx
if persistedName != payload.Name {
log.Printf("Create workspace %s: name collision auto-suffix %q -> %q", id, payload.Name, persistedName)
payload.Name = persistedName
}
// Persist initial secrets from the create payload (inside same transaction).
// nil/empty map is a no-op. Any failure rolls back the workspace insert

Some files were not shown because too many files have changed in this diff Show More