Compare commits

..

1 Commits

Author SHA1 Message Date
Hongming Wang 227787bbbd test: bash coverage for entrypoint.sh log_boot_context()
CI / validate (push) Failing after 0s
CI / Adapter unit tests (push) Failing after 6s
The Python adapter audit (test_adapter_logging.py) pins the
adapter.py side, but the entrypoint shell function fires earlier and
twice (pre-gosu + post-gosu). When the SDK import wedge keeps the
adapter from running at all, the shell emission is the operator's
only visibility into the boot env.

Eight new tests cover:
- env NAME=set / env NAME=unset shape for every audited var
- value-leak guard: secret strings never appear in output
- WORKSPACE_ID + PLATFORM_URL passthrough by value (not secret)
- <unset> fallback for missing platform identifiers
- uid/gid line shape (used to verify the privilege drop)
- dated boot banner shape (used to count restarts in a crash loop)
- cross-file gate: shell for-loop names == fixture tuple, mirroring
  test_audit_env_list_matches_entrypoint_sh's adapter.py↔shell gate

Strategy: regex-extract the function body from entrypoint.sh and run
it in a fresh /bin/sh with controlled env. We never source the whole
entrypoint because it would chown /workspace and exec molecule-runtime.

Closes the gap from task #251 (follow-up to PR #32 boot-debug logging).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 22:26:35 -07:00
6 changed files with 228 additions and 235 deletions
+1 -1
View File
@@ -2,7 +2,7 @@ name: CI
on: [push, pull_request]
jobs:
validate:
uses: molecule-ai/molecule-ci/.github/workflows/validate-workspace-template.yml@main
uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@main
tests:
name: Adapter unit tests
+8 -41
View File
@@ -32,47 +32,14 @@ permissions:
packages: write
jobs:
# The `.runtime-version` file is the push-mode cascade signal post-
# 2026-05-06: when molecule-core/publish-runtime.yml ships a new
# version to PyPI, it does NOT call repository_dispatch (Gitea 1.22.6
# has no such endpoint — empirically verified molecule-core#20).
# Instead it git-pushes an updated `.runtime-version` to each template,
# which trips this workflow's `on: push: branches: [main]` trigger.
# This job reads that file and forwards the version to the reusable
# build workflow so the Dockerfile pip-installs the exact published
# version, not whatever requirements.txt currently bounds.
resolve-version:
runs-on: ubuntu-latest
timeout-minutes: 2
outputs:
version: ${{ steps.read.outputs.version }}
steps:
- uses: actions/checkout@v4
- id: read
run: |
if [ -f .runtime-version ]; then
v="$(head -n1 .runtime-version | tr -d '[:space:]')"
echo "version=$v" >> "$GITHUB_OUTPUT"
echo "resolved runtime version: $v"
else
echo "no .runtime-version file present — falling through to Dockerfile default"
fi
publish:
needs: resolve-version
uses: molecule-ai/molecule-ci/.github/workflows/publish-template-image.yml@main
uses: Molecule-AI/molecule-ci/.github/workflows/publish-template-image.yml@main
secrets: inherit
with:
# Resolution chain (highest priority first):
# 1. client_payload.runtime_version — legacy GitHub
# repository_dispatch path (will return if Gitea ever adds
# the dispatch API; left in place for forward-compat).
# 2. inputs.runtime_version manual workflow_dispatch run from
# the Actions UI for ad-hoc rebuilds against a specific
# version.
# 3. needs.resolve-version.outputs.version — the
# `.runtime-version` file in this repo, written by
# molecule-core/publish-runtime.yml's push-mode cascade.
# 4. '' — fall through to the Dockerfile default
# (requirements.txt pin).
runtime_version: ${{ github.event.client_payload.runtime_version || inputs.runtime_version || needs.resolve-version.outputs.version || '' }}
# When the cascade fires, client_payload.runtime_version is the
# exact version PyPI just published. Forwarded to the reusable
# workflow as a docker --build-arg so the cache key changes
# per-version and pip install resolves freshly.
# On other events (push to main / manual without input), this is
# empty and the Dockerfile's default (requirements.txt pin) applies.
runtime_version: ${{ github.event.client_payload.runtime_version || inputs.runtime_version || '' }}
+12 -191
View File
@@ -1,201 +1,22 @@
name: Secret scan
# Hard CI gate. Refuses any PR / push whose diff additions contain a
# recognisable credential. Defense-in-depth for the #2090-class incident
# (2026-04-24): GitHub's hosted Copilot Coding Agent leaked a ghs_*
# installation token into tenant-proxy/package.json via `npm init`
# slurping the URL from a token-embedded origin remote. We can't fix
# upstream's clone hygiene, so we gate here.
# Calls the canonical reusable workflow in molecule-core. Defense
# against the #2090-class leak (a hosted-agent commit slipping a
# credential-shaped string into a PR). Pattern set lives in
# molecule-core so we do not maintain a parallel copy here.
#
# Inlined copy from molecule-ai/molecule-core/.github/workflows/secret-scan.yml.
# Cross-repo workflow_call to a private repo doesn't fully work on Gitea 1.22.6
# (workflow file fails parse-time at 0s with no logs); inline keeps the gate
# functional until Gitea is upgraded or the canonical scanner moves to a public
# repo. When that lands, this file reverts to the 3-line wrapper:
#
# jobs:
# secret-scan:
# uses: Molecule-AI/molecule-core/.github/workflows/secret-scan.yml@staging
#
# Pin to @staging not @main — staging is the active default branch,
# main lags via the staging-promotion workflow. Updates ride along
# automatically on the next consumer workflow run.
#
# Same regex set as the runtime's bundled pre-commit hook
# (molecule-ai-workspace-runtime: molecule_runtime/scripts/pre-commit-checks.sh).
# Keep the two sides aligned when adding patterns.
# Pinned to @staging because that is the active default branch on the
# upstream repo (main lags behind via the staging-promotion workflow).
# Updates ride along automatically as the upstream regex set evolves.
on:
pull_request:
types: [opened, synchronize, reopened]
push:
branches: [main, staging]
branches: [main, staging, master]
merge_group:
types: [checks_requested]
jobs:
scan:
name: Scan diff for credential-shaped strings
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 2 # need previous commit to diff against on push events
# For pull_request events the diff base may be many commits behind
# HEAD and absent from the shallow clone. Fetch it explicitly.
- name: Fetch PR base SHA (pull_request events only)
if: github.event_name == 'pull_request'
run: git fetch --depth=1 origin ${{ github.event.pull_request.base.sha }}
# For merge_group events the queue's pre-merge ref is a commit on
# `gh-readonly-queue/...` whose parent is the queue's base_sha.
# That parent isn't part of the queue branch's shallow clone, so
# we fetch it explicitly. Without this the diff falls through to
# "no BASE → scan entire tree" mode and false-positives on legit
# test fixtures (e.g. canvas/src/lib/validation/__tests__/secret-formats.test.ts).
- name: Refuse if credential-shaped strings appear in diff additions
env:
# Plumb event-specific SHAs through env so the script doesn't
# need conditional `${{ ... }}` interpolation per event type.
# github.event.before/after only exist on push events;
# merge_group has its own base_sha/head_sha; pull_request has
# pull_request.base.sha / pull_request.head.sha.
PR_BASE_SHA: ${{ github.event.pull_request.base.sha }}
PR_HEAD_SHA: ${{ github.event.pull_request.head.sha }}
PUSH_BEFORE: ${{ github.event.before }}
PUSH_AFTER: ${{ github.event.after }}
run: |
# Pattern set covers GitHub family (the actual #2090 vector),
# Anthropic / OpenAI / Slack / AWS. Anchored on prefixes with low
# false-positive rates against agent-generated content. Mirror of
# molecule-ai-workspace-runtime/molecule_runtime/scripts/pre-commit-checks.sh
# — keep aligned.
SECRET_PATTERNS=(
'ghp_[A-Za-z0-9]{36,}' # GitHub PAT (classic)
'ghs_[A-Za-z0-9]{36,}' # GitHub App installation token
'gho_[A-Za-z0-9]{36,}' # GitHub OAuth user-to-server
'ghu_[A-Za-z0-9]{36,}' # GitHub OAuth user
'ghr_[A-Za-z0-9]{36,}' # GitHub OAuth refresh
'github_pat_[A-Za-z0-9_]{82,}' # GitHub fine-grained PAT
'sk-ant-[A-Za-z0-9_-]{40,}' # Anthropic API key
'sk-proj-[A-Za-z0-9_-]{40,}' # OpenAI project key
'sk-svcacct-[A-Za-z0-9_-]{40,}' # OpenAI service-account key
'sk-cp-[A-Za-z0-9_-]{60,}' # MiniMax API key (F1088 vector — caught only after the fact)
'xox[baprs]-[A-Za-z0-9-]{20,}' # Slack tokens
'AKIA[0-9A-Z]{16}' # AWS access key ID
'ASIA[0-9A-Z]{16}' # AWS STS temp access key ID
)
# Determine the diff base. Each event type stores its SHAs in
# a different place — see the env block above.
case "${{ github.event_name }}" in
pull_request)
BASE="$PR_BASE_SHA"
HEAD="$PR_HEAD_SHA"
;;
*)
BASE="$PUSH_BEFORE"
HEAD="$PUSH_AFTER"
;;
esac
# On push events with shallow clones, BASE may be present in
# the event payload but absent from the local object DB
# (fetch-depth=2 doesn't always reach the previous commit
# across true merges). Try fetching it on demand. If the
# fetch fails — e.g. the SHA was force-overwritten — we fall
# through to the empty-BASE branch below, which scans the
# entire tree as if every file were new. Correct, just slow.
if [ -n "$BASE" ] && ! echo "$BASE" | grep -qE '^0+$'; then
if ! git cat-file -e "$BASE" 2>/dev/null; then
git fetch --depth=1 origin "$BASE" 2>/dev/null || true
fi
fi
# Files added or modified in this change.
if [ -z "$BASE" ] || echo "$BASE" | grep -qE '^0+$' || ! git cat-file -e "$BASE" 2>/dev/null; then
# New branch / no previous SHA / BASE unreachable — check the
# entire tree as added content. Slower, but correct on first
# push.
CHANGED=$(git ls-tree -r --name-only HEAD)
DIFF_RANGE=""
else
CHANGED=$(git diff --name-only --diff-filter=AM "$BASE" "$HEAD")
DIFF_RANGE="$BASE $HEAD"
fi
if [ -z "$CHANGED" ]; then
echo "No changed files to inspect."
exit 0
fi
# Self-exclude: this workflow file legitimately contains the
# pattern strings as regex literals. Without an exclude it would
# block its own merge.
SELF=".github/workflows/secret-scan.yml"
OFFENDING=""
# `while IFS= read -r` (not `for f in $CHANGED`) so filenames
# containing whitespace don't word-split silently — a path
# with a space would otherwise produce two iterations on
# tokens that aren't real filenames, breaking the
# self-exclude + diff lookup.
while IFS= read -r f; do
[ -z "$f" ] && continue
[ "$f" = "$SELF" ] && continue
if [ -n "$DIFF_RANGE" ]; then
ADDED=$(git diff --no-color --unified=0 "$BASE" "$HEAD" -- "$f" 2>/dev/null | grep -E '^\+[^+]' || true)
else
# No diff range (new branch first push) — scan the full file
# contents as if every line were new.
ADDED=$(cat "$f" 2>/dev/null || true)
fi
[ -z "$ADDED" ] && continue
for pattern in "${SECRET_PATTERNS[@]}"; do
if echo "$ADDED" | grep -qE "$pattern"; then
OFFENDING="${OFFENDING}${f} (matched: ${pattern})\n"
break
fi
done
done <<< "$CHANGED"
if [ -n "$OFFENDING" ]; then
echo "::error::Credential-shaped strings detected in diff additions:"
# `printf '%b' "$OFFENDING"` interprets backslash escapes
# (the literal `\n` we appended above becomes a newline)
# WITHOUT treating OFFENDING as a format string. Plain
# `printf "$OFFENDING"` is a format-string sink: a filename
# containing `%` would be interpreted as a conversion
# specifier, corrupting the error message (or printing
# `%(missing)` artifacts).
printf '%b' "$OFFENDING"
echo ""
echo "The actual matched values are NOT echoed here, deliberately —"
echo "round-tripping a leaked credential into CI logs widens the blast"
echo "radius (logs are searchable + retained)."
echo ""
echo "Recovery:"
echo " 1. Remove the secret from the file. Replace with an env var"
echo " reference (e.g. \${{ secrets.GITHUB_TOKEN }} in workflows,"
echo " process.env.X in code)."
echo " 2. If the credential was already pushed (this PR's commit"
echo " history reaches a public ref), treat it as compromised —"
echo " ROTATE it immediately, do not just remove it. The token"
echo " remains valid in git history forever and may be in any"
echo " log/cache that consumed this branch."
echo " 3. Force-push the cleaned commit (or stack a revert) and"
echo " re-run CI."
echo ""
echo "If the match is a false positive (test fixture, docs example,"
echo "or this workflow's own regex literals): use a clearly-fake"
echo "placeholder like ghs_EXAMPLE_DO_NOT_USE that doesn't satisfy"
echo "the length suffix, OR add the file path to the SELF exclude"
echo "list in this workflow with a short reason."
echo ""
echo "Mirror of the regex set lives in the runtime's bundled"
echo "pre-commit hook (molecule-ai-workspace-runtime:"
echo "molecule_runtime/scripts/pre-commit-checks.sh) — keep aligned."
exit 1
fi
echo "✓ No credential-shaped strings in this change."
secret-scan:
uses: Molecule-AI/molecule-core/.github/workflows/secret-scan.yml@staging
-1
View File
@@ -1 +0,0 @@
0.1.129
+1 -1
View File
@@ -24,7 +24,7 @@ common problems.
## Step 1 — Clone the Repository
```bash
git clone https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-claude-code.git
git clone https://github.com/Molecule-AI/molecule-ai-workspace-template-claude-code.git
cd molecule-ai-workspace-template-claude-code
```
+206
View File
@@ -0,0 +1,206 @@
r"""Tests for entrypoint.sh's log_boot_context() shell function.
The Python-side audit (test_adapter_logging.py) pins what `_audit_auth_env_presence`
in adapter.py emits. But the shell function fires FIRST — twice, even (once
pre-gosu as root, once post-gosu as agent). When the adapter never runs at
all because the SDK import fails, the entrypoint emission is the operator's
ONLY visibility into the boot env. So this contract needs its own test.
The cross-file gate `test_audit_env_list_matches_entrypoint_sh` proves the
NAME LIST matches; this file proves the SHELL CODE actually emits the
right lines for those names. Without this, a typo in the for-loop body
(e.g. `eval "val=\$$var"` → `val=$var`, which would print the literal
name not its value) silently breaks the audit.
Strategy: extract the `log_boot_context()` function body from entrypoint.sh
and run it in a fresh subprocess with controlled env. Asserts on stdout.
We never source entrypoint.sh wholesale because it would chown /workspace
and exec molecule-runtime — neither is appropriate in a test sandbox.
"""
from __future__ import annotations
import os
import re
import subprocess
from pathlib import Path
import pytest
TEMPLATE_DIR = Path(__file__).resolve().parent.parent
ENTRYPOINT = TEMPLATE_DIR / "entrypoint.sh"
def _extract_function() -> str:
"""Pull just the log_boot_context() function definition out of entrypoint.sh.
Returns the literal function definition (`log_boot_context() { ... }`) as
a string, suitable for `sh -c "<func>; log_boot_context"`. Bails with a
clear message if the function can't be located — that itself is a
regression worth a loud test failure.
"""
text = ENTRYPOINT.read_text()
# `log_boot_context() {` on its own line, then everything up to the
# matching closing `}` at column 0. The function is small and shape-stable;
# we don't try to be a full shell parser.
match = re.search(r"^log_boot_context\(\)\s*\{.*?^\}\s*$", text, re.DOTALL | re.MULTILINE)
if not match:
pytest.fail("Could not locate log_boot_context() in entrypoint.sh")
return match.group(0)
def _run_function(env: dict[str, str]) -> str:
"""Run log_boot_context() in a fresh /bin/sh with the given env. Returns stdout."""
func = _extract_function()
script = f"{func}\nlog_boot_context\n"
# Empty base env so PATH lookups (`id`, `hostname`, `date`, `ls`) still work
# but no inherited auth vars leak into the test. We restore PATH explicitly.
safe_env = {"PATH": os.environ.get("PATH", "/usr/bin:/bin")}
safe_env.update(env)
result = subprocess.run(
["/bin/sh", "-c", script],
env=safe_env,
capture_output=True,
text=True,
timeout=10,
check=False,
)
assert result.returncode == 0, (
f"log_boot_context exited rc={result.returncode}\n"
f"stdout:\n{result.stdout}\nstderr:\n{result.stderr}"
)
return result.stdout
# Audit names — kept in lockstep with adapter.py's _AUTH_ENV_AUDIT and the
# entrypoint.sh for-loop. test_audit_env_list_matches_entrypoint_sh and
# test_loop_var_list_matches_audit (below) gate any drift across the three
# locations.
_AUDIT_NAMES = (
"CLAUDE_CODE_OAUTH_TOKEN",
"ANTHROPIC_API_KEY",
"ANTHROPIC_AUTH_TOKEN",
"ANTHROPIC_BASE_URL",
"MINIMAX_API_KEY",
"GLM_API_KEY",
"KIMI_API_KEY",
"DEEPSEEK_API_KEY",
)
def test_emits_set_for_present_env():
"""A set var must produce `env NAME=set` — proves the eval-deref works."""
out = _run_function({"MINIMAX_API_KEY": "secret-MUST-NOT-LEAK"})
assert "env MINIMAX_API_KEY=set" in out
def test_emits_unset_for_absent_env():
"""An unset var must produce `env NAME=unset` — proves the empty-string branch."""
out = _run_function({})
for name in _AUDIT_NAMES:
assert f"env {name}=unset" in out, (
f"missing `env {name}=unset` line — for-loop body may be miscoded"
)
def test_never_leaks_value():
"""The audit prints NAMES, not VALUES. Regression here = secret leak.
Same threat model as the Python-side test: an operator-visible boot log
that contains the actual key would defeat the whole point of the audit
(the audit exists so we can answer 'is the key present' WITHOUT exposing
the key). A `eval "val=\\$$var"` typo collapsing to `echo $var` would
trip this test.
"""
secret = "sk-FAKE-MUST-NEVER-APPEAR-IN-BOOT-LOG"
out = _run_function({
"MINIMAX_API_KEY": secret,
"CLAUDE_CODE_OAUTH_TOKEN": secret,
"ANTHROPIC_BASE_URL": "https://api.example.com",
})
assert secret not in out, f"boot-context log leaked the env VALUE:\n{out}"
# ANTHROPIC_BASE_URL is the most-likely-to-be-logged-by-mistake field
# because operators sometimes WANT to see it; pin that it's still
# name-only.
assert "https://api.example.com" not in out
def test_emits_workspace_id_and_platform_url():
"""WORKSPACE_ID and PLATFORM_URL appear by VALUE — these are not secrets.
They're the operator-visible identifiers a support engineer needs to
correlate logs with platform records. Pinning the field shape so a
later refactor doesn't accidentally redact them.
"""
out = _run_function({
"WORKSPACE_ID": "ws-test-1234",
"PLATFORM_URL": "https://test.example.com",
})
assert "workspace_id=ws-test-1234" in out
assert "platform_url=https://test.example.com" in out
def test_emits_unset_marker_when_workspace_id_missing():
"""Missing WORKSPACE_ID falls back to the literal `<unset>` placeholder.
A support engineer reading the boot log must be able to distinguish
'WORKSPACE_ID was empty string' from 'WORKSPACE_ID was never injected
by the platform'. The shell `${VAR:-<unset>}` default handles that.
"""
out = _run_function({})
assert "workspace_id=<unset>" in out
assert "platform_url=<unset>" in out
def test_emits_uid_and_gid():
"""uid/gid line is critical — answers 'did the privilege drop happen?'
The two-emission pattern (pre-gosu as root, post-gosu as agent) only
works as a diagnostic if uid/gid is in every emission. Pin the field
shape; we don't pin the literal value because CI runs vary.
"""
out = _run_function({})
assert re.search(r"uid=\d+\s+gid=\d+", out), (
f"missing or malformed uid/gid line:\n{out}"
)
def test_emits_boot_marker():
"""Each emission starts with the dated `entrypoint boot` banner.
Operators grep for this to count restarts in a crash loop.
"""
out = _run_function({})
# Format: "----- entrypoint boot 2026-05-02T12:34:56Z -----"
assert re.search(
r"-----\s+entrypoint boot \d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\s+-----",
out,
), f"missing boot banner:\n{out}"
def test_loop_var_list_matches_audit():
"""The for-loop's literal NAME list must match _AUDIT_NAMES (this file).
Companion to test_audit_env_list_matches_entrypoint_sh in
test_adapter_logging.py: that test cross-checks adapter.py vs
entrypoint.sh; this one cross-checks entrypoint.sh vs the test
fixture above. If a maintainer adds a vendor to entrypoint.sh
without updating the audit name tuple in this file, the existing
`test_emits_unset_for_absent_env` would still pass (because all
audited names also appear in the loop), but the maintainer would
have a false sense of coverage. This test catches that.
"""
text = ENTRYPOINT.read_text()
loop_line = next(
(line for line in text.splitlines()
if "for var in" in line and "CLAUDE_CODE_OAUTH_TOKEN" in line),
None,
)
assert loop_line, "entrypoint.sh missing the auth-env for-loop"
names_in_shell = tuple(
loop_line.split("for var in", 1)[1].split(";", 1)[0].split()
)
assert set(names_in_shell) == set(_AUDIT_NAMES), (
f"_AUDIT_NAMES in this file ({set(_AUDIT_NAMES)}) and the for-loop "
f"in entrypoint.sh ({set(names_in_shell)}) disagree — update the "
"test fixture or the shell loop to bring them back in sync."
)