test: bash coverage for entrypoint.sh log_boot_context()

The Python adapter audit (test_adapter_logging.py) pins the adapter.py side, but the entrypoint shell function fires earlier and twice (pre-gosu + post-gosu). When the SDK import wedge keeps the adapter from running at all, the shell emission is the operator's only visibility into the boot env. Eight new tests cover: - env NAME=set / env NAME=unset shape for every audited var - value-leak guard: secret strings never appear in output - WORKSPACE_ID + PLATFORM_URL passthrough by value (not secret) - <unset> fallback for missing platform identifiers - uid/gid line shape (used to verify the privilege drop) - dated boot banner shape (used to count restarts in a crash loop) - cross-file gate: shell for-loop names == fixture tuple, mirroring test_audit_env_list_matches_entrypoint_sh's adapter.py↔shell gate Strategy: regex-extract the function body from entrypoint.sh and run it in a fresh /bin/sh with controlled env. We never source the whole entrypoint because it would chown /workspace and exec molecule-runtime. Closes the gap from task #251 (follow-up to PR #32 boot-debug logging). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 22:26:35 -07:00
6 changed files with 228 additions and 235 deletions
@@ -2,7 +2,7 @@ name: CI
 on: [push, pull_request]
 jobs:
  validate:
-    uses: molecule-ai/molecule-ci/.github/workflows/validate-workspace-template.yml@main
+    uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@main

  tests:
    name: Adapter unit tests
@@ -32,47 +32,14 @@ permissions:
  packages: write

 jobs:
-  # The `.runtime-version` file is the push-mode cascade signal post-
-  # 2026-05-06: when molecule-core/publish-runtime.yml ships a new
-  # version to PyPI, it does NOT call repository_dispatch (Gitea 1.22.6
-  # has no such endpoint — empirically verified molecule-core#20).
-  # Instead it git-pushes an updated `.runtime-version` to each template,
-  # which trips this workflow's `on: push: branches: [main]` trigger.
-  # This job reads that file and forwards the version to the reusable
-  # build workflow so the Dockerfile pip-installs the exact published
-  # version, not whatever requirements.txt currently bounds.
-  resolve-version:
-    runs-on: ubuntu-latest
-    timeout-minutes: 2
-    outputs:
-      version: ${{ steps.read.outputs.version }}
-    steps:
-      - uses: actions/checkout@v4
-      - id: read
-        run: |
-          if [ -f .runtime-version ]; then
-            v="$(head -n1 .runtime-version | tr -d '[:space:]')"
-            echo "version=$v" >> "$GITHUB_OUTPUT"
-            echo "resolved runtime version: $v"
-          else
-            echo "no .runtime-version file present — falling through to Dockerfile default"
-          fi
-
  publish:
-    needs: resolve-version
-    uses: molecule-ai/molecule-ci/.github/workflows/publish-template-image.yml@main
+    uses: Molecule-AI/molecule-ci/.github/workflows/publish-template-image.yml@main
    secrets: inherit
    with:
-      # Resolution chain (highest priority first):
-      #   1. client_payload.runtime_version — legacy GitHub
-      #      repository_dispatch path (will return if Gitea ever adds
-      #      the dispatch API; left in place for forward-compat).
-      #   2. inputs.runtime_version — manual workflow_dispatch run from
-      #      the Actions UI for ad-hoc rebuilds against a specific
-      #      version.
-      #   3. needs.resolve-version.outputs.version — the
-      #      `.runtime-version` file in this repo, written by
-      #      molecule-core/publish-runtime.yml's push-mode cascade.
-      #   4. '' — fall through to the Dockerfile default
-      #      (requirements.txt pin).
-      runtime_version: ${{ github.event.client_payload.runtime_version || inputs.runtime_version || needs.resolve-version.outputs.version || '' }}
+      # When the cascade fires, client_payload.runtime_version is the
+      # exact version PyPI just published. Forwarded to the reusable
+      # workflow as a docker --build-arg so the cache key changes
+      # per-version and pip install resolves freshly.
+      # On other events (push to main / manual without input), this is
+      # empty and the Dockerfile's default (requirements.txt pin) applies.
+      runtime_version: ${{ github.event.client_payload.runtime_version || inputs.runtime_version || '' }}
@@ -1,201 +1,22 @@
 name: Secret scan

-# Hard CI gate. Refuses any PR / push whose diff additions contain a
-# recognisable credential. Defense-in-depth for the #2090-class incident
-# (2026-04-24): GitHub's hosted Copilot Coding Agent leaked a ghs_*
-# installation token into tenant-proxy/package.json via `npm init`
-# slurping the URL from a token-embedded origin remote. We can't fix
-# upstream's clone hygiene, so we gate here.
+# Calls the canonical reusable workflow in molecule-core. Defense
+# against the #2090-class leak (a hosted-agent commit slipping a
+# credential-shaped string into a PR). Pattern set lives in
+# molecule-core so we do not maintain a parallel copy here.
 #
-# Inlined copy from molecule-ai/molecule-core/.github/workflows/secret-scan.yml.
-# Cross-repo workflow_call to a private repo doesn't fully work on Gitea 1.22.6
-# (workflow file fails parse-time at 0s with no logs); inline keeps the gate
-# functional until Gitea is upgraded or the canonical scanner moves to a public
-# repo. When that lands, this file reverts to the 3-line wrapper:
-#
-#   jobs:
-#     secret-scan:
-#       uses: Molecule-AI/molecule-core/.github/workflows/secret-scan.yml@staging
-#
-# Pin to @staging not @main — staging is the active default branch,
-# main lags via the staging-promotion workflow. Updates ride along
-# automatically on the next consumer workflow run.
-#
-# Same regex set as the runtime's bundled pre-commit hook
-# (molecule-ai-workspace-runtime: molecule_runtime/scripts/pre-commit-checks.sh).
-# Keep the two sides aligned when adding patterns.
+# Pinned to @staging because that is the active default branch on the
+# upstream repo (main lags behind via the staging-promotion workflow).
+# Updates ride along automatically as the upstream regex set evolves.

 on:
  pull_request:
    types: [opened, synchronize, reopened]
  push:
-    branches: [main, staging]
+    branches: [main, staging, master]
+  merge_group:
+    types: [checks_requested]

 jobs:
-  scan:
-    name: Scan diff for credential-shaped strings
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
-        with:
-          fetch-depth: 2  # need previous commit to diff against on push events
-
-      # For pull_request events the diff base may be many commits behind
-      # HEAD and absent from the shallow clone. Fetch it explicitly.
-      - name: Fetch PR base SHA (pull_request events only)
-        if: github.event_name == 'pull_request'
-        run: git fetch --depth=1 origin ${{ github.event.pull_request.base.sha }}
-
-      # For merge_group events the queue's pre-merge ref is a commit on
-      # `gh-readonly-queue/...` whose parent is the queue's base_sha.
-      # That parent isn't part of the queue branch's shallow clone, so
-      # we fetch it explicitly. Without this the diff falls through to
-      # "no BASE → scan entire tree" mode and false-positives on legit
-      # test fixtures (e.g. canvas/src/lib/validation/__tests__/secret-formats.test.ts).
-
-      - name: Refuse if credential-shaped strings appear in diff additions
-        env:
-          # Plumb event-specific SHAs through env so the script doesn't
-          # need conditional `${{ ... }}` interpolation per event type.
-          # github.event.before/after only exist on push events;
-          # merge_group has its own base_sha/head_sha; pull_request has
-          # pull_request.base.sha / pull_request.head.sha.
-          PR_BASE_SHA: ${{ github.event.pull_request.base.sha }}
-          PR_HEAD_SHA: ${{ github.event.pull_request.head.sha }}
-          PUSH_BEFORE: ${{ github.event.before }}
-          PUSH_AFTER: ${{ github.event.after }}
-        run: |
-          # Pattern set covers GitHub family (the actual #2090 vector),
-          # Anthropic / OpenAI / Slack / AWS. Anchored on prefixes with low
-          # false-positive rates against agent-generated content. Mirror of
-          # molecule-ai-workspace-runtime/molecule_runtime/scripts/pre-commit-checks.sh
-          # — keep aligned.
-          SECRET_PATTERNS=(
-            'ghp_[A-Za-z0-9]{36,}'           # GitHub PAT (classic)
-            'ghs_[A-Za-z0-9]{36,}'           # GitHub App installation token
-            'gho_[A-Za-z0-9]{36,}'           # GitHub OAuth user-to-server
-            'ghu_[A-Za-z0-9]{36,}'           # GitHub OAuth user
-            'ghr_[A-Za-z0-9]{36,}'           # GitHub OAuth refresh
-            'github_pat_[A-Za-z0-9_]{82,}'   # GitHub fine-grained PAT
-            'sk-ant-[A-Za-z0-9_-]{40,}'      # Anthropic API key
-            'sk-proj-[A-Za-z0-9_-]{40,}'     # OpenAI project key
-            'sk-svcacct-[A-Za-z0-9_-]{40,}'  # OpenAI service-account key
-            'sk-cp-[A-Za-z0-9_-]{60,}'       # MiniMax API key (F1088 vector — caught only after the fact)
-            'xox[baprs]-[A-Za-z0-9-]{20,}'   # Slack tokens
-            'AKIA[0-9A-Z]{16}'               # AWS access key ID
-            'ASIA[0-9A-Z]{16}'               # AWS STS temp access key ID
-          )
-
-          # Determine the diff base. Each event type stores its SHAs in
-          # a different place — see the env block above.
-          case "${{ github.event_name }}" in
-            pull_request)
-              BASE="$PR_BASE_SHA"
-              HEAD="$PR_HEAD_SHA"
-              ;;
-            *)
-              BASE="$PUSH_BEFORE"
-              HEAD="$PUSH_AFTER"
-              ;;
-          esac
-
-          # On push events with shallow clones, BASE may be present in
-          # the event payload but absent from the local object DB
-          # (fetch-depth=2 doesn't always reach the previous commit
-          # across true merges). Try fetching it on demand. If the
-          # fetch fails — e.g. the SHA was force-overwritten — we fall
-          # through to the empty-BASE branch below, which scans the
-          # entire tree as if every file were new. Correct, just slow.
-          if [ -n "$BASE" ] && ! echo "$BASE" | grep -qE '^0+$'; then
-            if ! git cat-file -e "$BASE" 2>/dev/null; then
-              git fetch --depth=1 origin "$BASE" 2>/dev/null || true
-            fi
-          fi
-
-          # Files added or modified in this change.
-          if [ -z "$BASE" ] || echo "$BASE" | grep -qE '^0+$' || ! git cat-file -e "$BASE" 2>/dev/null; then
-            # New branch / no previous SHA / BASE unreachable — check the
-            # entire tree as added content. Slower, but correct on first
-            # push.
-            CHANGED=$(git ls-tree -r --name-only HEAD)
-            DIFF_RANGE=""
-          else
-            CHANGED=$(git diff --name-only --diff-filter=AM "$BASE" "$HEAD")
-            DIFF_RANGE="$BASE $HEAD"
-          fi
-
-          if [ -z "$CHANGED" ]; then
-            echo "No changed files to inspect."
-            exit 0
-          fi
-
-          # Self-exclude: this workflow file legitimately contains the
-          # pattern strings as regex literals. Without an exclude it would
-          # block its own merge.
-          SELF=".github/workflows/secret-scan.yml"
-
-          OFFENDING=""
-          # `while IFS= read -r` (not `for f in $CHANGED`) so filenames
-          # containing whitespace don't word-split silently — a path
-          # with a space would otherwise produce two iterations on
-          # tokens that aren't real filenames, breaking the
-          # self-exclude + diff lookup.
-          while IFS= read -r f; do
-            [ -z "$f" ] && continue
-            [ "$f" = "$SELF" ] && continue
-            if [ -n "$DIFF_RANGE" ]; then
-              ADDED=$(git diff --no-color --unified=0 "$BASE" "$HEAD" -- "$f" 2>/dev/null | grep -E '^\+[^+]' || true)
-            else
-              # No diff range (new branch first push) — scan the full file
-              # contents as if every line were new.
-              ADDED=$(cat "$f" 2>/dev/null || true)
-            fi
-            [ -z "$ADDED" ] && continue
-            for pattern in "${SECRET_PATTERNS[@]}"; do
-              if echo "$ADDED" | grep -qE "$pattern"; then
-                OFFENDING="${OFFENDING}${f} (matched: ${pattern})\n"
-                break
-              fi
-            done
-          done <<< "$CHANGED"
-
-          if [ -n "$OFFENDING" ]; then
-            echo "::error::Credential-shaped strings detected in diff additions:"
-            # `printf '%b' "$OFFENDING"` interprets backslash escapes
-            # (the literal `\n` we appended above becomes a newline)
-            # WITHOUT treating OFFENDING as a format string. Plain
-            # `printf "$OFFENDING"` is a format-string sink: a filename
-            # containing `%` would be interpreted as a conversion
-            # specifier, corrupting the error message (or printing
-            # `%(missing)` artifacts).
-            printf '%b' "$OFFENDING"
-            echo ""
-            echo "The actual matched values are NOT echoed here, deliberately —"
-            echo "round-tripping a leaked credential into CI logs widens the blast"
-            echo "radius (logs are searchable + retained)."
-            echo ""
-            echo "Recovery:"
-            echo "  1. Remove the secret from the file. Replace with an env var"
-            echo "     reference (e.g. \${{ secrets.GITHUB_TOKEN }} in workflows,"
-            echo "     process.env.X in code)."
-            echo "  2. If the credential was already pushed (this PR's commit"
-            echo "     history reaches a public ref), treat it as compromised —"
-            echo "     ROTATE it immediately, do not just remove it. The token"
-            echo "     remains valid in git history forever and may be in any"
-            echo "     log/cache that consumed this branch."
-            echo "  3. Force-push the cleaned commit (or stack a revert) and"
-            echo "     re-run CI."
-            echo ""
-            echo "If the match is a false positive (test fixture, docs example,"
-            echo "or this workflow's own regex literals): use a clearly-fake"
-            echo "placeholder like ghs_EXAMPLE_DO_NOT_USE that doesn't satisfy"
-            echo "the length suffix, OR add the file path to the SELF exclude"
-            echo "list in this workflow with a short reason."
-            echo ""
-            echo "Mirror of the regex set lives in the runtime's bundled"
-            echo "pre-commit hook (molecule-ai-workspace-runtime:"
-            echo "molecule_runtime/scripts/pre-commit-checks.sh) — keep aligned."
-            exit 1
-          fi
-
-          echo "✓ No credential-shaped strings in this change."
+  secret-scan:
+    uses: Molecule-AI/molecule-core/.github/workflows/secret-scan.yml@staging
@@ -1 +0,0 @@
-0.1.129
@@ -24,7 +24,7 @@ common problems.
 ## Step 1 — Clone the Repository

 ```bash
-git clone https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-claude-code.git
+git clone https://github.com/Molecule-AI/molecule-ai-workspace-template-claude-code.git
 cd molecule-ai-workspace-template-claude-code
 ```

@@ -0,0 +1,206 @@
+r"""Tests for entrypoint.sh's log_boot_context() shell function.
+
+The Python-side audit (test_adapter_logging.py) pins what `_audit_auth_env_presence`
+in adapter.py emits. But the shell function fires FIRST — twice, even (once
+pre-gosu as root, once post-gosu as agent). When the adapter never runs at
+all because the SDK import fails, the entrypoint emission is the operator's
+ONLY visibility into the boot env. So this contract needs its own test.
+
+The cross-file gate `test_audit_env_list_matches_entrypoint_sh` proves the
+NAME LIST matches; this file proves the SHELL CODE actually emits the
+right lines for those names. Without this, a typo in the for-loop body
+(e.g. `eval "val=\$$var"` → `val=$var`, which would print the literal
+name not its value) silently breaks the audit.
+
+Strategy: extract the `log_boot_context()` function body from entrypoint.sh
+and run it in a fresh subprocess with controlled env. Asserts on stdout.
+We never source entrypoint.sh wholesale because it would chown /workspace
+and exec molecule-runtime — neither is appropriate in a test sandbox.
+"""
+from __future__ import annotations
+
+import os
+import re
+import subprocess
+from pathlib import Path
+
+import pytest
+
+TEMPLATE_DIR = Path(__file__).resolve().parent.parent
+ENTRYPOINT = TEMPLATE_DIR / "entrypoint.sh"
+
+
+def _extract_function() -> str:
+    """Pull just the log_boot_context() function definition out of entrypoint.sh.
+
+    Returns the literal function definition (`log_boot_context() { ... }`) as
+    a string, suitable for `sh -c "<func>; log_boot_context"`. Bails with a
+    clear message if the function can't be located — that itself is a
+    regression worth a loud test failure.
+    """
+    text = ENTRYPOINT.read_text()
+    # `log_boot_context() {` on its own line, then everything up to the
+    # matching closing `}` at column 0. The function is small and shape-stable;
+    # we don't try to be a full shell parser.
+    match = re.search(r"^log_boot_context\(\)\s*\{.*?^\}\s*$", text, re.DOTALL | re.MULTILINE)
+    if not match:
+        pytest.fail("Could not locate log_boot_context() in entrypoint.sh")
+    return match.group(0)
+
+
+def _run_function(env: dict[str, str]) -> str:
+    """Run log_boot_context() in a fresh /bin/sh with the given env. Returns stdout."""
+    func = _extract_function()
+    script = f"{func}\nlog_boot_context\n"
+    # Empty base env so PATH lookups (`id`, `hostname`, `date`, `ls`) still work
+    # but no inherited auth vars leak into the test. We restore PATH explicitly.
+    safe_env = {"PATH": os.environ.get("PATH", "/usr/bin:/bin")}
+    safe_env.update(env)
+    result = subprocess.run(
+        ["/bin/sh", "-c", script],
+        env=safe_env,
+        capture_output=True,
+        text=True,
+        timeout=10,
+        check=False,
+    )
+    assert result.returncode == 0, (
+        f"log_boot_context exited rc={result.returncode}\n"
+        f"stdout:\n{result.stdout}\nstderr:\n{result.stderr}"
+    )
+    return result.stdout
+
+
+# Audit names — kept in lockstep with adapter.py's _AUTH_ENV_AUDIT and the
+# entrypoint.sh for-loop. test_audit_env_list_matches_entrypoint_sh and
+# test_loop_var_list_matches_audit (below) gate any drift across the three
+# locations.
+_AUDIT_NAMES = (
+    "CLAUDE_CODE_OAUTH_TOKEN",
+    "ANTHROPIC_API_KEY",
+    "ANTHROPIC_AUTH_TOKEN",
+    "ANTHROPIC_BASE_URL",
+    "MINIMAX_API_KEY",
+    "GLM_API_KEY",
+    "KIMI_API_KEY",
+    "DEEPSEEK_API_KEY",
+)
+
+
+def test_emits_set_for_present_env():
+    """A set var must produce `env NAME=set` — proves the eval-deref works."""
+    out = _run_function({"MINIMAX_API_KEY": "secret-MUST-NOT-LEAK"})
+    assert "env MINIMAX_API_KEY=set" in out
+
+
+def test_emits_unset_for_absent_env():
+    """An unset var must produce `env NAME=unset` — proves the empty-string branch."""
+    out = _run_function({})
+    for name in _AUDIT_NAMES:
+        assert f"env {name}=unset" in out, (
+            f"missing `env {name}=unset` line — for-loop body may be miscoded"
+        )
+
+
+def test_never_leaks_value():
+    """The audit prints NAMES, not VALUES. Regression here = secret leak.
+
+    Same threat model as the Python-side test: an operator-visible boot log
+    that contains the actual key would defeat the whole point of the audit
+    (the audit exists so we can answer 'is the key present' WITHOUT exposing
+    the key). A `eval "val=\\$$var"` typo collapsing to `echo $var` would
+    trip this test.
+    """
+    secret = "sk-FAKE-MUST-NEVER-APPEAR-IN-BOOT-LOG"
+    out = _run_function({
+        "MINIMAX_API_KEY": secret,
+        "CLAUDE_CODE_OAUTH_TOKEN": secret,
+        "ANTHROPIC_BASE_URL": "https://api.example.com",
+    })
+    assert secret not in out, f"boot-context log leaked the env VALUE:\n{out}"
+    # ANTHROPIC_BASE_URL is the most-likely-to-be-logged-by-mistake field
+    # because operators sometimes WANT to see it; pin that it's still
+    # name-only.
+    assert "https://api.example.com" not in out
+
+
+def test_emits_workspace_id_and_platform_url():
+    """WORKSPACE_ID and PLATFORM_URL appear by VALUE — these are not secrets.
+
+    They're the operator-visible identifiers a support engineer needs to
+    correlate logs with platform records. Pinning the field shape so a
+    later refactor doesn't accidentally redact them.
+    """
+    out = _run_function({
+        "WORKSPACE_ID": "ws-test-1234",
+        "PLATFORM_URL": "https://test.example.com",
+    })
+    assert "workspace_id=ws-test-1234" in out
+    assert "platform_url=https://test.example.com" in out
+
+
+def test_emits_unset_marker_when_workspace_id_missing():
+    """Missing WORKSPACE_ID falls back to the literal `<unset>` placeholder.
+
+    A support engineer reading the boot log must be able to distinguish
+    'WORKSPACE_ID was empty string' from 'WORKSPACE_ID was never injected
+    by the platform'. The shell `${VAR:-<unset>}` default handles that.
+    """
+    out = _run_function({})
+    assert "workspace_id=<unset>" in out
+    assert "platform_url=<unset>" in out
+
+
+def test_emits_uid_and_gid():
+    """uid/gid line is critical — answers 'did the privilege drop happen?'
+
+    The two-emission pattern (pre-gosu as root, post-gosu as agent) only
+    works as a diagnostic if uid/gid is in every emission. Pin the field
+    shape; we don't pin the literal value because CI runs vary.
+    """
+    out = _run_function({})
+    assert re.search(r"uid=\d+\s+gid=\d+", out), (
+        f"missing or malformed uid/gid line:\n{out}"
+    )
+
+
+def test_emits_boot_marker():
+    """Each emission starts with the dated `entrypoint boot` banner.
+
+    Operators grep for this to count restarts in a crash loop.
+    """
+    out = _run_function({})
+    # Format: "----- entrypoint boot 2026-05-02T12:34:56Z -----"
+    assert re.search(
+        r"-----\s+entrypoint boot \d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\s+-----",
+        out,
+    ), f"missing boot banner:\n{out}"
+
+
+def test_loop_var_list_matches_audit():
+    """The for-loop's literal NAME list must match _AUDIT_NAMES (this file).
+
+    Companion to test_audit_env_list_matches_entrypoint_sh in
+    test_adapter_logging.py: that test cross-checks adapter.py vs
+    entrypoint.sh; this one cross-checks entrypoint.sh vs the test
+    fixture above. If a maintainer adds a vendor to entrypoint.sh
+    without updating the audit name tuple in this file, the existing
+    `test_emits_unset_for_absent_env` would still pass (because all
+    audited names also appear in the loop), but the maintainer would
+    have a false sense of coverage. This test catches that.
+    """
+    text = ENTRYPOINT.read_text()
+    loop_line = next(
+        (line for line in text.splitlines()
+         if "for var in" in line and "CLAUDE_CODE_OAUTH_TOKEN" in line),
+        None,
+    )
+    assert loop_line, "entrypoint.sh missing the auth-env for-loop"
+    names_in_shell = tuple(
+        loop_line.split("for var in", 1)[1].split(";", 1)[0].split()
+    )
+    assert set(names_in_shell) == set(_AUDIT_NAMES), (
+        f"_AUDIT_NAMES in this file ({set(_AUDIT_NAMES)}) and the for-loop "
+        f"in entrypoint.sh ({set(names_in_shell)}) disagree — update the "
+        "test fixture or the shell loop to bring them back in sync."
+    )