harden(runtime): scrub bare sk-ant keys, JSON-quoted token/apiKey, aws_secret_access_key in _sanitize_for_external
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 12s
E2E Chat / E2E Chat (pull_request) Failing after 3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 12s
Harness Replays / Harness Replays (pull_request) Successful in 1s
Harness Replays / detect-changes (pull_request) Successful in 7s
publish-runtime-autobump / pr-validate (pull_request) Successful in 35s
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 54s
Secret scan / Scan diff for credential-shaped strings (pull_request) Failing after 9s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Failing after 43s
gate-check-v3 / gate-check (pull_request) Successful in 7s
security-review / approved (pull_request) Successful in 9s
sop-checklist / na-declarations (pull_request) N/A: (none)
qa-review / approved (pull_request) Successful in 10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m56s
sop-checklist / all-items-acked (pull_request) Successful in 7s
sop-tier-check / tier-check (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 6m40s
CI / Platform (Go) (pull_request) Successful in 10m22s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Canvas (Next.js) (pull_request) Successful in 10m48s
CI / all-required (pull_request) Successful in 1s

Addresses internal#212 PR#1420 dual-review SECURITY finding (infra-sre /
infra-runtime-be): _sanitize_for_external missed three real credential
shapes because the legacy regex requires a `[ :=]+` separator after the
prefix:
- bare `sk-ant-api03-…` keys (real key uses `-`, not `[ :=]`)
- JSON-quoted "token"/"apiKey"/"secret"/"password" values
- `aws_secret_access_key=…`

Added three narrowly-scoped regexes (length thresholds tuned so curated
short examples like `sk-ant-EXAMPLE-SHORT` / `ghp_SHORT_TOKEN` and all
actionable auth/quota/HTTP guidance still pass through). Extended the unit
test with test_sanitize_agent_error_reason_scrubs_all_secret_formats
asserting redaction for all three new formats plus the original Bearer
regression. Full sanitize suite green; existing passthrough assertions
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-17 07:56:16 -07:00
parent 44b78e28c8
commit 7d2eaa3748
2 changed files with 78 additions and 0 deletions
+22
View File
@@ -599,6 +599,28 @@ def _sanitize_for_external(msg: str) -> str:
import re as _re
msg = _re.sub(r"(?i)(?:bearer|token|api[_-]?key|sk-)[ :=]+[A-Za-z0-9_/.-]{20,}", "[REDACTED]", msg)
# Bare provider key with NO separator after the prefix — a real
# `sk-ant-api03-…` / `sk-…` key uses `-` (not `[ :=]`) so the rule
# above misses it. Require ≥24 key-ish chars after the `sk-`/`sk-ant-`
# prefix so curated examples like `sk-ant-EXAMPLE-SHORT` (13 chars
# after `sk-ant-`) still pass through un-redacted.
msg = _re.sub(r"(?i)\bsk-(?:ant-)?[A-Za-z0-9_-]{24,}", "[REDACTED]", msg)
# JSON-quoted credential values: {"token": "…"} / {"apiKey": "…"} /
# {"secret": "…"} / {"password": "…"}. Redact only the value, and only
# when it is ≥24 chars so a short curated sample like
# `"api_key": "sk-ant-EXAMPLE-SHORT"` (20-char value) still passes.
msg = _re.sub(
r'(?i)("(?:token|api[_-]?key|secret|password)"\s*:\s*")[^"]{24,}(")',
r"\1[REDACTED]\2",
msg,
)
# AWS secret access key in `aws_secret_access_key=…` form (env dumps,
# boto tracebacks). The base64-ish value runs until whitespace/quote.
msg = _re.sub(
r"(?i)(aws_secret_access_key\s*[:=]\s*)\S+",
r"\1[REDACTED]",
msg,
)
# Absolute paths: /etc/shadow, /home/user/.aws/credentials, etc.
msg = _re.sub(r"(?:/[^/\s]+){2,}", lambda m: m.group(0) if len(m.group(0)) < 60 else "[REDACTED_PATH]", msg)
return msg
+56
View File
@@ -829,6 +829,62 @@ def test_sanitize_agent_error_reason_still_scrubs_secrets():
assert "please re-auth" in out
def test_sanitize_agent_error_reason_scrubs_all_secret_formats():
"""The scrubber must redact every realistic credential shape — not just
the `Bearer <tok>` form the original test happened to exercise
(internal#212 review finding: bare `sk-ant-api03-…` keys, JSON-quoted
"token"/"apiKey" values, and `aws_secret_access_key=` all leaked).
All curated/actionable guidance must still survive the scrub.
"""
# 1. Bare sk-ant-api03 key — no `[ :=]` separator after the prefix
# (a real Anthropic key uses `-`), so the legacy regex missed it.
bare = (
"provider HTTP 401 — auth failed — invalid key "
"sk-ant-api03-AbCdEf0123456789AbCdEf0123456789AbCdEf0123456789xyz "
"please re-auth"
)
out = sanitize_agent_error(reason=bare)
assert "sk-ant-api03-AbCdEf0123456789AbCdEf0123456789AbCdEf0123456789xyz" not in out
assert "[REDACTED]" in out
assert "401" in out # actionable status survives
assert "please re-auth" in out # actionable guidance survives
# 2. JSON-quoted "token" / "apiKey" values.
jblob = (
'provider error — config dump {"token": '
'"abcDEF0123456789ghIJKL0123456789mnopQRST", "apiKey": '
'"sk-ant-api03-ZZZZ1111ZZZZ2222ZZZZ3333ZZZZ4444ZZZZ"} — '
"use an API key instead"
)
out = sanitize_agent_error(reason=jblob)
assert "abcDEF0123456789ghIJKL0123456789mnopQRST" not in out
assert "sk-ant-api03-ZZZZ1111ZZZZ2222ZZZZ3333ZZZZ4444ZZZZ" not in out
assert "[REDACTED]" in out
assert "use an API key instead" in out # actionable guidance survives
# 3. aws_secret_access_key=… form.
awsblob = (
"provider HTTP 403 — boto credential error "
"aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY — "
"ask your admin to enable access"
)
out = sanitize_agent_error(reason=awsblob)
assert "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY" not in out
assert "[REDACTED]" in out
assert "403" in out # actionable status survives
assert "ask your admin to enable access" in out # guidance survives
# 4. Regression: the original Bearer form still redacts.
bearer = (
"provider HTTP 401 — Authorization: Bearer "
"sk-ant-DEADBEEFDEADBEEFDEADBEEF0123456789abcdef re-auth"
)
out = sanitize_agent_error(reason=bearer)
assert "sk-ant-DEADBEEFDEADBEEFDEADBEEF0123456789abcdef" not in out
assert "[REDACTED]" in out
assert "re-auth" in out
def test_sanitize_agent_error_reason_wins_over_stderr():
"""When both reason and stderr are passed, the curated reason wins."""
out = sanitize_agent_error(