Compare commits

..

3 Commits

Author SHA1 Message Date
app-fe a2d4ce5e24 ci: re-trigger build 2026-05-15 05:29:15 +00:00
documentation-specialist 644226f2b2 fix(docs): set terminationGracePeriodSeconds to 120 in Kubernetes YAML example
Secret scan / secret-scan (pull_request) Successful in 1s
CI / build (pull_request) Successful in 4m4s
The example showed terminationGracePeriodSeconds: 30, but the accompanying
note says the value "should exceed the healthcheck failure threshold
(3 × 30s = 90s)". With 30s < 90s, Kubernetes would send SIGTERM and
wait only 30s before SIGKILL — potentially killing the pod before the
graceful shutdown (3s via stop_event) completes.

Changed to 120s, which exceeds the 90s threshold and aligns the YAML
example with the documented requirement.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 04:57:26 +00:00
technical-writer b6e3b8e8e0 docs(tutorials): add self-hosted workspace Docker deployment guide
Secret scan / secret-scan (pull_request) Successful in 1m20s
CI / build (pull_request) Successful in 2m44s
Covers Docker image pull, required env vars (MOLECULE_API_URL,
MOLECULE_API_KEY, WORKSPACE_ID, PORT), built-in HEALTHCHECK probe
(/agent/card every 30s), Docker Compose config, graceful SIGTERM
shutdown via stop_event threading.Event, and Kubernetes liveness/readiness
probe configuration.

Closes gap: no self-hosted Docker workspace deployment docs existed
despite molecule-core#883 HEALTHCHECK shipping in 2026-05-13.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-14 04:54:04 +00:00
2 changed files with 201 additions and 25 deletions
-25
View File
@@ -8,31 +8,6 @@ Entries are published daily at 23:50 UTC.
---
## 2026-05-13
### ✨ New features
- **Docker HEALTHCHECK for workspace containers**: the workspace `Dockerfile` now includes a `HEALTHCHECK --interval=30s --timeout=5s --retries=3` directive that probes `http://localhost:${PORT:-8000}/.well-known/agent-card.json`. Self-hosted operators running the workspace container under Docker or Kubernetes can now use native liveness/readiness probes — the container is marked healthy only when the A2A agent card endpoint responds. (`molecule-core` [#883](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/883))
### 📚 Documentation
- **Security section expanded**: the [Security hub](/docs/security) now includes a link to the [OWASP Agentic Top 10](/docs/security/owasp-agentic-top-10) risk framework and the [Security Changelog](/docs/security/changelog) alongside the existing SAFE-MCP advisory. A severity level table (CRITICAL / HIGH / MEDIUM / LOW) calibrates response timelines for each finding. (`docs` [#35](https://git.moleculesai.app/molecule-ai/docs/pulls/35))
- **MCP server env var corrected**: `MOLECULE_URL` is now consistently named `MOLECULE_API_URL` in the [MCP Server documentation](/docs/mcp-server) and all code examples. The old name is no longer referenced on any public surface. (`docs` [#34](https://git.moleculesai.app/molecule-ai/docs/pulls/34))
- **Remote workspaces documentation updated with graceful shutdown**: the [Remote Workspaces page](/docs/remote-workspaces) now documents the `stop_event: threading.Event` parameter for `run_heartbeat_loop()` and `run_agent_loop()`, enabling proper SIGTERM graceful shutdown in Kubernetes, Docker, and other container-orchestrated environments. The `PLATFORM_URL` default has also been corrected to `http://host.docker.internal:8080` for containerized development. (`docs` [#29](https://git.moleculesai.app/molecule-ai/docs/pulls/29))
- **Workspace runtime `PLATFORM_URL` defaults corrected**: `PLATFORM_URL` now consistently defaults to `http://host.docker.internal:8080` across all workspace runtime modules (`a2a_cli.py`, `a2a_client.py`, `a2a_mcp_server.py`, and 10 others). Previously some modules defaulted to `http://platform:8080`, causing connection failures in containerized deployments where the Docker host is not named `platform`. (`docs` [#32](https://git.moleculesai.app/molecule-ai/docs/pulls/32))
- **Dev channel setup documentation clarified**: setting a dev channel using the channel name alone (e.g. `claude-code`) now produces a clear error directing users to the tagged form (`claude-code:latest`). The tagged form is required because dev channels track rolling tags (`latest`, `nightly`) rather than semantic versions. (`docs` [#30](https://git.moleculesai.app/molecule-ai/docs/pulls/30))
- **MCP server tool registry table corrected**: the [MCP Server](/docs/mcp-server) documentation now lists all 87 tools across 12 categories. A prior version listed only 29 tools, with several tools assigned to incorrect categories. (`molecule-mcp-server` [#5](https://git.moleculesai.app/molecule-ai/molecule-mcp-server/pulls/5))
- **CWE-22 path traversal regression in org template import documented**: the [Security Changelog](/docs/security/changelog) records a regression in `org_import.go` where `createWorkspaceTree`'s path-traversal guard was removed, allowing a malicious org YAML with `filesDir: "../../../etc"` to read arbitrary server files. The fix replaces unprotected `parseEnvFile` calls with `loadWorkspaceEnv` which applies `resolveInsideRoot` validation before accessing any path. (`docs` [#31](https://git.moleculesai.app/molecule-ai/docs/pulls/31), `molecule-core` [#810](https://git.moleculesai.app/molecule-ai/molecule-core/pulls/810))
- **EC2 Instance Connect staging IAM permission documented**: the [EC2 provisioner tutorial](/docs/tutorials/aws-ec2-provisioner) and internal observability runbook now document the `ssm:SendCommand` permission required on the EC2 Instance Connect IAM role for staging tenant Vector installation via SSM. (`docs` [#33](https://git.moleculesai.app/molecule-ai/docs/pulls/33))
### 🧹 Internal
- **Internal platform hardening** (`molecule-core`): Go handler checks restored to main (`#871`); main test blockers repaired (`#900`); rows.Err() checks added after all database scan loops (`#882`, `#865`); bundle import test builds restored (`#861`, `#850`); TermsGate dialog structure + WCAG button accessibility improved (`#854`); WCAG AA contrast corrected for amber buttons (`#859`); EventsTab and ScheduleTab test coverage added (`#869`); delivery mode and workspace status test coverage added (`#868`); ExternalConnectModal test coverage added — 31 cases (`#847`); 14 pure-function cases added to `org_helpers_pure_test.go` (`#840`); `pgplugin` store dual-fields test fixed with `regexp.QuoteMeta` (`#857`); workflow status emitters annotated (`#877`); gate-check infra-sre Gitea login mapped to `core-devops` agent (`#896`); CI lint-workflow-yaml Rules 7/8/9 resolved on redeploy-tenants-on-main (`#903`); CI retry logic added to status reaper for API timeouts (`#890`); CI main-gate skip logic for non-default-base PRs added (`#892`); workspace Dockerfile test compile drift repaired (`#884`); staging synced from main (`#876`).
- **CI tooling migration** (`molecule-core`, `molecule-sdk-python`): `.github/workflows/` renamed to `.gitea/workflows/` post GitHub suspension sweep (`molecule-sdk-python` [#9](https://git.moleculesai.app/molecule-ai/molecule-sdk-python/pulls/9), molecule-core multiple PRs); SOP checklist gate added to docs CI (`.gitea/scripts/sop-checklist-gate.py`) to auto-warn when a public-surface PR is missing a changelog entry (`docs` [#27](https://git.moleculesai.app/molecule-ai/docs/pulls/27)); duplicate changelog entries added to the 2026-05-10 section by a prior automated backfill removed; chronological order restored (`docs` [#28](https://git.moleculesai.app/molecule-ai/docs/pulls/28)).
- **SaaS platform stability** (`molecule-core`): `ADMIN_TOKEN` placeholder in `global_secrets` now healed at platform server startup — SaaS tenants provisioned with a placeholder token now receive the real token automatically without requiring re-provision (`#893`, `#898`); `ADMIN_TOKEN` injected into workspace container env vars for admin-gated endpoint access (`#885`).
---
## 2026-05-12
### 🔒 Security
@@ -0,0 +1,201 @@
---
title: Self-Hosted Workspace Deployment with Docker
---
# Self-Hosted Workspace Deployment with Docker
This guide covers running a Molecule AI workspace agent as a Docker container on a self-hosted server or VM. It covers the Docker image, required environment variables, the built-in healthcheck, graceful shutdown, and Kubernetes deployment considerations.
> **Prerequisites:** A running Molecule AI control plane (self-hosted or SaaS), an `ADMIN_TOKEN` or org-scoped API key with admin scope, and Docker 20.10+ on the host.
## How the workspace container works
The Molecule AI workspace Dockerfile includes:
- A `HEALTHCHECK` directive that probes the agent card endpoint every 30 seconds
- A uvicorn server on port 8000 (configurable via `PORT`)
- Support for `stop_event` graceful shutdown via SIGTERM
```
┌─────────────────────────────────────────────┐
│ Docker host (your VM / bare metal) │
│ │
│ ┌─────────────────────────────────────┐ │
│ │ workspace container │ │
│ │ │ │
│ │ uvicorn (port 8000) │ │
│ │ └─ /agent/card ← HEALTHCHECK │ │
│ │ │ │
│ │ run_heartbeat_loop(stop_event) │ │
│ └──────────────┬──────────────────────┘ │
│ │ │
│ host.docker.internal:8080 │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────┐ │
│ │ Molecule AI control plane │ │
│ │ (platform on port 8080) │ │
│ └─────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
```
## Step 1: Create an external workspace
First register the workspace as an external (self-managed) agent on the platform.
```bash
ADMIN_TOKEN="your-admin-token"
PLATFORM_URL="https://platform.moleculesai.app" # or http://localhost:8080 for local dev
WORKSPACE=$(curl -s -X POST "${PLATFORM_URL}/workspaces" \
-H "Authorization: Bearer ${ADMIN_TOKEN}" \
-H "Content-Type: application/json" \
-d '{"name": "self-hosted-agent", "runtime": "external"}')
WORKSPACE_ID=$(echo "$WORKSPACE" | python3 -c "import json,sys; print(json.load(sys.stdin)['id'])")
echo "Workspace ID: $WORKSPACE_ID"
```
Save the returned `WORKSPACE_ID` and bearer token from the next step.
## Step 2: Pull the workspace image
The workspace image is published to the Molecule AI ECR registry. Contact your platform administrator for the registry prefix and credentials, then log in:
```bash
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com"
docker pull "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
```
## Step 3: Configure environment variables
| Variable | Default | Description |
|---|---|---|
| `MOLECULE_API_URL` | `http://localhost:8080` | Platform API URL. From Docker on Linux/macOS, use `http://host.docker.internal:8080` to reach the host machine. |
| `MOLECULE_API_KEY` | — | Bearer token obtained during agent registration |
| `WORKSPACE_ID` | — | Workspace ID from Step 1 |
| `PORT` | `8000` | Agent server port (matches HEALTHCHECK) |
| `AGENT_CARD_URL` | `http://localhost:${PORT}/agent/card` | Advertised agent card URL (must be reachable from the platform) |
## Step 4: Run the container
### Docker (standalone)
```bash
docker run -d \
--name molecule-workspace \
-p 8000:8000 \
-e MOLECULE_API_URL="http://host.docker.internal:8080" \
-e MOLECULE_API_KEY="your-agent-bearer-token" \
-e WORKSPACE_ID="your-workspace-id" \
-e PORT=8000 \
"${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
```
> **Note for Linux hosts:** Docker does not include `host.docker.internal` by default. On Linux, either add `--add-host=host.docker.internal:host-gateway` to the `docker run` command, or use the host machine's IP address directly (e.g. `http://192.168.1.100:8080`).
### Verify the healthcheck
```bash
# Wait for the container to become healthy (up to ~2 minutes)
docker inspect --format='{{.State.Health.Status}}' molecule-workspace
# Expected output: healthy
# Once healthy, the agent card is reachable:
curl -s http://localhost:8000/agent/card | python3 -m json.tool
```
### Docker Compose
```yaml
services:
molecule-workspace:
image: "${REGISTRY_PREFIX}.dkr.ecr.us-east-1.amazonaws.com/molecule-workspace:latest"
ports:
- "8000:8000"
environment:
MOLECULE_API_URL: "http://host.docker.internal:8080"
MOLECULE_API_KEY: "your-agent-bearer-token"
WORKSPACE_ID: "your-workspace-id"
PORT: "8000"
# Linux hosts: add host.docker.internal resolution
# extra_hosts:
# - "host.docker.internal:host-gateway"
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/agent/card"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
```
## Step 5: Graceful shutdown
The workspace agent supports graceful shutdown via a `stop_event: threading.Event`. When the container receives SIGTERM (e.g. from `docker stop`), the heartbeat loop exits cleanly with return value `"stopped"` instead of hanging.
To enable SIGTERM handling in your agent code:
```python
import signal, threading
from molecule_agent import RemoteAgentClient
client = RemoteAgentClient(
molecule_api_url=os.environ["MOLECULE_API_URL"],
api_key=os.environ["MOLECULE_API_KEY"],
workspace_id=os.environ["WORKSPACE_ID"],
)
stop_event = threading.Event()
def sigterm_handler(signum, frame):
print("Received SIGTERM, initiating graceful shutdown...")
stop_event.set()
signal.signal(signal.SIGTERM, sigterm_handler)
# run_heartbeat_loop exits with return value "stopped" when stop_event is set
result = client.run_heartbeat_loop(stop_event=stop_event)
print(f"Heartbeat loop stopped: {result}")
```
Without explicit SIGTERM handling, the container will be killed after the Docker default 10-second timeout. The healthcheck ensures orchestrators can detect an unhealthy container before the SIGTERM timeout.
## Kubernetes deployment
For Kubernetes deployments, use the native liveness/readiness probe configuration instead of the Docker HEALTHCHECK:
```yaml
ports:
- name: http
containerPort: 8000
livenessProbe:
httpGet:
path: /agent/card
port: http
initialDelaySeconds: 30
periodSeconds: 30
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /agent/card
port: http
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
terminationGracePeriodSeconds: 120
```
> **Note:** `terminationGracePeriodSeconds` must exceed the liveness probe failure window (3 × 30s = 90s) so that Kubernetes sends SIGTERM and allows graceful shutdown before the pod is killed. The 120s value here gives a 30s buffer beyond the 90s threshold.
## Troubleshooting
| Symptom | Cause | Fix |
|---|---|---|
| Container shows `unhealthy` after startup | Platform unreachable from container | Verify `MOLECULE_API_URL` uses `host.docker.internal` (Docker) or the correct host IP |
| `curl: (7) Failed to connect` on healthcheck | Container not fully started | Wait up to 30s; increase `start_period` |
| Agent not appearing on canvas | Wrong `WORKSPACE_ID` or expired token | Re-run registration; check platform logs |
| `host.docker.internal` not resolved | Linux host without the Docker flag | Use `--add-host=host.docker.internal:host-gateway` or the host's LAN IP |