Health Checks
The daemon exposes two HTTP health endpoints for container orchestration, load balancers, and monitoring systems.
Endpoints
Section titled “Endpoints”| Endpoint | Auth | Purpose |
|---|---|---|
GET /api/health/ready | None | Liveness/readiness probe |
GET /api/health/status | Required | Full subsystem status |
Both listen on the daemon’s configured port (default 5199).
Readiness probe
Section titled “Readiness probe”curl -sf http://127.0.0.1:5199/api/health/readyReturns 200 OK with body healthy when the daemon is accepting requests. No authentication, no JSON — just a string. Use this for Docker HEALTHCHECK, Kubernetes liveness probes, and load balancer health checks.
Any non-200 response (or connection refused) means the daemon isn’t ready.
Docker HEALTHCHECK
Section titled “Docker HEALTHCHECK”The official container image uses this:
HEALTHCHECK --interval=15s --timeout=5s --start-period=30s --retries=3 \ CMD curl -sf http://127.0.0.1:5199/api/health/ready || exit 1The 30-second start period gives the daemon time to initialize providers and connect channels before the first probe fires.
Kubernetes
Section titled “Kubernetes”livenessProbe: httpGet: path: /api/health/ready port: 5199 initialDelaySeconds: 30 periodSeconds: 15 timeoutSeconds: 5 failureThreshold: 3Systemd
Section titled “Systemd”For systemd-managed daemons, use a ExecStartPost check or a watchdog timer:
[Service]ExecStartPost=/bin/sh -c 'until curl -sf http://127.0.0.1:5199/api/health/ready; do sleep 2; done'Status endpoint
Section titled “Status endpoint”curl -s http://127.0.0.1:5199/api/health/status \ -H "Authorization: Bearer $(netclaw auth token)"Returns a JSON object with the state of every subsystem. This is the same data that netclaw status displays — the CLI just formats it as a table.
Requires authentication (bearer token or loopback origin). Returns 401 without valid credentials.
Response structure
Section titled “Response structure”{ "overall": "healthy", "build": { "version": "0.4.2", "commitHash": "a1b2c3d", "buildTimestamp": "2026-05-01T12:00:00Z" }, "process": { "pid": 1234, "startedAtUtc": "2026-05-05T08:00:00Z", "uptimeSeconds": 3600 }, "connectors": [ { "key": "slack", "displayName": "Slack", "enabled": true, "status": "healthy", "message": null }, { "key": "discord", "displayName": "Discord", "enabled": true, "status": "disconnected", "message": "Gateway timeout" }, { "key": "mcp:github", "displayName": "GitHub MCP", "enabled": true, "status": "healthy", "message": null } ], "model": { "modelId": "anthropic/claude-sonnet-4", "displayName": "Claude Sonnet 4", "provider": "openrouter", "inputModalities": ["text", "image"], "outputModalities": ["text"], "contextWindow": 200000 }, "persistence": { "provider": "sqlite" }, "memory": { "provider": "sqlite", "status": "healthy", "databasePath": "/root/.netclaw/memory/netclaw-memory.db", "pendingCheckpoints": 0 }, "reminders": { "scheduledCount": 3, "activeExecutions": 0, "failedCount": 0 }, "telemetry": { "enabled": true, "otlpEndpoint": "http://localhost:4317" }, "update": { "state": "up-to-date", "available": false, "currentVersion": "0.4.2", "latestVersion": "0.4.2" }}Overall status
Section titled “Overall status”The overall field is computed from connector states:
| Overall | Condition |
|---|---|
healthy | All enabled connectors are healthy |
degraded | Any enabled connector is disconnected, degraded, auth-required, or auth-failed |
The HTTP status code is always 200 — parse the overall field to determine health. This avoids false-positive container restarts when a single channel has a transient disconnect.
Connector statuses
Section titled “Connector statuses”Each connector (Slack, Discord, MCP servers) reports one of:
| Status | Meaning |
|---|---|
healthy | Connected and operational |
degraded | Partially functional (e.g., reconnecting) |
disconnected | Connection lost |
auth-required | Needs OAuth flow (MCP servers) |
auth-failed | Credentials rejected |
disabled | Turned off in config |
Memory status
Section titled “Memory status”| Status | Meaning |
|---|---|
healthy | Database accessible, checkpoint backlog ≤ 25 |
degraded | Checkpoint backlog growing (> 25 pending) |
unavailable | Database unreachable |
Monitoring integration
Section titled “Monitoring integration”Prometheus / Grafana
Section titled “Prometheus / Grafana”Poll /api/health/status on an interval and extract metrics:
# Simple availability check (no auth needed)curl -sf http://127.0.0.1:5199/api/health/ready && echo 1 || echo 0For richer metrics, use the OpenTelemetry integration which exports to OTLP directly.
Operational alerts
Section titled “Operational alerts”When a connector transitions to disconnected or auth-failed, the daemon fires an operational alert to configured webhook targets. You don’t need to poll the status endpoint to detect failures — alerts push to you.
Related pages
Section titled “Related pages”netclaw status— CLI that formats this endpoint’s response as a tablenetclaw doctor— offline diagnostics for configuration issues- Operational Alerts — push notifications on health state changes
- OpenTelemetry — metrics and traces export
- Docker Deployment — container health check configuration
Resources
Section titled “Resources”- ASP.NET Health Checks — the framework behind the endpoints
- Kubernetes Probes — configuring liveness and readiness probes
- Docker HEALTHCHECK — container health check instruction reference