OpenTelemetry
Netclaw can push metrics and structured logs to any OpenTelemetry Protocol (OTLP) compatible collector. Flip it on, point it at your collector, and your existing backend (Grafana, Datadog, Honeycomb, whatever speaks OTLP) gets per-channel message flow metrics, token consumption counters, and full daemon logs.
Configuration
Section titled “Configuration”Merge a Telemetry block into your existing ~/.netclaw/config/netclaw.json:
{ "Telemetry": { "Enabled": true, "Otlp": { "Endpoint": "http://127.0.0.1:4317" } }}| Field | Type | Default | Description |
|---|---|---|---|
Enabled | bool | false | Turns on the OTLP export pipeline |
Otlp:Endpoint | string | http://127.0.0.1:4317 | OTLP collector endpoint (gRPC) |
Netclaw uses gRPC OTLP on port 4317, not HTTP/Protobuf (4318). If your collector only accepts HTTP OTLP, you’ll get silent failures.
Environment variable overrides follow the .NET double-underscore convention:
export NETCLAW_Telemetry__Enabled="true"export NETCLAW_Telemetry__Otlp__Endpoint="http://127.0.0.1:4317"Telemetry config changes need a daemon restart. Run netclaw doctor first to catch config errors before you bounce the daemon:
netclaw doctornetclaw daemon stop && netclaw daemon startValidation
Section titled “Validation”If Otlp:Endpoint isn’t a valid absolute URI, the daemon refuses to start, even with telemetry disabled:
Telemetry:Otlp:Endpoint must be an absolute URI.netclaw doctor catches this before you hit it at startup. It validates the endpoint format and warns when telemetry is on but no explicit endpoint is set.
What gets exported
Section titled “What gets exported”The OTel resource service name is netclawd (hardcoded, not configurable).
Every log line the daemon produces goes to your collector, with full formatting and scope data (IncludeFormattedMessage, IncludeScopes, ParseStateValues all enabled). Two meters cover metrics: one for session-level token usage, one for per-channel message flow. Full reference below.
Distributed tracing is off for now. The cross-actor model produces disconnected spans with no meaningful causality chain, so it’s more noise than signal.
Metrics reference
Section titled “Metrics reference”Session metrics (Netclaw.Sessions)
Section titled “Session metrics (Netclaw.Sessions)”Token consumption and turn tracking across all sessions.
| Metric | Type | Description |
|---|---|---|
netclaw.session.tokens.input | Counter | Input tokens consumed |
netclaw.session.tokens.output | Counter | Output tokens consumed |
netclaw.session.turns.completed | Counter | Conversation turns completed |
These are aggregate totals across all models and providers, with no per-model or per-provider attribute breakdowns.
Channel metrics (Netclaw.Channels)
Section titled “Channel metrics (Netclaw.Channels)”Per-channel message pipeline metrics. Each metric name is prefixed with netclaw.channel.{channel-type} where channel type is one of: slack, tui, headless, signalr, reminder, webhook, discord.
| Metric suffix | Type | Attributes | Description |
|---|---|---|---|
.events.received | Counter | kind | Inbound events received |
.events.dropped | Counter | reason | Events dropped before processing |
.events.filtered | Counter | reason | Events filtered by policy |
.events.routed | Counter | kind | Events that reached conversation routing |
.messages.enqueued | Counter | — | Messages accepted into session queue |
.replies.posted | Counter | — | Successful replies sent |
.replies.rejected | Counter | error_code | Rejected reply attempts |
.replies.failed | Counter | — | Failed reply attempts |
.reply.duration.ms | Histogram | — | Reply post latency in milliseconds (includes both successful and failed attempts) |
Netclaw.Webhooks(the meter behindnetclaw stats) is not wired into OTLP. Thenetclaw.channel.webhook.*metrics above cover message flow through the webhook channel; per-route delivery stats only show up innetclaw stats.
Diagnostic queries
Section titled “Diagnostic queries”| Symptom | What to check |
|---|---|
| No replies | events.received > 0 but replies.posted = 0. Events arrive, nothing comes back. |
| Looping agent | turns.completed climbing without corresponding replies.posted |
| Policy dropping messages | events.dropped with reason attribute showing the cause |
Collector setup
Section titled “Collector setup”Point the endpoint at any OpenTelemetry Collector that accepts gRPC OTLP. A minimal local setup with Docker:
docker run -d --name otel-collector \ -p 4317:4317 \ otel/opentelemetry-collector-contrib:latestRoute to your backend (Prometheus, Grafana Cloud, Datadog, etc.) via the collector’s exporter configuration.
For a quick local stack, Grafana OTel-LGTM bundles the collector, Prometheus, Loki, and Grafana in one container. Good for kicking the tires before committing to a production backend.
Verifying the pipeline
Section titled “Verifying the pipeline”After restarting the daemon with telemetry enabled:
netclaw statusshould show the telemetry row asenabledwith your endpointnetclaw doctorvalidates the endpoint URI format- Send a test message through any channel and look for
netclaw.channel.*metrics in your collector
Once data is flowing, build dashboards around the diagnostic queries above. Operational Alerts covers netclaw’s built-in webhook notifications, which work alongside OTel-based alerting.
Related pages
Section titled “Related pages”- Operational Alerts for outbound webhook notifications
netclaw statusshows the telemetry row and OTLP endpointnetclaw doctorvalidates OTLP endpoint configurationnetclaw statshas in-process counters, including webhook metrics that aren’t in OTLP
Resources
Section titled “Resources”- OpenTelemetry Collector documentation covers setup, configuration, and deployment
- OTLP specification describes the wire protocol netclaw uses
- Grafana OTel-LGTM Docker image is an all-in-one local observability stack
- .NET environment variable configuration explains the double-underscore nesting convention