Routing OTLP exceptions through the OTel Collector to a Sentry-compatible backend
An OTel SDK calls recordException() in your service. Twelve hops later the exception lands in your error tracker. The OpenTelemetry Collector owns most of those hops. This is the routing recipe — receiver, OTTL transform, exporter — and the three failure modes that drop exceptions before anyone sees them.
20 seconds. An OTel SDK emits exception data over OTLP as either a span event or a log record. The Collector receives both through the standard otlp receiver. A transform processor written in OTTL filters benign exceptions and fills in missing attributes. An exporter sends the result to a Sentry-compatible backend. urgentry accepts OTLP-native ingest, so the exporter is otlphttp pointed at /v1/traces and /v1/logs with no envelope translation in the path.
60 seconds. Direct SDK-to-backend OTLP works for one service and breaks at three. The Collector earns its slot by batching across services, retrying on backend hiccups, scrubbing PII in one place, and providing a single config point when the backend moves. For exception data specifically, the Collector is where you decide what counts as a real exception versus deployment noise — cancellation errors and intentional test panics flood the backend if nothing filters them. The exporter side has two shapes: OTLP-native to a backend that speaks OTLP (urgentry), or an envelope translator to a backend that only speaks the legacy Sentry envelope (Sentry SaaS, older GlitchTip). The native path is fewer moving parts; the envelope path is what you reach for when the backend forces it.
This guide covers the OTLP exception data model, an OTTL transform that filters and decorates, both exporter shapes with a complete config, and the three operational failures that quietly drop exceptions on the way to the backend.
Why the Collector belongs in the path
OTel SDKs ship OTLP exporters and can post directly to a backend. For a single binary on a single VPS, that direct path is fine. For anything with multiple services, the Collector belongs in the middle for five reasons that all show up the first time you skip it.
- Batching across services. Each SDK has its own queue and its own retry timer. The Collector merges N noisy queues into one steady stream with predictable size and frequency. Your backend sees one batch every two seconds, not N independent floods.
- Retry on backend hiccups. SDK queues are bounded in memory. When the backend is briefly down, an SDK queue fills in tens of seconds and starts dropping. The Collector can persist its queue to disk through the
file_storageextension and ride out a multi-minute outage without loss. - One place to scrub PII. An exception message can interpolate query strings, request bodies, or user identifiers. Doing PII redaction in every SDK requires every team to update every service. Doing it once in the Collector is one config change. See our PII scrubbing guide for the patterns that actually catch leaks.
- Backend changes without service redeploys. When you move from Sentry SaaS to urgentry, or split errors and traces across two backends, the change is a Collector restart. No application code touches OTel config.
- Tail-based sampling. Per-trace decisions require seeing every span in the trace. The SDK does not have that view. The Collector does, and the
tail_samplingprocessor is what makes "keep every error trace, sample the rest" possible — covered in detail in our tail-based sampling guide.
The cost is one more process to run, one more config file to version, and a small latency budget. The benefit is every concern above in one place.
How exception data arrives at the Collector
OTel SDKs record exceptions through two parallel paths. The split exists because the semantic conventions changed mid-stream and the SDKs kept both paths for backward compatibility.
The classic path is a span event. When your code calls span.RecordException(err) or span.recordException(e), the SDK appends an event named exception to the active span. The event carries four attributes from the OpenTelemetry semantic conventions:
exception.type— the fully qualified type name, likejava.lang.NullPointerExceptionorValueError.exception.message— the exception's message string.exception.stacktrace— the stringified stack trace.exception.escaped— a boolean indicating whether the exception escaped the span scope.
The newer path is a log record. The OTel logs SIG settled on emitting exceptions as standalone log records with the same attribute schema, plus an event.name of exception, and the originating trace_id and span_id as resource fields. The reasoning was that exceptions are signal records, not part of the span lifecycle, and routing them through the logs pipeline gives them their own sampling and retention rules.
Both paths still ship in production SDKs. Java, .NET, and Python emit span events by default and log records when the logs SDK is wired up. Go favors span events. The Collector receives both via the same otlp receiver — span events arrive in the traces pipeline, log records arrive in the logs pipeline, and you wire both into the exporter.
On the wire, the receiver looks the same as any OTel deployment:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
gRPC on 4317, HTTP on 4318. Both ports accept traces, logs, and metrics. Nothing exception-specific lives at the receiver layer.
Filtering and decorating with OTTL
The transform processor is the OTel Collector's general-purpose mutation tool. It uses OTTL — the OpenTelemetry Transformation Language — to filter, set, and delete attributes on spans, span events, log records, and metric data points.
For exception routing, four OTTL operations cover almost every case.
Drop benign exceptions. Every error tracker drowns in cancellation errors on day one. Go's context.Canceled, gRPC's Code.CANCELLED, HTTP client timeouts on graceful shutdown, and the test framework panics from your CI pipeline are all "exceptions" by the SDK definition and none of them are bugs. Drop them at the Collector:
processors:
filter/exceptions:
error_mode: ignore
traces:
spanevent:
- 'name == "exception" and attributes["exception.type"] == "context.Canceled"'
- 'name == "exception" and attributes["exception.type"] == "io.grpc.StatusRuntimeException" and IsMatch(attributes["exception.message"], "^CANCELLED")'
logs:
log_record:
- 'attributes["exception.type"] == "context.Canceled"'
Backfill missing service identifiers. A common failure: a worker process forgets to set OTEL_SERVICE_NAME and ships exceptions with no service.name resource attribute. The backend cannot route the exception to a project. OTTL fills it in from the process name or a static default:
transform/backfill:
error_mode: ignore
trace_statements:
- context: resource
statements:
- set(attributes["service.name"], "unknown-service") where attributes["service.name"] == nil
log_statements:
- context: resource
statements:
- set(attributes["service.name"], "unknown-service") where attributes["service.name"] == nil
Strip PII from exception messages. The riskiest fields are exception.message and exception.stacktrace, both of which can interpolate runtime values. An OTTL replace_pattern against an email regex is the simplest scrub:
transform/scrub:
error_mode: ignore
log_statements:
- context: log
statements:
- replace_pattern(attributes["exception.message"], "[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+", "[REDACTED_EMAIL]")
Normalize deployment context. The release-health line in urgentry depends on deployment.environment being one of a small allowed set: production, staging, development. Coerce stray values to that vocabulary so a typo in one service does not split your environment list:
transform/normalize-env:
error_mode: ignore
trace_statements:
- context: resource
statements:
- set(attributes["deployment.environment"], "production") where IsMatch(attributes["deployment.environment"], "^(prod|production|live)$")
- set(attributes["deployment.environment"], "staging") where IsMatch(attributes["deployment.environment"], "^(stage|staging|test)$")
OTTL compiles to bytecode at Collector start. The runtime cost is tens of microseconds per record. The cost shows up only when you run a regex against a large attribute like exception.stacktrace on every record; keep heavy matches scoped to small fields like exception.type.
Two exporter paths
The exporter is where Sentry compatibility actually happens. There are two shapes worth knowing.
OTLP-native. The cleanest path is the same protocol end to end. urgentry accepts OTLP HTTP on the standard signal endpoints: /v1/traces and /v1/logs. The Collector's otlphttp exporter posts to those endpoints directly. No envelope translation in the path, the same retry and queue semantics the Collector uses everywhere else, and one wire format to debug.
exporters:
otlphttp/urgentry:
endpoint: https://errors.yourdomain.com
headers:
Authorization: "Bearer ${URGENTRY_INGEST_TOKEN}"
compression: gzip
sending_queue:
enabled: true
num_consumers: 4
queue_size: 5000
storage: file_storage/queue
retry_on_failure:
enabled: true
initial_interval: 5s
max_interval: 30s
max_elapsed_time: 300s
The storage: file_storage/queue line points at the file_storage extension, which persists the export queue to disk. A five-minute backend outage no longer drops data — the queue rides through it. This is the same pattern documented in our silent 200 OK failures guide; it works on the SDK side too, but the Collector is where most teams put it.
Sentry envelope. The other shape applies when the backend speaks only the legacy Sentry envelope format. The sentry exporter in opentelemetry-collector-contrib translates OTLP spans into the envelope format and posts to a DSN-derived endpoint. The trade is one more translation step and a less direct relationship between OTLP semantics and what the backend stores.
exporters:
sentry:
dsn: https://PUBLIC_KEY@errors.yourdomain.com/PROJECT_ID
environment: production
For urgentry, both exporters work, but the OTLP-native path is the supported route. Reach for the envelope path only when a second backend in your fanout requires it — see our side-by-side evaluation guide for the dual-backend pattern.
A complete config
End-to-end, the Collector config for exception routing to urgentry looks like this:
extensions:
file_storage/queue:
directory: /var/lib/otelcol/queue
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
memory_limiter:
check_interval: 1s
limit_mib: 512
spike_limit_mib: 128
filter/exceptions:
error_mode: ignore
traces:
spanevent:
- 'name == "exception" and attributes["exception.type"] == "context.Canceled"'
logs:
log_record:
- 'attributes["exception.type"] == "context.Canceled"'
transform/backfill:
error_mode: ignore
trace_statements:
- context: resource
statements:
- set(attributes["service.name"], "unknown-service") where attributes["service.name"] == nil
log_statements:
- context: resource
statements:
- set(attributes["service.name"], "unknown-service") where attributes["service.name"] == nil
batch:
timeout: 2s
send_batch_size: 512
exporters:
otlphttp/urgentry:
endpoint: https://errors.yourdomain.com
headers:
Authorization: "Bearer ${URGENTRY_INGEST_TOKEN}"
compression: gzip
sending_queue:
enabled: true
queue_size: 5000
storage: file_storage/queue
retry_on_failure:
enabled: true
max_elapsed_time: 300s
service:
extensions: [file_storage/queue]
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, filter/exceptions, transform/backfill, batch]
exporters: [otlphttp/urgentry]
logs:
receivers: [otlp]
processors: [memory_limiter, filter/exceptions, transform/backfill, batch]
exporters: [otlphttp/urgentry]
The processor order is load-bearing. memory_limiter goes first so it can reject early under pressure. The exception filter goes next so dropped records do not consume work in later processors. The transform processor goes after the filter so it does not decorate records that are about to be dropped. batch goes last so the batch size reflects the records that actually exit.
Three failure modes that drop exceptions silently
The Collector is reliable when it is configured for the load it actually sees. Three misconfigurations cause silent exception loss in production, and all three return 200 OK to the upstream SDK while the exception goes nowhere.
Memory limiter rejects under spike load. The memory_limiter processor is correct to drop records when the Collector approaches its memory ceiling — the alternative is the Collector OOMing and losing everything in flight. But the rejection looks like a successful receive from the SDK's side. The fix is to size the Collector for the worst spike, not the average. Our rate limits and quotas guide covers the math for the spike case, and the same logic applies to the Collector's memory ceiling.
The OTTL filter matches more than you meant. A regex like IsMatch(attributes["exception.type"], "Cancel") will catch context.Canceled, your intended target, and also OrderCancellationException, which is a real bug class in your checkout service. The fix is exact-match equality where possible, anchored regexes (^...$) where not. Verify by running the filter in dry-run with the logging exporter at verbosity: detailed and watching what falls out.
The export queue fills during a backend outage. The otlphttp exporter has a default queue_size of 1000 items. A five-minute backend outage at fifty exceptions per second fills 15,000 items, fourteen-fifteenths of which the queue refuses. The fix is the file_storage extension pointed at a persistent disk path with a queue size that survives your longest realistic outage. Sized at 100,000 items the queue rides through a thirty-minute backend window at the same rate.
All three failures share a pattern: the rejection happens inside the Collector, the SDK sees a successful upstream response, and no metric on the application side reflects the loss. The Collector's own internal telemetry (the otelcol_processor_dropped_* and otelcol_exporter_send_failed_* counters, exposed on the Collector's own metrics endpoint) is where the truth lives. Scrape those into the same backend you ship to. The Collector watching itself is the only thing that catches Collector-side drops.
Verifying the route
Two checks gate "we routed it correctly" from "the next time we deploy we will find out we did not."
First, the logging exporter at verbosity: detailed prints every record that exits the pipeline. Add it to the exporters list temporarily, throw a known exception in a test service, and confirm it lands at the logging output with the expected service.name, exception.type, and exception.message. Then remove it.
Second, a curl test takes the SDK out of the loop. Post a synthetic OTLP envelope to the Collector's HTTP receiver:
curl -X POST http://collector:4318/v1/traces \
-H "Content-Type: application/json" \
-d '{
"resourceSpans": [{
"resource": {
"attributes": [{
"key": "service.name",
"value": {"stringValue": "verify-test"}
}]
},
"scopeSpans": [{
"spans": [{
"traceId": "5b8aa5a2d2c872e8321cf37308d69df2",
"spanId": "051581bf3cb55c13",
"name": "verify-route",
"kind": 1,
"startTimeUnixNano": "1750000000000000000",
"endTimeUnixNano": "1750000001000000000",
"events": [{
"timeUnixNano": "1750000000500000000",
"name": "exception",
"attributes": [
{"key": "exception.type", "value": {"stringValue": "VerifyError"}},
{"key": "exception.message", "value": {"stringValue": "this is a route test"}}
]
}]
}]
}]
}]
}'
A 200 from the Collector and a corresponding event in urgentry within a few seconds confirms the route end to end. If the Collector returns 200 but urgentry has nothing, the failure is in the exporter or between the Collector and the backend — start with the Collector's own metrics endpoint and the export-failure counters.
Frequently asked questions
Do I need the OTel Collector to send exceptions from an OTel SDK to urgentry?
No. The OTel SDKs can post OTLP directly to urgentry's /v1/traces and /v1/logs endpoints. The Collector belongs in the path for the same reasons it belongs in any production OTel deployment: batching across services, retry on backend hiccups, PII scrubbing, and a single place to change the route when the backend moves. For a single-service hobby project, skip it. For anything with more than two services, run it.
What is the difference between an OTel span event and a log record for exceptions?
Both carry the same attribute schema (exception.type, exception.message, exception.stacktrace). The span event attaches the exception to a specific span as part of the trace; the log record is a standalone signal that carries the trace_id and span_id but lives in the logs pipeline. Most current SDKs emit one or the other based on the language and version. The Collector receives both paths through the standard OTLP receiver.
Will the OTTL transform processor slow down my ingest pipeline?
In practice, no. OTTL is compiled to bytecode at Collector start and runs in tens of microseconds per record. The cost shows up only when an OTTL statement triggers a regex against a large attribute like exception.stacktrace on every record. Keep regex matches on small attributes like exception.type and you will not measure the overhead.
What happens if an exception arrives without a service.name?
The Collector accepts it, but the Sentry-compatible backend cannot route it to a project. urgentry uses service.name and deployment.environment to map incoming OTLP data to a project and environment. An exception without service.name lands in an unknown bucket. The OTTL transform in this guide adds a fallback so the route never drops the data.
Can I drop benign exceptions before they hit the backend?
Yes, and you should. Cancellation errors, context.Canceled, EOF on graceful shutdown, and intentional panics in tests are the four classes of benign exception that flood an error tracker on day one. An OTTL filter statement matched against exception.type drops them at the Collector before they consume backend storage or trigger alerts.
Sources
- OpenTelemetry semantic conventions: exceptions on spans — the canonical attribute schema for
exception.type,exception.message,exception.stacktrace, andexception.escaped. - OpenTelemetry semantic conventions: exceptions on logs — the newer log-record path for exceptions, with the same attribute schema plus
event.name. - opentelemetry-collector-contrib: transform processor — the OTTL transform processor documentation, including the trace/log/span event contexts and the full statement reference.
- opentelemetry-collector-contrib: filter processor — the filter processor documentation for dropping records by attribute match.
- opentelemetry-collector-contrib: sentry exporter — the community-maintained Sentry envelope exporter, for the envelope-path option in the two-exporter section.
- opentelemetry-collector-contrib: file_storage extension — the persistent-queue extension referenced in the silent-loss section.
- urgentry compatibility matrix — the source-scanned audit of all 218 Sentry REST API operations and the OTLP ingest endpoints used in this guide.
OTLP-native ingest. Same Collector config. One Go binary on the receiving end.
urgentry accepts OTLP-native traces and logs on the standard signal endpoints. Point the otlphttp exporter at it and exception data flows through the same pipeline you already run.