Running urgentry side-by-side with Sentry during evaluation.
One event landing in urgentry proves the protocol works. It does not prove that grouping is stable, that alert thresholds behave the same way, or that your on-call engineers will find what they need at 2 a.m. A parallel run against real production traffic answers those questions before you pull the cord.
20 seconds. After the switch proof passes, configure your SDK or collector to send the same traffic to both Sentry and urgentry for one week. At the end of the week, compare ingest counts, fingerprint stability, alert parity, and latency. If the numbers hold, you have the evidence you need to commit to a cutover.
60 seconds. Two patterns exist for splitting traffic. The dual-DSN approach uses the SDK transport layer to send to two endpoints simultaneously; it works in most SDKs without a proxy and is the right choice for teams without an OTel collector already running. The proxy-fanout approach puts an OTel collector between your services and both backends; it gives you finer control over sampling and routing but adds infrastructure you need to maintain. Pick the pattern that matches the infrastructure you already have.
The parallel run is not a load test. Its job is to surface silent differences: events that land in one system and not the other, issues that group differently, alerts that fire in Sentry but not urgentry. Go into the week with a written measurement plan. A gut-feel comparison at the end of seven days produces a gut-feel migration decision.
Why a parallel run beats a single proof
The switch proof is a binary test. Either an event lands in urgentry or it does not. Pass or fail, it takes thirty minutes, and it tells you exactly one thing: the protocol contract between your SDK and urgentry’s ingest server holds.
Production traffic is not one event. It is thousands of events per day across many services, with real fingerprinting patterns, real alert thresholds, and real on-call workflows. The differences that matter in a migration do not show up in a single-event proof. They show up in the second day of the parallel run when you notice that a high-frequency error is grouping into three urgentry issues instead of one Sentry issue. Or on day four when you realize the alert rule for payment failures has not fired in urgentry despite six real incidents.
A parallel run is the gap between “the DSN swap works” and “urgentry is ready to be the system your team trusts during an incident.” Running both systems on the same traffic for a week is the cheapest way to find the gap before the cutover, not after.
Two architectural patterns
There are two ways to get the same events into both systems. The right one depends on the infrastructure your team already runs.
Pattern 1: Dual-DSN via SDK transport hooks
Most Sentry SDKs expose a transport hook or a transport override that lets you intercept the outbound HTTP call before it is sent. The canonical implementation wraps the default transport to POST the envelope to two endpoints in sequence (or concurrently). The SDK does the work; no new infrastructure runs in your network path.
This is the right pattern for teams that do not already operate an OTel collector. It is simpler, has fewer failure modes, and is reversible by swapping back to a single DSN.
The tradeoff: if urgentry’s ingest is slow to respond, the dual-send adds latency to the SDK’s background flush loop. In practice urgentry’s p99 ingest latency is well under 50ms on modest hardware, so the effect is negligible. Still worth measuring.
Pattern 2: Proxy fanout via OTel collector
If your services already route telemetry through an OpenTelemetry collector, you can configure the collector to export to both urgentry and Sentry (via urgentry’s OTLP ingest and Sentry’s OTLP endpoint, respectively). The collector handles the fan-out; individual services do not change at all.
This is the right pattern for teams with a collector already in the stack. It gives you pipeline-level control: you can sample, filter, and transform before deciding what each backend receives. It is also easier to revert cleanly because the services themselves are not changed.
The tradeoff: the collector becomes a single point of failure for both systems during the evaluation. If the collector drops events, the comparison data is worthless. Monitor the collector’s export metrics before drawing conclusions from your ingest count comparison.
Implementation: dual-DSN with the Sentry SDK
The Sentry SDK transport abstraction is consistent across major SDK versions. The pattern below shows the TypeScript/JavaScript SDK, which has the most complete transport hook surface. Go and Python follow the same shape with slightly different APIs.
JavaScript / TypeScript
import * as Sentry from "@sentry/node";
import {
makeNodeTransport,
getDefaultIntegrations,
} from "@sentry/node";
import type { Transport, TransportMakeRequestResponse, Envelope } from "@sentry/types";
function makeDualTransport(
sentryDsn: string,
urgentryDsn: string
): (options: Parameters[0]) => Transport {
return (options) => {
const sentryTransport = makeNodeTransport({ ...options, url: sentryDsn });
const urgentryTransport = makeNodeTransport({ ...options, url: urgentryDsn });
return {
send(envelope: Envelope): Promise {
// Fire both; resolve on Sentry (primary) result.
void urgentryTransport.send(envelope);
return sentryTransport.send(envelope);
},
flush(timeout?: number): Promise {
return Promise.all([
sentryTransport.flush(timeout),
urgentryTransport.flush(timeout),
]).then(([a]) => a);
},
};
};
}
Sentry.init({
dsn: process.env.SENTRY_DSN,
transport: makeDualTransport(
process.env.SENTRY_DSN!,
process.env.URGENTRY_DSN!
),
// All other SDK options remain unchanged.
tracesSampleRate: 1.0,
});
The URGENTRY_DSN environment variable is the only new piece of configuration. Set it to the DSN from your urgentry project settings. The SDK uses the primary dsn field for everything except the outbound HTTP URL, which the transport override replaces.
Python
import sentry_sdk
from sentry_sdk.transport import HttpTransport
from sentry_sdk.envelope import Envelope
import os, threading
class DualTransport(HttpTransport):
def __init__(self, options, urgentry_dsn: str):
super().__init__(options)
urgentry_opts = dict(options)
urgentry_opts["dsn"] = urgentry_dsn
self._urgentry = HttpTransport(urgentry_opts)
def capture_envelope(self, envelope: Envelope) -> None:
# Send to urgentry in background; don't block primary path.
threading.Thread(
target=self._urgentry.capture_envelope,
args=(envelope,),
daemon=True,
).start()
super().capture_envelope(envelope)
sentry_sdk.init(
dsn=os.environ["SENTRY_DSN"],
transport=lambda opts: DualTransport(opts, os.environ["URGENTRY_DSN"]),
# All other options unchanged.
)
Go
package main
import (
"github.com/getsentry/sentry-go"
"os"
)
// DualTransport fans out to Sentry and urgentry.
type DualTransport struct {
primary sentry.Transport
secondary sentry.Transport
}
func (d *DualTransport) Flush(timeout time.Duration) bool {
a := d.primary.Flush(timeout)
b := d.secondary.Flush(timeout)
return a && b
}
func (d *DualTransport) Configure(options sentry.ClientOptions) {
d.primary.Configure(options)
secondaryOptions := options
secondaryOptions.Dsn = os.Getenv("URGENTRY_DSN")
d.secondary.Configure(secondaryOptions)
}
func (d *DualTransport) SendEvent(event *sentry.Event) {
go d.secondary.SendEvent(event) // non-blocking
d.primary.SendEvent(event)
}
func main() {
_ = sentry.Init(sentry.ClientOptions{
Dsn: os.Getenv("SENTRY_DSN"),
Transport: &DualTransport{
primary: sentry.NewHTTPTransport(),
secondary: sentry.NewHTTPTransport(),
},
})
}
In all three cases the pattern is the same: the secondary transport (urgentry) is fire-and-forget from the perspective of the primary error flow. If urgentry is unreachable during the evaluation window, the SDK continues sending to Sentry without interruption.
Implementation: collector fanout (OTel)
If you already run an OpenTelemetry Collector, the configuration is simpler than the SDK-level work above. Add urgentry as a second exporter alongside your existing Sentry or OTLP exporter and use a fanout pipeline.
receivers:
otlp:
protocols:
http:
endpoint: 0.0.0.0:4318
grpc:
endpoint: 0.0.0.0:4317
exporters:
# Existing Sentry OTLP exporter
otlphttp/sentry:
endpoint: https://o0.ingest.sentry.io/api/0/envelope/
headers:
X-Sentry-Auth: "Sentry sentry_key=${SENTRY_KEY},sentry_version=7"
# New urgentry OTLP exporter โ same protocol, different host
otlphttp/urgentry:
endpoint: http://${URGENTRY_HOST}:4318
headers:
X-Sentry-Auth: "Sentry sentry_key=${URGENTRY_KEY},sentry_version=7"
processors:
batch:
timeout: 1s
send_batch_size: 1024
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/sentry, otlphttp/urgentry]
logs:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/sentry, otlphttp/urgentry]
The OTel collector’s fanout pipeline sends to both exporters for every pipeline run. The collector batches internally, so the cost is one extra HTTP connection per batch cycle, not one extra connection per event.
One operational note: set retry_on_failure on the urgentry exporter separately from the Sentry exporter. During the evaluation window you do not want urgentry’s retry backoff to hold up Sentry deliveries. Keep the exporters independent:
exporters:
otlphttp/urgentry:
endpoint: http://${URGENTRY_HOST}:4318
retry_on_failure:
enabled: true
max_elapsed_time: 30s # don't wait indefinitely during eval
sending_queue:
enabled: true
num_consumers: 4
queue_size: 500
What to measure during the week
A parallel run without a measurement plan produces anecdote, not evidence. Write down what you will check before day one starts. The five categories below are the minimum.
1. Ingest count parity
Pull the event count from both systems at the same time each day. urgentry exposes a stats API at /api/0/projects/{org}/{project}/stats/ using the same schema as Sentry. The counts will not be identical: SDKs sample, retry, and sometimes drop events under backpressure. A difference of less than 2% is noise. A consistent daily drift above 5% warrants investigation. Check urgentry’s ingest error log first; a parsing failure on a specific SDK version will show up as a per-envelope error.
2. Fingerprint stability
Take the top ten issues by event count in Sentry. Find the corresponding issue in urgentry. Confirm the fingerprints are producing the same grouping. A mismatch means either urgentry’s default grouping algorithm differs from Sentry’s for a specific stack frame pattern, or a custom fingerprinting rule in Sentry was not migrated. Both are fixable; find them now.
The fingerprint mismatch to worry about is the one that splits a single high-frequency Sentry issue into many small urgentry issues. That pattern inflates urgentry’s issue list and makes alert thresholds meaningless.
3. Alert parity
Recreate your on-call-critical alert rules in urgentry before the parallel run starts. Keep a log of every alert that fires in Sentry during the week. At the end of the week, verify each one fired in urgentry as well. Any alert that fired in Sentry but not urgentry is a gap you must close before the cutover.
The most common miss: urgentry’s default alert for “new issue” fires on the first occurrence, then stays silent for subsequent occurrences of the same fingerprint. If your Sentry rule uses an occurrence-count threshold (more than 10 events in 5 minutes), recreate that threshold explicitly in urgentry.
4. Latency at the urgentry host
urgentry runs at about 52 MB resident at 400 events per second on a single core. Measure the actual ingest latency from urgentry’s own stats endpoint during peak traffic hours. If you see p99 ingest latency climb above 200ms, check whether the SQLite WAL is the bottleneck (it usually is at very high write rates) and consider tuning PRAGMA synchronous = NORMAL in the urgentry config.
5. Missing events
At the end of the week, pick five specific error events from Sentry by event ID. Look them up in urgentry. They should all be there. An event that reached Sentry but not urgentry during a dual-DSN run indicates a transport error in the secondary path. Check urgentry’s ingest log for the timestamp and the event’s project DSN.
The daily checklist
The parallel run requires a daily check-in. It should take ten minutes. Do not skip days; a two-day gap means you find problems on day seven that started on day two.
- Day 1. Confirm events are arriving in urgentry. Check the first event from each service. Verify the stack trace is readable and the environment tag is correct. Log the event count in both systems at 5 p.m.
- Day 2. Compare event counts from day one. Check fingerprinting on the top five issues. Trigger one of your on-call-critical alert rule conditions manually and confirm urgentry fires it.
- Day 3. Review urgentry’s ingest error log. Look for any per-envelope parsing failures. Check whether any high-frequency issue in Sentry is grouping as multiple smaller issues in urgentry.
- Day 4. Log event counts again. Check whether the Sentry rate limiter is kicking in on the Sentry side (look for 429 responses in the SDK debug log). If Sentry is throttling, the count comparison will diverge and the evaluation data becomes unreliable.
- Day 5. Walk the primary triage workflow in urgentry for one real issue that appeared during the week. Assign it, check the stack, look at the breadcrumbs, mark it resolved. Ask one on-call engineer to do the same without guidance.
- Day 6. Review the alert log from the week. List every alert that fired in Sentry. Verify each one is in urgentry’s alert history. Write down any gaps.
- Day 7. Final event count comparison. Spot-check five specific events by event ID across both systems. Write the evaluation summary: ingest parity, fingerprint findings, alert gaps, latency numbers. Present it at the sign-off meeting.
Sign-off criteria and cutover
The parallel run ends when you can write “yes” next to every item below. If an item is “no,” it is either a blocking gap (fix it before the cutover date) or a known gap with a post-cutover owner and a due date.
- Event count delta is below 5% on each of the last three days of the run.
- Top ten issues group consistently across both systems. Fingerprint mismatches, if any, are understood and resolved.
- Every on-call-critical alert rule fired at least once in urgentry during the week. No Sentry alert from the week is missing from urgentry’s history without an explanation.
- urgentry ingest p99 latency stayed below 200ms during peak hours.
- Five specific events, chosen from Sentry by event ID, were found in urgentry.
- At least one on-call engineer walked the triage workflow in urgentry without a guide and found what they needed.
The cutover step
Once sign-off is written down and reviewed, the cutover is one step: remove the secondary transport wrapper and replace the SENTRY_DSN environment variable value with the urgentry DSN. Restart the services. Verify the first event in urgentry. Disable the Sentry project (do not delete it; historical events may be useful for reference).
There is no data migration step. urgentry starts from the first event it receives as the primary backend. Sentry’s historical data stays in Sentry until you decide to delete it. The cutover itself takes about as long as a deployment.
Common pitfalls
Duplicate alerts from both systems
During the parallel run, both Sentry and urgentry will fire alerts for the same real incidents. If your on-call rotation routes through PagerDuty or Opsgenie, you will receive two pages for each event unless you gate urgentry’s alerts differently. The simplest fix during the evaluation window is to send urgentry alerts to a dedicated Slack channel, not to the on-call integration. This lets you verify that alerts fire without doubling the page volume. Do not wire urgentry into the production on-call pipeline until after sign-off.
Billing impact on the Sentry side
Running a dual-DSN setup sends every event to Sentry twice in the sense that Sentry is still receiving your full production volume. Nothing changes on the Sentry billing side: you were already paying for those events, and you are still paying for them during the evaluation. What changes is that you are now also running urgentry, which has its own hosting cost (typically a VPS or your existing infrastructure). The parallel run should not increase your Sentry bill unless you add new services to the dual-DSN configuration that were not previously instrumented.
The risk is rate limiting, not billing. If your Sentry plan has a monthly event cap and you are near it when the parallel run starts, the additional events from any newly-instrumented services could push you over the cap mid-month. Check your Sentry quota dashboard before starting the run and calculate whether the extra volume fits.
Rate limits skewing the count comparison
If Sentry rate-limits your SDK during the evaluation window, Sentry will drop events and urgentry will appear to have higher event counts. This is not urgentry losing events; it is Sentry throttling them. Enable SDK debug logging and watch for HTTP 429 responses to sentry.io. If you see them, the ingest count comparison is invalid for those periods and you need to adjust the measurement window.
urgentry does not have a hosted plan quota. On your own hardware, the only rate limit is what your server can handle. At the typical mid-size service traffic level, urgentry’s single-binary architecture handles several hundred events per second without saturation.
Fingerprint drift on custom grouping rules
Teams that have maintained a Sentry project for more than a year often have fingerprinting rules they forgot they created. Those rules do not migrate automatically. During the parallel run, look for issues in urgentry that have far more event occurrences than expected, or issues in Sentry that have no counterpart in urgentry. Both patterns indicate a fingerprinting rule gap. Export your Sentry fingerprinting rules from project settings before the run starts and recreate them in urgentry.
FAQ
How long should the parallel run last?
One week of production traffic is the minimum that surfaces meaningful differences. A long-lived service with low error volume may need two weeks before the fingerprint and alert data is statistically interesting. Do not run longer than two weeks; the carry cost of maintaining two active ingest paths accumulates.
Will running both DSNs double my Sentry bill?
Not in the way that matters. Sentry is still receiving the same events it was receiving before the run started. Your bill does not increase unless you add newly-instrumented services to the dual-DSN configuration. The risk is rate limiting if you are near a monthly event cap, not a higher invoice.
Can I use the OTel collector fanout pattern if I am not using OpenTelemetry today?
The OTel collector can accept Sentry SDK traffic if you route it through urgentry’s OTLP ingest. In practice, the dual-DSN pattern is simpler for teams not already running a collector. Only adopt the fanout architecture if you already have a collector in your stack or plan to adopt OTel regardless.
What counts as a “passing” parallel run?
The six sign-off criteria in this guide all need to be green. At minimum: ingest count parity within 5%, no unresolved fingerprint drift on your top ten issues, all on-call-critical alert rules firing in urgentry, and latency within acceptable bounds. If those hold, the run passes.
Does urgentry deduplicate events if I accidentally send the same event twice?
urgentry deduplicates based on the event ID in the envelope header, the same mechanism Sentry uses. SDK-generated event IDs are UUIDs unique per event, so dual-DSN delivery does not create duplicates inside a single urgentry project. Each system receives and stores its own copy independently.
Sources
- Sentry SDK transport documentation — the transport hook surface for JavaScript, including the
makeNodeTransportfactory used in the dual-DSN implementation above. - OTel Collector configuration reference — pipeline, exporter, and retry configuration used in the collector fanout pattern.
- urgentry quickstart — install paths, first-run flow, DSN creation, and the OTLP endpoint reference.
- urgentry SDK ingest documentation — envelope and store endpoint definitions, including the stats API used for ingest count comparison.
- Sentry project stats API — the event count endpoint schema that urgentry’s stats API mirrors.
Ready to start the parallel run?
Stand up urgentry alongside your existing Sentry instance in under ten minutes and add the dual-DSN transport wrapper to one service. Run for a week. Compare the numbers. The binary is a single download; the parallel run is reversible at any point before the cutover.