Release health: deploys, sessions, and crash-free users.
Release health answers one question: did this deploy make things worse? Total error count cannot answer that question on its own. Session-based release health can. This guide covers what release health measures, how the formulas work, where session events come from, and how urgentry surfaces the data.
20 seconds. Release health tracks crash-free sessions and crash-free users per release. A session is a single continuous use of the application. It ends healthy, with a handled error, or with a crash. The crash-free rate is 1 - (crashed / total). 99.5% is the common consumer mobile bar. Below it, page someone.
60 seconds. Total error count goes up when traffic goes up, so it cannot tell you whether a deploy regressed quality. Crash-free rate normalizes for traffic. Two releases with different traffic volumes are directly comparable as percentages. The sessions signal comes from SDK auto-instrumentation: mobile SDKs track the app lifecycle (foreground, background, terminate), web SDKs track the page session, and server SDKs track the request session. You tag each session with a release name at SDK initialization. urgentry receives those session payloads via the Sentry-compatible sessions endpoint, computes the crash-free statistics, and shows a side-by-side release comparison in the dashboard.
Where urgentry sits today. urgentry ingests session payloads, computes crash-free-sessions and crash-free-users, and surfaces release-vs-release comparison. The gaps are in the release adoption curve display and the threshold-based alerting on crash-free rate, both of which are on the roadmap. The session endpoint is part of the 218-operation compatibility layer; no SDK changes are needed to start collecting release health data from an existing Sentry SDK integration.
What release health is actually measuring
A production service generates errors all the time. Some come from user mistakes. Some come from network conditions. Some come from bugs in a specific release. Total error count reflects all of them indiscriminately. When traffic doubles, error count doubles. That tells you nothing about whether a deploy changed the underlying quality of the application.
Release health shifts the denominator from time to sessions. A session is one continuous user interaction with the application: the app in the foreground on mobile, the page alive in the browser, the request lifecycle on a server SDK. Each session ends in one of three terminal states: healthy (no error, no crash), errored (a handled error occurred), or crashed (the application terminated unexpectedly). The crash-free rate divides crashed sessions by total sessions and subtracts from one.
That rate normalizes for traffic. A release at 100,000 sessions per day with 200 crashes is 99.8% crash-free. A release at 10,000 sessions with 200 crashes is 98% crash-free. Total error count shows the same number for both. Crash-free rate shows they are different situations. That is the core measurement difference.
Release health also introduces two distinct user-facing metrics that error count cannot produce: crash-free sessions and crash-free users. They diverge in ways that matter, which is why most dashboards show both.
The four signals every release-health system tracks
Every release health system, regardless of vendor, reduces to four session terminal states. The Sentry sessions API formalizes them explicitly, and urgentry ingests them in the same format:
- Sessions started. The raw count of sessions opened in a release. The denominator in every crash-free formula. A session starts when the SDK detects the application entering the active state: app foregrounded on mobile, page loaded in the browser, first request in server mode.
- Sessions ended healthy. Sessions that closed without any error or crash. The user completed their interaction, the page unloaded, or the request returned a response, and nothing went wrong. These are the sessions that do not touch either crash-free numerator in the negative direction.
- Sessions ended with a handled error. Sessions where a Sentry-captured exception occurred but the application continued running. The SDK sends status
erroredfor these. The application did not crash. The session did not end unexpectedly. This status matters for error-free session metrics, but it is not a crash. - Sessions ended with a crash. Sessions where the application terminated unexpectedly. Unhandled exceptions, OOM kills, signal terminations, ANRs on Android. The SDK sends status
crashed. This is the numerator in the crash-free formulas.
The distinction between errored and crashed is the boundary that separates error rate from crash rate. Conflating them inflates the apparent severity of a release. A release with 2% errored sessions and 0.1% crashed sessions looks very different on a release health dashboard than it does on an error count graph.
How the math works
Two formulas, both straightforward:
// Crash-free sessions
crashFreeSessions = 1 - (crashedSessions / totalSessions);
// Crash-free users
crashFreeUsers = 1 - (usersWithCrashedSession / totalUsers);
The formulas are identical in structure. The difference is the unit: sessions versus users.
Crash-free sessions counts each session independently. Ten sessions from the same user that all crash contribute ten to the crashed count and ten to the total. A power user who opens the app repeatedly and crashes each time pulls down crash-free sessions substantially.
Crash-free users counts each distinct user identifier independently. Those same ten sessions from one user contribute one to the crashed-users count and one to the total-users count. If that user is your most engaged user, the impact on crash-free users is minimal even if crash-free sessions looks alarming.
The two metrics diverge when crash frequency is not uniformly distributed. When crashes concentrate on specific users (a device model, an OS version, a user cohort with a specific feature flag), crash-free users tells the story and crash-free sessions amplifies it. When crashes spread uniformly across occasional users, both metrics track closely.
A practical read on divergence: if crash-free sessions drops sharply but crash-free users holds steady, a small number of users experienced many crashes. Look for a specific device or configuration. If both drop together, the crash affects users broadly and the release regression is real.
The floor-vs-ceiling reading: crash-free sessions is a floor (it counts every crash event). Crash-free users is a ceiling (it counts each affected user once, no matter how many times they crashed). A regression that looks modest on crash-free users can be severe on crash-free sessions if the affected users are frequent openers. Read both before drawing a conclusion.
Where session events come from
Session events originate from SDK auto-instrumentation. The instrumentation strategy differs by platform, and the differences matter when reading release health data.
Mobile SDKs (iOS, Android). The mobile SDK hooks into the app lifecycle. When the application enters the foreground, the SDK opens a session. When the application goes to the background and does not return within a configurable timeout (typically 30 seconds), the SDK closes the session. If the application crashes while foregrounded, the SDK detects the crash on the next launch and sends a crashed session for the previous session. This means crash sessions on mobile arrive delayed: the crash is reported on the following launch, not at the moment of the crash.
Web SDKs (JavaScript browser). The browser SDK tracks page sessions. A session starts on page load. It ends when the page unloads or the user is inactive beyond the idle timeout. The SDK sends session updates periodically and a terminal update on unload. Browser crash detection is limited: a tab kill or an OOM crash that prevents the unload event from firing may result in a session that never receives a terminal status, which the backend eventually times out as abandoned rather than crashed.
Server SDKs (Python, Go, Java, Ruby, etc.). Server SDKs operate in request session mode by default. Each incoming request opens a session and closes it when the response returns. A 500 that the application catches and handles sends errored. An unhandled exception that terminates the process sends crashed. In request mode, session counts track request volume, not user sessions. The user_id in server mode requires explicit SDK configuration.
The session_mode SDK setting controls this behavior. Set session_mode: "application" for long-lived server processes where you want to track the process lifecycle rather than individual requests. Set session_mode: "request" (the default) for per-request session semantics. Mixing modes within a platform produces session counts that are not comparable across releases.
Tagging events with a release
The SDK reads the release identifier at initialization time. Every session and error event carries that identifier in the envelope. Without a release tag, session data lands in urgentry but does not associate with any release, and the dashboard has nothing to compare.
Three common release name conventions, each with trade-offs:
- Semver string (
1.4.2,2.0.0-rc.1). Human-readable. Maps to a release artifact. Works well for mobile apps where the version string is already defined. The risk: if you hotfix in place without bumping the version, two different code states share one release name. - Git SHA prefix (
a3f1b9c,a3f1b9c-prod). Unique per commit. Maps directly to source control. Works well for continuous-deploy web services where a version number is not meaningful. The risk: not human-readable without a lookup. - Build number (
2026052201, a date-based build stamp). Monotonically increasing. Useful for CI environments that do not embed git context easily. The risk: no semantic meaning; two builds from the same commit have different names.
Source the release from CI, not from a hardcoded string in the application. In a GitHub Actions workflow:
- name: Set release env
run: echo "RELEASE_NAME=$" >> $GITHUB_ENV
- name: Build
env:
VITE_SENTRY_RELEASE: $
run: npm run build
In the SDK initialization:
Sentry.init({
dsn: "https://key@urgentry.example.com/1",
release: import.meta.env.VITE_SENTRY_RELEASE,
environment: "production",
});
The "always-latest" antipattern: pointing release at a string like "latest" or "production" that never changes makes rollback invisible. urgentry cannot distinguish sessions from the pre-rollback and post-rollback code. All crash data merges into one release. The crash-free chart becomes a flat line that tells you nothing about which deploy caused a regression.
Adopting a release
Adoption tracks what fraction of your active session volume is running a given release. urgentry computes this as sessions on release N divided by total active sessions in the same time window.
A release becomes the active version when a meaningful fraction of sessions run it. "Meaningful" depends on your deploy strategy. A big-bang deploy to all servers flips adoption from 0% to ~100% within minutes of the deploy completing. A staged rollout (10% of servers, then 25%, then 100%) shows adoption climbing over hours or days.
Adoption matters for interpreting crash-free rates. A release at 5% adoption that shows 95% crash-free has low statistical confidence: if it only has 200 sessions, the margin of error on that percentage is large. The same release at 100% adoption with 50,000 sessions is a stable reading. Read crash-free percentages alongside session counts, not in isolation.
In a staged rollout, a crash-free rate that degrades as adoption increases is the signal you are looking for. A release that looks healthy at 5% adoption and shows degradation at 25% may be hitting a code path that requires a higher load or a broader device distribution to trigger. The regression is real; it was just not visible at low volume.
Reading the regression chart
The release health chart shows crash-free rate per release over time. Reading it requires three reference points: the current release, the previous release, and the baseline for your application.
The baseline is your expected crash-free rate during a period with no deploy activity. For a stable application, this is a relatively flat line. Measure it over the two to four weeks before a major release and treat it as the floor below which you expect not to drop. A new release that drops below the baseline is a regression candidate.
Period-over-period comparison is the primary regression signal. If the previous release ran at 99.8% crash-free and the new release is at 99.2%, that is a 0.6 percentage point regression. At 100,000 sessions per day, a 0.6-point drop means 600 additional crashed sessions per day. Whether that is page-worthy depends on your threshold.
What counts as a regression worth paging: the answer differs by product type, but the common heuristics are a drop of more than 0.5 percentage points in crash-free sessions, sustained over at least 1,000 sessions (to filter statistical noise), with the drop persisting for more than 15 minutes after the release reached meaningful adoption. A single-session anomaly in a low-adoption window is not a regression. A sustained drop at full adoption is.
The chart is also where you read rollback signals. If a rollback produces a distinct release name and urgentry receives sessions tagged with it, the crash-free rate for the rollback release should return to the previous release's baseline. If it does not, the regression was not caused by the release you rolled back.
Setting a sane crash-free threshold
The threshold is the crash-free rate below which you alert. The right number depends on the application type and the user expectation.
Consumer mobile apps at scale (millions of daily active users, App Store distribution) set the bar at 99.5% or higher. Apple's App Store Quality Guidelines flag apps with high crash rates for review. Google Play's Android vitals dashboard shows crash-free sessions as a store-level metric. At 99.5%, one in 200 sessions crashes. At 10 million sessions per day, that is 50,000 daily crashes. For a navigation or payments app, even 99.5% is too permissive. Set a stretch target of 99.9% (one crash per 1,000 sessions) for critical-path consumer products.
Internal tools and enterprise software have a different tolerance curve. Users are on managed devices, on specific OS versions, and in controlled environments. Crash-free rates of 99.0% are often acceptable. The user has a support channel and a workaround. The expectation differs from a consumer context.
Server SDKs in request mode should not use the consumer mobile bar. A server process crash is a different event than a mobile app crash. A 500 rate of 0.5% (99.5% success rate) may be normal for a service with upstream dependencies. A crash rate (actual process termination) of 0.01% is a different signal.
The noise floor is the practical lower bound for meaningful alerting. At very low session volumes (fewer than 500 sessions in a measurement window), statistical variation means the crash-free rate can swing several percentage points without any real change in software quality. Set a minimum session count threshold alongside the percentage threshold: alert only when session count exceeds 1,000 and crash-free rate drops below your target. This prevents alert storms during staged rollouts when adoption is still low.
urgentry’s release-health implementation today
urgentry covers the Sentry sessions endpoint as part of its 218-operation compatibility layer. The SDK sends session envelopes to the same envelope endpoint as errors. urgentry parses the session payloads, tracks the four terminal states, and computes crash-free-sessions and crash-free-users per release. No SDK configuration change is needed if you already use the Sentry SDK pointed at urgentry; the session data flows automatically.
The dashboard exposes a release-vs-release comparison view. Select two releases and urgentry displays crash-free-sessions, crash-free-users, and session counts side by side with the period-over-period delta. The comparison operates on the actual session counts from each release, not on resampled estimates.
The math is transparent: urgentry counts crashed terminal states from session payloads, divides by total sessions for that release, and subtracts from one. Crash-free users applies the same logic at the user_id level, deduplicating crashes per user identifier before computing the rate.
The gaps versus Sentry today: the release adoption curve (the percentage of sessions on each release over time) is present in the data but not yet visualized on a dedicated adoption chart. Threshold-based alerting on crash-free rate (send a notification when crash-free drops below 99.5%) is a roadmap item. The sessions endpoint itself and the release-vs-release comparison are available now.
Tying release health into deploys
The Sentry deploys API lets you mark when a release transitions from built to deployed. urgentry covers this endpoint. A deploy record associates a release name with a deploy timestamp, an environment, and an optional URL. When a deploy marker lands in urgentry, it appears as an annotation on the release health chart at the moment the deploy was recorded.
The marker serves as the before/after boundary for reading the chart. Without it, you are looking at a crash-free time series and guessing where each deploy occurred. With it, the deploy moment is an explicit vertical line on the chart. A drop in crash-free rate that begins at the marker is a deploy regression. A drop that predates the marker is a pre-existing condition.
Record a deploy from your CI pipeline immediately after the deploy completes, before you start watching for regressions:
curl -X POST "https://urgentry.example.com/api/0/organizations/my-org/releases/${RELEASE_NAME}/deploys/" \
-H "Authorization: Bearer ${URGENTRY_AUTH_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"environment": "production",
"name": "deploy-'$(date +%Y%m%d%H%M%S)'",
"url": "https://github.com/my-org/my-repo/commit/'"${RELEASE_NAME}"'"
}'
In a GitHub Actions workflow, add this as the final step after a successful deploy:
- name: Record deploy in urgentry
if: success()
run: |
curl -X POST \
"$/api/0/organizations/$/releases/$/deploys/" \
-H "Authorization: Bearer $" \
-H "Content-Type: application/json" \
-d '{"environment":"production","name":"$"}'
The deploy marker and the release name must match exactly. If your release name is the git SHA and the deploy record uses a different identifier, the marker does not associate with the right release.
The three release-health antipatterns
1. Forgetting to tag releases
The most common gap. The SDK runs. Sessions arrive. The crash-free rate is visible for the "no release" bucket. But no release name means no release comparison. The chart shows historical aggregate data with no way to attribute regressions to a specific deploy. Fixing this after the fact requires a redeploy with the release field set. All historical session data that arrived without a release identifier stays in the untagged bucket.
Set the release field at SDK initialization time, in every environment, from the start. Audit your SDK initialization code before you look at the release health dashboard for the first time. If the release field is absent or hardcoded to a static string, the dashboard data is not useful.
2. Counting handled errors as crashes
Some SDK configurations treat caught exceptions as fatal. The session ends with crashed status instead of errored when an exception is captured, even if the application continued running normally. This inflates crash-free rate in the wrong direction: your application looks less stable than it is because handled errors appear as crashes.
Check your SDK configuration for options that send the session terminal status. On the JavaScript SDK, the autoSessionTracking option controls this. On mobile SDKs, review whether Sentry.captureException() calls in your catch blocks affect the session lifecycle. The intent of crash-free rate is to track unexpected terminations, not handled error paths. Handled errors belong in the error count, not in the crash denominator.
3. Averaging across platforms
A release that ships across iOS, Android, and a web frontend produces session data from three SDKs with three session lifecycle models. Averaging their crash-free rates into a single number destroys the signal. An iOS crash at 99.2% and a web session error at 99.9% merge into 99.55%, which tells you nothing. The iOS regression is buried.
Read crash-free rates per platform, per environment. urgentry supports filtering the release comparison view by platform. Use it. Mobile crash-free rates and server crash-free rates operate on different denominators (app sessions versus requests) and different crash semantics (process termination versus unhandled exception in a request). They are not the same measurement dressed up as the same percentage.
Frequently asked questions
Does urgentry support release health?
Yes. urgentry ingests session payloads via the Sentry-compatible sessions endpoint, computes crash-free-sessions and crash-free-users from them, and shows a release-vs-release comparison view in the dashboard. The formula is 1 - (crashed / total). The gaps versus Sentry are in the release adoption curve visualization and threshold-based alerting on crash-free rate, both on the roadmap.
What counts as a crashed session?
A session whose terminal status is crashed. The Sentry SDK sends this status when the application terminates unexpectedly: an unhandled exception on mobile, an OOM kill, a signal, an ANR on Android. A session that ends with a handled error but does not crash the process sends errored, not crashed. This distinction separates crash-free-sessions from error-free-sessions. Only crashed sessions affect the crash-free formulas.
How do crash-free-sessions and crash-free-users differ?
Crash-free-sessions counts each session independently. Ten sessions from the same user that all crash count as ten crashes. Crash-free-users counts each distinct user_id once: those same ten sessions from one user count as one affected user. When crashes concentrate on power users who open the app frequently, crash-free-sessions drops faster than crash-free-users. Read both; they tell different stories about distribution and severity.
What release name format should I use?
Use a format that is unique per deploy and reproducible from CI: a semver string (1.4.2), a git SHA prefix (a3f1b9c), or a monotonic build number. The most important property is that rollbacks produce a distinct release name. If a rollback re-uses the previous release name, urgentry cannot distinguish pre-rollback from post-rollback session data, and the crash-free chart loses the ability to confirm the rollback fixed anything.
My crash-free rate looks wrong. What do I check first?
Check three things in order: that the SDK's release field is set correctly (look at a raw event in urgentry to confirm the release value); that session_mode matches your intended lifecycle (application vs. request); and that you are not reading a cross-platform aggregate. iOS and Android sessions have different crash semantics than web or server sessions. Filter to one platform before concluding the number is wrong.
Sources
- Sentry release health documentation — covers the session lifecycle, the four terminal states, crash-free formulas, and the release adoption curve. The canonical reference for how the Sentry sessions model was designed.
- Sentry sessions API specification — the envelope format for session payloads, the status field values, and the update-vs-terminal event distinction that SDK authors implement to send session data.
- urgentry compatibility matrix — the full list of 218 covered Sentry REST API operations, including the sessions endpoint and the deploys endpoint, with status and known gaps versus the hosted Sentry product.
- FSL-1.1-Apache-2.0 license text — the source-available license under which urgentry is distributed. Converts to Apache 2.0 after two years. Permits self-hosting, modification, and internal use without restriction.
- OpenTelemetry session semantic conventions — the OTel specification for session attributes, including session.id and the session lifecycle model, which informs how session context propagates alongside OTLP signals.
- Apple crash reporting documentation — how iOS crash reports are generated, when they are delivered to the SDK on next launch, and why mobile crash sessions arrive delayed relative to the crash event itself.
Release health without the SaaS bill.
urgentry ingests session payloads from your existing Sentry SDK, computes crash-free-sessions and crash-free-users, and shows release-vs-release comparison in the dashboard. One Go binary at 52 MB resident, SQLite by default, 218 Sentry API operations covered. No SDK changes required.