Guide Cost ~9 min read Updated April 14, 2026

Sentry cut spans from 10M to 5M in Aug 2025: what to do.

In August 2025, Sentry halved the included span quota on Team and Business plans. If your bill went up and your error volume stayed flat, this is why. This guide covers what changed, which teams got hit hardest, and four concrete levers you can pull today.

TL;DR

20 seconds. Sentry cut the included span quota from 10M to 5M per month on Team and Business plans in August 2025. If you were sending between 5M and 10M spans per month, you went from zero overage to paying for every span above 5M. If you were already above 10M, your overage volume stayed the same but you lost the buffer you had toward the old floor.

60 seconds. The change hit teams running Sentry Performance Monitoring at scale: distributed tracing across multiple services, auto-instrumented frameworks, and broad span coverage. You have four levers: cut span volume with sampling, upgrade to a higher plan, move traces to a different backend while keeping errors on Sentry, or self-host. None of these is universally correct. The right one depends on your span volume, your ops capacity, and how much of your debugging workflow depends on traces versus errors.

This guide walks through each option with numbers, so you can make the decision based on your specific profile rather than on which option sounds cheapest in the abstract.

What changed in August 2025

Sentry announced in the months before August 2025 that it would reduce the included span quota on Team and Business plans. The change took effect in August 2025. The old included quota was 10 million spans per month. The new included quota is 5 million spans per month.

Sentry framed the change as a pricing normalization. The practical effect on any team already using Performance Monitoring was a direct increase in their bill for the same usage. A team sending 8 million spans per month went from zero span overage to paying for 3 million overage spans. A team sending 15 million spans per month already paid for 5 million overage spans under the old quota; now they pay for 10 million.

The math on the effective per-span cost is straightforward. Before August 2025, a team sending 10M spans paid zero overage; 10M spans cost whatever their base plan cost. After August 2025, a team sending 10M spans pays overage on 5M spans at approximately $0.0000088 per span, which adds roughly $44/month to their bill. That is a $528/year increase for the same span volume, with no change to any code.

For teams in the 5M to 10M range, the change doubled the effective cost of their span usage. For teams above 10M, the increase was proportionally smaller but the dollar amount was larger.

Monthly span volume Overage before Aug 2025 Overage after Aug 2025 Bill increase (approx.)
3M $0 $0 $0
6M $0 ~$9/mo +$9/mo (+$108/yr)
10M $0 ~$44/mo +$44/mo (+$528/yr)
20M ~$44/mo ~$132/mo +$88/mo (+$1,056/yr)
50M ~$352/mo ~$396/mo +$44/mo (+$528/yr)

The teams hit hardest in absolute dollar terms are those in the 5M to 30M span range. Below 5M, the change had no effect. Far above 10M, the relative increase is smaller because overage was already the dominant line item.

Who got hit hardest

The change affected teams running Sentry Performance Monitoring at any meaningful scale. Three factors determine whether your team landed in the impact zone.

The first factor is your traces_sample_rate. If you run at 1.0 (all transactions traced), your span volume scales directly with your request rate. A service handling 500 requests per second with an average of 30 spans per transaction generates 38 billion spans per month. At 0.1, the same service generates 3.8 billion spans per month. Teams that had not tuned this setting were the most exposed.

The second factor is auto-instrumentation breadth. The Sentry SDK auto-instruments HTTP clients, database drivers, cache clients, and framework internals. A Django app with the full integration suite active generates more spans per request than one with only the core SDK. A request that hits Postgres, Redis, and three downstream HTTP services might generate 20 spans on a minimally instrumented app and 80 on a fully auto-instrumented one.

The third factor is service count. Distributed tracing captures spans across every service in a request path. A monolith traces one service per transaction. A microservices architecture with ten services in a request path generates ten times the spans for the same number of user-facing transactions, assuming full propagation.

The customer segment most likely to feel the August 2025 change: teams that adopted Performance Monitoring when spans were bundled generously into their plan, instrumented broadly because the cost was invisible, and had no span volume alerts configured. For those teams, the August invoice was the first signal that span volume was a meaningful billing variable.

The four levers you can pull

You have four options, each with a different cost structure and a different set of trade-offs. Most teams should evaluate them in order before deciding.

  1. Reduce span volume via sampling. Cut what you send to Sentry without changing your backend. This is the lowest-effort option and works for any team with headroom to reduce coverage.
  2. Tier up to a higher plan. Business and enterprise plans have more included spans and lower per-unit overage rates. This works when your current overage bill exceeds the plan upgrade cost.
  3. Move spans to OTel and a different backend. Keep errors on Sentry, route OTel traces to a separate backend. This splits your observability stack but removes the per-span cost for traces.
  4. Self-host. Run a Sentry-compatible tracker on your own infrastructure. Span volume becomes an infrastructure cost rather than a per-unit cost, which changes the math for high-volume teams.

Lever 1: sampling

Sampling is the fastest lever to pull and the one with the most variance in impact. The core question is whether to sample at the head (decide at the start of a request) or at the tail (decide after the request completes, based on what happened).

Head-based sampling is what traces_sample_rate controls. Set it to 0.1 and you send 10% of transactions with all their spans. The spans from unsampled transactions are dropped client-side before they reach the quota counter. This approach is simple to implement and has a predictable effect on span volume, but it is blunt: it under-samples slow and error-prone transactions at the same rate as healthy fast ones.

Tail-based sampling lets you decide after the fact which traces to keep. Sentry supports dynamic sampling rules that increase the sample rate for transactions with errors, for specific endpoints, or above a latency threshold. A reasonable configuration: set the global traces_sample_rate to 0.05, then add a dynamic rule that forces a 1.0 rate for any transaction with an error or a duration above your p95 threshold. This retains full coverage on the traces most relevant to debugging while cutting volume on healthy high-frequency transactions.

Traffic shape Recommended global rate Dynamic rule Expected span reduction
High-frequency, low-variance (ingest, queue consumers) 0.01–0.05 Force 1.0 on error transactions 90–99%
Mixed API (user-facing endpoints, varied latency) 0.1 Force 1.0 on errors and p95 latency breaches 80–90%
Low-frequency, high-value (checkout, auth) 0.5–1.0 Not required; volume is low 0–50%
Background jobs and cron 0.01 Force 1.0 on failures 95%+

The signal cost of sampling is real. At a global rate of 0.1, you miss 90% of transactions. Finding a slow query that only appears in 1 in 20 requests requires either a high enough sample rate to catch it or a large enough transaction volume that the absolute number of sampled instances is sufficient to detect the pattern. For a service handling 1,000 requests per second, 0.1 sample rate still gives you 100 sampled transactions per second, which is enough statistical coverage for most latency investigations. For a service handling 2 requests per second, the same rate gives you one sampled transaction every 5 seconds, which leaves large gaps.

Lever 2: tier up

Upgrading from Team to Business, or from Business to an enterprise plan, increases the included span quota and may reduce the per-unit overage rate. The question is whether the plan upgrade cost is less than the overage cost it eliminates.

The Business plan costs approximately $80/month (as of mid-2026) versus approximately $26/month for Team. The delta is $54/month, or $648/year. If your span overage on the Team plan exceeds $54/month, upgrading to Business costs less than staying on Team with overages, assuming the Business plan's included quota covers your volume.

The Business plan's included span quota is 5M as of August 2025 (same floor as Team after the change). The difference at the Business plan is the per-unit overage rate after 5M, which may be lower than the Team plan rate, and the overall included volume across other line items (errors, replays). If you have significant error overage in addition to span overage, the Business plan's higher included error quota may reduce total overage even if the span quota is the same.

When tiring up makes sense: your span volume sits between 5M and 15M per month, your error volume exceeds the Team plan included tier, and the combined overage on both line items exceeds $54/month. In that scenario, the Business plan costs less than staying on Team with the overage combination.

When it does not: your span volume exceeds 20M per month and no plan tier meaningfully covers your volume without a custom contract. At 50M spans per month, you are paying overage on 45M spans regardless of which self-serve plan you are on. The right path at that volume is either aggressive sampling or a custom enterprise negotiation.

Lever 3: split errors and traces

You can keep errors on Sentry and route OTel traces to a different backend. This removes the per-span cost for traces while retaining Sentry's issue tracking, alerting, and workflow for errors.

The architecture uses OpenTelemetry's SDK with a dual-exporter setup. Your application sends error events to Sentry via the Sentry SDK (or via the OTel-to-Sentry bridge), and sends trace data to a separate OTel collector that routes to a trace backend. Common backends for this pattern are Grafana Tempo (cheap storage, good Grafana integration), Jaeger (open source, simple to self-host), and urgentry (OTLP-native ingest alongside Sentry-compatible errors, all in one binary).

The main operational challenge with this approach is cross-tool linking. When an error fires in Sentry, you want to be able to navigate from the error to the trace that contains it. That linking requires a shared trace ID visible in both tools. With Sentry and Grafana Tempo, the linking is manual: you copy the trace ID from the Sentry error and paste it into Tempo. Some teams build a custom URL in their Sentry event context that deep-links to the Tempo trace viewer. Jaeger has a similar story.

urgentry handles this without a separate tool: it ingests errors via the Sentry SDK protocol and ingests traces via OTLP in the same process. Errors link to traces automatically because both share the same database and the same trace ID. If you want a single UI rather than two tools, that matters. If you already run Grafana and Tempo for other services, the marginal cost of adding a Sentry-to-Tempo data path is low and the workflow cost of switching tabs is acceptable.

This lever works best for teams where traces are primarily a latency debugging tool and errors are the primary alerting and issue management surface. It works less well for teams who rely on Sentry's Performance Monitoring product features: transaction summary views, endpoint-level p95 trends, and the performance score dashboard all depend on Sentry receiving span data directly.

Lever 4: self-host

Self-hosting removes per-span billing entirely. You pay a fixed infrastructure cost regardless of span volume, and that cost does not increase as your traffic grows.

The Sentry self-hosted option is the first thing most teams consider. The Sentry self-hosted stack requires 24–32 GB of RAM in practice (ClickHouse alone uses 8–12 GB under load), a 20-container Docker Compose orchestration, and routine operator time for upgrades and disk management. A VM adequate for stable production use costs $55–200/month depending on provider. Add 4–8 hours per quarter of routine maintenance at a $100/hour engineer rate, and the annual ops cost lands between $2,580 and $9,600. That number only beats a Sentry SaaS bill above roughly $5,000–10,000/year.

urgentry is a Go single binary that runs on much smaller hardware: a $20–40/month VPS with 4–8 GB RAM handles most small and mid-size teams at steady-state ingest. Spans arrive via OTLP. Errors arrive via the Sentry SDK protocol without code changes. Routine maintenance is closer to 30 minutes per month because there are no JVM processes, no Kafka consumers, and no ClickHouse schema migrations to manage. Upgrades are a binary swap.

The infrastructure-only cost for urgentry annualizes to $240–480/year. With ops time at $100/hour and 30 minutes per month, the annual TCO lands between $600 and $1,200, depending on hardware choice and upgrade frequency. That number beats a Sentry SaaS bill starting around $600–1,200/year, and grows more attractive as span volume increases because the self-host cost stays flat while the SaaS overage cost grows.

The gaps relative to full Sentry are real. Session replay exists in urgentry with a partial playback experience, not the full product surface. Deep profiling features are narrower. If either of those is central to your workflow, self-hosting with urgentry may not replace the SaaS fully. Check the compatibility matrix before making a decision based on feature parity.

A worked example: 15M spans per month

Call this team Wavefront: a twelve-person Series A startup. Two backend services in Python, one in Go. Frontend with React. traces_sample_rate=1.0 on both backend services before anyone audited it. Monthly span volume: 15 million. Monthly error volume: 180,000. On the Business plan.

The bill before and after August 2025

Line item Before Aug 2025 After Aug 2025 Change
Base plan (Business) ~$80 ~$80 $0
Error overage (180k vs 200k included) $0 $0 $0
Span overage (15M vs 10M included) ~$44/mo ~$88/mo +$44/mo
Total monthly ~$124 ~$168 +$44/mo (+$528/yr)

Option 1: sample to 5M spans/month

Cut traces_sample_rate from 1.0 to 0.33 on the high-frequency Go service (which generates most of the volume), keep it at 1.0 on the Python API. Add a dynamic rule that forces 1.0 for error transactions on all services. Projected span volume after: roughly 5.5M, just above the included quota, with ~$4/month in marginal overage. Monthly bill: ~$84. Annual saving versus the post-August baseline: ~$1,008.

Signal cost: the Go service loses visibility into 67% of healthy transactions. The team determines this is acceptable because the Go service is a queue consumer with low variance: every job does the same thing, and seeing 33% of them is sufficient to detect latency regressions.

Option 2: stay on Business and absorb the overage

Do nothing. Pay the $168/month. Annual cost: $2,016. If the team expects traffic to grow 2x over the next year, span volume at 1.0 sample rate reaches 30M and the monthly bill grows to ~$212. That trajectory matters for the decision.

Option 3: split errors and traces

Keep errors on Sentry Business plan at the lowest tier that covers error volume. Route OTel traces to Grafana Tempo on a $15/month Hetzner instance. Monthly cost: Sentry base plan (~$26 Team, error-only) + Hetzner VPS ($15) = $41/month. Annual cost: $492. The savings relative to the post-August baseline are $1,632/year. Setup time: 4–8 hours to configure the OTel collector, dual exporter, and Tempo. Operational cost: 30 minutes per month for VPS maintenance.

Trade-off: the team loses Sentry's Performance Monitoring product UI. Trace investigation moves to Grafana. Engineers who use the Sentry Performance dashboard daily will feel this.

Option 4: self-host with urgentry

Replace Sentry SaaS with urgentry on a Hetzner CPX31 ($15/month). Errors arrive via the existing Sentry SDKs with one DSN config change. Traces arrive via OTLP from the same OTel instrumentation. Monthly infrastructure cost: $15. Ops time: 30 minutes/month at $100/hour = $50. Annual TCO: ($65/month x 12) + ($100 upgrade) = $880. Saving versus the post-August baseline of $2,016: roughly $1,136/year.

Trade-off: the team now manages their own observability infrastructure. If a deploy breaks urgentry's ingest, no errors are captured until the issue is resolved. The team needs monitoring for the monitoring tool itself.

Recommendation by team profile

Profile Recommended lever Rationale
Small team, tight budget, high ops capacity Self-host (urgentry) Fixed cost, lowest annual TCO, no per-span billing
Small team, tight budget, low ops capacity Sample harder (Lever 1) No operational overhead, immediate bill reduction
Mid team, already on Sentry, uses Performance UI Sample + dynamic rules, then tier up if needed Preserve Performance Monitoring value, reduce overage first
Mid team, uses OTel already, traces are secondary Split errors and traces (Lever 3) Low migration cost, removes span overage entirely
Large team, span volume above 50M, bill above $500/month Model full TCO for self-host or negotiate enterprise The savings at this volume justify the modeling work

The decision tree

Start by pulling your current monthly span volume from Settings > Usage & Stats or the stats_v2 API. That number determines which part of the tree applies to you.

Under 5M spans/month: You are within the included quota. The August 2025 change did not directly increase your bill. Your risk is future growth: if your span volume is growing at 20% per month, you will cross 5M within a few months. Set a usage alert at 80% of 5M (4M spans) so you have time to react before overage starts.

5M to 15M spans/month: You are in the zone where sampling makes the most sense as a first move. A well-tuned dynamic sampling configuration can bring you back within the included quota or close to it. Try Lever 1 before anything else. If you cannot sample below 5M without losing investigations you care about, and if your overage bill exceeds $50/month, evaluate Lever 3 (split errors and traces) as the next option.

15M to 50M spans/month: Sampling can still help, but getting from 30M to 5M requires a very low global sample rate that may be unacceptable for your debugging workflow. Evaluate Lever 3 (split traces) or Lever 4 (self-host) at this volume. The break-even for self-hosting with urgentry arrives when the SaaS overage bill alone exceeds the urgentry annual TCO of roughly $880/year, which happens around 15M spans/month for most plan configurations.

Above 50M spans/month: The overage cost is substantial regardless of plan tier. Model the full TCO for self-hosting honestly, including ops time. If your team already manages production infrastructure and the urgentry ops overhead is genuinely additive at 30 minutes/month, the financial case for self-hosting is clear at this volume. If your team has no ops capacity, negotiate an enterprise Sentry contract where span rates are included in the committed spend.

What to do this week

Pull your last three months of span volume from the Sentry stats_v2 API or from Settings > Usage & Stats. Take the average. Compare it to 5 million.

If you are above 5M, look at your current traces_sample_rate setting on each service. If any service runs at 1.0 and handles more than a few requests per second, that service is a candidate for a lower rate with a dynamic error rule. Set the dynamic rule first; then lower the global rate incrementally and observe the span volume in the Usage & Stats UI before each further reduction.

Set a usage alert for spans at 80% of your included quota. In Sentry: Settings > Alerts > New Alert > Metric Alert, category = spans, threshold = 4,000,000. This gives you time to react before overage starts in a billing period.

If sampling does not get you to an acceptable volume without losing signal you depend on, pick the lever that matches your team's ops profile from the decision tree above. The configuration changes for each option are well-documented; the decision about which option to use is the harder part, and the worked example and decision tree above give you the structure to make it.

Frequently asked questions

What exactly changed about Sentry spans in August 2025?

Sentry reduced the included span quota on Team and Business plans from 10 million spans per month to 5 million. The change was announced in advance and framed as a pricing normalization. Teams using Performance Monitoring with span volumes between 5M and 10M per month went from zero overage to paying for every span above 5M, with no code change required to trigger the increase.

How do I find out how many spans my team sends per month?

Three places: Settings > Usage & Stats in the Sentry UI for a monthly visual breakdown; the /api/0/organizations/{org_slug}/stats_v2/ API with category=span (or category=transaction depending on SDK version) for daily granularity; and the billing CSV at Settings > Billing > Usage & Payments for exact overage charges per period. Cross-reference the API data against the billing CSV to verify your numbers.

Does reducing traces_sample_rate actually lower my Sentry bill?

Yes. Spans are sampled at the trace level: a traces_sample_rate of 0.1 sends 10% of transactions and their spans, cutting span volume by approximately 90%. The signal cost is real: you lose visibility into unsampled transactions. Dynamic sampling rules let you keep full coverage on error transactions and slow transactions while cutting the global rate on healthy ones, which preserves more investigation value per span sent.

When is it cheaper to move OTel traces off Sentry entirely?

When your span volume exceeds roughly 15–20M per month and sampling cannot get you within the included quota without losing investigations you depend on. At 20M spans, you pay overage on 15M spans every month at approximately $0.0000088/span, which is roughly $132/month in span overage alone. A self-hosted Tempo instance or urgentry on a $15–40/month VPS costs less than that overage in infrastructure, plus ops time. Run the comparison against your specific overage number.

What is urgentry and how does it relate to Sentry spans?

urgentry is a source-available Sentry-compatible single binary that accepts errors from any Sentry SDK and ingests traces natively via OTLP. Span volume does not generate a per-span cost: the infrastructure cost is fixed at whatever your VPS costs, regardless of span count. It covers 218 of 218 documented Sentry API operations. Session replay and deep profiling surfaces are narrower than the full Sentry product; check the compatibility matrix for the gap detail before making a migration decision.

Sources

  1. Sentry pricing page — current plan tiers, included quotas, and per-unit overage rates. All prices quoted in this guide are approximations; verify against this page before building a budget model.
  2. Sentry changelog, August 2025 — the announcement of the span quota reduction from 10M to 5M on Team and Business plans.
  3. Sentry documentation: Performance at scale — sampling configuration reference, including traces_sample_rate and dynamic sampling rule setup.
  4. OpenTelemetry documentation: Sampling — head-based and tail-based sampling concepts, the Traceparent propagation model, and the OTel Collector sampling processor reference.
  5. FSL-1.1-Apache-2.0 license — the Functional Source License under which urgentry is released.
  6. urgentry compatibility matrix — the 218-operation audit of urgentry’s Sentry API coverage, SDK compatibility, and feature gap detail.

If the span math says it’s time to look at alternatives

urgentry is a source-available Sentry-compatible alternative that runs as a single binary. It accepts errors from any Sentry SDK and ingests traces via OTLP natively. Span volume does not carry a per-unit cost: you pay the infrastructure cost of the VPS, fixed regardless of traffic. It implements 218 of 218 documented Sentry API operations. If the worked example above matches your profile, the compatibility matrix and install are linked below.