Guide Observability fundamentals ~8 min read Updated June 4, 2026

Where your observability bill actually goes: egress, not events

Most self-hosted observability bills are dominated by network egress, not ingest volume. The four egress line items, what OTel Arrow buys you, and the layout that zeroes them.

TL;DR

20 seconds. Self-hosted observability bills are usually a network bill with a small EC2 footnote, not a storage bill with a small network footnote. Cross-AZ at $0.01/GB each way, NAT-gateway processing at $0.045/GB, and public-internet egress at $0.09/GB stack on every byte of telemetry that crosses a boundary. The collector-to-backend layout decides the bill.

60 seconds. Storage and CPU are cheap; cloud networking is not. The same 18 TB/month of telemetry that costs $0 transport between two instances in the same AZ costs roughly $2,600/month when the collector forwards through NAT to an external SaaS endpoint. The signal volume is identical. OTel Arrow shrinks the bytes by 4–10× and helps when the layout is fixed by something other than cost, but it does not change the architectural question. The architectural question is whether the collector and the backend share a VPC.

This guide covers the four egress line items that appear on every meaningful observability deployment, a worked example at two volumes, what OTel Arrow is and is not, the four collector placement options ranked by cost, and how self-hosted error tracking lands the destination inside your VPC.

The bill you think you have vs the one you actually have

Most teams budget observability by event volume: events per second, GB ingested per day, retention days times daily ingest. That model fits a SaaS line item nicely, because the SaaS vendor charges you on that exact axis.

When you self-host, the axis changes. Storage is cheap. CPU is cheap. What you pay for is the bytes moving from the place they were emitted to the place they get stored. In a cloud-hosted deployment that path crosses one or more billed network boundaries, and each boundary charges per GB. Spans, logs, errors, profiles — every signal hits the same network at roughly the same place, and the network is the thing you pay for.

The result is that the self-hosted observability bill looks like a network bill with a tiny EC2 footnote, not a storage bill with a tiny network footnote. Teams that planned for the second shape are surprised by the first.

The four egress line items that show up on every cloud deployment

Four cloud network charges show up on every observability deployment of meaningful scale. They stack, and the largest is usually the smallest one to fix.

1. Cross-AZ traffic inside a VPC

AWS charges $0.01 per GB in each direction when an instance in availability zone us-east-1a talks to an instance in us-east-1b. Both ends are billed, so a round trip is $0.02 per GB. GCP and Azure are within a cent of the same number. The charge applies even when both instances are in the same VPC and the same subnet group.

For observability this matters because the OTel Collector is typically deployed as a DaemonSet or as a Deployment with replicas spread across AZs for resilience, and the storage backend is usually AZ-redundant too. Spans emitted on a host in AZ-a that hit a collector replica in AZ-b that forwards them to a backend in AZ-c pay two cross-AZ hops on the same byte.

2. NAT-gateway processing

Anything leaving a private subnet through a NAT gateway pays $0.045 per GB processing, on top of $0.045 per gateway-hour just for keeping the NAT up. The processing charge is the bigger surprise: a 1 TB month through NAT is $45 for the bytes alone, before any internet egress charge stacks on top.

This bites teams who put their collector in a private subnet and their telemetry backend on the public internet, whether that backend is a SaaS endpoint or a self-hosted backend in a different account. Every byte of telemetry pays the gateway tax.

3. Public-internet egress

Once bytes leave the cloud provider's network entirely, AWS bills $0.09 per GB for the first 10 TB per month, dropping to $0.085 and then $0.07 at higher tiers. GCP and Azure are similar. Cloudflare R2 famously waives egress entirely, which is part of why the comparison has been ugly for the hyperscalers since 2022.

For observability this is the line item that scales with traffic to a third-party SaaS. A team sending 1 TB of telemetry per month to a SaaS endpoint pays AWS $90 for the bytes (or more, if NAT processing is also in the path), regardless of what the SaaS vendor charges them on the receiving end.

4. Cross-region replication

Cross-region traffic on AWS ranges from $0.02 per GB between two US regions to $0.09 per GB for intercontinental hops. Anyone running an active-active observability backend across regions pays this for every replicated span and event, on top of whatever the primary ingest path costs.

Most teams discover the line item the month after they enable cross-region DR for the telemetry backend. The replication path doubles the telemetry-related egress overnight.

A worked example: 5M spans/day across two AZs

Concrete numbers make the shape clearer. Imagine a service emitting 5 million spans per day, average wire size 1.2 KB encoded as OTLP protobuf with gzip compression. That comes out to 6 GB/day of telemetry, or 180 GB/month.

Layout one: collector and backend in the same AZ.

  • 180 GB × $0 cross-AZ = $0
  • 180 GB × $0 NAT (no NAT in path) = $0
  • 180 GB × $0 internet egress (intra-VPC) = $0
  • Total: $0/month for transport.

Layout two: collector in AZ-a, backend in AZ-b, both in the same VPC.

  • 180 GB × $0.02 round-trip cross-AZ = $3.60/month.
  • Total: $3.60.

Layout three: collector in AZ-a, forwarding to a SaaS backend reached through a NAT gateway in AZ-b.

  • 180 GB × $0.01 cross-AZ outbound = $1.80
  • 180 GB × $0.045 NAT processing = $8.10
  • 180 GB × $0.09 internet egress = $16.20
  • Total: $26.10.

Now scale the workload by 100×: 500M spans/day, 18 TB/month.

  • Layout one: still $0.
  • Layout two: $360/month.
  • Layout three: $2,610/month — and that is before the SaaS vendor's bill arrives.

The layout decision moves the transport bill by three orders of magnitude. The signal volume does not change. This is why "your observability bill is a network bill" is not hyperbole.

What OTel Arrow actually buys you

The OpenTelemetry Arrow project, originally upstreamed by F5 in 2022 and stabilized in the 2025 release cycle, replaces the OTLP protobuf wire format with Apache Arrow columnar batches. The published measurements show 4–10× compression improvements for metrics and 2–4× for spans, with logs in between depending on cardinality.

For the worked example above, OTel Arrow does not change the layout decisions. It shrinks the bytes. A 4× reduction on layout three drops 18 TB/month to roughly 4.5 TB and the egress bill from $2,610 to about $650 — a real cut, but the layout was still wrong.

Where OTel Arrow earns its place is when the layout is dictated by something other than cost. A regulated workload that must keep telemetry in a separate observability VPC, or a team that contractually cannot co-locate with the backend, takes the compression as the only available lever. The protocol switch is a flag on the OTel Collector exporter and a flag on the receiver. No application code changes.

exporters:
  otelarrow:
    endpoint: collector-aggregator.observability.svc:4317
    compression: zstd
    arrow:
      disabled: false
      num_streams: 4

The receiver-side cost is a small RAM bump and the dependency on a recent Collector build. The wire savings start on the first batch.

The collector placement decision

Once you accept that egress is the bill, the placement of the OTel Collector is the highest-leverage knob you have. Four placements, in roughly descending cost.

Cross-region SaaS

The default for most teams: applications emit OTLP to a regional collector, which forwards to a SaaS endpoint in a different region or a different cloud. Every byte pays cross-AZ, NAT, and internet egress, and often a cross-region surcharge once it leaves your provider. The bill scales linearly with traffic, and the only lever you have is the SaaS's own retention tiering.

Same-region SaaS

Applications emit to a regional collector, which forwards to a SaaS endpoint that happens to be in the same cloud region. Cross-AZ may still apply depending on where the SaaS's ingest endpoint lives, and NAT processing still applies because the destination is outside your VPC. Better than cross-region, still expensive.

Same-VPC self-hosted

Applications emit to a same-AZ collector, which forwards to a self-hosted backend (urgentry, GlitchTip, Bugsink, SigNoz, Sentry self-hosted) inside the same VPC. Cross-AZ traffic only if the backend's AZ differs from the collector's. No NAT processing, no public-internet egress. The bill is dominated by EC2 hours and storage.

Same-host self-hosted

The collector runs as a DaemonSet or sidecar on every application host, and the backend runs on a host in the same VPC. Application-to-collector traffic is loopback (free) or DaemonSet-to-host-collector (free). Only the collector-to-backend hop is potentially billed, and even that is free if you keep the backend in the AZ where the bulk of your application traffic lives. This is the layout where the egress line item rounds to zero.

The cost gap between the first and last placement is typically a 50–200× factor at any meaningful scale. That gap is large enough to dominate the architecture decision, not be an afterthought to it.

Self-hosted error tracking changes the egress shape

Error tracking has a friendly property compared to general observability: the events are small, the rate is bounded by the actual failure rate of the system, and the destination is a single backend. A team running urgentry, GlitchTip, or Bugsink inside their VPC pays nothing for the error-ingest path. The SDK or the OTel Collector forwards events to a hostname that resolves inside the VPC, and the bytes never touch a billed boundary.

The contrast with a SaaS error tracker is the full egress stack. The SDK has to reach a hostname owned by the vendor, the bytes leave the VPC, cross-AZ applies at best and NAT plus internet egress applies in the common case. At 5M events/day, that path is about $26/month per service in transport before any product bill. A team running fifty services pays $1,300/month just to deliver errors to their tracker.

Once you write down the egress math for a specific deployment, the architectural choice of "where the collector forwards to" becomes the only thing that matters for cost. Sampling rates, retention windows, signal selection — these all live inside whatever bound your network bill sets, and the network bill is set by where the destination sits.

Frequently asked questions

What is the dominant cost of self-hosted observability?

Network egress, in most deployments. Cross-AZ traffic at $0.01 per GB each way, NAT-gateway processing at $0.045 per GB, and public-internet egress at $0.09 per GB stack up faster than disk or CPU. The collector-to-backend hop, where every span and event goes, is the line that wins.

Does OTel Arrow actually reduce egress?

Yes, by roughly 4–10× depending on signal shape. OTel Arrow encodes telemetry as Apache Arrow columns, which compress structured data far better than the OTLP protobuf wire format. The reductions are larger for metrics and spans than for sparse log batches, and the switch is a flag on both ends of the exporter–receiver pair.

Should I co-locate the OTel Collector with my error-tracking backend?

If you can. Same host eliminates the hop entirely. Same AZ removes cross-AZ charges. Same VPC removes NAT-gateway processing. Cross-region is the only layout that always pays for traffic, and it should be a deliberate choice for resilience, not a default.

How does self-hosted error tracking change the egress picture?

It moves the destination inside your network. Sentry SaaS and Datadog do not bill you for the bytes you send them, but your cloud bills you for the bytes leaving your VPC. Self-hosting puts collector and backend inside the same VPC, often the same host, which zeroes the path that hurt the most.

What is the cheapest collector placement for a single-region deployment?

Run the collector as a DaemonSet or sidecar on the same host as the application, batch aggressively, and forward to a backend in the same VPC. That layout pays nothing per byte. The only cost is the collector's RAM and the disk the backend uses.

Sources

  1. OpenTelemetry Arrow project — the OTel Arrow GitHub repo, including published compression benchmarks across signal types and the exporter/receiver components used in the configuration example above.
  2. AWS EC2 on-demand pricing — Data Transfer — the authoritative reference for cross-AZ ($0.01/GB), regional, and public-internet egress charges used in the worked example.
  3. AWS VPC pricing — the NAT-gateway hourly rate and per-GB processing charge that turn private-subnet collectors into a recurring observability line item.
  4. Cloudflare R2 — the egress-fee waiver that anchors the comparison and explains why the hyperscaler economics have been awkward to defend since 2022.
  5. OpenTelemetry Collector deployment guide — the agent/gateway/sidecar deployment patterns referenced in the placement decision section.

Land the destination inside your VPC.

urgentry runs as a single Go binary inside the same VPC as your OTel Collector. No NAT-gateway processing, no public-internet egress, no cross-region replication for the ingest path. Change one DSN and the telemetry stops leaving your network.