Guide Self-hosting ~19 min read Updated June 24, 2026

Self-Hosted Monitoring Stack: How to Combine Errors, Traces, and Logs on One Backend

A self-hosted monitoring stack that handles errors, traces, and logs on one backend needs a single ingest process, correlated storage for all three signal types, and a query layer that connects them during an incident. The minimum viable version costs $80–$120 per year; the production-grade version delivers 2,200 errors per second with a query p95 of 48.82 ms.

TL;DR

20 seconds. urgentry self-hosted mode on the same hardware as Sentry self-hosted uses 21 times less memory (391.8 MB vs 8,191.7 MB), handles 2.2 times the throughput (2,200 eps vs 1,000 eps), and returns queries 29 times faster at the p95 percentile (48.82 ms vs 1,400.81 ms). The minimum viable one-backend stack for errors, traces, and logs runs on a $5 VPS at $80–$120 per year all-in.

60 seconds. One backend handling all three signal types needs four things: ingest endpoints for OTLP and the Sentry envelope format, a storage engine that handles mixed workloads, correlation IDs threaded through all three signals, and a query layer that connects errors to traces to logs without copying IDs between tools. urgentry ships as a single Go binary with SQLite (Tiny mode: 52.3 MB peak RAM, 400 eps on a large box) or with PostgreSQL and NATS (self-hosted mode: 391.8 MB peak RAM, 2,200 eps). The TCO for Tiny mode is $80–$120 per year. A full Grafana/Prometheus/Loki stack is the right choice when your primary workflow is infrastructure metrics or high-volume log retention rather than exception triage.

This guide covers what each signal catches, why three separate tools add coordination overhead, what a one-backend stack requires, benchmark numbers across Tiny mode, self-hosted mode, and Sentry self-hosted, annualized TCO for four scenarios, OpenTelemetry as the transport layer, when Grafana and Prometheus win instead, the four concrete stack decisions, and anti-patterns to avoid.

What the three signals are and what each one catches

Three distinct data types flow out of a production application. Each one answers a different question, and the distinction matters when you’re choosing what to store and where.

Errors are discrete failure events: unhandled exceptions, panics, assertion failures, and 5xx responses. An error record captures the exception type, the stack trace, breadcrumbs leading up to the failure, and environment metadata (release, server, user). A fingerprinting algorithm groups repeated occurrences into a single “issue” so you’re not re-triaging the same crash a thousand times. Errors answer: what broke, where in the code, and how many users were affected? They are low in volume relative to the other two signals but high in signal quality. A team serving a thousand users daily might generate a few dozen unique error issues per day, a manageable set with clear actionability.

Distributed traces model the lifecycle of a request across services. A trace is a directed acyclic graph of spans: one root span (the entry point) with child spans for database calls, downstream service calls, and other timed operations. Trace data scales with throughput rather than with bug count. A distributed trace answers: how long did this request take at each hop, and which service introduced the latency? Traces are the investigation tool after an error surfaces. You find the error first, then reach for the trace to see the context surrounding it.

Structured logs are the timestamped narrative of what a process did. Unlike errors (triggered by exceptions) and traces (scoped to a request lifetime), logs emit continuously. A single web request might produce a dozen log lines: authentication check, cache hit or miss, SQL query, response time. Logs answer: what did the process actually do step by step? They fill the gaps traces don’t capture and let you reconstruct state without a debugger attached.

The relationship between these three signals matters more than the definitions in isolation. In a real incident, the sequence is almost always: an error alert fires, the error record points to a trace ID, the trace reveals a slow downstream span, and the logs on that span show the actual query that timed out. Three signals, one incident, one query path. That workflow is what a one-backend stack is built to support.

Why three separate tools create a different problem

The default architecture for a team scaling past a few engineers is accidental: Sentry for errors, Datadog (or Jaeger) for traces, and CloudWatch or Grafana/Loki for logs. Each tool was a reasonable choice at the time. Together, they accumulate coordination overhead that compounds with team size.

The concrete friction points:

Correlating an error to its trace requires copying a trace ID out of the error tracker, pasting it into Jaeger or Datadog APM, and hoping the retention windows happen to overlap. Log lines from the same request live in a third system, often with a different timestamp format and no shared trace context. When a service generates 50,000 events per hour, retaining that data across three separate retention policies adds cost at each tier: vendor A charges for ingestion, vendor B charges for storage, vendor C charges for egress. Three separate auth integrations, three alert pipelines, and three data-governance domains describe fundamentally the same runtime narrative, just fragmented.

At real traffic volumes, the cost compounds quickly. At 5 million errors and 10 million spans per month in Sentry, overages alone run roughly $426/month for errors, $44/month for spans, and $25/month for replay (urgentry.com/guides/cost/self-host-economics/, 2026). Adding Datadog APM or a comparable trace backend puts most teams into four figures per month before accounting for data egress.

The self-hosting economics case closes around $100–$150/month in combined SaaS spend. Below that threshold, the ops overhead of running your own backend does not pay for itself. Above it, a single self-hosted backend changes the math substantially.

What one backend actually requires

A single backend that handles all three signal types needs four things done correctly.

Ingest endpoints for each protocol. OTLP (OpenTelemetry Protocol) over HTTP or gRPC carries traces and logs. The Sentry envelope format (/api/{id}/envelope/) carries errors from Sentry SDKs. Nothing prevents the same server from accepting both wire formats simultaneously, and doing so means you can add OTel instrumentation to an existing Sentry-instrumented codebase without migrating away from the Sentry SDK.

A storage engine that handles mixed workloads. Errors require a grouping index (fingerprints mapping to issue IDs). Traces need a wide columnar format for span-level queries across thousands of attributes. Logs need time-series storage with reasonable full-text search. SQLite works for small volumes and is operationally trivial. PostgreSQL with a columnar extension handles production loads. ClickHouse handles high-cardinality trace data at scale, though it adds operational weight.

Correlation IDs threaded through all three signal types. This is the non-negotiable architectural requirement. If an error event doesn’t carry a trace_id, you cannot jump from the error to the trace. OpenTelemetry handles this automatically through context propagation: every span carries a trace ID, and any log emitted within a span’s context inherits the same trace ID. The three signals become one navigable graph rather than three separate datasets.

A query layer that works across signal types. A UI that lets you open an error, click through to the parent trace, and filter logs by the same trace ID, without switching applications or manually copying IDs, is what makes the one-backend model operationally worthwhile. Without this, you’ve consolidated storage but not the workflow.

The operational requirement is lower than it sounds. A process that handles all four concerns does not need to be a distributed system. On a single machine at moderate traffic, one well-written Go binary with SQLite covers all four requirements. The complexity comes when you scale past SQLite’s write throughput ceiling, not at the architecture level.

Real resource numbers: Tiny mode vs self-hosted mode

urgentry ships in two deployment modes with published, reproducible benchmark data. All numbers below are from the April 2026 benchmark (published April 13, 2026, last updated June 21, 2026 · /docs/benchmarks/). The large-box reference hardware is 8 vCPU, 30.6 GiB RAM, 226 GB SSD, Ubuntu 24.04. The methodology covers envelope ingest with a 70/30 small-and-medium error mix.

Tiny mode (SQLite, single binary, no external infrastructure dependencies):

Metric Large-box result Budget VPS result
Peak RAM 52.3 MB 44.8 MB
Stable throughput 400 eps 100 eps
Query p95 78.66 ms 79.9 ms
Ingest p95 10.08 ms 6.8 ms

Self-hosted mode (PostgreSQL + NATS + object storage, coordinated by the same binary):

Metric Large-box result Budget VPS result
Peak RAM 391.8 MB 184.0 MB
Stable throughput 2,200 eps 100 eps
Query p95 48.82 ms 55.5 ms
Ingest p95 0.71 ms 5.3 ms

Sentry self-hosted on the same large-box hardware:

Metric Result
Peak RAM 8,191.7 MB (~8 GB)
Stable throughput 1,000 eps
Query p95 1,400.81 ms
Small-box result Installation failed

The summary: on the same hardware, urgentry self-hosted uses approximately 21 times less memory than Sentry self-hosted, handles 2.2 times the throughput, and returns queries 29 times faster at the p95 percentile. The benchmark does not cover all Sentry operations (methodology: envelope ingest only), and edge cases may differ.

For error tracking on a $5 VPS, Tiny mode is the correct starting point. Teams that need traces and logs correlated with errors at higher volumes should budget for a $20–$40/month VPS running self-hosted mode.

TCO reality: what it actually costs

Annual cost estimates for different configurations in 2026, with sources for every figure.

urgentry Tiny mode on a $5–$6/month VPS:

  • Host: $60–$72/year. Linode Nanode 1 GB at $5/month; Hetzner CPX11 at ~€4.51/month (~$59/year); DigitalOcean 1 GB Droplet at $6/month.
  • Backup storage (Litestream streaming to S3-compatible storage): $12–$60/year depending on event volume and retention.
  • Total: $80–$120/year (/guides/self-hosting/error-monitoring-on-5-dollar-vps/, 2026).

urgentry self-hosted mode on a mid-range VPS:

  • A $20–$40/month box (Hetzner CX32 or DigitalOcean 4 GB Droplet) handles 2,200 eps with room for trace and log volume.
  • Annual host cost: $240–$480/year.
  • Ops time: ~30 minutes per month at the $150/hour all-in engineer rate (/guides/cost/self-host-economics/, 2026) = ~$900/year if you count it. Most teams don’t; it’s worth stating so you can decide.

Sentry self-hosted:

  • Minimum hardware: 16 GB RAM documented; 24–32 GB in practice for production stability (/guides/self-hosting/sentry-self-hosted-ram/, 2026). The current docker-compose.yml defines approximately 70 services running on a single host (confirmed from github.com/getsentry/self-hosted, accessed 2026-06-26).
  • Hetzner 32 GB VM: ~$60–$65/month in EU. AWS equivalent: $170–$280/month.
  • Ops time: estimated 24 hours/year routine maintenance plus 8 hours for major upgrades = 32 hours at $150/hour = $4,800/year in eng time.
  • Worked example total: ~$5,100/year (/guides/cost/self-host-economics/, 2026).

Sentry cloud SaaS:

  • Worked example at real production event volumes: ~$6,900/year including overages.
  • The break-even vs urgentry self-hosted is ~$100–$150/month in Sentry SaaS spend.

A three-tool SaaS split (Sentry + Datadog APM + Grafana Cloud) at modest production traffic runs $2,000–$5,000/year before factoring in integration and ops overhead. Consolidating onto one self-hosted backend changes the math for teams above the break-even threshold.

The one category where Sentry self-hosted “wins” on TCO: teams that already have the ops expertise and the 24–32 GB hardware in-house for other purposes, where incremental ops cost is near zero. For everyone else, the hardware requirement alone makes it the expensive option.

OpenTelemetry as the transport layer

OpenTelemetry (OTel) is the CNCF standard for instrumenting, collecting, and exporting telemetry. It defines OTLP, the wire protocol that carries all three signal types over the same HTTP or gRPC connection. As of 2024, OTLP is vendor-neutral: the same collector configuration ships to Jaeger, Tempo, Loki, urgentry, or any OTLP-compatible backend. That vendor-neutrality is the reason to invest in OTel instrumentation rather than proprietary agents.

What OTel provides in a one-backend self-hosted stack:

Context propagation. When a request enters your application, the OTel SDK attaches a trace context (trace ID and span ID) to every operation. Any log emitted inside that operation inherits the trace context automatically. This is the correlation mechanism that makes “jump from error to trace to logs” work without manual ID copying. Without context propagation threaded consistently, a one-backend architecture still requires you to grep for trace IDs across separate query views.

A unified SDK. One set of instrumentation libraries covers spans, logs, and exception recording. The span.RecordException() call creates an OTLP exception event that surfaces as an error issue with a stack trace on any backend that supports it. The alternative, maintaining separate Sentry SDK instrumentation plus a separate tracing library plus a separate logging library, works but creates three independent instrumentation surfaces to keep synchronized.

The Collector as a routing layer. The OpenTelemetry Collector is a lightweight (~50–100 MB) stateless process that receives telemetry from your services, applies sampling decisions, and routes to one or more backends. A common pattern: all signals go to the Collector; errors and traces route to urgentry; raw Prometheus metrics route to a time-series database; long-retention log archives go to S3 or object storage. This adds one process to run but gives you sampling control without touching application code.

urgentry accepts OTLP/HTTP JSON at /v1/traces and /v1/logs in the same binary that handles Sentry SDK errors. For teams already on the Sentry SDK, adding OTel traces means adding the OTel SDK alongside the existing instrumentation and pointing spans at urgentry’s OTLP endpoint. The Sentry SDK handles errors; OTel handles traces and logs; urgentry correlates all three. The OTLP for error tracking guide covers the SDK configuration and the three transport variants (gRPC, HTTP/protobuf, HTTP/JSON) in detail.

When a full Grafana/Prometheus/Loki stack wins

urgentry is errors-first. Traces and logs in urgentry exist to support the error-debugging workflow, not as a general-purpose metrics platform or a log-management system. That is a deliberate design choice, and it means there are cases where a different stack is the genuinely correct answer.

Metrics-first teams (infrastructure monitoring, SLO tracking, capacity planning) need Prometheus or a compatible time-series database. Prometheus excels at numeric gauge, counter, and histogram data: CPU utilization over time, request rate, queue depth. urgentry does not store Prometheus-style metrics. If your primary monitoring workflow is “is this service healthy right now?” rather than “what broke in the last deploy?”, Grafana + Prometheus + Alertmanager is the right center of your stack, and urgentry is a complementary error layer rather than the primary tool.

High-volume log pipelines belong in Loki, OpenSearch, or Elasticsearch. urgentry stores logs scoped to errors and traces, not arbitrary application log streams. Teams that need to retain and search 100 GB/day of application logs for compliance or audit purposes need a dedicated log store. Sending that volume through an error tracker raises ingest costs and query latency without improving the debugging workflow.

Teams with platform engineering resources who want the full observability surface (continuous profiling, infrastructure dashboards, synthetic monitoring, custom visualization) are better served by the full Grafana stack or a managed platform like Datadog, even at higher cost. The ops overhead of running Grafana + Prometheus + Loki + Tempo on your own hardware is real, but so is the capability ceiling when you need features that errors-first tools don’t implement.

A concrete check: if your team spends more time answering “how is the system performing?” than “what is broken in production?”, Prometheus/Grafana is the right primary tool. If your team spends more time triaging exceptions and tracing slow requests back to specific code paths, urgentry fits the workflow. Many teams need both, at which point the common pattern is urgentry for errors and traces, Prometheus for metrics, and a minimal Loki setup for log retention.

On feature coverage: urgentry includes session replay and continuous profiling alongside errors and traces. The genuine gaps relative to a full observability platform are infrastructure metrics dashboards, synthetic monitoring, and custom visualization, which is the tradeoff discussed above for teams that need the complete Grafana or Datadog surface.

Putting the stack together: the concrete decisions

A one-backend self-hosted monitoring stack for a small-to-mid team in 2026 involves four decisions, in order.

Decision 1: Binary or Docker Compose? For urgentry Tiny mode, a single binary is the correct choice. No container runtime required, no compose file to maintain, no container networking to debug when something breaks at 2 AM. The single binary vs Docker Compose analysis consistently favors single-binary for teams without dedicated ops. For Sentry self-hosted, Docker Compose is the only supported installation path, which is one reason the minimum hardware requirement is 16–32 GB of RAM across approximately 70 services in its current docker-compose.yml.

Decision 2: Tiny mode or self-hosted mode? Tiny mode (SQLite storage) handles up to ~400 eps sustainably on a large box; ~100 eps on a budget VPS. If your peak error volume plus trace and log ingest regularly exceeds 200 eps, or if you need the faster query latency of self-hosted mode’s columnar storage, move up to self-hosted mode on a box with at least 1–2 GB RAM for urgentry plus headroom for PostgreSQL. The jump from $5–$6/month to $20–$40/month in hosting covers most production workloads.

Decision 3: Which SDK for instrumentation? If you’re already on the Sentry SDK, changing one DSN string is the entire migration for error events. urgentry covers 218/218 Sentry API operations, verified from source-scanned API specifications. For new instrumentation, the OTel SDK handles all three signals (errors via RecordException, traces via spans, logs via the OTel logging API) with one set of libraries and automatic context propagation between them. The two are not mutually exclusive: a common starting point is Sentry SDK for errors (zero-touch migration) and OTel SDK added alongside it for traces.

Decision 4: OTel Collector or direct ingest? For a single service or a handful of services at low volume, direct ingest (SDK directly to urgentry’s OTLP endpoint) is simpler and one less process to run. For a multi-service architecture at higher throughput, the OTel Collector adds tail-based sampling (which matters for trace cost above a few hundred requests per second) and routing flexibility. The Collector itself is lightweight and stateless; the ops overhead is low once configured.

A working stack for most small teams:

  1. One Hetzner CPX11 (€4.51/month) or Linode Nanode ($5/month)
  2. urgentry in Tiny mode, single binary, SQLite storage
  3. Existing Sentry SDK with DSN pointed at urgentry
  4. OTel SDK added to new services, OTLP endpoint set to urgentry
  5. Litestream for real-time SQLite replication to S3-compatible backup
  6. Total annual cost: $80–$120

That configuration handles up to 400 eps, stores all three signal types, and provides full trace-error-log correlation. When error volume exceeds 200 eps sustained, upgrade the box and switch to self-hosted mode. The migration between modes does not require re-instrumentation.

Anti-patterns that waste your time

Running Sentry self-hosted because it’s listed as open source. The infrastructure cost on a 32 GB VM ($720–$840/year on Hetzner in EU) plus ops time (estimated 32 hours/year at $150/hour = $4,800/year) adds up to more than most teams’ Sentry SaaS bill. The hidden cost of Sentry self-hosted covers the math. “Open source” and “cheap to operate” are different claims.

Three separate best-of-breed tools with no shared trace context. Best-of-breed is the right answer when you genuinely use the advanced features of each tool. Most small teams pay for Datadog APM and use a fraction of the feature surface. The correlation overhead between three tools creates operational bugs (mismatched retention windows, clock skew between systems, trace IDs that don’t correlate because one system uses W3C trace context and another uses B3) that a single-backend architecture avoids structurally.

Sending your entire application log stream to the error tracker. Logs that are not correlated to an error or trace should stay in their own store, or be dropped. Routing your full application log stream to urgentry (or any error tracker) raises ingest costs and query latency without improving the debugging workflow. Log the things that carry trace context; archive the rest separately.

Head-based sampling for traces when errors are a primary signal. If you drop 90% of traces at the entry point (head-based sampling), you drop the errors contained in the discarded traces. Tail-based sampling for errors is the correct default: keep all traces that contain an error or exceed a latency threshold, sample the clean fast ones. The OTel Collector’s tail-sampling processor handles this configuration.

Skipping the correlation ID setup and treating it as optional. A one-backend monitoring stack where the three signals aren’t correlated by trace ID is functionally equivalent to three separate tools stored in one database. The architectural benefit comes from the correlation. Threading trace context through all three signals before the first line of production code goes in is the setup step that pays for itself during every incident afterward.

Frequently asked questions

What is a self-hosted monitoring tool?

A self-hosted monitoring tool is software you run on your own infrastructure to track errors, performance, and logs from your applications. You control the data, the retention policy, and the hardware costs. The tradeoff versus SaaS is operational responsibility in exchange for lower cost and full data residency. Self-hosted tools range from a single binary on a $5 VPS to multi-container clusters on dedicated hardware.

What hardware do I need for a self-hosted monitoring stack?

For a small team (under 200 eps): a 1–2 GB RAM VPS running urgentry Tiny mode, which used 44.8–52.3 MB peak RAM in the April 2026 benchmark. For up to 2,200 eps with distributed traces and logs correlated: a box with at least 2 GB RAM for urgentry self-hosted mode (391.8 MB peak RAM in the benchmark, with PostgreSQL alongside). For Sentry self-hosted: a documented minimum of 16 GB RAM; 24–32 GB in practice for production stability.

Is self-hosted monitoring cheaper than SaaS?

At error volumes that put your Sentry cloud bill above $100–$150/month, self-hosting becomes cost-effective. A worked example with real event volumes: Sentry SaaS at ~$6,900/year, Sentry self-hosted at ~$5,100/year (mostly ops time), and urgentry self-hosted at ~$1,370/year (/guides/cost/self-host-economics/, 2026). Below the $100/month Sentry bill, the ops overhead typically exceeds the SaaS cost, and SaaS is the honest recommendation.

What is OpenTelemetry and how does it relate to a monitoring stack?

OpenTelemetry is the CNCF standard for instrumenting applications to emit telemetry: distributed traces, logs, and metrics. It defines OTLP, the wire protocol that carries all three signal types over the same connection. In a self-hosted stack, OTel provides the instrumentation SDK, context propagation (trace IDs threaded from errors through to logs), and optionally a Collector for routing and tail-based sampling. Any OTLP-compatible backend (urgentry, Jaeger, Tempo, Loki) can receive the output.

Can I use my existing Sentry SDK with a self-hosted monitoring tool?

With urgentry, yes: changing one DSN string is the entire migration for error events. urgentry covers 218/218 Sentry API operations, verified from Sentry’s API specifications. Sentry SDKs for Python, JavaScript, Go, Ruby, Java, .NET, PHP, and mobile platforms send error data to whichever endpoint the DSN points to. The SDK has no awareness of which backend processes the events.

What is the difference between a self-hosted monitoring stack and self-hosted Sentry?

Self-hosted Sentry requires approximately 70 services in its current docker-compose.yml (confirmed from github.com/getsentry/self-hosted, accessed 2026-06-26) and 16–32 GB of RAM, with installation reported to fail on hardware below that threshold. A minimal self-hosted monitoring stack using urgentry Tiny mode requires one binary and 44.8–52.3 MB RAM, runs on a $5 VPS, and delivers 100–400 eps depending on hardware. In its full self-hosted mode urgentry covers errors, traces, session replay, and continuous profiling; the remaining capability differences are some Sentry integrations (Codecov, certain source code deep-link features) that are not covered.

What are the best self-hosted monitoring tools in 2026?

The correct answer depends on your primary signal. For errors-first teams that want Sentry SDK compatibility without the 16–32 GB RAM requirement: urgentry. For teams that need a general-purpose observability stack covering metrics and dashboards: Grafana + Prometheus + Loki + Tempo. For teams that want error tracking specifically with a more established history: GlitchTip or Bugsink, both Sentry-compatible and lighter than Sentry self-hosted. For full-stack observability with a native OTel backend: SigNoz. No single tool wins every case. The right question is which signal (errors, metrics, or logs) drives most of your incident response.

Sources

  1. urgentry benchmark: April 2026 — throughput, RAM, and query latency for urgentry Tiny mode, urgentry self-hosted, and Sentry self-hosted on identical hardware; published April 13, 2026; last updated June 21, 2026.
  2. urgentry guide: Self-hosting economics 2026 — $150/hour engineer rate, ops time estimates, worked TCO comparison ($6,900/year SaaS vs $1,370/year urgentry vs $5,100/year Sentry self-hosted), break-even at $100–$150/month Sentry bill.
  3. urgentry guide: $5 VPS resource specs 2026 — Hetzner CPX11 €4.51/month, DigitalOcean 1 GB Droplet $6/month, Linode Nanode $5/month; annual cost $80–$120.
  4. urgentry guide: Sentry RAM requirements 2026 — 16 GB documented minimum; 24–32 GB in practice; ~20 containers; Hetzner 32 GB VM $60–$65/month EU; AWS $170–$280/month.
  5. urgentry guide: OTLP for error tracking — OTLP/HTTP JSON endpoints, three transport variants, urgentry OTLP ingest configuration.
  6. urgentry guide: Consolidating observability 2026 — cost breakdown for split-vendor vs unified self-hosted.
  7. opentelemetry.io — OpenTelemetry project specification: OTLP wire protocol, context propagation, Collector architecture (CNCF standard).

One backend for errors, traces, and logs.

urgentry ingests Sentry SDK envelopes and OTLP traces and logs in the same binary. On the same hardware as Sentry self-hosted, it uses 391.8 MB of RAM vs 8,191.7 MB, handles 2,200 eps vs 1,000 eps, and returns queries at 48.82 ms p95 vs 1,400.81 ms. The minimum viable stack runs on a $5 VPS at $80–$120 per year.