Skip to main content
Under Reviewv0.1.0-alpha

Traces

AxCom uses the OpenTelemetry SDK for distributed tracing via the pkg/telemetry package. Traces are exported via OTLP/HTTP to the OTel Collector and forwarded to Tempo for storage and querying in Grafana.

For Go API usage (initialising the SDK, shutdown hooks), see pkg/telemetry.


How Tracing Works

Each incoming HTTP request can carry a trace — a tree of timed spans representing the work done across services. Within a single AxCom process, a trace captures:

  • HTTP handler execution (root span)
  • Downstream database queries
  • Cache lookups
  • External HTTP calls

All spans carry resource attributes (service name, version, environment) and are linked by a shared trace_id. When trace.id appears in a log line, it points directly to the corresponding trace in Tempo.


Configuration

All settings are read from environment variables at startup.

Env VarValuesDefaultDescription
OTEL_ENABLEDtrue / falsefalseEnables or disables the SDK
OTEL_SERVICE_NAMEstringecom-engineService name on all spans
OTEL_SERVICE_VERSIONstring1.0.0Service version on all spans
OTEL_ENVIRONMENTstringproductionDeployment environment
OTEL_TRACE_SAMPLE0.01.00.01Fraction of traces to sample
OTEL_EXPORTERotlp, nonenoneTrace exporter
OTEL_EXPORTER_OTLP_ENDPOINTURLCollector endpoint (e.g. http://otelcol:4318)

Enabling Traces

Add to your app environment (e.g. .env.dev):

OTEL_ENABLED=true
OTEL_EXPORTER=otlp
OTEL_EXPORTER_OTLP_ENDPOINT=http://otelcol:4318
OTEL_TRACE_SAMPLE=0.1 # sample 10% in staging

In production, keep OTEL_TRACE_SAMPLE at 0.01 (1%) unless debugging a specific issue.


Sampling

OTEL_TRACE_SAMPLE valueSamplerWhen to use
<= 0NeverSample — no tracesDisabled
> 0 and < 1TraceIDRatioBased — probabilisticProduction (1%) / Staging (10%)
>= 1AlwaysSample — every requestLocal development / debugging

Sampling is applied at the root span. If a trace is sampled, all child spans within the same trace are always included.


Propagation

The SDK registers two standard propagators globally:

PropagatorHeaderPurpose
W3C TraceContexttraceparent, tracestateCarry trace/span IDs between services
W3C BaggagebaggageCarry key-value pairs across service boundaries

When an upstream service (load balancer, API gateway, or another microservice) sends a traceparent header, the SDK automatically continues that trace rather than starting a new one. This enables end-to-end traces that span multiple services.


Resource Attributes

Every span produced by AxCom includes these resource attributes:

AttributeSource
service.nameOTEL_SERVICE_NAME
service.versionOTEL_SERVICE_VERSION
deployment.environment.nameOTEL_ENVIRONMENT

These attributes appear in Tempo and can be used to filter traces by environment or version.


Trace-to-Log Correlation

When a request is traced, the trace.id and span.id are automatically injected into every log line produced within that request's context (via logger.*Ctx() methods).

{
"@timestamp": "2026-06-28T14:32:05.456Z",
"log.level": "error",
"message": "checkout failed: payment timeout",
"service.name": "ecom-engine",
"trace.id": "4bf92f3577b34da6a3ce929d0e0e4736",
"span.id": "00f067aa0ba902b7"
}

Workflow in Grafana:

  1. See an error spike in Service Health or HTTP Traffic dashboard.
  2. Open the Logs dashboard, filter by error level.
  3. Find the relevant log line — copy trace.id.
  4. Open Tempo in Grafana, search by trace ID.
  5. Inspect the full trace: handler timing, DB query duration, cache hits.

OTLP Exporter

When OTEL_EXPORTER=otlp, traces are exported using OTLP/HTTP (/v1/traces). The endpoint must be an OTel Collector (or compatible backend) that accepts OTLP HTTP.

In the self-hosted monitoring stack (Scenario 5), the OTel Collector is reachable at http://otelcol:4318 over the ecom-net Docker network.

Compatible backends:

  • OpenTelemetry Collector → Tempo
  • Jaeger v2+
  • Honeycomb
  • Datadog Agent

Current Instrumentation Status

LayerInstrumentedNotes
HTTP handler (root span)PlannedOTel HTTP middleware not yet wired
Database queriesPlannedpgxpool OTel plugin not yet wired
Cache operationsPlanned
External HTTP callsPlanned

The pkg/telemetry package and the OTel Collector pipeline are fully set up. The infrastructure is ready — adding instrumentation to individual layers is the next step.

When OTEL_ENABLED=false (the current default), a no-op TracerProvider is registered. All downstream calls to trace.SpanFromContext() return a no-op span and are safe to call without nil checks.