Code-Memo

Observability for APIs

Logging requests/responses

  1. Log method, path, status, latency, tenant/user id (hashed), request id; redact bodies by default.
  2. Use structured JSON logs for easy queries in ELK/Datadog/CloudWatch.
  3. Sample high-cardinality paths if volume explodes, but keep errors at 100%.

Metrics (latency, error rate)

  1. Track RED (rate, errors, duration) per route and dependency; alert on SLO burn.
  2. Break out 4xx vs 5xx so you do not mask client bugs as server incidents.
  3. Export histograms (p50/p95/p99) rather than averages only.

Tracing distributed requests

  1. Propagate W3C Trace Context (traceparent) or vendor headers across services.
  2. Spans should cover external HTTP, DB, queues to find real bottlenecks.
  3. Link traces to logs via shared trace/span ids.

Correlation IDs

  1. Accept X-Request-Id from clients or generate one; echo it in responses and error bodies.
  2. Pass the same id through async jobs and webhooks for end-to-end stories.
  3. Guard against header injection by validating format and length.