Logging, metrics, and tracing
Logging
- Prefer structured logs for services (
log/slog, zap, zerolog).
- Include request IDs / trace IDs, resource identifiers, and error context.
- Keep logs machine-parsable for indexing and alerting.
Metrics
- Export Prometheus metrics (or vendor format) for:
- request rates/latency/error rates
- queue depth / retries
- Go runtime stats when useful (GC pauses, goroutines, heap)
Tracing
- Use OpenTelemetry when tracing helps correlate cross-service latency.
- Add spans around outbound calls (DB, HTTP, queue publish/consume).
Health checks
Expose endpoints or commands:
- liveness: process is alive
- readiness: dependencies available (DB connection, config loaded)
Correlation
Make it easy to connect:
- log line ↔ trace ↔ metrics label(s)
- stable identifiers (deployment, version, instance)