Code-Memo

Performance & Scalability

Caching (HTTP caching headers)

Use Cache-Control, ETag, and Last-Modified for GETs that are safe to cache at CDNs or browsers.
Mark personalized responses private or no-store as appropriate.
Invalidate or shorten TTL when correctness beats freshness (balances, entitlements).

ETags and conditional requests

ETag lets clients send If-None-Match to skip large bodies when nothing changed (304).
Strong vs weak ETags matter for byte-identical semantics; document which you emit.
Combine with Range requests only when you fully understand intermediaries.

Rate limiting strategies

Token bucket for smooth bursts; sliding window for fairness; per-tenant quotas for SaaS.
Return structured quota headers (X-RateLimit-*, RateLimit-* drafts) when helpful.
Coordinate limits with API gateway and service-level budgets.

Bulk operations vs single requests

Bulk reduces round trips but increases payload size, timeouts, and partial failure complexity.
Prefer bounded batch sizes and per-item error reporting in the response.
For huge jobs, use 202 + async processing instead of multi-minute HTTP requests.

Payload optimization

Enable gzip/Brotli at the edge; trim unused fields with sparse fieldsets or GraphQL-like projections if supported.
Avoid N+1 chatty patterns; offer includes or dedicated aggregate reads when needed.
Watch serialization cost on hot paths (large JSON maps, deeply nested objects).