Code-Memo

Designing Facebook

✅ STEP 1: CLARIFY THE PROBLEM

Facebook is a social network that includes many features.

Core Modules:

Who Are the Users?

Product Goals

What does the business care about?

Constraints and Priorities

Constraint / Priority Notes
Latency Sub-200ms for most UI interactions (especially feed loading)
Scalability Billions of users, posts, likes — horizontally scalable architecture
Consistency Relaxed for feed, stronger for friends or user settings
Availability 99.99%+ SLA
Privacy & Security Users control post visibility, robust auth and permissions
Write frequency Lower than reads (more people consume than post)
Read frequency Very high — especially for feed, likes, comments

Key Use Cases

  1. User creates account
  2. User logs in
  3. User sends/accepts friend requests
  4. User makes a post (text/image)
  5. User likes/comments on a post
  6. User sees a ranked feed of friend posts
  7. User receives notifications on interactions

High-Level Metrics to Support

Metric Notes
Requests per second (RPS) Feed reads could be 10–100× post writes
Feed latency Target <200ms P95
Storage growth TBs to PBs daily
Cache hit ratio >90% for hot user/feed data
System uptime 99.99% or better
Eventual consistency lag Acceptable within seconds for feed and likes/comments

Personas

✅ STEP 2: Define Functional Requirements

Functional requirements define what the system should do — features and behaviors that are visible to users and that drive the architecture.

Detailed Functional Requirements (User Stories)

User & Auth

Friend System

Post System

Like System

Comment System

News Feed

Notification System

System APIs (Sketch)

API Endpoint Method Description
/signup POST Register a new user
/login POST Log in and receive a token
/user/profile GET/PUT View or update profile
/friends/request/:id POST Send friend request
/friends/respond/:id POST Accept/reject friend request
/posts POST Create a new post
/posts/:id DELETE Delete a post
/feed GET Get personalized feed
/posts/:id/like POST Like/unlike post
/posts/:id/comments GET/POST View/add comments
/notifications GET View unread/read notifications

✅ STEP 3: Define Non-Functional Requirements (NFRs)

Non-functional requirements describe the system’s quality attributes: performance, scalability, reliability, availability, security, and more.

Scalability

Design Implications:

Availability

Design Implications:

Latency / Performance

Operation Target Latency (P95)
Feed load ≤ 200ms
Post submission ≤ 500ms
Like/comment action ≤ 150ms
Friend request ≤ 100ms
Notification delivery ≤ 1s

Design Implications:

Consistency

Durability

Design Implications:

Security

Design Implications:

Privacy

Design Implications:

Maintainability & Modularity

Testability

Observability

Design Implications:

Cost Efficiency

Design Implications:

✅ STEP 4: Define System Interfaces / APIs

API Design Principles

Authentication APIs

Endpoint Method Description
/api/v1/signup POST Create a new account
/api/v1/login POST Log in, returns auth token
/api/v1/logout POST Invalidate session/token
/api/v1/refresh POST Refresh token (if used)

User Profile APIs

Endpoint Method Description
/api/v1/users/me GET Get current user’s profile
/api/v1/users/me PUT Update profile info
/api/v1/users/:id GET View another user’s profile

Friendship APIs

Endpoint Method Description
/api/v1/friends/requests GET View incoming/outgoing requests
/api/v1/friends/request/:user_id POST Send friend request
/api/v1/friends/respond/:user_id POST Accept/reject friend request
/api/v1/friends/list GET Get list of friends
/api/v1/friends/remove/:user_id DELETE Unfriend a user

Post APIs

Endpoint Method Description
/api/v1/posts POST Create a new post
/api/v1/posts/:id GET Get a specific post
/api/v1/posts/:id DELETE Delete your own post
/api/v1/users/:id/posts GET Get all posts by a user

News Feed APIs

Endpoint Method Description
/api/v1/feed GET Get personalized feed

Query Params:

Likes APIs

Endpoint Method Description
/api/v1/posts/:id/like POST Like or unlike a post
/api/v1/posts/:id/likes GET Get list/count of likes

Comments APIs

Endpoint Method Description
/api/v1/posts/:id/comments GET View comments on a post
/api/v1/posts/:id/comments POST Add a comment to a post
/api/v1/comments/:id DELETE Delete a comment

Notification APIs

Endpoint Method Description
/api/v1/notifications GET Get all notifications
/api/v1/notifications/:id/read POST Mark a notification as read

Search API

Endpoint Method Description
/api/v1/search/users GET Search for users by name/email

Internal Microservice APIs

These are not exposed to the frontend but help for internal service-to-service communication:

4.12 Media Upload APIs

✅ STEP 5: High-Level Architecture

High-level means we are not yet choosing specific tools, but rather defining the main components, their responsibilities, and how data and requests flow across the system.

Core Components

                    ┌────────────┐
                    │   Client   │ (Web/iOS/Android)
                    └─────┬──────┘
                          │
                          ▼
                 ┌─────────────────┐
                 │ API Gateway / LB│
                 └────┬───────┬────┘
                      ▼       ▼
     ┌────────────────────┐ ┌────────────────────┐
     │ Authentication Svc │ │  User/Profile Svc  │
     └────────────────────┘ └────────────────────┘
                      ▼
     ┌────────────────────┐ ┌────────────────────┐
     │ Friend Graph Svc   │ │     Post Svc       │
     └────────────────────┘ └────────────────────┘
                      ▼
     ┌────────────────────┐ ┌────────────────────┐
     │ Feed/Timeline Svc  │ │  Like/Comment Svc   │
     └────────────────────┘ └────────────────────┘
                      ▼
               ┌──────────────┐
               │ Notification │
               └─────┬────────┘
                     ▼
           ┌────────────────────┐
           │ Media/CDN Service  │
           └────────────────────┘

        (All backed by databases and caches)

Service Responsibilities

Service Responsibility
API Gateway Entry point, auth/token check, rate limit, routing
Auth Service Login, signup, JWT issuing, token validation
User Service Profile CRUD, avatar, bio, privacy settings
Friendship Service Send/accept/remove friend requests, check mutuals
Post Service Create/delete posts, store metadata and media links
Feed Service Fan-out on write, ranking, retrieve paginated feed
Like/Comment Service Handle post likes, unlikes, comments CRUD
Notification Service Send real-time/email notifications, track read/unread
Media Service Handles signed URLs, uploads to CDN, proxying

Supporting Infrastructure

Component Role
Databases Stores structured data (users, posts, likes, friendships)
Blob Storage Stores images/videos (S3, GCS, etc.)
Cache Layer Redis or Memcached — speed up reads (feeds, posts, profiles)
CDN Serve static content globally (profile pictures, post images)
Queue System Kafka, RabbitMQ — async processing (notifications, feed generation)
Monitoring Prometheus, Grafana, logs/metrics/tracing

Request Flows

News Feed Load (Fan-out model)

Client → API Gateway → Feed Service
         ↓
     Cache (User Feed Cache)
         ↓ (if cache miss)
     DB or Fan-out Store → Return top 20 posts → Client

Post Creation

Client → API Gateway → Post Service
                        ↓
                   Store in DB
                        ↓
            Push to Feed Service (via queue)
                        ↓
         Update friend feed caches or async storage
                        ↓
         Trigger Notifications via Notification Service

Friend Request Flow

Client → API Gateway → Friendship Service
                        ↓
                   Update DB
                        ↓
             Trigger Notification Service

Like Flow

Client → API Gateway → Like Service
                        ↓
                   Update DB
                        ↓
             Update like count in cache
                        ↓
        Notify post owner (optional)

Deployment Architecture

Scaling Strategy

Component Scaling Strategy
API Gateway Horizontally scale stateless proxies
Post/Feed DBs Shard by user_id
Feed Service Use async queues + precomputed feed cache
Media Use CDN edge caching
Notification Decouple via pub/sub queue
Friends Graph partitioning if needed (for big users)

Caching Strategy

Cached Entity Layer TTL / Invalidation Strategy
User profile Redis 5 min TTL or on update
News feed Redis Precomputed and updated on new posts
Post details Redis 10–30 min TTL
Friends list Redis Cached and synced periodically
Likes/comments count Redis Async updates from DB or write-through

ML & Ranking

If you want intelligent feeds:

✅ STEP 6: Database Design & Schema (ERD)

Key Entities

The main things we need to store:

Entity Relationship Diagram (ERD)

[User]
 ├─ user_id (PK)
 ├─ name
 ├─ email
 ├─ password_hash
 ├─ bio
 └─ avatar_url

[Friendship]
 ├─ user_id_1 (FK → User)
 ├─ user_id_2 (FK → User)
 ├─ status (pending, accepted)
 └─ created_at

[Post]
 ├─ post_id (PK)
 ├─ user_id (FK → User)
 ├─ text
 ├─ image_url
 ├─ visibility (public, friends)
 ├─ created_at
 └─ updated_at

[Like]
 ├─ like_id (PK)
 ├─ post_id (FK → Post)
 ├─ user_id (FK → User)
 ├─ created_at

[Comment]
 ├─ comment_id (PK)
 ├─ post_id (FK → Post)
 ├─ user_id (FK → User)
 ├─ text
 ├─ created_at

[Notification]
 ├─ notification_id (PK)
 ├─ user_id (FK → User)
 ├─ type (like, comment, friend_request)
 ├─ source_user_id (FK → User)
 ├─ post_id (optional FK → Post)
 ├─ seen (boolean)
 ├─ created_at

[Media]  (optional if offloaded to S3/CDN)
 ├─ media_id (PK)
 ├─ post_id (FK → Post)
 ├─ url
 ├─ type (image, video)
 ├─ uploaded_at

Relationship Summary

Relationship Type
One User → Many Posts 1:N
One Post → Many Comments 1:N
One Post → Many Likes 1:N
One User ↔ One User (Friendship) Many-to-Many
One Post ↔ Media (Optional) 1:N
One User → Many Notifications 1:N

Schema Normalization Notes

Indexing Strategy

Table Indexes
User email (unique), user_id (PK)
Post user_id, created_at
Like post_id, user_id, (post_id, user_id) (unique)
Comment post_id, created_at
Friendship (user_id_1, user_id_2) (unique), status
Notification user_id, seen, created_at

Partitioning / Sharding Strategy

For scale, the following partitioning ideas help:

Table Shard Key Notes
Post user_id Most posts are read by friends
Comment post_id Comments only shown per-post
Like post_id Many likes per post
Friendship user_id_1 Graph partitioning later optional
Notification user_id Each user sees only their own

Feed Storage Strategy (Precomputed Feed)

You can store feed entries in a separate table:

[FeedEntry]
 ├─ user_id (owner of feed)
 ├─ post_id
 ├─ score (for ranking)
 ├─ inserted_at

User Graph (Social Graph DB)

To support more complex relationships (e.g. mutual friends, friend recommendations):

Media Storage Strategy

✅ STEP 7: Component Design (Low-Level Design)

🎯 Goal:

Define how each core service works internally, including:

7.1 Post Service

Internal Components:

PostService
├── PostController         # Handles HTTP requests
├── PostManager            # Business logic (create, delete)
├── PostRepository         # Interface to Post DB
├── MediaServiceClient     # Upload images/videos
├── FeedPublisher          # Publishes new post event to Feed queue
├── NotificationPublisher  # Sends notification events

Sample Flow: Create Post

1. Controller receives POST /posts
2. Validates payload
3. Stores post in DB via PostRepository
4. Publishes "NewPostEvent" to Kafka → FeedService
5. Publishes notification to user followers (async)
6. Returns post ID and metadata

Caching:

7.2 Feed Service

Internal Components:

FeedService
├── FeedController
├── FeedManager
├── FeedStore              # DB or Redis-based precomputed feed
├── RankerModule           # Sorts by recency + popularity
├── FeedFanOutWorker       # Listens to "NewPostEvent"

Sample Flow: Load Feed

1. Controller receives GET /feed?cursor=...
2. Checks Redis cache: feed:{user_id}
3. If cache miss:
   a. Pulls friend list
   b. Pulls N recent posts
   c. Ranks posts
   d. Stores result in cache
4. Returns top posts with pagination

Caching:

7.3 Friendship Service

Internal Components:

FriendService
├── FriendController
├── FriendManager
├── FriendshipRepository
├── NotificationPublisher

Flow: Send Friend Request

1. User A sends request to B
2. Store in Friendship table: (A, B, pending)
3. Notify B via NotificationPublisher

Caching:

7.4 Like & Comment Service

Internal Components:

InteractionService
├── LikeController / CommentController
├── LikeManager / CommentManager
├── LikeRepository / CommentRepository
├── PostStatUpdater
├── NotificationPublisher

Flow: Like a Post

1. POST /posts/:id/like
2. Add entry in Like table
3. Update post's like count (write-through cache)
4. Send notification to post owner (optional)

Caching:

7.5 Notification Service

Internal Components:

NotificationService
├── NotificationController
├── NotificationManager
├── NotificationRepository
├── EventConsumer (Kafka listener)

Flow: Handle Like Event

1. EventConsumer receives "PostLikedEvent"
2. Creates notification row in DB
3. Optionally pushes real-time notification (websocket)

Caching:

7.6 Media Service (Optional External)

Flow: Upload Image

  1. GET /upload-url
  2. Returns pre-signed S3 URL
  3. Client uploads image to S3
  4. Client POSTs the image URL with post content

7.7 User Service

Internal Components:

UserService
├── UserController
├── UserManager
├── UserRepository

Caching:

7.8 Auth Service

Internal Components:

AuthService
├── AuthController
├── TokenManager (JWT or session)
├── UserAuthRepository

7.9 Event & Queue System

Use Kafka or RabbitMQ to send events like:

Each service listens to only the events it needs.

Sample Event Structure

{
  "type": "PostCreated",
  "user_id": "u123",
  "post_id": "p456",
  "created_at": "2025-07-01T12:00:00Z"
}

✅ STEP 8: Scaling & Optimization

8.1 Database Scaling

Challenges:

Strategies:

Sharding Strategies:

Replication:

8.2 Caching

Types:

Techniques:

8.3 Feed Generation Optimization

Fan-out on Write vs Fan-out on Read

Hybrid Approach:

8.4 Rate Limiting & Throttling

Prevent abuse and overload:

8.5 Load Balancing

8.6 Asynchronous Processing

8.7 Hot Key Mitigation

8.8 Data Compression & Storage Optimization

8.9 CDN Usage

8.10 Monitoring & Auto-scaling

8.11 Security Optimizations

Great choice! Security and privacy are absolutely critical, especially for a platform like Facebook handling tons of sensitive personal data.

✅ STEP 9: Security & Privacy Considerations

Authentication & Authorization

Data Encryption

Input Validation & Injection Protection

Rate Limiting & Abuse Prevention

Privacy Controls & Compliance

Secure Logging & Monitoring

Infrastructure Security

Incident Response & Recovery

✅ STEP 10: Monitoring, Logging & Alerting

Monitoring

Goals: Track system health, performance, and usage metrics in real time.

Logging

Goals: Capture detailed logs for debugging, auditing, and compliance.

Alerting

Goals: Detect issues early and notify responsible teams instantly.

Distributed Tracing

Health Checks & Self-Healing

Monitoring Data Retention & Privacy

✅ STEP 11: Cost & Infrastructure Planning

Infrastructure Components & Cost Drivers

Component Description Cost Drivers
Compute App servers, microservices Number of instances, CPU, RAM, uptime
Database SQL/NoSQL clusters Storage size, IOPS, replication
Cache Redis/Memcached clusters Memory size, number of nodes
Storage Blob storage (images, videos) Data size, bandwidth, PUT/GET requests
Network Data transfer (between regions, CDN usage) Egress bandwidth, number of requests
CDN Content delivery network Requests served, bandwidth
Messaging Queues Kafka, RabbitMQ Throughput, retention time
Monitoring & Logging Metrics, logs, tracing Data volume stored, query volume

Cost Optimization Strategies

Infrastructure Sizing Guidelines (Example)

Service Initial Setup Scaling Metric
API Servers 10 instances (4 vCPU, 16GB RAM) Requests per second, CPU load
Database Cluster 3-node primary + 3 replicas Storage & read/write throughput
Cache Cluster 5-node Redis cluster (128GB total RAM) Cache hit ratio, memory usage
Blob Storage S3 or equivalent Total GB stored, monthly requests
Message Queue Kafka cluster with 5 brokers Messages per second, retention

Budgeting & Forecasting

Multi-Region Deployment Costs

Disaster Recovery & Backup Costs

Team & Operational Costs

✅ STEP 12: Testing & Deployment Strategies

Testing Strategies

Goal: Ensure quality, catch bugs early, and maintain system stability.

A. Unit Testing

B. Integration Testing

C. End-to-End (E2E) Testing

D. Load & Stress Testing

E. Security Testing

F. Chaos Engineering (Advanced)

Continuous Integration (CI)

Continuous Deployment / Delivery (CD)

Deployment Architecture

Rollback & Recovery

Monitoring Post-Deployment

Documentation & Runbooks