Code-Memo

Designing Twitter

✅ STEP 1: CLARIFY THE PROBLEM

Twitter is a microblogging and social networking platform focused on short messages called tweets, which can include text, media (images, videos), and links.

Core Features:

Who Are the Users?

Product Goals / Business Goals

Constraints and Priorities

Constraint / Priority Notes
Latency Tweets and timeline updates should appear with minimal delay (<200ms ideal)
Scalability Millions of tweets per second at peak, billions of total tweets
Consistency Strong consistency for tweet posting, follow/unfollow; eventual consistency acceptable for timelines
Availability 99.99% SLA or better
Rate limiting Prevent spam and abuse with rate limits and throttling
Privacy & Security Protect user data, allow private accounts, enforce content policies
Read/Write Ratio More balanced than Facebook; tweets are more frequent
Real-time delivery Critical for notifications, mentions, trends

Key Use Cases

  1. User creates account and logs in
  2. User posts a tweet (with optional media)
  3. User follows/unfollows others
  4. User views their home timeline with recent tweets from followed users
  5. User likes, retweets, or replies to tweets
  6. User searches tweets, users, or hashtags
  7. User views notifications (mentions, likes, retweets)
  8. User views trending topics and hashtags
  9. User sends/receives Direct Messages (optional)
  10. User manages profile, privacy settings

High-Level Metrics to Support

Metric Notes
Requests per second (RPS) Millions of tweets per second at peak
Timeline load latency <200ms P95
Tweet posting latency <500ms
Like/retweet latency <150ms
System uptime 99.99%+
Fan-out scale Millions of followers for top users
Cache hit ratio >90% for hot timelines and tweets
Data volume growth TBs per day, petabytes over time

Personas

Persona Behavior and Needs
Casual user Reads timeline, tweets occasionally
Power user Tweets often, engages with many tweets
Celebrity/influencer Millions of followers, high fan-out challenge
Brand/organization Scheduled tweets, analytics, high engagement
Bot/spammer Automated tweeting and abuse, rate limiting and detection needed
New user No followers yet, cold-start timeline challenge

Discussion & Questions to Clarify Further

✅ STEP 2: Define Functional Requirements

1. User & Authentication

2. Follow System

3. Tweet System

4. Engagement System

5. Timeline / Feed System

6. Notifications System

7. Search System

8. Direct Messaging (DM) [Optional for MVP]

9. Media Handling

10. Account Settings & Privacy

11. Moderation & Abuse Prevention

12. Analytics & Insights (Optional)

13. APIs (public/internal)

Sample API Endpoints Sketch

Endpoint Method Description
/signup POST Register new user
/login POST User login
/users/:id GET/PUT Get/update user profile
/users/:id/follow POST Follow a user
/users/:id/unfollow POST Unfollow a user
/tweets POST Post a new tweet
/tweets/:id GET/DELETE Get/delete a tweet
/tweets/:id/like POST Like/unlike a tweet
/tweets/:id/retweet POST Retweet or undo retweet
/tweets/:id/reply POST Reply to a tweet
/timeline/home GET Get user’s home timeline
/notifications GET Get user’s notifications
/search GET Search tweets/users/hashtags
/messages POST/GET Send and get DMs
/media/upload POST Upload media

✅ STEP 3: Define Non-Functional Requirements (NFRs)

1. Scalability

2. Availability

3. Latency / Performance

Operation Target Latency (P95)
Tweet posting < 500 ms
Timeline loading < 200 ms
Like/retweet actions < 150 ms
Follow/unfollow < 100 ms
Notification delivery < 1 second (near real-time)

4. Consistency

5. Durability

6. Security

7. Privacy

8. Maintainability & Modularity

9. Testability

10. Observability

11. Cost Efficiency

✅ STEP 4: Define System Interfaces / APIs

General Guidelines

1. User & Authentication APIs

Endpoint Method Request Body / Query Parameters Response Description
/api/v1/signup POST { email, username, password, name } { userId, token, profile } Register new user
/api/v1/login POST { email/username, password } { token, user } Login, returns JWT token
/api/v1/logout POST (Auth token in header) { message: "Logged out" } Logout user
/api/v1/password-reset/request POST { email } { message: "Reset email sent" } Request password reset email
/api/v1/password-reset/confirm POST { token, newPassword } { message: "Password reset successful" } Confirm password reset
/api/v1/users/:id GET (Auth token) { userProfile } Get user profile
/api/v1/users/:id PUT { name, bio, location, website, avatarUrl, privacySettings } { updatedProfile } Update user profile
/api/v1/users/:id/follow POST (Auth token) { message: "Followed user" } Follow user
/api/v1/users/:id/unfollow POST (Auth token) { message: "Unfollowed user" } Unfollow user
/api/v1/users/:id/followers GET ?cursor=xxx&limit=20 { followers: [...], nextCursor } Get user followers (paginated)
/api/v1/users/:id/following GET ?cursor=xxx&limit=20 { following: [...], nextCursor } Get users this user follows
/api/v1/users/:id/block POST (Auth token) { message: "User blocked" } Block user
/api/v1/users/:id/unblock POST (Auth token) { message: "User unblocked" } Unblock user

2. Tweet APIs

Endpoint Method Request Body / Query Parameters Response Description
/api/v1/tweets POST { text, mediaUrls[], poll?, replyToTweetId?, location? } { tweetId, createdAt, tweetData } Create new tweet
/api/v1/tweets/:id GET (Auth token optional, depends on tweet visibility) { tweetData } Get tweet by ID
/api/v1/tweets/:id DELETE (Auth token) { message: "Tweet deleted" } Delete tweet (owner only)
/api/v1/tweets/:id/like POST (Auth token) { message: "Tweet liked" } Like or unlike a tweet (toggle)
/api/v1/tweets/:id/retweet POST { comment? } { message: "Retweeted", retweetId } Retweet or quote retweet
/api/v1/tweets/:id/replies GET ?cursor=xxx&limit=20 { replies: [...], nextCursor } Get replies to a tweet (paginated)
/api/v1/tweets/:id/engagements GET   { likeCount, retweetCount, replyCount } Get engagement counts
/api/v1/tweets/:id/likers GET ?cursor=xxx&limit=20 { users: [...], nextCursor } Get users who liked a tweet (paginated)

3. Timeline / Feed APIs

Endpoint Method Query Parameters Response Description    
/api/v1/timeline/home GET `?cursor=xxx\&limit=50\&filter=media | replies | retweets` { tweets: [...], nextCursor } Get home timeline (tweets from followed users, ranked or chronological)    
/api/v1/timeline/user/:userId GET ?cursor=xxx&limit=50 { tweets: [...], nextCursor } Get tweets by a specific user    

4. Notification APIs

Endpoint Method Query Parameters Response Description
/api/v1/notifications GET ?cursor=xxx&limit=50 { notifications: [...], nextCursor } Get notifications for logged-in user
/api/v1/notifications/:id/read POST   { message: "Notification marked as read" } Mark notification as read

5. Search APIs

Endpoint Method Query Parameters Response Description    
/api/v1/search GET `?q=keyword\&type=tweets | users | hashtags\&limit=20\&cursor=xxx` { results: [...], nextCursor } Search tweets, users, or hashtags    

6. Direct Message APIs (Optional MVP)

Endpoint Method Request Body / Query Parameters Response Description
/api/v1/messages POST { recipientId, messageText, mediaUrl? } { messageId, timestamp } Send a DM
/api/v1/messages GET ?cursor=xxx&limit=50&chatWithUserId=xxx { messages: [...], nextCursor } Fetch DM history
/api/v1/messages/:id DELETE   { message: "Message deleted" } Delete a message

7. Media APIs

Endpoint Method Request Body / Query Parameters Response Description
/api/v1/media/upload POST Multipart form data: image/video/gif file { mediaUrl, mediaId, metadata } Upload media (used in tweets/DMs)

8. Account Settings & Privacy APIs

Endpoint Method Request Body / Query Parameters Response Description
/api/v1/account/privacy GET   { privacySettings } Get user privacy settings
/api/v1/account/privacy PUT { isPrivate, mutedUsers[], blockedUsers[], notificationPrefs } { updatedSettings } Update privacy and notification preferences

API Error Handling Examples

HTTP Status Meaning Response Body Example
200 Success { "status": "ok", "data": {...} }
400 Bad Request (validation) { "error": "Invalid tweet content" }
401 Unauthorized (no token) { "error": "Authentication required" }
403 Forbidden (access denied) { "error": "Not allowed to delete this tweet" }
404 Not Found { "error": "Tweet not found" }
429 Too Many Requests (rate limit) { "error": "Rate limit exceeded" }
500 Server error { "error": "Internal server error" }

✅ STEP 5: High-Level Architecture

Key Goals of Architecture

1. Major Components Overview

Component Responsibility
Client Apps Web, iOS, Android apps, third-party clients consuming APIs
API Gateway Entry point for all client requests, handles auth, routing, rate limiting
User Service Manages user profiles, authentication, privacy settings
Follow Service Manages follow/unfollow relationships, blocks, mutes
Tweet Service Create, read, update, delete tweets, including media metadata
Timeline Service Generates and serves user timelines (feeds), supports fan-out/fan-in models
Engagement Service Handles likes, retweets, replies, counts, notifications
Notification Service Delivers notifications (push, email, in-app)
Search Service Indexes tweets, users, hashtags for search queries
Media Service Stores, processes, and serves images, videos, GIFs
Direct Message Service (optional) Manages private messaging
Admin & Moderation Tools Content reporting, abuse detection, user banning
Analytics Service Tracks metrics and user behavior for insights
Caching Layer Redis or Memcached clusters for hot data (timelines, tweets)
Database Layer Multiple databases for user data, tweets, relationships, engagements
CDN Content delivery network for media and static content
Message Queue Kafka, RabbitMQ for asynchronous processing (fan-out, notifications)
Monitoring & Logging Observability infrastructure for metrics, tracing, alerts

2. High-Level Architecture Diagram (Conceptual)

  +--------------------+
  |   Client Apps      | <-- iOS, Android, Web, 3rd party
  +--------------------+
            |
            v
  +--------------------+
  |    API Gateway     | -- Auth, Routing, Rate Limiting
  +--------------------+
     |       |       |       \
     v       v       v        v
+-------+ +-------+ +--------+ +---------+
| User  | |Tweet  | |Follow  | |Timeline |   <-- Microservices
|Service| |Service| |Service | |Service  |
+-------+ +-------+ +--------+ +---------+
     |        |         |          |
     |        |         |          v
     |        |         |   +-------------+
     |        |         |   | Cache (Redis)|
     |        |         |   +-------------+
     |        |         |          |
     |        |         |          v
     |        |         |   +-------------+
     |        |         |   | Databases    | -- Users DB, Tweets DB, Follows DB
     |        |         |   +-------------+
     |        |         |
     |        |         +----------------+
     |        |                          |
     |        |                          v
     |        |                  +---------------+
     |        |                  | Message Queue | <-- Kafka/RabbitMQ
     |        |                  +---------------+
     |        |                          |
     |        |                          v
     |        |                 +--------------------+
     |        |                 | Notification Service|
     |        |                 +--------------------+
     |        |
     |        +----------------------------------+
     |                                       |
     v                                       v
+--------------+                      +----------------+
| Media Service|                      | Search Service |
+--------------+                      +----------------+

3. Data Stores and Technologies (Example Choices)

Service Data Store Notes
User Service Relational DB (Postgres) Strong consistency, complex queries
Tweet Service Distributed NoSQL (Cassandra, DynamoDB) High write throughput, time-series data
Follow Service Graph DB (Neo4j, RedisGraph) or relational Efficient follower/following queries
Timeline Service Cache-heavy (Redis), also reads from Tweet DB Precompute timelines or fan-out on read
Engagement Service NoSQL or key-value store For likes, retweets counts
Notification Service NoSQL or Queue System For event-driven notifications
Media Service Object Storage (S3, GCS) Scalable media storage and CDN integration
Search Service Elasticsearch Full-text search, hashtag, user search
Message Queue Kafka, RabbitMQ Asynchronous processing and decoupling

4. Communication Patterns

5. Fan-out Strategies for Timeline Generation

6. Additional Considerations

✅ STEP 6: Database Design & Schema (ERD)

Key Design Considerations

Core Entities and Relations

1. User

Field Type Notes
user_id (PK) UUID / bigint Primary key
username varchar(15) Unique, indexed
email varchar Unique
password_hash varchar Secure hashed password
display_name varchar(50) User’s full name
bio text Nullable
location varchar Nullable
website_url varchar Nullable
avatar_url varchar Nullable
is_private boolean Account privacy setting
created_at timestamp Account creation timestamp
updated_at timestamp Profile last update
deleted_at timestamp Soft delete

2. Followers / Followings

Field Type Notes
follower_id (PK) UUID / bigint User who follows
followed_id (PK) UUID / bigint User who is followed
is_approved boolean For private accounts
created_at timestamp When follow created

Composite primary key: (follower_id, followed_id)

3. Tweets

Field Type Notes
tweet_id (PK) UUID / bigint Primary key
user_id (FK) UUID / bigint Author of tweet
text varchar(280) Tweet content
created_at timestamp Timestamp
in_reply_to_id UUID / bigint Nullable, if reply to another tweet
is_deleted boolean Soft delete
sensitive_content boolean Flag for content warning
language varchar(10) Language code
visibility enum ‘public’, ‘followers’, ‘private’

4. Tweet Media

Field Type Notes
media_id (PK) UUID / bigint Primary key
tweet_id (FK) UUID / bigint Associated tweet
media_url varchar URL to media in object storage
media_type enum ‘image’, ‘video’, ‘gif’
order int Display order

5. Likes

Field Type Notes
tweet_id (PK) UUID / bigint Tweet liked
user_id (PK) UUID / bigint User who liked
created_at timestamp When liked

6. Retweets

Field Type Notes
retweet_id (PK) UUID / bigint Primary key
original_tweet_id UUID / bigint Tweet being retweeted
user_id UUID / bigint Who retweeted
comment varchar(280) Optional quote retweet comment
created_at timestamp When retweeted

7. Replies

Replies are stored as tweets with in_reply_to_id linking to parent tweet. For fast lookup:

Field Type Notes
reply_id (PK) UUID / bigint Primary key
tweet_id (FK) UUID / bigint Parent tweet
reply_tweet_id UUID / bigint Reply tweet

8. Notifications

Field Type Notes
notification_id (PK) UUID / bigint Primary key
user_id (FK) UUID / bigint User to notify
type enum ‘follow’, ‘like’, ‘retweet’, ‘reply’, ‘mention’
source_user_id UUID / bigint User who triggered notification
tweet_id (nullable) UUID / bigint Related tweet if applicable
created_at timestamp When notification created
is_read boolean Read/unread flag

9. Blocks / Mutes

Field Type Notes
user_id (PK) UUID / bigint User performing block/mute
blocked_user_id (PK) UUID / bigint User being blocked/muted
is_block boolean true=block, false=mute
created_at timestamp When action occurred

10. Direct Messages (Optional)

Field Type Notes
message_id (PK) UUID / bigint Primary key
sender_id (FK) UUID / bigint User sending
recipient_id (FK) UUID / bigint User receiving
text text Message content
media_url varchar Optional media URL
created_at timestamp Timestamp
is_deleted boolean Soft delete

ERD Diagram (Simplified)

User ---< Followers >--- User
  |
  |---< Tweet >---< TweetMedia
  |            \
  |             ---< Like >--- User
  |             ---< Retweet >--- User
  |             ---< Reply >--- Tweet (self-join)
  |
  |---< Notification >--- User (source_user_id FK)
  |
  |---< BlockMute >--- User (blocked_user_id FK)
  |
  |---< DirectMessage >--- User (sender & recipient)

Indexes & Keys

✅ STEP 8: Scaling & Optimization

Goals for Scaling & Optimization

1. Scaling Strategies Overview

Aspect Strategy
Compute Horizontal scaling via stateless microservices
Data Storage Partitioning (sharding), replication, caching
Network Load balancing, CDNs for media
Messaging Distributed message queues (Kafka)
Caching Multi-layered caching (Redis, CDN)
Data Processing Asynchronous and batch processing

2. Compute & Microservices Scaling

3. Database Scaling

4. Caching Optimization

5. Fan-out Optimization for Timelines

6. Load Balancing & Traffic Routing

7. Asynchronous Processing & Queues

8. Data Compression & Storage Optimization

9. Network Optimization

10. Monitoring & Autoscaling Feedback Loop

11. Cost Optimization

Got it! Let’s deep dive into STEP 9: Security & Privacy Considerations for Twitter. This is crucial to protect user data, maintain trust, and comply with legal requirements.

✅ STEP 9: Security & Privacy Considerations

1. Authentication & Authorization

2. Data Protection

3. Privacy Controls & User Settings

4. Abuse Prevention & Rate Limiting

5. API Security

6. Infrastructure & Network Security

7. Logging & Monitoring for Security

9. Backup & Disaster Recovery

✅ STEP 10: Monitoring, Logging & Alerting

1. Objectives

2. Monitoring

Types of Metrics:

Tools:

Best Practices:

3. Logging

Types of Logs:

Logging Infrastructure:

4. Distributed Tracing

5. Alerting

6. Incident Management & Postmortems

7. Security Monitoring

8. Capacity Planning & Forecasting

✅ STEP 11: Cost & Infrastructure Planning

1. Infrastructure Components to Budget For

Component Description Cost Factors
Compute Resources Microservice servers, API gateways Number of instances, CPU, memory, uptime
Database Storage Relational DB, NoSQL DB, backups Storage volume, IOPS, replication
Caching Layer Redis/Memcached clusters Memory usage, cluster size
Messaging Queues Kafka or RabbitMQ Throughput, partitions, retention duration
CDN & Media Storage Serving images, videos, static assets Data transfer, storage volume
Network Load balancers, bandwidth costs Data egress, regional routing
Monitoring & Logging Metrics storage, log aggregation Volume of data ingested and stored
Security Services WAF, DDoS protection Protection tier, traffic volume

2. Cost Optimization Strategies

3. Cost Estimation Example (Very Rough)

Resource Unit Cost Example Estimated Usage Monthly Cost Estimate
EC2 Instances $0.10/hr (t3.medium) 50 instances * 24*30 hrs ~$3600
Database Storage $0.10/GB-month (SSD) 10 TB ~$1000
Redis Cache $0.20/GB-month 1 TB ~$200
Kafka Cluster $0.15/hr per broker 5 brokers ~$540
CDN (Data Transfer) $0.08 per GB 50 TB ~$4000
Logging Storage $0.03 per GB 10 TB ~$300
Network Egress $0.05 per GB 20 TB ~$1000
Security (WAF, DDoS) Fixed + usage-based Depends $500+

4. Infrastructure Planning

5. Budgeting for Growth

6. Cost Monitoring & Alerting

✅ STEP 12: Testing & Deployment Strategies

1. Testing Strategies

a) Unit Testing

b) Integration Testing

c) End-to-End (E2E) Testing

d) Load and Performance Testing

e) Security Testing

f) Regression Testing

2. Deployment Strategies

a) Continuous Integration / Continuous Deployment (CI/CD)

b) Canary Releases

c) Blue-Green Deployments

d) Rolling Updates

3. Infrastructure as Code (IaC)

4. Feature Flags & Dark Launches

5. Monitoring Post-Deployment

6. Backup & Rollback Plans