Administrator
Published on 2026-02-28 / 1 Visits
0
0

LeetCode Study Note | Problem 355: Design Twitter

Core Goal

Implement a simplified Twitter with:

  • postTweet(userId, tweetId)

  • follow(followerId, followeeId)

  • unfollow(followerId, followeeId)

  • getNewsFeed(userId): return 10 most recent tweetIds from userId + followees, ordered by recency.

The entire optimization focus is getNewsFeed, because it dominates runtime as data grows.


Baseline Approach (AC but not scalable)

Data Structures (baseline)

  • followMap_[u] -> unordered_set<int>: followees of user u

  • tweets_[tweetId] -> Tweet(tweetId, ts) global lookup

  • userTweet_[u] -> unordered_set<int>: tweetIds posted by user u

  • ts uses a monotonic counter gTimestamp_++ (preferred over wall clock)

getNewsFeed Baseline

  1. Collect all tweetIds from:

    • each followee of userId

    • userId themself

  2. Convert tweetIds to Tweet objects using tweets_

  3. Sort all tweets by timestamp descending

  4. Take first 10

Complexity

Let M be number of collected tweets (sum of all followees’ tweets + self tweets):

  • Collect: O(M)

  • Sort: O(M log M)

  • Return 10: O(10)

This passes the problem but is overkill, because we only need top 10.


Optimization 1: Avoid Full Sort — Keep Top 10 with a Heap

Key Observation

We don’t need a full ordering of M tweets. We only need the largest 10 by timestamp.

Idea: Min-Heap of Size 10

While scanning candidate tweets:

  • push Tweet into min-heap ordered by timestamp

  • if heap size > 10, pop the smallest (oldest)

    At the end, heap contains the newest 10 tweets (unordered); pop into output and reverse.

Complexity

  • scan: O(M)

  • heap operations: O(M log 10) ≈ O(M)

  • space: O(10)

This is already a major improvement:

  • from O(M log M) → O(M)

Tradeoff

Still scans all tweets of all followees. If a user follows many accounts with huge histories, scanning remains expensive.


Optimization 2 (Best): K-Way Merge on Per-User Chronological Streams

Key Observation

Tweets for each user are naturally time-ordered at creation.

So instead of storing tweetIds in an unordered container, store tweets in append-only chronological order.

Data Structure Change

Replace:

  • userTweet_[u] -> unordered_set<int>

With:

  • userTweets_[u] -> vector<Tweet> (append-only, time increasing)

Now each user has a sorted stream by timestamp_.

K-Way Merge Strategy

We need the newest tweets among (F + 1) streams (F followees + self).

This is identical to merging k sorted lists from the end.

Algorithm:

  1. For each stream (self + each followee), push its latest tweet into a max-heap.

    • Heap node tracks: (uid, idx, tweetId, ts)

  2. Repeat up to 10 times:

    • pop heap top (newest tweet)

    • add tweetId to answer

    • move that user’s pointer one step earlier (idx-1) and push the previous tweet (if exists)

Why Node is Required

If the heap only stores Tweet objects, you lose:

  • which user stream it came from

  • where to fetch the “next candidate” from that stream

    So heap elements must carry stream state (cursor).

Complexity

Let F be number of followees:

  • heap init: O((F+1) log(F+1))

  • extract 10: O(10 log(F+1))

  • does NOT depend on total tweet history

This is the “interview-ready” answer:

  • avoids scanning all tweets

  • avoids sorting all tweets

  • only touches as much data as needed to output 10 results


Practical Notes (C++ gotchas)

Avoid 

unordered_map::operator[]

 for read-only access

  • operator[] inserts default values when key is missing

  • it requires default-constructible value types

    Use find() / at() when reading.

Prefer logical timestamp over wall clock

  • gTimestamp_++ is monotonic, deterministic, fast

  • avoids clock rollback / NTP adjustments

  • perfect for ordering in this simplified system


  • per-user vector<Tweet> chronological

  • max-heap (priority_queue) + cursor Node

  • pop/push up to 10 times

This is the strongest optimization story:

  1. brute-force collect + sort

  2. heap top-10 without full sort

  3. k-way merge without scanning full histories



Comment