Baseline System Design — Facebook Newsfeed And “Fanout”
Comparing Resources Like “Grokking The System Design Interview” To The Actual Facebook Blog
I had a few ideas for how to make Baseline System Design interesting — my best one was to look at resources like Grokking to find out how they designed hypothetical systems, then compare this to how systems are actually designed. So far I have had mixed success with Facebook Newsfeed.
For anyone skimming this, here is the short version: This published paper from the International Journal Of Computer Applications references this Facebook blog post, and goes on to state that Facebook uses a primarily “pull-based architecture” /fanout-on-load when it generates its newsfeed. Resources like “Grokking The System Design Interview” argue instead for a hybrid approach similar to what Twitter does.
If I were to simply regurgitate things like “Grokking,” I would probably be better off pointing readers to System Design Primer.
…this is my first complaint with System Design Primer — it simply lumps “Facebook Newsfeed” in with “Twitter Timeline,” as if the two were designed exactly the same way.
After what I have read about Facebook Newsfeed, I am starting to see why so many writers and bloggers do things this way. Unlike Twitter, some of Facebook’s engineering blog posts are rather vague and read like advertisements. Simply searching Google is even harder, as looking up “Facebook Newsfeed” will yield a number of results that simply explain how to make your Facebook posts trend.
An Explanation Of Fanout
This is one small part of a hard question you can find under “Grokking The System Design Interview.” Facebook Newsfeed is what updates the results on Facebook’s homepage, such that your brother’s engagement photo appears before your aunt’s YouTube link of two corgis playing tetherball.
There are a number of really interesting technical questions that “Grokking” glosses over, like how Facebook actually uses machine learning to determine its rankings.
When a person publishes a post, the post is pushed to all followers in something called fanout. The push approach is called fanout-on-write, and the pull approach is called fanout-on-load.
Alex Xu’s book essentially describes the exact same thing, but in a slightly clearer way.
In Fanout-on-write, the newsfeed is precomputed during write time. A new post is delivered to friends’ caches immediately after publication. Fetching is fast, but it has the “hotkey problem” — it is really slow for celebrities with many friends.
In fanout-on-load, the newsfeed is generated during read time. There is no hotkey problem, but fetching is slow.
The hybrid approach is to let followers pull the updates for celebrities, but have “normies” carry out the writes. In other words, the majority of users use “fanout-on-write,” but not celebrities.
Down The Rabbit Hole
From the Facebook blog:
We leveraged the disaggregation concept to redesign Multifeed, a distributed backend system that is involved in News Feed. When a person goes to his or her Facebook feed, Multifeed looks up the user’s friends, finds all their recent actions, and decides what should be rendered based on a certain relevance and ranking algorithm. The disaggregation results with respect to infrastructure were impressive across multiple areas tracked.
This, I imagine, is where the researchers who wrote “A Hybrid Approach To Social Network User Feed Generation” got their data to conclude that “Facebook uses a model that is primarily pull based.” The conclusion seems a little bit sketchy to me, but they are published researchers and I am in a long list of random people on Medium who have blogs.
I feel a little bit like I am trying to solve a murder mystery: Where did the creators of DesignGurus get the information that Facebook uses a hybrid approach to fanout? To be fair, they have never claimed to actually be proposing how Facebook is truly designed. But Alex Xu proposed the exact same design, prompting me to wonder if they have more information from sources at Facebook, if the two are collaborating, or if one of them simply took the idea from the other.
My best guess is that they were simply pretending Facebook was designed exactly like Twitter. That is covered in both the twitter blogs and in this book:
Closing Thoughts
Since this ended up being an entire blog post about one small part of Facebook newsfeed, it may be good to go into another portion of the proposed Facebook newsfeed design, or return to Twitter.