How does Twitter work? Letโs take a look at it from the architectural point of view before Elon takes it.
๐๐ก๐ ๐๐ข๐๐ ๐จ๐ ๐ ๐๐ฐ๐๐๐ญ:
1๏ธโฃ A tweet comes in through the Write API.
2๏ธโฃ The Write API routes the request to the Fanout service.
3๏ธโฃ The Fanout service does a lot of processing and stores them in the Redis cache.
4๏ธโฃ The Timeline service is used to find the Redis server that has the home timeline on it.
5๏ธโฃ A user pulls their home timeline through the Timeline service.
๐๐๐๐ซ๐๐ก & ๐๐ข๐ฌ๐๐จ๐ฏ๐๐ซ๐ฒ
๐น Ingester: annotates and tokenizes Tweets so the data can be indexed.
๐น Earlybird: stores search index.
๐น Blender: creates the search and discovery timelines.
Disclaimer: This article is based on the tech talk given by Twitter in 2013 (bit.ly/3vNfjRp). Even though many years have passed, itโs still quite relevant. I redraw the diagram as the original diagram is difficult to read.
Over to you:
What are some of the biggest differences between Twitter and Facebook that might shape their system architectures?
If you found this thread helpful, follow me @alexxubyte for more.
Retweet the first tweet to help more people to learn system design.
โข โข โข
Missing some Tweet in this thread? You can try to
force a refresh
Popular interview question: What is the difference between ๐๐ซ๐จ๐๐๐ฌ๐ฌ and ๐๐ก๐ซ๐๐๐?
To better understand this question, letโs first take a look at what is a Program.
A ๐๐ซ๐จ๐ ๐ซ๐๐ฆ is an executable file containing a set of instructions and passively stored on disk. One program can have multiple processes. For example, the Chrome browser creates a different process for every single tab.
A ๐๐ซ๐จ๐๐๐ฌ๐ฌ means a program is in execution. When a program is loaded into the memory and becomes active, the program becomes a process. The process requires some essential resources such as registers, program counter, and stack.
Interesting read: Software Architecture and Design InfoQ Trends Report โ April 2022 by @InfoQ
Key takeaways:
โData plus architecture" is the idea that, more frequently, software architecture is adapting to consider data. This holistically includes data quality, data pipelines, and traceability to understand how data influenced decisions and AI models.
Innovative software architecture is facilitating data quality the way weโve improved code quality. Catching bad data early is as important as catching bugs early.
The practice of software architecture does not belong solely to people with the job title of architect.
One picture is worth more than a thousand words. In this post, we will take a look at how to design ๐๐จ๐จ๐ ๐ฅ๐ ๐๐จ๐๐ฌ.
1๏ธโฃ Clients send document editing operations to the WebSocket Server.
2๏ธโฃ The real-time communication is handled by the WebSocket Server.
3๏ธโฃ Documents operations are persisted in the Message Queue.
4๏ธโฃ The File Operation Server consumes operations produced by clients and generates transformed operations using collaboration algorithms.
5๏ธโฃ Three types of data are stored: file metadata, file content, and operations.
One of the biggest challenges is real-time conflict resolution. Common algorithms include:
๐น Operational transformation (OT)
๐น Differential Synchronization (DS)
๐น Conflict-free replicated data type (CRDT)
Deploying or upgrading services is risky. In this post, we explore risk mitigation strategies.
The diagram below illustrates the common ones.
๐๐ฎ๐ฅ๐ญ๐ข-๐๐๐ซ๐ฏ๐ข๐๐ ๐๐๐ฉ๐ฅ๐จ๐ฒ๐ฆ๐๐ง๐ญ
We deploy new changes to multiple services simultaneously. This approach is easy to implement. But since all the services are upgraded at the same time, it is hard to manage and test dependencies. Itโs also hard to rollback safely.
๐๐ฅ๐ฎ๐-๐๐ซ๐๐๐ง ๐๐๐ฉ๐ฅ๐จ๐ฒ๐ฆ๐๐ง๐ญ
With blue-green deployment, we have two identical environments: one is staging (blue) and the other is production (green). The staging environment is one version ahead of production.
I quit my job at Twitter 3 years ago and put a huge burden of the mortgage on my wife. Bay area houses are expensive...
Now I make a living by writing a series of system design interview books with my co-author @sahnlam
Our books have 1000+ reviews on Amazon and are consistently ranked in the top 10 of Computer & Technology books.
I started to post on LinkedIn regularly and it passed 100,000 followers last Friday.
Iโve met a lot of like-minded individuals and love what I do.
Iโll be sharing my learning, thinking, or interview tips every week. If you think my posts will be useful, please like or share so they can reach a wider audience. Thanks for reading my story.
๐๐จ๐ค๐๐ง ๐๐๐ฌ๐๐
Step 1 - the user enters their password into the client, and the client sends the password to the Authentication Server.
Step 2 - the Authentication Server authenticates the credentials and generates a token with an expiry time.
Steps 3 and 4 - now the client can send requests to access server resources with the token in the HTTP header. This access is valid until the token expires.