One picture is worth more than a thousand words. In this post, we will take a look at what happens when Alice sends an email to Bob.1/4
1. Alice logs in to her Outlook client, composes an email, and presses “send”. The email is sent to the Outlook mail server. The communication protocol between the Outlook client and mail server is SMTP.2/4
2. Outlook mail server queries the DNS (not shown in the diagram) to find the address of the recipient’s SMTP server. In this case, it is Gmail’s SMTP server. Next, it transfers the email to the Gmail mail server. The communication protocol between the mail servers is SMTP.3/4
3. The Gmail server stores the email and makes it available to Bob, the recipient.
4. Gmail client fetches new emails through the IMAP/POP server when Bob logs in to Gmail.4/4
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Caching is awesome but it doesn’t come without a cost, just like many things in life.
One of the issues is cache miss attack. Correct me if this is not the right term. It refers to the scenario where data to fetch doesn't exist in the database and the data isn’t cached either.
So every request hits the database eventually, defeating the purpose of using a cache. If a malicious user initiates lots of queries with such keys, the database can easily be overloaded.
The diagram below illustrates the process.
Two approaches are commonly used to solve this problem:
🔹Cache keys with null value. Set a short TTL (Time to Live) for keys with null value.
My new book System Design Interview - An Insider’s Guide (Volume 2) will be available on Amazon soon! It is a continuation of the system design interview book series.
Some stats about the book:
🔹 13 NEW real system design interviews with detailed solutions.
🔹 300+ diagrams to explain how different systems work.
🔹 400+ pages.
🔹 took 1.5 years to make.
My co-author @sahnlam and I have spent countless nights and weekends on the book. Our goal is to make complex systems easy to understand.
Popular interview question: how to diagnose a mysterious process that’s taking too much CPU, memory, IO, etc?
The diagram below illustrates helpful tools in a Linux system.
🔹‘vmstat’ - reports information about processes, memory, paging, block IO, traps, and CPU activity.
🔹‘iostat’ - reports CPU and input/output statistics of the system.
🔹‘netstat’ - displays statistical data related to IP, TCP, UDP, and ICMP protocols.
🔹‘lsof’ - lists open files of the current system.
🔹‘pidstat’ - monitors the utilization of system resources by all or specified processes, including CPU, memory, device IO, task switching, threads, etc.
Design stock exchange. Let’s trace the life of an order through various components in the diagram to see how the pieces fit together.
First, we follow the order through the trading flow. This is the critical path with strict latency requirements. Everything has to happen fast in the flow:
Step 1: A client places an order via the broker’s web or mobile app.
Step 2: The broker sends the order to the exchange.
Step 3: The order enters the exchange through the client gateway. The client gateway performs basic gatekeeping functions such as input validation, rate limiting, authentication, normalization, etc. The client gateway then forwards the order to the order manager.
You probably heard about 𝐒𝐖𝐈𝐅𝐓. What is SWIFT? What role does it play in cross-border payments? Let's take a look.
The Society for Worldwide Interbank Financial Telecommunication (SWIFT) is the main secure 𝐦𝐞𝐬𝐬𝐚𝐠𝐢𝐧𝐠 𝐬𝐲𝐬𝐭𝐞𝐦 that links the world’s banks. 1/9
The Belgium-based system is run by its member banks and handles millions of payment messages per day. The diagram below illustrates how payment messages are transmitted from Bank A (in New York) to Bank B (in London). 2/9
Step 1: Bank A sends a message with transfer details to Regional Processor A in New York. The destination is Bank B. 3/9
In modern architecture, systems are broken up into small and independent building blocks with well-defined interfaces between them. Message queues provide communication and coordination for those building blocks. Today, let’s discuss at-most once, at-least once, and exactly once.
𝐀𝐭-𝐦𝐨𝐬𝐭 𝐨𝐧𝐜𝐞
As the name suggests, at-most once means a message will be delivered not more than once. Messages may be lost but are not redelivered. This is how at-most once delivery works at the high level.
Use cases: It is suitable for use cases like monitoring metrics, where a small amount of data loss is acceptable.
𝐀𝐭-𝐥𝐞𝐚𝐬𝐭 𝐨𝐧𝐜𝐞
With this data delivery semantic, it’s acceptable to deliver a message more than once, but no message should be lost.