How does Disney Hotstar capture 5 Billion Emojis during a tournament?
Dedeepya Bonthu [1] wrote an excellent engineering blog that captures this nicely. Here is my understanding of how the system works.
1. Clients send emojis through standard HTTP requests. You can think of Golang Service as a typical Web Server. Golang is chosen because it supports concurrency well. Threads in GoLang are lightweight.
2. Since the write volume is very high, Kafka (message queue) is used as a buffer.
3. Emoji data are aggregated by a streaming processing service called Spark. It aggregates data every 2 seconds, which is configurable. There is a trade-off to be made based on the interval.
A shorter interval means emojis are delivered to other clients faster but it also means more computing resources are needed.
4. Aggregated data is written to another Kafka.
5. The PubSub consumers pull aggregated emoji data from Kafka.
6. Emojis are delivered to other clients in real-time through the PubSub infrastructure.
The PubSub infrastructure is interesting. Hotstar considered the following protocols: Socketio, NATS, MQTT, and gRPC, and settled with MQTT. For those who are interested in the tradeoff discussion, see [2].
A similar design is adopted by LinkedIn [3].
Over to you: What are some of the off-the-shelf Pub-Sub services available?
How do we design a system for internationalization?
The diagram below shows how we can internationalize a simple e-commerce website.
Different countries have differing cultures, values, and habits. When we design an application for international markets, we need to localize the application in several ways:
🔹 Language 1. Extract and maintain all texts in a separate system. For example:
- We shouldn’t put any prompts in the source code.
- We should avoid string concatenation in the code.
- We should remove text from graphics. 2. Use complete sentences and avoid dynamic text elements 3. Display business data such as currencies in different languages
1. Parse HTML and generate Document Object Model (DOM) tree.
When the browser receives the HTML data from the server, it immediately parses it and converts it into a DOM tree.
2. Parse CSS and generate CSSOM tree.
The styles (CSS files) are loaded and parsed to the CSSOM (CSS Object Model).
3. Combine DOM tree and CSSOM tree to construct the Render Tree. The render tree maps all DOM structures except invisible elements (such as <head> or tags with display:none; ). In other words, the render tree is a visual representation of the DOM.
How do we properly deal with HTTP errors on the browser side? And how do you handle them correctly on the server side when the client side is at fault?
From the browser's point of view, the easiest thing to do is to try again and hope the error just goes away. This is a good idea in a distributed network, but we also have to be very careful not to make things worse. Here’s two general rules:
1. For 4XX http error code, do not retry. 2. For 5XX http error code, try again carefully.
How do companies typically ship code to production?
@GergelyOrosz wrote an excellent article about this topic and he has kindly agreed to share excerpts with Twitter readers.
1. 𝐒𝐭𝐚𝐫𝐭𝐮𝐩𝐬: Typically do fewer quality checks than other companies.
@GergelyOrosz Startups tend to prioritize moving fast and iterating quickly. As the company attracts users, these teams need to start to find ways to not cause regressions or ship bugs. They then have the choice of going down one of two paths: hire QAs or invest in automation.
To understand Linux file permissions, we need to understand Ownership and Permission.
𝐎𝐰𝐧𝐞𝐫𝐬𝐡𝐢𝐩
🔹Owner: the owner is the user who created the file or directory
🔹Group: a group can have multiple users. All users in the group have the same permissions to access the file or directory
🔹Other: other means those who are not owners or members of the group
𝐏𝐞𝐫𝐦𝐢𝐬𝐬𝐢𝐨𝐧
There are three types of permissions.
🔹Read (r): the read permission allows the user to read a file.
🔹Write (w): the write permission allows the user to change the content of the file.
🔹Execute (x): the execute permission allows a file to be executed.
What is ELK Stack and why is it so popular for log management?
The ELK Stack is composed of three open-source products. ELK stands for Elasticsearch, Logstash, and Kibana.
🔹 Elasticsearch is a full-text search and analysis engine, leveraging Apache Lucene search engine as its core component.
🔹 Logstash collects data from all kinds of edge collectors, then transforms that data and sends it to various destinations for further processing or visualization.