A deep technical dive into all things Redis. Covering various Redis topologies, data persistence and process forking.
Redis redis.io (“REmote DIctionary Service”) is an open-source key-value database server.
The most accurate description of Redis is that it's a data structure server. This specific nature of Redis has led to much of its popularity and adoption amongst developers.
Primarily, Redis is an in-memory database used as a cache in front of another "real" database like MySQL or PostgreSQL to help improve application performance. It leverages the speed of memory and alleviates load off the central application database
There are several ways to deploy Redis which one you go with highly depends on scale and use case. For simple deployments a single node cluster is all you need. For more complicated and mission critical workloads you might want Redis Sentinel.
Many have thought about what happens when you can't store all your data in memory on one machine. Currently, the maximum RAM available in a single server is 24TIB, presently listed online at AWS. Granted, that's a lot, but for some systems, that isn't enough. Thus Redis Cluster.
If we are going to use Redis to store any kind of data for safe keeping, it's important to understand how Redis is doing it. There are many usecases where if you were to lose the data Redis is storing is not the end of the world.
This coolest part of Redis in my opinion is how it leverages forking and copy-on-write to facilitate data persistence performantly. When you fork a process, the parent and child share memory, and in that child process Redis begins the snapshotting (Redis) process.
It is often surprising how little is known about how databases operate at a surface level, considering they store almost all of the states in our applications. Things You Should Know About Databases.
Indexes are a data structure that helps decrease the look-up time of requested data. Indexes achieve this with the additional costs of storage, memory, and keeping it up to date (slower writes), which allows us to skip the tedious task of checking every table row.
So here is where most developers go – I have seen this problem before; we need some dictionary (hash map) and a way to get to the specific row we are looking for. These are called index leaf nodes.
Your app is getting better. It has more features, more active users, and every day it collects more data. Your database is now causing the rest of your application to slow down.
Engineers often get caught up in doing things the most involved way, but keeping things simple early on makes challenging things later on as your application evolves much easier. So if your problem goes away by getting machines with more resources, 9/10, that's the correct answer
Sharding is an example of horizontal scaling, while vertical scaling is an example of just getting larger and larger machines to support the new workload.
With the breach of LastPass and everyone wondering if their vaults will be impervious to attacks. I thought it might be good time to refresh our understanding on two type of encryption. Once we do that let's talk about encryption in password managers.
There are two classes of encryption, but until 1976 symmetric key encryption was the only show in town. It involves a shared key used to encrypt and decrypt messages.
First up we have symmetric key encryption. This requires the successful sharing of a shared key. If someone gets hold of this key its allows them to decrypt any message they intercept and also encrypt its own messages.
Simon Willison (@simonw), the creator of Datasette, writes about SQLite and the challenges of building a server-side web application that also works in Electron and WebAssembly. It's a great read — I hope you enjoy it and hopefully learn something.
Datasette datasette.io is a tool for exploring and publishing data. The original goal of the project was to make it easy and inexpensive to publish structured datasets online.
The Python Global Interpreter Lock or GIL, in simple words, is a mutex (or a lock) that allows only one thread to hold the control of the Python interpreter.
Python uses reference counting for memory management, leading to a race condition if multiple threads have access.
It is often surprising how little is known about how databases operate at a surface level, considering they store almost all of the states in our applications. Things You Should Know About Databases.
Indexes are a data structure that helps decrease the look-up time of requested data. Indexes achieve this with the additional costs of storage, memory, and keeping it up to date (slower writes), which allows us to skip the tedious task of checking every table row.
So here is where most developers go – I have seen this problem before; we need some dictionary (hash map) and a way to get to the specific row we are looking for. These are called index leaf nodes.
There is often a level of focus on the bigger picture when it comes to system design, but we often don't think about the underlying components in these systems. So let's chat about different levels of memory.
Over the years memory has increased in capacity and in speed as you can see with the chart below it's been following a trajectory called Moore's law.
There are several types of RAM; the two main types are SRAM and DRAM. SRAM is more closely associated with CPU caches and provides lower latency but are more expensive. Meanwhile DRAM is slower but much cheaper and can be packed densely, which makes it ideal for main memory.