I knew the company was heavy on SQL, so I spent 3 days refreshing Normalization concepts.
But the questions were about Denormalization.
In Normalization, you reduce data duplication. In Denormalization, you add data redundancy.
When do you need it?
- When you have reporting or analytical requirements where complex queries are common.
- In applications where data is read frequently, and updates or inserts are less frequent.
- Denormalization can be helpful for caching.
- In cases where you have hierarchical data.
How to denormalize?
1️⃣ Identify the Tables to Denormalize:
Focus on tables that are frequently queried for reporting or analytical purposes.
2️⃣ Choose the Columns to Denormalize:
These are often the ones used in joins or frequently aggregated. Sometimes, you should include related data to simplify queries.
3️⃣ Create New Tables or Add Columns:
You can create new tables to store the denormalized data or add new columns to existing tables.
These new columns will contain the redundant data.
4️⃣ Populate Denormalized Data:
Populate the denormalized tables or columns with data from the original normalized tables. This can be a one-time or ongoing data migration.
5️⃣ Maintain Data Consistency:
Keep that the denormalized data stays in sync with the original normalized data.
To update the denormalized data, you may need:
- Triggers
- Stored procedures
- Batch processes
6️⃣ Optimize Queries:
Rewrite your queries to take advantage of the denormalized structure.
Simplify complex joins and aggregations, making your queries faster and more efficient.
𝗣𝗿𝗼𝘀:
- Denormalization will improve the speed of data retrieval.
- It simplifies complex queries, making them easier to write and understand, which is good for reporting and analysis.
- Precompute and store aggregated data, such as sums or averages, to avoid extensive calculations.
𝗖𝗼𝗻𝘀:
- Adding redundancy can lead to data inconsistencies. You must carefully manage updates and inserts to maintain data integrity.
- Storing more data consumes more disk space.
Does your Database need Denormalization?
Great question!
Views allow for dynamic denormalization.
You can define views that join multiple tables and present denormalized data without actually storing it redundantly.
If you need to prioritize query performance and can accept the trade-offs associated with data redundancy and synchronization, denormalizing with tables is a viable option.
On the other hand, if data integrity and reduced maintenance complexity are critical, views are a better choice for presenting denormalized data dynamically without storing it redundantly.
In many cases, a combination of both approaches is the winning formula.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Talking to a Junior developer, I was stunned at how confused he was about 3 primary concepts.
But also remembered nobody explained these concepts back in my college days.
I hope this helps some young engineers out there.
1. Hashing: this is a one-way process used for data integrity verification.
It converts input data (plaintext) into a fixed-length string (hash value) using a hash function.
Hashing helps ensure data hasn't been tampered with, but it's irreversible!
2. Encryption: this is the real deal when it comes to data security.
It transforms data into an unreadable format (ciphertext) using an algorithm and a secret key. The ciphertext can be decrypted back to the original data with the right key.
Over the last 6 years, I learned something I’ll never forget:
Frameworks and languages come and go, but the fundamentals always stay with you.
Here are 5 ideas you need to master:
1. Problem Decomposition
Breaking down complex problems into manageable parts is the first step.
Techniques like Divide and Conquer have been used since the beginning.
This technique reduces complexity and makes problem-solving more approachable.
2. Algorithm Analysis
Understanding algorithms and their efficiency are essential. Knowing their time and space complexity lets you choose the most appropriate one for a problem.
Proficiency in algorithm analysis allows you to optimize code and create efficient solutions.