Bad data can have a big impact on a company's bottom line.
Poor-quality data is frequently blamed for operational blunders, incorrect analytics, and poorly thought-out company initiatives.
What should be reviewed?
Organizations can detect data mistakes that need to be fixed and determine whether the data in their IT systems is suitable for the intended use by measuring data quality levels.
1. Check for completeness/uniqueness. Presence of missing data? Are data entries duplicated?
2. Check for accuracy and consistency. Are the formulas correct and consistent? Are the entered Data accurate?
3. Check for conformance and validity. Do the data meet required specifications?
4. Timeliness. Is it up to date? Is it readily available?
5. Review the provenance. What is the source of this data? Can it be relied upon?
Thankfully, we can check data quality issues at every step of the analysis work. These are:
1. Fix quality issues before/as they are captured
2. Detect & fix issues inside source system
3. Detect & fix issues in ETL: Audit, Balance & Ctrl; Standardize during load & ensure Referential Integrity.
4. Detect & fix issues inside the database
5. Detect & fix issues in report or analysis
Avoid damaging the business with incorrect analysis.
The best tool for data analysis is your brain. Develop it.
Different firms use different tools. Having the data analysis skills is material so you can apply them to any BI tool.
The tools listed here are not endorsements.
1. Database Systems. For creating, extracting & maintaining data from databases.
E.g. Microsoft Access, MySQL, PostgreSQL, Microsoft SQL Server, IBM DB2, Oracle, Teradata etc.
2. Standard Reporting: Used to manipulate or show data in a consistent, repeated manner.
E.g. Microsoft Excel, Microsoft SQL Server, Oracle OBIEE, Cognos, MicroStrategy etc.
I know how difficult it can be to ask the right questions that will generate the key insights you need to solve a problem or make a data-driven decision.
Here are 20 key questions to help you understand and solve problems with data.
1. Why was I asked to review this? – Problem Statement/Purpose
2. What does the product/process do? – Domain knowledge
3. Where are we now? – Actual Performance
4. Where should we have been? – Plan/budget
5. Did we achieve what we planned to achieve? – Variance Analysis
6. Why did we not achieve what we planned to achieve? – Root cause analysis
7. Where are the gaps in our process/strategy? – Gap Analysis
8. What did we not consider before? – Gap Analysis
9. What has changed within this period? – Trend Analysis
Let's discuss data models & benefits of having solid data models.
One of a data analyst's key responsibilities is building a strong data model.
The term "Data Model" refers to the process of arranging data into tables based on relationships & groups...
...to minimize duplication and maximize efficiency.
By performing this task properly, you contribute to making it simpler for people to comprehend your data, which will make it simpler for both you and them to create useful reports and dashboards.
It is challenging to provide a set of guidelines for what constitutes a good data model because every piece of data is unique and is used in different ways.
A smaller data model is preferable because it will operate more quickly and be easier to use.
You know how frustrating change of requirements from stakeholders can be.
Besides following a clear change management process, developing mental agility will boost your efficiency and overall productivity.
What is mental agility?
Mental agility is the ability to think and apply insights quickly from one context to another.
In simpler terms, it is how well your mind can quickly adjust to new conditions/ideas.
Thankfully, your mental agility can be improved.
Use these 5 simple ways to get started.
1. Be curious. Ask “Why” & “What If”. 2. Read, observe & listen. Read widely & learn to listen to understand not to reply. 3. Be less defensive. Have an open mind. 4. Schedule time to meditate & think. 5. Gain domain knowledge. This helps to understand other possible use cases.