Let's break down each question 1. Logic: Show the classes, where the number of student is more than 5. The word "At least" tells us something, we have to include 5 as part of our filtering condition, hence >=5 2. This looks pretty straight forward. For each user, we want to....
..return the number followers, hence the need to use the Group BY clause. This clause allows us to aggregate data against the unique categorical/qualitative values in a field. 3. The given SQL query finds the maximum value (num) from the "mynumbers" table that appears only once.
It first identifies the numbers that occur exactly once by grouping and filtering, and then selects the maximum value from those numbers.
Approach
The query starts with the inner subquery (select num from mynumbers group by num having count(num) = 1) as n.
The subquery groups the numbers (num) from the "mynumbers" table and filters only those numbers that have a count of 1, meaning they appear only once.
The outer query selects the maximum value from the numbers obtained in the subquery using the MAX function and aliases it as num
The query retrieves the result, which is the maximum value (num) that appears only once. 4. This question requires the use of the Rank window function. Tbh, I'm always confused on how to differentiate Rank from Dense Rank, until I had to use the Olympics..
Award/Podium ceremony for athletes. So, here's how I'll always remember. Dense rank takes the idea of the Olympics award/podium rankings. For instance, this occurs when 2 athletes gets tied on the silver medals. The next person gets the bronze medal or 3rd. I.e no position was...
Skipped. The Rank on the other hand is the opposite. The rank skips 2nd position/silver to the 4th position.
Back to our question. We were told to rank the scores. And if there is a tie between 2 scores, the next ranking number should be the next consecutive int. Value...
..indicating that we should not skip any number/position.
With our Olympic analogy, we should be confident to know we have to use the Dense rank function.
Thank you so much for staying till the end. 💛Merci.,
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Let's break them down: 1. Intuition
The query aims to retrieve information about managers in the Employees table, including their IDs, names, the number of employees reporting to them, and the average age of their reports.
The solution involves using a self join. Since we were told that a manager is also an employee, we will join the table to itself, where the employee_id field is equal reports_to field. On this premise, we aggregate the number of employees who report directly to the manager,
A few days back, 2 Data Analyst newbies showed me mistakes they encountered during their analysis process. It dawned on me that lots of times, we don't diligently follow through the data analysis process, especially the cleaning phase.
In this short🧵, I break down this process.
PHASE 1: ASK QUESTIONS
The first step in the analysis process is to formulate a problem to solve or a question to answer, such as: are customers aged 21 to 30 more likely to churn than customers aged 41 to 50? As analysts, we must be very curious, we must ASK QUESTIONS!
PHASE 2: Collect & Store Data
Next, we need to collect and store the necessary data, which could require the use of a database or a spreadsheet. For large datasets, we store them in a database. For datasets shy of a million records, spreadsheet software like Excel can be used.
Let's break down each code 1. The ending part to the question is where the major conditional logic lies. We were told to return the needed columns, and price for the first year of every product sold. Since we do not have distinct column showing min years for each product,
We have to create that with a subquery in the Where clause. We are concerned about the first year of every product sold, so we aggregate the min years for each product. After doing that, we tell SQL to return the prices of products, the year, and qty when the product was first..