Question for SQL experts: #STOIC has a much richer typing system than @duckdb. Therefore, when we create a column like "revenue", we want to type it (Number) and give it a full name (Total Revenue). What would be the best way to add that information directly in the SQL code?
To be clear, we need a solution that would be compatible with @duckdb's SQL parser and would be as idiomatic as possible. Otherwise, we'll have to invent our own syntax and do some pre-processing. That's always an option, but I'd like to avoid it if possible.
And for the record, here is the full list of our datatypes:
Yesterday, I described a version of our upcoming SQL query designer that focused on making it easier to compose SQL queries, while preserving SQL's hierarchical structure. Today, I want to explore an alternative path.
Instead of taking a SQL-oriented approach, I want to take a task-oriented approach. And I want to free myself from SQL's hierarchical structure, while still producing a well-formed SQL query in the end.
The idea is similar to what Excel's pivot table is doing: give the user a relatively simple graphical user interface to pivot a table across multiple dimensions, and produce a query from it (be it a SQL or a DAX query).
One of #STOIC's most useful features is its signature Summary Charts, which are these bar charts displayed at the top of every column in a table. They work really well, unless your table has very few rows. Here is how we'll improve them for some interesting corner cases.
Relative Baseline in Bar Plots
When a table has less than 50 rows, we replace the Histogram displayed for a numerical column with a Bar Plot visualizing discrete values, but we keep 0 as baseline. We should use MIN or MAX instead, as we do for cell summaries below.
Bar Plot with All Equal Values
When a Bar Plot is produced for a set of values that are all equal, we would want to know that at a glance. To do so, the length of bars will be reduced by 50%, while keeping the plot horizontally centered.
Things that I love about @berkeleygfx's design philosophy:
Emergent over prescribed aesthetics.
Expose state and inner workings.
Dense, not sparse.
Performance is design.
Beauty emerges automatically without deliberation.
Do not infantilize users.
"Emergent over prescribed aesthetics."
Our core UI design was totally emergent. We did not follow any trends. We just tried to render as much qualitative and quantitative information as possible, in a well-organized fashion. Aesthetics were a mere by-product.
"Expose state and inner workings."
That's exactly what we do when we expose the "Runtime Boundary" on the left panel: Lambda functions upstream, EC2 monostore downstream.
One of the main factors that is holding back the design of good data tools for less technical users is our lack of trust in their ability to understand complex mathematical concepts like logarithmic scales or percentiles.
While it is true that the vast majority of such users are not yet familiar or comfortable with these concepts today, they could learn about them over time, and we need to design user interfaces that can facilitate this learning.
This is why I am such a strong proponent of what I call progressive user interfaces. These are user interfaces that use advanced concepts surreptitiously. These concepts are there right in front of the user, yet might not be noticed at first, and don't get in the way.
This thread introduced our upcoming SQL query designer and its two modes: graphical and textual. This other thread will cover the graphical mode in more detail, reviewing each clause of SQL's SELECT statement.
This thread introduced the SQL vs. Tabula discussion. This other thread will cover our SQL query designer. This tool will focus on @duckdb's SQL dialect, but it will be designed to work with virtually any SQL cloud database (@SnowflakeDB, @databricks, etc).
There are many ways to design a SQL query designer, and to design a good one, two critical questions must be answered first: who is the audience, and what is their working environment?
For #STOIC, we must serve two very different user groups:
1. Users who are new to SQL 2. Users who are SQL experts