Here's a fun challenge: given an array of datetimes, what's the best way to plot those on a frequency graph over time?

They might all be on the same day, or they might be spread out over several years - so the challenge is automatically picking the most interesting bucket size An example of the kid of chart I want to produce
Looks like d3.bin().thresholds() is the answer I'm looking for observablehq.com/@d3/histogram
Yup, this works perfectly: observablehq.com/@simonw/my-twe… Histogram of my tweets over time, generated with the followi
Turns out the default implementation of d3.bin().thresholds() without any arguments is to use Sturges' formula, which is covered on the Wikipedia Histogram page - there are a whole load of smart mathematical solutions to my original question en.wikipedia.org/wiki/Histogram…
I added a tooltip to the chart showing which time period each line in the histogram represents, and wrote up the whole thing in a TIL: til.simonwillison.net/observable-plo…
d3 may use Sturges' but it turns out Observable Plot which I used for my final chart actually uses Scott’s rule instead!

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Simon Willison

Simon Willison Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @simonw

9 Aug
Is there a reliable way to tell search engine crawlers that a site hasn't been updated in X days so they don't need to re-crawl it?

Do they tend to believe the <lastmod> element in sitemap.xml ? And can I set that to apply to the whole site, not just an individual page?
Asking because tailing logs shows a vast amount of crawler traffic to Datasette instances that haven't seen any data changes in over a year - I may have to robots.txt block crawlers from them to save in costs, but I'd rather tell them "no point in crawling, nothing has changed"
Datasette currently has a plugin for configuring robots.txt, but I'm beginning to think it should be part of core and crawlers should be blocked by default - having people explicitly opt-in to having their sites crawled and indexed feels a lot safer datasette.io/plugins/datase…
Read 4 tweets
28 Jul
Finally published my article describing the Baked Data architectural pattern, which I define as "bundling a read-only copy of your data alongside the code for your application, as part of the same deployment"
simonwillison.net/2021/Jul/28/ba…
I've been exploring this pattern for a few years now. It lets you publish sites to serverless hosts such as @vercel or @googlecloud Cloud Run that serve content from a read-only database (usually SQLite) - so they scale horizontally and can reboot if something breaks
It effectively gives you many of the benefits of static site publishing - cheap to host, hard to break, easy to scale - while still supporting server-side features such as search engines, generated Atom feeds and suchlike
Read 11 tweets
26 Jul
Anyone know of examples of SaaS apps that deploy stuff to your AWS account (lambdas, S3 buckets etc) using IAM credentials that you grant to the SaaS app? Is this a pattern anywhere?
I've seen examples of apps that will write your data to an S3 bucket that you own - various logging tools do this
Here's Fastly's documentation on setting that up: docs.fastly.com/en/guides/log-…
Read 8 tweets
11 Jun
A silly thing that puts me off using Docker Compose a bit: I frequently have other projects running on various 8000+ ports, and I don't like having to shut those down before running "docker-compose up"

Is there a Docker Compose pattern for making the ports runtime-configurable?
I'd love to be able to run something like this:

cd someproject
export WEB_PORT=3003
docker-compose up

And have the project's server run on localist:3003 without any risk of clashing with various other Docker Compose AND non-Docker-Compose projects I might be running
This looks like exactly what I want! The “.env” file You can set default values for any envir
Read 6 tweets
10 May
Announcing Django SQL Dashboard, now out of alpha and ready for people to try out on their own Django+PostgreSQL projects: simonwillison.net/2021/May/10/dj…
The key idea here is to bring some of the most valuable features of Datasette to any Django+PostgreSQL project

You can execute read-only SQL queries interactively, bookmark and share the results, and write queries that produce bar charts, progress bars and even word clouds too
I recorded a three minute video demo which shows the tool in action
Read 9 tweets
10 May
Out of interest: if you have a blob of JSON on your clipboard and you want to see a pretty-printed version of it, what's your fastest way to do that?
I hit Shift+Command+N in VSCode to get a new window, paste it in there, then hit Shift+Command+P to get the command palette, type JS and select the JSON pretty print option - which I think I installed as an extension at some point
Other times I'll use "pbpaste | jq", occasionally I'll use ipython like so:

s = """<paste JSON here>"""
import json
print(json.dumps(json.loads(s), indent=2))
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(