I realized this can be coded up in < 5 minutes in the browser via @UsePenrose, and gave it a try. Pretty darn close! 🧵
[2/n] To be clear, this 🧵 isn't about finding a better packing—or even finding it faster. Wizards like @KangarooPhysics surely have better tricks up their sleeves 🪄
Instead, it's about an awesome *tool* for quickly whipping up constraint-based graphics: penrose.cs.cmu.edu
[3/n] The "17 squares" problem provides a great demonstration of how Penrose works.
In fact, if you want to use this thread as a mini-tutorial, you can try it out at penrose.cs.cmu.edu/try/
[4/n] The first step is to just say what type of objects we have in our domain of interest.
In this case, there's just one type of object: squares!
So, we write "type Square" in the domain tab:
[5/n] Next, we need to say how squares should get "styled," i.e., how they should get arranged and drawn. So, let's open up the style tab:
[6/n] The first thing is to just say how large our drawing canvas should be.
Since the drawing will be forced to stay on the canvas, we can use the canvas itself as the outer bounding square!
The smaller we make the width/height, the harder packing our 17 squares will become…
[6/n] The first thing is to just say how large our drawing canvas should be.
Since the drawing will be forced to stay on the canvas, we can use the canvas itself as the outer bounding square!
The smaller we make the width/height, the harder packing our 17 squares will become…
[7/n] Next up: how should we draw each Square?
Well, we can make a rule for that: for all squares, draw a polygon with four corners, centered around the origin.
[8/n] So far we've said that we want to draw squares, and have specified how to draw a square.
But to see something actually pop up on the screen, we have to actually make a square!
So, let's head over to the "substance" tab and just type "Square S".
[9/n] Hit "compile" and you should now see a lonely square.
…not that exciting!
But it's gonna get way more interesting really fast.
[10/n] Let's hop back over to the style tab, where all the action happens.
First, we'll give our square a variable translation (x,y) and rotation θ, using some standard 2D vector math.
The ? mark means Penrose will fill in the blanks for us (e.g., by picking random values).
[11/n] If we hit "compile▸" again, we now get a random square fitting this description.
Hit "resample", and we get a different random square.
Notice that the square always stays within the canvas—which will be essential for our packing problem!
[12/n] Finally—and this is really where the magic happens—we just tell Penrose that we don't want any squares to overlap.
In particular, for all pairs of distinct squares S and T, we ask that their "icons" (the shapes we defined in the previous rule) should be disjoint.
[13/n] Of course, since we only asked for one square, this rule is gonna be really easy to satisfy!
So, let's go back to the "substance" tab and this time ask for all seventeen:
[14/n] If we now hit "compile", we should see something that looks like confetti—because our bounding box is super huge (100 x 100):
[15/n] But what if we try a smaller box, say, 10 x 10 (by changing the canvas size at the top of the "style" program)?
More interesting…
[16/n] Maybe to see what's going on we should draw a background for the canvas:
[17/n] Now we can make the bounding box 5x5…
[18/n] 4.8 x 4.8…
[19/n] And keep on going—how low can we go?
Warning: if you make it too small, the optimization may go on forever and freeze your tab! Sorry about that.
You can also slow down the optimization to a step size of 1 to see how it works:
From here of course you can pack a different number of squares, or different shapes altogether.
But really this is just a (very) small example of what you can do with Penrose—check out the gallery and documentation for more.
And have fun exploring!
[20/20]
P.S. Here's the full example if you don't feel like typing it in yourself!
The sum of the squares of several positive values can never be bigger than the square of their sum.
This picture helps make sense of how ℓ₁ and ℓ₂ norms regularize and sparsify solutions (resp.). [1/n]
These pictures are often batting around in my brain when I think about optimization/learning problems, but can take some time to communicate to students, etc. So, I thought I'd make some visualizations. [2/n]
Suppose we minimize the squared length of a vector x, equal to the sum of squares of its components.
To avoid the trivial solution x=0, we'll also require that the components sum to a nonzero value.
Equivalently: minimize the ℓ₂ norm ‖x‖₂, subject to ‖x‖₁=1. [3/n]
We often use discretization to approximate continuous laws of physics, but it also goes the other way:
You can use continuous equations to approximate the behavior of discrete systems!
Here we'll see how electrical circuits can be modeled using the Laplace equation Δφ=0. [1/n]
The Laplacian Δ is central to numerous (continuous) physical equations like the heat equation, the wave equation, and so on.
I have a whole video about it here: [2/n]
The discrete or graph Laplacian L is typically viewed as a numerical approximation of Δ, giving the difference between the value ui at a node of a graph, and a weighted average of uj at all neighbors j:
Entropy is one of those formulas that many of us learn, swallow whole, and even use regularly without really understanding.
(E.g., where does that “log” come from? Are there other possible formulas?)
Yet there's an intuitive & almost inevitable way to arrive at this expression.
When I first heard about entropy, there was a lot of stuff about "free states" and "disorder." Or about the number of bits needed to communicate a message.
These are ultimately important connections—but it's not clear it's the best starting point for the formula itself.
A better starting point is the idea of "surprise."
In particular, suppose an event occurs with probability p. E.g., if the bus shows up on time about 15% of the time, p = 0.15.
How *surprising*, then, is an event with probability p? Let's call this quantity S(p).
We often think of an "equilibrium" as something standing still, like a scale in perfect balance.
But many equilibria are dynamic, like a flowing river which is never changing—yet never standing still.
These dynamic equilibria are nicely described by so-called "detailed balance"
In simple terms, detailed balance says that if you have less "stuff" at point x, and more "stuff" at point y, then to maintain a dynamic equilibrium, the fraction of stuff that moves from x to y needs to be bigger than the fraction that moves from y to x.
Detailed balance is also the starting point for algorithms that efficiently generate samples of a given distribution, called "Markov chain Monte Carlo" algorithms.
The idea is to "design" a random procedure that has the target distribution as the equilibrium distribution.
Even though Newton's laws are deterministic, the behavior of many interacting bodies is so chaotic that it looks essentially "random."
Statistical mechanics effectively says: why bother with all those complex trajectories? Just go ahead and replace them with truly random motion.
A good way to think about the difference is to imagine actually simulating the particles.
With Newtonian simulation, you track both positions & velocities. Velocities increase or decrease according to forces (like attraction/repulsion); positions are then updated by velocities.
With (overdamped) Langevin simulation, you track just the positions. Positions follow the same forces (i.e., the potential gradient), plus some random "noise" that models all that chaotic motion.
Near equilibrium, these two simulators yield motions with very similar statistics.