davidad 🎇 Profile picture
13 Jan, 14 tweets, 3 min read
Hot take: the FP schism between Haskell and Lisp/Scheme is not *really* about static vs dynamic types, or purity vs effects; it's about extensional vs syntactic compositionality. 1/🧵
Extensionality means (to me, at least) that your expressions are to be seen as mere pointers into a Platonic "semantic domain" where all equivalent objects are truly the same (formally, a set quotiented by equivalence; even more formally, a 0-truncation of a higher groupoid) 2/🧵
The syntactic approach is to take expressions exactly as they are. While in the extensional philosophy, an expression in code REFERS to data, in the syntactic philosophy, the code IS the data. Two pieces of code are equal if and only if they have the same syntax tree. 3/🧵
Now, if you want syntax trees to be the primary objects you work with, and you want to be able to compose them and manipulate them elegantly, you want a very small grammar where there is really only one kind of thing to handle. In Lisp this thing is called a "cons cell". 4/🧵
In the extensional philosophy, the purpose of the syntax is just to *refer* to some Platonic ideal in the semantic domain, so you want as many tools as you can get into the grammar to be able to refer elegantly. Only the compiler implementation teams need to pay the price. 5/🧵
(Incidentally, this is why there is basically only one Haskell implementation, and basically infinitely many Lisp implementations.) 6/🧵
In the extensional philosophy, as soon as code is written, the code itself should be stripped away—its "references" followed—to reveal the best equivalent computable representations for the semantics it expresses. To know what representations to use, we need static types. 7/🧵
Furthermore, computable representations of objects that also can depend on runtime input, or produce arbitrary side effects, get very constrained. They basically have to take the shape of code! Then equivalence becomes undecidable, case analysis is intractable… Ergo purity. 8/🧵
The advantage of the Lisp/syntactic approach is that all affordances available to the programmer are also available to every function (at runtime). Functions can perform effects, redefine themselves, and inspect the internal structures of themselves & their inputs. 9/🧵
The type theory community has made great strides understanding the extensional semantics of more Lisp-like values.
— Gradual type theory, for values with non-static types arxiv.org/pdf/1811.02440…
— Effect systems, for values with impure side effects dl.acm.org/doi/pdf/10.114…
10/🧵
However, neither of those help with the Lisp affordance to inspect the internal syntactic structure of functions at runtime. This thread was motivated by thinking of a possible direction for that... 11/🧵
From an extensional perspective, what we're anxious about with direct syntactic inspection is violations of the principle that equivalent inputs yield equivalent outputs. But Higher Inductive Types (see dl.acm.org/doi/pdf/10.114…) give a way of checking that kind of principle. 12/🧵
Unfortunately Higher Inductive Types are more like data types than function types; maybe we need "Higher Coinductive Types" for behavioral equivalence. But I look forward to when higher-order functions can use syntactic metaprogramming iff it respects semantic equivalences. 13/🧵
In conclusion, I'm long type-theory in the end: I think higher-categorical semantics has a lot to offer, is unlimited in what it can theoretically accommodate, & a lot of good theory is ready today. But some of what Lisp can do is still open research for extensionalists. 14/14

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with davidad 🎇

davidad 🎇 Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @davidad

24 Sep 20
1. The difference between discovery and invention is a matter of degree, not a difference in kind.
2. The degree in question is what @DavidDeutschOxf calls “being hard-to-vary”, not “having been there already”.
For example, when Wadler says “Most programming languages were invented, not discovered—*and you can tell*”, I think he means that the set of core features in, say, C++ is not hard-to-vary (at least not compared to the set of core features in simply-typed λ-calculus).
When Harper says that when you see the same structure in categories, type theories, and logics, that's a "legitimate scientific discovery," he means that demanding a triple coincidence makes details harder to vary.
Read 9 tweets
13 Aug 20
In the words of Donald Knuth (on this very topic), "But no, no, ten thousand times no!" (source: math.uni-bielefeld.de/~sillke/NEWS/0…)

Let's talk about how primitive functions on ℝ even get defined. I'm using Terry Tao's Analysis I as a reference. 1/
ℝ is constructed step-by-step:
1. Obtain the naturals ℕ somehow. Popular choices include (a) taking Peano axioms as fundamental, (b) using Zermelo-Frankel axioms to construct finite ordinals, (c) defining an inductive type, or (d) the classic, accepting ℕ from God. (I kid.) 2/
2. Construct ℤ as a quotient of ℕ×ℕ: specifically, pairs (a,b) where a∈ℕ and b∈ℕ, interpreted as purely formal expressions "a – b", with "a – b" and "c – d" considered equal iff a+d = c+b (where the + here is Peano addition, of type ℕ×ℕ→ℕ). 3/
Read 17 tweets
9 Aug 20
Comparing the Cont & Dist monads: My operational interpretation of a continuation-passing-style value ∀R.(X→R)→R is “for any result type R, if you tell me how to compute one from an X, I’ll give you one,” and the only way to return an R is to obtain one from the continuation…
…because there are no other operations that return this arbitrary type R. Meanwhile Dist(X) is more like ∀(R:Convex).(X→R)→R. Now I can give the continuation a bunch of values in X and weighted-average the return values. So a Dist(X) is a bunch of X’s associated with weights.
This algebraic approach, where I equip R with a “convex combination” operator, only works for finitely supported distributions. To get to fully general measures, we swap our synthetic “R:Convex” for the analytic reals ℝ, but impose severe conditions on the function (X→ℝ)→ℝ…
Read 5 tweets
7 Aug 20
When continuous-valued probability distributions are taught, typically the density function (pdf) is introduced first, with some handwaving, and then the cdf is defined as an integral of the pdf.

This is backwards! A pdf ONLY EXISTS as the DERIVATIVE of the cdf.
I'm not saying that the Radon-Nikodym theorem or Lebesgue measure should be explicitly introduced before we can talk about Gaussians, but I think people comfortable with calculus would rather see d/dx P(X<x) than the usual handwaving about densities
A prime example of the kind of handwaving I'm whinging about, from Wikipedia. What is this nonsense, Image
Read 4 tweets
27 Dec 19
In ML, the term "manifold" generally refers to the image of a differentiable map from ℝᵏ→ℝⁿ. Sometimes assumed (explicitly or implicitly) to have a diff'ble inverse back to ℝᵏ (i.e. to be a diff'ble embedding). Either way, not the same idea of "manifold" used in math—(1/n)
With the invertibility assumption, this *is* a manifold, but an extremely special type of manifold: while manifold theory offers a lot of complexity around gluing together local Euclidean "charts", typical representation-learning learns just one *global* Euclidean chart! (2/n)
To boot, an ML "manifold" is an isometrically embedded submanifold of ℝⁿ. By Nash embedding theorem, isometric embedding into some ℝⁿ is possible for any Riemannian manifold, but it's not usually so trivial. (3/n)
Read 5 tweets
13 Dec 19
Suppose I have a functor H : C^op × D → V (a profunctor; fine to assume V=Set), and endofunctors L : C→C and R : D→D, with

H(L(c),d) ≅ H(c,R(d))
natural in c,d.

Looks like an adjunction, but isn't directly—not even a relative adjunction. Does this situation have a name?
One characterization: endomorphisms in Chu(Cat,V). Explicitly, (L,R) : Endo_{Chu(Cat,V)}(C,D,H).
(Er, by C there I meant C^op, and I guess technically also L^op.)
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!