Tai-Danae Bradley Profile picture
Jun 29, 2021 9 tweets 4 min read Read on X
Today on the blog I’ve started a new mini-series called “Language, Statistics, & Category Theory” to describe some ideas my collaborators and I share in a recent paper on mathematical structure in language. Part 1 is now live! math3ma.com/blog/language-…
We open with the idea that language is algebraic: you can “multiply”words together (concatenation) to get new expressions:

red × firetruck = red firetruck

I’ve mentioned this idea previously in a "promo video" I made for my PhD thesis last year:
Now, thinking algebraically, consider this famous quote by linguist John Firth: “You shall know a word by the company it keeps.” If you’re an algebraist, you may try to formalize this by identifying the meaning of a word, like “red,” with the principal ideal associated to it:
Algebra is nice, but it isn’t the full story. There’s also statistics! “Red firetruck” occurs more frequently than “red idea,” and this contributes to the meaning of the word “red.” So, how can we marry these structures? It turns out category theory is a nice setting for this.
To start, language is a category! Objects are strings of words, and arrows indicate when one string is contained in another. This category is a bit like syntax: it tells us “what goes with what.”
What can we do with this category? Well, we might wonder what perspective category theory’s most important theorem—the Yoneda lemma—brings. Informally, this theorem states that a mathematical object is uniquely determined by its networks of relationships. math3ma.com/blog/the-yoned…
So in the spirit of the Yoneda lemma, the meaning of a word like “red” is contained in the network of ways that "red" fits into all other expressions in English. Sounds a bit like John Firth's quote, no?
The passage from “red” to “the network of ways 'red' fits into language” is described formally in category theory as a functor. It takes us from a syntax category of language to the category of “copresheaves” on it, where semantics may lie. Sounds fancy, but the idea is simple!
Okay, why do all this work? It turns out there are advantages to thinking category theoretically, rather than merely algebraically, including a principled way to incorporate statistics. I’ll explain more in Part 2. Stay tuned! (Or, read the preprint! arxiv.org/abs/2106.07890)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tai-Danae Bradley

Tai-Danae Bradley Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @math3ma

Oct 20, 2019
I’m happy to share a new paper with @MStoudenmire and John Terilla: arxiv.org/abs/1910.07425 We share a tensor network generative model, a deterministic training algo & estimate generalization error, all with the clarity of linear algebra. Now on the blog! math3ma.com/blog/modeling-…
@MStoudenmire I’ll explain some of the ideas here, sticking to a "lite" version. (Check out the paper for the full version!)

If you’ve been following the posts on Math3ma for the past 6-or-so months, you’ll be delighted to know the content is all related. But more on that later…
@MStoudenmire Alright, before jumping in, let’s warm up with a question:
Read 30 tweets
Jul 24, 2019
Every probability distribution can be viewed as a quantum state & vice versa. There's a nice mathematical dictionary between the two worlds! So, what *is* a quantum state? And what's the dictionary? "A First Look at Quantum Probability, Part 2" is here! math3ma.com/blog/a-first-l…
I’ll share a few of the ideas here, picking up where we left off in Part 1:
For motivation, we started with a simple joint probability distribution and turned it into a matrix M (and judiciously added some square roots).
Read 23 tweets
Jul 18, 2019
Hello friends! I’m excited to share with you the start of a mini-series on quantum probability theory. It's a *first* look at the subject, so the only prerequisites are linear algebra and basic probability. Part 1 is now on Math3ma! math3ma.com/blog/a-first-l…
Part 1 motivates the mini-series by reflecting on a thought from the world of (classical) probability theory:

*Marginal probability doesn’t have memory.*

What do I mean?
From a joint probability distribution on a product of two sets, you can get marginal probabilities by summing over, or “integrating out,” one of the variables. But marginalizing loses information—it doesn’t remember what was summed away!
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(