My Authors
Read all threads
1/hella

this 🧵 by @daniela_witten is a masterclass in both the #SVD and in technical communication on Twitter.

i want to hop on this to expand on the "magic" of this decomposition and show folks where the rabbit goes, because i just gave a talk on it this week!

🧙‍♂️🐇💨😱
tl;dr: the basic idea of the SVD works for _any_ function.

it's a three step decomposition:

- throw away the useless bits ⤵
- rename what remains 🔀
- insert yourself into the right context ⤴
also, if you're more of a "YouTube talk" than a "tweet wall" kinda person, check out the video version, given as part of the @weights_biases Deep Learning Salon webinar series

first, let's orient ourselves.

we're talking about the _singular value decomposition_ of a matrix M

M = U Σ V^T

where U is tall, V^T is skinny, and Σ is diagonal, like in the pic below
instead of thinking of these matrices as "blocks of numbers", though, i find it more illuminating to think of them as "functions that take in arrays and spit out arrays"

so what do these matrices do?
- V^T removes any part of the input that lies in the kernel of M, i.e. any part that goes to 0, and lowers the dimension (r <= n)
- Σ scales each entry of its input
- U takes low-dimensional inputs and "pads" them so they look like higher-dimensional things (r <= m)
from the perspective of "matrices are functions", it makes more sense to write the SVD with this diagram
how is this read? starting at a node (array), we can "travel" to a different node (new array) by following an arrow (applying a matrix).

this diagram "commutes", meaning that if any two paths start + end at the same nodes, the matrices for each path are equal

#followyourarrow
a path with a single arrow corresponds to the matrix labeling that arrow.

but what if i follow a path w more than one arrow, e.g. Σ then U? what's the matrix?

it's the matrix product, ΣU!

in fact, this is my preferred way to define the matrix product! much clearer motivation
this approach is motivated by the idea that linear algebra is _not_ like algebra, where "algebra" here means "method for manipulating equations"

instead, it's more like programming, where we manipulate and compose functions

see this talk for more!

so, how does this help us understand the SVD?

well, the idea of breaking down one function into three functions is a very general one.

in its broadest definition, it can be applied to _any_ function, as pictured below
the three steps:

- an "onto" function, aka surjection/epimorphism
- a "reversible" function, aka bijection/isomorphism
- a "one-to-one" function, aka injection/monomorphism
the Latin names (-jection) are more common, the Greek names (-morphism) more general. use them to taste!

i like using emojis: ⤵, 🔀, and ⤴
here's an example, applied to a Python function that might show up IRL: is_odd takes in an Int-eger and returns a String that identifies whether the input is even or odd
the first step picks out two "representatives" -- one for each possible output of is_odd. many onto one ⤵

the second step assigns each representative to the right output _value_. a perfect pairing 🔀

the final step ensures we have the right output _type_. just "inserting" ⤴
let's look back at the SVD:

- V^T maps the kernel of M to 0. 0 is in the kernel, all others have unique outputs. it's picking out representatives! ⤵ ✅
- Σ just scales things. that's reversible! 🔀✅
- and U takes r-dim arrays and makes them n-dim. that's type-matching! ⤴✅
this doesn't explain the centrality of the SVD in applications or why it's so efficiently computable. for that, see the original thread from @WomenInStat

but it does explain why the SVD exists for all matrices -- matrices are functions, and this decomp exists for all functions!
this decomposition, of which SVD is just the coolest e.g., is known as the "1st Isomorphism Theorem" in its full generality

everywhere it appears, it sparks insight, connects multiple fundamental ideas, and relates seemingly distant concepts

you might even call it ... magic! 😈
just one example: in group theory, the 1st Isomorphism Theorem leads to kernels, quotient groups, and more, as explained by @math3ma: math3ma.com/blog/the-first…

check out her blog for more really great math explainers. eagerly awaiting her book on topology!

hella/hella, end of 🧵
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Charles 🎉 Frye

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!