Damek Davis Profile picture
Optimization and Machine Learning // @Wharton Statistics and Data Science // "I just love math"
Dec 11, 2023 14 tweets 6 min read
Next formalization is the "gradient inequality" for smooth convex functions.

The proof (next tweet) is a few sentences of English.

My lean4 proof (below) is roughly 350 lines of code.

1/n Image The proof in English is short. It is a consequence of the simple convex analysis fact:

If f is a convex function such that f(0) = 0 and f(y)>= o(y), then f(y) \geq 0 globally. Image
Dec 2, 2023 9 tweets 2 min read
I proved a super simple helper lemma.
In English the proof is 1 line.
In lean4, it took me around 50 lines of code.
1/n Image The proof took me around 5 hours (lol).
Any help shortening would be appreciated.

2/n Image
Aug 29, 2023 4 tweets 2 min read
Cool technique. Reminds me of the the Cramer-Chernoff method, which applies a Markov type inequality and optimizes to find the "best" parameter....
Image Carrying out the optimization bounds the tail deviation as the exponential of the Fenchel conjugate of the log moment generating function of the r.v. Image