My Authors
Read all threads
📊 Answers and comments about yesterday's online quiz on learning.

Not a quiz on online learning. Though there was an (online learning) component on that online (learning quiz), which maybe led to some learning online, so... I am confused now.



1/16
Let's start with PAC learning. All I'm saying here applies for classification, i.e., learning functions over some space 𝒳 taking boolean values: f: 𝒳 →{0,1}.

Q1 asked what quantity, if any, captured PAC-learnability. 71.3% got it right...



2/16
... it's the VC dimension, a combinatorial notion named after Vapnik and Chervonenkis: en.wikipedia.org/wiki/Vapnik–Ch…

That's, arguably, *the* cornerstone of statistical learning/computational learning theory. There are many great books+lecture notes about this, e.g., this one...

3/16
... and it's a deeply influential notion. Even today, understanding the exact complexity of PAC learning under various constraints is a very active area, by the way. Check #COLT2020 for some papers about it even this year—including the best paper! colt2020.org/virtual/papers…

4/16
Which brings us to Q2: PAC-learning is fine, but PAC-learning efficiently is better! Can we?

I am sorry to say that as far as I know, it's the 19.7% going with unicorns who got it right... 🦄 We have, even now, very little understanding of those...



5/16
... computational issues (please, correct me if I'm wrong!). E.g., we know that efficient *proper* PAC learning (where the output hypothesis needs to be in the same class 𝓒, so you can't for instance approximate a linear function with a degree-17 polynomial) is...

6/16
... impossible unless P=NP. The general "improper" learning algorithms we have are not efficient themselves either, though efficient algorithms can exist for some 𝓒 on a case-by-case basis. So, basically: 🤷

7/16
So let's change the learning model and go fully online. No underlying statistical model, just an adversarial setting where someone wants to make you fail:* Online Learning, Mistake Bound.

* So, basically, a learning model equivalent to my middle school Latin teacher.

8/16
The answer to Q3 is, I think, quite surprising (though not apparently to 43% of you): in this model, the Littlestone dimension fully characterizes what is learnable. Better, it comes with a canonical algorithm that always does the job!

What is...



9/16
... the Littlestone dimension, you ask? It's kind of tricky, but here it is—stealing the def from lecture notes by Akshay Krishnamurthy:
people.cs.umass.edu/~akshay/course…

Anyways. It's combinatorial, depends only on the class 𝓒 to learn, and it captures exactly the right thing!

10/16
Last question now. What if we want to PAC-learn classifiers from a class 𝓒, but in a differentially #private (DP) manner?
By this I mean (approximate) (ε,δ)-DP: for small ε and δ, the output reveals very little about any single data point from the training set.

11/16
Here comes the magic (25.9% got it right): it's PAC learning, sure, but somehow the right quantity is... the Littlestone dimension. Not the VC dimension. Not the number of possible functions |𝓒|.

Why? How? I wish I knew...



12/16
... but I know someone who does! SomeoneS, actually. This has been recently proven in an amazing paper by Bun (@markmbun), Livni, and Moran, the culmination of a line of research hinting at this bizarre connection between privacy and online learning. (Oh, hi, @SethVNeel.)

13/16
I encourage you to watch the talk Mark gave about this on @TCS_plus:
🎥 sites.google.com/site/plustcs/p…
or watch yesterday's #COLT2020 keynote by Salil Vadhan, which touches upon this: colt2020.org/virtual/speake… (the 🎥 recording will be online at some point, I believe)

14/16
To conclude: many, many things were not mentioned here. Feel free to add refs, discuss, comment below! ↯

Also, I want to mention that for a different notion of privacy, *local* privacy, a recent work of Edmonds, Nikolov (@thesasho), and Ullman provides some analogue of..

15/16
... the above characterization, at least in the noninteractive setting:

What about other notions: pan-privacy, shuffle privacy? What about interactivity?

So many questions!

16/end
Note: @Aaroth and @thesasho rightly pointed out to me that as far as query complexity goes* Statistical Query (SQ) learning and (interactive) localy private (LDP) PAC-learning are equivalent, up to polynomial factors.

*The same is not known for *sample* complexity...
... a quite subtle point for me. I believe that this equivalence SQ↭LDP was shown in the original KLNRS'08 paper, "What Can We Learn Privately?": arxiv.org/abs/0803.0924

Also, for "pure" (ε,0)-DP PAC learning (no δ), a ≠ characterization was obtained in jmlr.org/papers/v20/18-…
Missing some Tweet in this thread? You can try to force a refresh.

Keep Current with Clément Canonne

Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

how to unroll video

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Follow Us on Twitter!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!