Andreas Kirsch 🇮🇱🇺🇦 Profile picture
Past: 🧑‍🎓 DPhil @AIMS_oxford @ExeterCollegeOx @UniofOxford (4.5yr) 🧙‍♂️ RE @DeepMind (1yr) 📺 SWE @Google (3yrs) 🎓 @TU_Muenchen 👤 Fellow @nwspk
Aug 14 5 tweets 2 min read
Excited to publish a Python package that turns @karpathy's "A Recipe for Training Neural Networks" into easy-to-use diagnostics code! 🔧

No more randomly poking around in your custom @PyTorch DNN to debug it.

Get simple diagnostics for your neural nets 🫶

#PyTorch

1/ Image I took a good look at the checklist that @karpathy wrote up in and wrote assert_* methods to implement as many checks as possible.

You can simply `pip install neural_net_checklist` and use the torch_diagnostic package to check against simple bugs

2/ karpathy.github.io/2019/04/25/rec…
Image
Aug 11 15 tweets 6 min read
A small info-theory thread (or at least food for thought):

Why is the Bayesian Model Average the best choice? Really why?

I'll go through a naive argument (anyone has better references?), simple lower-bounds and decompositions, and pitch a "reverse mutual information"

1/15 Image The full thread below as a regular blog post is here btw:



2/15blog.blackhc.net/2024/08/BMA-op…
Dec 31, 2022 11 tweets 5 min read
How can different active learning and active sampling methods be connected? And what about Bayesian active learning?

Are these notions of "informativeness" the same?

We cover these questions in a new paper in @TmlrOrg 🥳

@yaringal @OATML_Oxford

1/11

openreview.net/forum?id=UVDAK… We use Fisher information to connect recent approaches (BADGE, BAIT, LogDet objectives from SIMILAR and PRISM) with approximations of information-theoretic quantities (EIG, EPIG, ...) based on literature going back to Lindley (1956) and MacKay (1992)

2/11 Our paper “Unifying Approaches in Active Learning and Acti
Dec 4, 2021 4 tweets 2 min read
Finally an information-theoretic deduction of Stirling's approximation for Binomial Coefficients 🥳🎉

We present a very different take on how to derive it. We only use basic probability theory and intuitions from information theory 🔥

blackhc.net/blog/2021/bino… We show that the approximation is actually an upper bound and characterize the approximation error
Sep 30, 2020 6 tweets 3 min read
Given that the @TheSun published a piece with statements and new samples from the authors in a way that is everything we are worried about in the ML community and we want to avoid and prevent, I was wrong.

This sort of scientific communication is indefensible and appalling. I'm speechless given the quotes of the authors in the article and how they could think that it is in any way helpful to the sensitive debates that are happening.

It's like a nightmare come true, and the way the findings are presented in the article is harmful.
Jun 20, 2020 17 tweets 7 min read
Has anyone called out @UniofOxford and its colleges for how they bully and prevent students from returning to their accommodation and the atrocious advice they have given? @KelloggOx @StJohnsOx @LinacreCollege @ChCh_Oxford come to mind🙂 Students living in university or college accommodation at Oxford don't have tenancy agreements but "license" agreements, which give colleges lots of leeway and students practically no rights
Apr 20, 2020 4 tweets 2 min read
🎉🎉Happy & proud to share some research into Information Bottlenecks from @yaringal, @clarelyle and me at @OATML_Oxford 🎉🎉

We provide intuition and practical IB objectives for modern DNN architectures, like ResNets.

Check it out on arXiv
👉arxiv.org/abs/2003.12537 Image Our paper "Unpacking Information Bottlenecks: Unifying Information-Theoretic Objectives in Deep Learning" shows that well-known dropout regularization with standard cross-entropy loss and simple regularizers optimizes IB objectives in modern DNN architectures.
Apr 9, 2020 4 tweets 2 min read
🔥 Has your PyTorch code ever crashed because it ran out-of-memory in CUDA, and you had to fiddle with batch sizes repeatedly? 🔥

What if we could just write code that adapted to the available memory instead of resorting to brittle hand-tuning? 🤯

👉 github.com/BlackHC/toma🤗 toma (torch memory-adaptive algorithms) helps you write algorithms in PyTorch that adapt to the available (CUDA) memory.

It's hands-off and does not make assumptions about your code.

If your code fails, it halves the batch size and retries.
Mar 11, 2020 5 tweets 1 min read
Please read this very detailed yet plain to understand analysis: I'll wear my conspiracy hat for a second: given all we know and all that must have been known by stakeholders earlier, is the current inaction towards stricter containment gross negligence due to stupidity and recklessness by our leaders or are some condoning the consequences:
Jan 12, 2020 6 tweets 1 min read
Interesting post that conflates two separate issues: expressive flexible interfaces and code de-duplication. The bit about communication with team members is very important. More details👇 The example replaces repetitive code with a neat (👈literally) deduplicated abstraction. The problem with that is that it also tied down the interface to adhering to the chosen abstraction.
Dec 31, 2019 8 tweets 2 min read
Is attacking @realDonaldTrump for how he tweets and expresses his opinion tone-policing? 🤗 A lot of the reactions to his tweets are about tone usually. That makes me wonder about the boundaries of tone-policing 🤔 Are angry white men just misunderstood? Tone-policing might say: I cannot listen to you when you talk like that. Saying someone is tone-policing might mean: get over your own feelings and listen to me/the person. That requires empathy and the work to get over one's own feelings 🎉
Dec 27, 2019 15 tweets 2 min read
All hail the supremacy of System 1 over System 2 🙄 Part of these debates seems to be a question of association and feelings vs facts and boundaries 🤔

Anima's blog post seems toxic and judgmental in that it attacks people's characters in absolute terms. Compared to Scott's post and Pinker's email the contrast is stark 😶
Oct 3, 2019 4 tweets 1 min read
@hirevue what is wrong with you? 🤦

"In doctors, you might expect a good one to use more technical language"... Really?

That's the level of intelligence your product offers? And you're okay with that?

You're replacing actual interview decisions with Clever Hans. WTF You probably don't even know what your model does.

NLP models are still mostly cheating and looking for cues, and you're basing hiring decisions on that.

How can you not be sued for discrimination?
Or is your model actually just random?
Apr 4, 2019 7 tweets 2 min read
@MetomicGeorgia Oh btw, I'm referring to arxiv.org/abs/1904.02095. (from the article I linked, not the one you linked.)

I totally agree that advertisers shouldn't be allowed to target for these criteria (gender, race, age, etc), and also geolocation when it gets too narrow @MetomicGeorgia And when it comes to political targeting etc, targeting should be totally disabled in my opinion to allow for less manipulation (one could target political ads based on "safe" interests, but you can deduce what people are emotional about etc, so makes it easier to manipulate)
Jan 15, 2019 12 tweets 4 min read
As part of my resolutions for 2019, I want to think more critically about the papers I'm reading. I'll start with "When Will AI Exceed Human Performance? Evidence from AI Experts" (arxiv.org/abs/1705.08807).

#AI #predictions #paperreview The paper "When Will AI Exceed Human Performance? Evidence from AI Experts" analyses predictions about progress in AI from researchers who published at the 2015 NIPS and ICML conferences. Only a quick survey was used to gather the data.

arxiv.org/abs/1705.08807
May 29, 2018 8 tweets 4 min read
eventbrite.co.uk/e/bbc-newshack…

BBC #newsHACK: Reaching Young Audiences

IE @BBC wants to figure out how to get young people addicted to checking the news (and maybe reinvent #clickbait?)
1/ "How can trusted news providers reach young audiences?

Participants will be asked to prototype a new format that addresses one or more news needs identified by our research."

2/
Apr 1, 2018 7 tweets 1 min read
After two months as a fellow at Newspeak, I've realized that, while AI and ML are very interesting, they are not real game changers. The real game changers are cryptocurrencies and blockchain! AI and automation are still far away from transforming our society and will only further concentrate wealth in the hands of the few, whereas blockchain is happening *now*.
Feb 13, 2018 4 tweets 2 min read
And the future is already here, it seems:
lyrebird.ai already allows you to create a digital voice based on 1 minute of training data 2/ dropbox.com/s/0r6fdm1kqdc9… does this sound like me?