Woke up to 3 new hate emails from PyTorch fanboys, a new record (was this coordinated?).
I got my very first hate message in 2017, from a PyTorch fan, and it has been a recurring thing since. The degree of toxicity of that community is insane.
It's exhausting.
If you are a Keras user, I beg you: never behave like these people. Don't be a jerk. Don't harass anyone. Not that I've had reports of that ever happening.
Just be cool. Build great things. Help others.
If you are a PyTorch fan: please leave me alone. My work is not related to you. You don't need to feel threatened by my work. I'm just doing the best I can. I've been at it for 5 years, it has been a lot of work.
So please leave me alone, and leave the Keras & TF team alone.
Also remember that I am not Keras. Keras is backed by a team of awesome folks, and by a community of 400,000 users & contributors. Keras is a collective project, built by many, serving the needs of many. I merely started it, a long time ago. Attacking me won't help you.
I've never witnessed any issue with any other community. Not once. MXNet, Caffe, sklearn, you name it. Zero. But the PyTorch community is something special.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
I'm partnering with @mikeknoop to launch ARC Prize: a $1,000,000 competition to create an AI that can adapt to novelty and solve simple reasoning problems.
I published the ARC benchmark over 4 years ago. It was intended to be a measure of how close we are to creating AI that can reason on its own – not just apply memorized patterns.
ARC tasks are easy for humans. They aren't complex. They don't require specialized knowledge – a child can solve them. But modern AI struggles with them.
Because they have one very important property: they're designed to be resistant to memorization.
It's amazing to me that the year is 2024 and some people still equate task-specific skill and intelligence. There is *no* specific task that cannot be solved *without* intelligence -- all you need a sufficiently complete description of the task (removing all test-time novelty and uncertainty), and you can achieve arbitrary levels of skills while entirely by-passing the problem of intelligence. In the limit, even a simple hashtable can be superhuman at anything.
The "AI" of today still has near-zero (though not exactly zero) intelligence, despite achieving superhuman skill at many tasks.
Here's one thing that AI won't be able to do within five years (if you extrapolate from the excruciatingly slow progress of the past 15 years): acquiring new skills as efficiently as humans, using the same data. The ARC benchmark is an attempt at measuring roughly that.
The point of general intelligence is to make it possible to deal with novelty and uncertainty, which is what our lives are made of. Intelligence is the ability to improvise and adapt in the face of situations you weren't prepared for (either by your evolutionary history or by your past experience) -- to efficiently acquire skills at novel tasks, on the fly.
Many of the people who are concerned with falling birthrates aren't willing to consider the set policies that would address the problem -- aggressive tax breaks for families, free daycare, free education, free healthcare, and building more/denser housing to slash the price of homes.
Most people want children, but can't afford them.
I always found it striking how very rich couples (50M+ net worth) all tend to have over 3 children (and often many more). And how young women always say they want children -- yet in practice they delay family building because they are forced to focus on financial stability and therefore career. When money is not an object, families have 3+ children.
For middle incomes (below 1M/year) fertility goes down as income goes up, because *the cost of raising children increases with income* due to *opportunity cost*. If you make $150k and stand to eventually grow to $300k, you are losing a lot of money by quitting your job to raise children (on top of the prohibitive cost of raising children -- which also goes up as your incomes and thus standards go up). You are thus *more* likely to postpone having children.
Starting at 1M/year, fertility rates rise again. And couples that make 5+M/year get to have the number of children they actually want -- which is almost always more than 3, and quite often 5+.
That memorization (which ML has solely focused on) is not intelligence. And because any task that does not involve significant novelty and uncertainty can be solved via memorization, *skill* is never a sign of intelligence, no matter the task.
Intelligence is found in the ability to pick up new skills quickly & efficiently -- at tasks you weren't prepared for. To improvise, adapt and learn.
Here's a paper you can read about it.
It introduced a formal definition of intelligence, as well as benchmark to capture that definition in practical terms. Although it was developed before the rise of LLMs, current state-of-the-art LLMs such as Gemini Ultra, Claude 3, or GPT-4 are not able to score higher than a few percents on that benchmark.arxiv.org/abs/1911.01547
Finding 1: the fastest backend for a given model typically alternates between XLA-compiled JAX and XLA-compiled TF. Plus, you might want to debug/prototype in PT before training/inferencing with JAX or TF.
The ability to write framework-agnostic models and pick your backend later is a game-changer.
Finding 2: Keras 3 with the best-performing backend outperforms reference native PT implementations (compiled) for all models we tried.
Notably, 5 out of 10 tasks demonstrate speedups exceeding 100%, with a maximum speedup of 340%.
If you're not leveraging this advantage for any large model training run, you're wasting GPU time -- and thus throwing away money.
It doesn't take a whole lot of pondering to figure out that the thesis "humans only seem smart because they're 'trained' on huge amounts of 'data' via their visual system (almost like LLMs!)" doesn't hold any water.
For instance -- congenitally blind people are not less intelligent. Vision isn't fundamental to what makes us human. A rich learning environment is still a rich learning environment when apprehended through restricted sensorimotor modalities.
Humans span an incredibly wide range of sensorimotor affordances. Some are blind, some are deaf, some don't have hands. They might grow up in radically different environments -- some with just three other humans around them, some with thousands. Some with libraries of books, some without any writing.
In the end, though, it doesn't make a huge difference -- all of them become fully-fledged, intelligent humans. Because no matter what, they're all extracting information from the world at a roughly constant rate: the intrinsic rate at which the brain processes information. Which is an infinitesimal fraction of the bandwidth of the human sensorimotor feed.
If your senses are missing something, you'll just report your fixed-rate attention to something else, and won't be much poorer for it.
That's also why the influence of genes on fluid intelligence is overwhelmingly greater than that of the environment. If "training data" was so important, you'd expect environment and education to be critical to intelligence. They aren't. Twins raised in vastly different situations end up about as smart.