Tweet

More from @fchollet

François Chollet

@fchollet

20 Mar

Deep learning excels at unlocking the creation of impressive early demos of new applications using very little development resources.

The part where it struggles is reaching the level of consistent usefulness and reliability required by production usage.

Autonomous driving is the ultimate example. You could use deep learning to create an impressive self-driving car prototype in 2015 on a shoestring budget (Comma did exactly that, using Keras). Five years and billions of $ later, the best DL-centric driving systems are still L2+.

Every app demo based on GPT-3 follows this pattern. You can build the demo in a weekend, but if you invest $20M and 3 years fleshing out the app, it's unlikely it will still be using GPT-3 at all, and it may ever meet customer requirements

Read 4 tweets

François Chollet

@fchollet

13 Mar

Quick tweetorial: using KerasTuner to find good model configs.

Define your model as usual -- but put your code in a function that takes a `hp` (hyperparameters) argument.

Then, instead of using values like "embedding_dim = 512", use ranges: `hp.Int(...)`

Then, instantiate a tuner and pass it your model building function. It will need an `objective` to optimize -- this could the name be any metric found in the model logs. For built-in Keras metrics, the tuner will automatically pick whether to maximize or minimize the metric.

`max_trials` is the maximum number of model configurations to try. The ominous-sounding `executions_per_trial` is the number of model training runs to average for each model config: a higher value reduces results variance.

Read 4 tweets

François Chollet

@fchollet

7 Mar

Fun fact: if you wanted to keep an open-air swimming pool on the surface of Mars, you'd have to keep it heated at a temperature exactly between 0°C and 0.5°C (about 32°F). Because the atmospheric pressure on Mars is so low, water would boil if its temperature got any higher.

And any lower than that, it would freeze (which would be the default given that the surrounding atmosphere would be at around -60°C / -80°F)

Now, fun medical puzzle: if you took off your spacesuit on the surface of Mars, what would immediately happen to you? Would you...

Read 4 tweets

François Chollet

@fchollet

3 Mar

New code walkthrough on keras.io: speech recognition with Transformer. Very readable and concise demonstration of how to build and train a speech recognition model on the LJSpeech dataset.
keras.io/examples/audio…

@NandanApoorv

This example was implemented by @NandanApoorv. Let's take a look at the model architecture.

It starts by defining two embedding layers: a positional embedding for text tokens, and an embedding for speech features, that uses 1D convolutions with strides for downsampling.

Then it defines a Transformer encoder, which is your usual Transformer block, as well as a Transformer decoder, which is also your usual Transformer block, but with causal attention to prevent later timesteps to influence the decoding of earlier timesteps.

Read 4 tweets

François Chollet

@fchollet

22 Feb

https://twitter.com/moms4nuclear/status/1304473231743574016

Seeing lots of takes about nuclear power and its opponents. Yes, nuclear power could be an important element of a climate solution. Yes, the world needs to build more nuclear power plants. But it's absurd to blame environmental activists for the fact that it hasn't happened yet.

https://twitter.com/moms4nuclear/status/1304473231743574016

The primary reason why countries with large CO2 emissions haven't gone nuclear is economic: the upfront cost of a nuclear plant is a large multiple of that of a coal plant. That's why coal is king in India, for instance. Nothing to do with activists.

Or consider China, the largest emitter of CO2 today. You think environmental activism is why China hasn't built more nuclear plants? Lol. Economically, coal has been "good enough" -- assuming we ignore its health costs and long-term environmental costs.

Read 6 tweets

François Chollet

@fchollet

19 Feb

@mhmazur

Interesting analysis by @mhmazur. Human work is driven by clear goals and is informed by task-specific context. A model that is optimized for generating plausible-sounding text, ignoring goals and context, virtually never produces any useful answer (unless by random chance).

https://twitter.com/mrpatto/status/1362844477157560320

Reminder: language serves a variety of purposes -- transmit information, act on the world to achieve specific goals, serve as a social lubricant, etc. Language cannot be modeled as a statistical distribution independent of these purposes.

This is akin to modeling the appearance of animals as a statistical distribution while ignoring the environment in which they live. You could use such a model to generate plausible-looking animals, but don't expect them to be able to survive in the wild (environmental fitness)

Read 4 tweets

Share this page!

François Chollet

Try unrolling a thread yourself!

More from @fchollet

François Chollet

François Chollet

François Chollet

François Chollet

François Chollet

François Chollet

Did Thread Reader help you today?

Like this author's thread?