Read on Twitter

12,399 views

@fchollet

, 18 tweets, 6 min read Read on Twitter

Are you a deep learning researcher? Wondering if all this TensorFlow 2.0 stuff you heard about is relevant to you?

This thread is a crash course on everything you need to know to use TensorFlow 2.0 + Keras for deep learning research. Read on!

1) The first class you need to know is `Layer`. A Layer encapsulates a state (weights) and some computation (defined in the `call` method).

2) The `add_weight` method gives you a shortcut for creating weights.

3) It’s good practice to create weights in a separate `build` method, called lazily with the shape of the first inputs seen by your layer. Here, this pattern prevents us from having to specify `input_dim`:

4) You can automatically retrieve the gradients of the weights of a layer by calling it inside a GradientTape. Using these gradients, you can update the weights of the layer, either manually, or using an optimizer object. Of course, you can modify the gradients before using them.

5) Weights created by layers can be either trainable or non-trainable. They're exposed in the layer properties `trainable_weights` and `non_trainable_weights`. Here's a layer with a non-trainable weight:

6) Layers can be recursively nested to create bigger computation blocks. Each layer will track the weights of its sublayers (both trainable and non-trainable).

7) Layers can create losses during the forward pass. This is especially useful for regularization losses. The losses created by sublayers are recursively tracked by the parent layers.

8) These losses are cleared by the top-level layer at the start of each forward pass -- they don't accumulate. `layer.losses` always contain only the losses created during the *last* forward pass. You would typically use these losses by summing them when writing a training loop.

9) You know that TF 2.0 is eager by default. Running eagerly is great for debugging, but you will get better performance by compiling your computation into static graphs. Static graphs are a researcher's best friends! You can compile any function by wrapping it in a tf.function:

10) Some layers, in particular the `BatchNormalization` layer and the `Dropout` layer, have different behaviors during training and inference. For such layers, it is standard practice to expose a `training` (boolean) argument in the `call` method.

11) You have many built-in layers available, from Dense to Conv2D to LSTM to fancier ones like Conv2DTranspose or ConvLSTM2D. Be smart about reusing built-in functionality.

12) To build deep learning models, you don't have to use object-oriented programming all the time. All layers we've seen so far can also be composed functionally, like this (we call it the "Functional API"):

The Functional API tends to be more concise than subclassing, & provides a few other advantages (generally the same advantages that functional, typed languages provide over untyped OO development).

Learn more about the Functional API: tensorflow.org/alpha/guide/ke…

However, note that the Functional API can only be used to define DAGs of layers -- recursive networks should be defined as `Layer` subclasses instead.

In your research workflows, you may often find yourself mix-and-matching OO models and Functional models.

That's all you need to get started with reimplementing most deep learning research papers in TensorFlow 2.0 and Keras!

Now let's check out a really quick example: hypernetworks.

A hypernetwork is a deep neural network whose weights are generated by another network (usually smaller).

Let's implement a really trivial hypernetwork: we'll take the `Linear` layer we defined earlier, and we'll use it to generate the weights of... another `Linear` layer.

Another quick example: implementing a VAE in either style, either subclassing (left) or the Functional API (right). I've posted this before. Find what works best for you!

This is the end of this thread. Play with these code examples in this Colab notebook: colab.research.google.com/drive/17u-pRZJ… 🦄🚀

Like this thread? Get email updates or save it to PDF!

Subscribe to François Chollet

Get real-time email alerts when new unrolls are available from this author!

This content may be removed anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Like this thread? Get email updates or save it to PDF!

Subscribe to François Chollet

This content may be removed anytime!

Try unrolling a thread yourself!

More from @fchollet see all

Related threads

Trending hashtags

Did Thread Reader help you today?