@amaarora really explained #convolutions very well in #fastbook week 12 session which can be viewed here



I wasn't able to write a blog post explaining my learnings from the stream but would threfore write a 🧵
2/n

After going through 1st part of #convolutions chapter, have cleared a concept and was introduced to two new concepts.

1. How depthwise convolutions work (3/n)
2. Dilated convolutions (7/n)
3. Alternate interpretation of #stride (9/n)
3/n

When we have a n-channel input and a m-channel output, we need to convolve over not only 2-Dimensions (W x H) but also across the depth D.

An RGB image for example has 3 channels

Let us consider we want to derive 10 feature maps from this input.
4/n

We shall then have 10 kernels each 3-D with different activations across the depth dimension.

We can look at each kernel one at a time and we observe that it's shape is in_channels x k x k

#DL
5/n

The way it gets applied is each input channel is multiplied with a corresponding kernel channel.

Then the results of these individual computations are just plainly added and we get the final (1 x h x w) or (1 x fraction of h x fraction of w) resultant feature map.

#DL
6/n

The following visual from this paper beautifully explains the same.

arxiv.org/pdf/1603.07285…
7/n

This paper also talks about dilation convolutions which is basically IMHO convolution with holes.

What this allows a neural network is increased receptivity so even the shallower layers are looking at a larger portion of the image plus we get dimensionality redx
8/n

Which means no need of pooling or taking larger strides. This was indeed a new way of thinking about convolutions and I hope to try it out in the near future.

#CV #DL
9/n

Another beautiful perspective of looking at a high stride convolution is presented in this paper.

A stride of > 1 is just applying convolution with stride 1 and only retaining some elements in the output feature map which we get.

#ComputerVision
10/n

The above really helps understand the implications when we compare pooling vs higher stride for reducing the feature map size deeper down the network.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Vinayak Nayak

Vinayak Nayak Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(