Tweet

Blake Richards

14 Jan, 16 tweets, 7 min read

1/ I'm very happy to give a little thread today on our paper accepted at ICLR 2021!

🎉🎉🎉

In this paper, we show how to build ANNs that respect Dale's law and which can still be trained well with gradient descent. I will expand in this thread...

openreview.net/forum?id=eU776…

2/ Dale's law states that neurons release the same neurotransmitter from all of their axonal terminals.

en.wikipedia.org/wiki/Dale%27s_…

Practically speaking, this implies that neurons are either all excitatory or inhibitory. It's not 100%, nothing is in biology, but it's roughly true.

3/ You may have wondered, "Why don't more people use ANNs that respect Dale's law?"

The rarely discussed reason is this:

When you try to train an ANN that respects Dale's law with gradient descent, it usually doesn't work as well -- worse than an ANN that ignores Dale's law.

4/ In this paper, we try to rectify this problem by developing corrections to standard gradient descent to make it work with Dale's law.

Our hope is that this will allow people in the Neuro-AI field to use ANNs that respect Dale's law more often, ideally as a default.

5/ To begin with, we assume we are dealing with feed-forward inhibitory interneurons, so that inhibitory units receive inputs from the layer below and project within layer. Note though, that all of our results can generalise to recurrent inhibition using a rolled out RNN.

6/ To make gradient descent work well in these networks (which we call "Dale's ANNs", or "DANNs") we identify two important strategies.

7/ First, we develop methods for initialising the network so that it starts in a regime that is roughly equivalent to layer norm.

8/ Second, inspired by natural gradients, we use the Fisher Information matrix to show that the interneurons have a disproportionate effect on the output distribution, which requires a correction term to their weight-updates to prevent instability.

9/ When we implement these changes, we can match a standard ANN that doesn't obey Dale's law!

Not so for ANNs that obey Dale's law and doesn't use these tactics!

(Here are MNIST results, ColumnEI is a simple ANN with columns in the weight matrices forced to be all + or -)

10/ This can work in RNNs (as noted above) and also in convnets, as shown here:

11/ There are many next steps! But, an immediate one will be developing PyTorch and TensorFlow modules with these corrections built-in to make it easy for other researchers to build and train DANNs.

12/ This work also raises interesting Qs about inhibitory plasticity. It can sometimes be tough to induce synaptic plasticity in interneurons. Is that because the brain has it's own built-in mechanisms that correct for an outsized impact of interneurons on output behaviour?

13/ Another Q: why does the brain use Dale's law? We were only able to *match* standard ANNs. No one has shown *better* results with Dale's law. So, is Dale's law just an evolutionary constraint, a local minima in the phylogenetic landscape? Or does it help in some other way?

14/ Recognition where due: this work was lead by Jonathan Cornford (left), with important support from Damjan Kalajdzievski (right) and several others.

Thanks everyone for your awesome work on this!

@CIFAR_News

15/ Thanks also to @CIFAR_News, @HBHLMcGill, @IVADO_Qc, @NSERC_CRSNG, @TheNeuro_MNI, and @Mila_Quebec for making this work possible.

@iclr_conf

16/ Also, thanks to the @iclr_conf organizers and our AC and reviewers! Our reviews were very constructive, we really appreciated it. #ICLR2021

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @tyrell_turing

Blake Richards

@tyrell_turing

11 Nov 20

https://twitter.com/tyrell_turing/status/1270840814080528384?s=20

1/ We have a new lab website!

linclab.mila.quebec

There were a variety of reasons for the change, but the biggest ones were related to commitments we made as a lab during #ShutDownSTEM in June:

https://twitter.com/tyrell_turing/status/1270840814080528384?s=20

Specifically, we've made the following changes:

2/ We've created an online application form with clear instructions:

linclab.mila.quebec/opportunities

From now on, I will direct anyone who emails me to this form, which eliminates the "hidden curriculum" of how to write to a PI. Hopefully, this limits potential implicit bias from me.

3/ This application form also allows people to identify as being from an under-represented group, if they wish, which will help the lab to consider diversity more concretely when we're making recruitment decisions.

Read 9 tweets

Blake Richards

@tyrell_turing

30 Jun 20

@TheFrontalLobe_

1/n) A small thread on argument structure... I've been thinking about this bc @TheFrontalLobe_ recently posted about Tim Van Gelder's famous paper, and I turned into a complete jerk in response (sorry again André). I was asking myself, why does that paper make me so irritable?

2/n) I realized why: it's the structure of the argument in the paper. I also realized that many other papers that get under my skin share the same structure. I need to learn to be less of a jerk on Twitter, yes, but I also want to highlight why this structure bothers me so.

3/n) Here's the structure I find so irritating:

1. Concept A is central to current theories of the brain/mind
2. Define A as implying X, Y, Z, claim concept B does not
3. Argue that X, Y, Z are surely not how brains/minds work
4. Conclude that we should abandon A in favour of B

Read 13 tweets

Blake Richards

@tyrell_turing

10 Jun 20

1/ For #ShutDownSTEM today, our lab put aside research and crafted some concrete ideas for what we can do to help reduce anti-black/indig. racism, and more broadly, increase diversity in STEM. Our focus was local, specific acts to the lab. I wanted to share what we came up with.

2/ (Item 1) we decided that we could alter the way in which members of the lab are selected, in order to reduce the potential influence of unconscious biases and barriers that could potentially keep BIPOC from getting into the lab.

3/ Currently, the process is informal: ppl email me, and if I am impressed, I have a Zoom meeting and/or they give a talk and meet the lab. I then ask the lab's opinions, and make a final decision.

Read 20 tweets

Blake Richards

@tyrell_turing

5 Apr 20

@NeuroNaud

1/ Need a distraction from the pandemic? It's #tweeprint time!!!

I'm very excited to share here with you new work from myself, @NeuroNaud, @guerguiev, Alexandre Payeur, and @hisspikeness:

biorxiv.org/content/10.110…

We think our results are quite exciting, so let's go!

2/ Here, we are concerned with the credit assignment problem. How can feedback from higher-order areas inform plasticity in lower-order areas in order to ensure efficient and effective learning?

3/ Based on the LTP/LTD literature (e.g. jneurosci.org/content/26/41/…), we propose a "burst-dependent synaptic plasticity" rule (BDSP). It says, if there is a presynaptic eligibility trace, then:

- postsynaptic burst = LTP
- postsynaptic single spike = LTD

Read 34 tweets

Blake Richards

@tyrell_turing

27 Feb 20

@matthewcobb

1/ I want to very briefly address this article that came out today by @matthewcobb in the Guardian:

theguardian.com/science/2020/f…

2/ I'm tempted to ignore it, but I think that would actually be a shame, because in many ways, it's a good article. Yet, it is also a confused article, and I worry about it confusing both scientists and the public more broadly. So, I'll just quickly address the confusion.

@matthewcobb

3/ The mistake is a classic mistake. @matthewcobb is not the first to make it, and I know he will not be the last. It's this: to think that Von Neumann machines (like our laptops) are the only type of computer, and that their properties define computation. That is false.

Read 11 tweets

Blake Richards

@tyrell_turing

29 Nov 19

@GaryMarcus

@GaryMarcus @r_chavarriaga @KordingLab @DeepMindAI You don't actually keep up with the neuroscience literature, do you? That has been evident in these conversations... Here, lemme give you a few examples:

@GaryMarcus

@GaryMarcus @r_chavarriaga @KordingLab @DeepMindAI 1) ANNs optimised on relevant tasks match the representations in human (and primate) cortical areas better than other models developed to date:

journals.plos.org/ploscompbiol/a…

sciencedirect.com/science/articl…

journals.plos.org/ploscompbiol/a…

@GaryMarcus

@GaryMarcus @r_chavarriaga @KordingLab @DeepMindAI 2) ANNs trained on motor tasks successfully predict both motor behaviour and the distribution of representations in motor cortex:

sciencedirect.com/science/articl…

biorxiv.org/content/biorxi…

Read 7 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Blake Richards

Try unrolling a thread yourself!

More from @tyrell_turing

Blake Richards

Blake Richards

Blake Richards

Blake Richards

Blake Richards

Blake Richards

Did Thread Reader help you today?

Like this author's thread?