Jesse Engel Profile picture
Guitarist, Researcher Google Brain. Opinions are my own.
Oct 22, 2022 4 tweets 1 min read
No reason, just feel the need to say it again, there is no such thing as "an AI". It'd be so much clearer if we actually described what the tech is (Algorithms) and what it does (Automate). Automation Algorithms, or Augmentation/Assistance Algorithms depending on how we use them. Just try replacing "AI" with "Automation Algorithms" when reading headlines, it makes you feel sane again, or at least grounded back in reality.
May 26, 2022 4 tweets 2 min read
One takeaway for me from (#dalle2, #imagen, #flamingo) is there's no one "golden algorithm" to unlock these new transfer learning capabilities. Contrastive, AR, Freezing, Priors, they all can work. You almost can't stop these models from exhibiting these new types of behavior... ...It reminds me a lot of early DL days, when people used to think you needed sparsity regularization to learn nice gabor filters in NNs, but then it turned out than almost any model with convolution and enough natural data would learn them on their own...
Nov 10, 2021 10 tweets 6 min read
Check out our latest blog post on using Transformers for Music Transcription: g.co/magenta/mt3

Authors: @jpgard, @ethanmanilow, @iansimon, @fjord41, @rigeljs, @jesseengel Rather than training domain-specific models for each dataset, we show that a seq2seq approach can jointly train on many different datasets with arbitrary combinations of instruments. This is an important step towards general purpose music transcription.
Sep 18, 2021 20 tweets 6 min read
It’s well known that neural networks model correlation and not causation.

Recently, I’ve found it helpful to think about NN blocks as literal correlations of correlations of correlations …

(incl. dense, norm, nonlin, conv, softmax, transformer, LMs, GANs, …)

🧵 1/20 This is probably obvious to a lot of people, but I found it interesting, so I thought I'd share. Corrections welcome 😀

2/
Apr 27, 2021 4 tweets 1 min read
1/4 Sorry for another AI rant, I'm just reminded on a daily basis of how harmful the term really is. Almost all technologies could be much better described by saying what they actually do, where the "A" is "automation" and/or "augmentation", and hardly artificial. Examples: automated decision making, automated policing, automated hiring, augmented writing, augmented creative tools, etc...

It gives a much clearer picture of what a technology does, how it changes power dynamics of society, and who's responsible for its creation and use.
May 1, 2020 17 tweets 4 min read
A lot of folks have been asking me my thoughts about the recent Jukebox work by @OpenAI, so I thought a thread might help. I feel like I have separate reactions from three different parts of my identity:

1) ML researcher
2) ML researcher of music
3) Musician

Long thread :)
1/17 1) As an ML researcher, I think the results are really impressive! The model builds directly off of the VQ-VAE2 work of @avdnoord, hierarchically modeling discrete codes with transformer priors, and autoregressive audio approaches of @sedielem.
2/17
Jan 15, 2020 16 tweets 9 min read
Differentiable Digital Signal Processing (DDSP)! Fusing classic interpretable DSP with neural networks.

⌨️ Blog: magenta.tensorflow.org/ddsp
🎵 Examples: g.co/magenta/ddsp-e…
⏯ Colab: g.co/magenta/ddsp-d…
💻 Code: github.com/magenta/ddsp
📝 Paper: g.co/magenta/ddsp-p…

1/ 2/ tl; dr: We've made a library of differentiable DSP components (oscillators, filters, etc.) and show that it enables combining strong inductive priors with expressive neural networks, resulting in high-quality audio synthesis with less data, less compute, and fewer parameters.
Feb 28, 2019 10 tweets 5 min read
Make music with GANs!
GANSynth is a new method for fast generation of high-fidelity audio.

🎵 Examples: goo.gl/magenta/gansyn…
⏯ Colab: goo.gl/magenta/gansyn…
📝 Paper: goo.gl/magenta/gansyn…
💻 Code: goo.gl/magenta/gansyn…
⌨️ Blog: magenta.tensorflow.org/gansynth

1/ 2/ tl; dr: We show that for musical instruments, we can generate audio ~50,000x faster than a standard WaveNet, with higher quality (both quantitative and listener tests), and have independent control of pitch and timbre, enabling smooth interpolation between instruments.