Google DeepMind Profile picture
Sep 9, 2021 10 tweets 9 min read Read on X
Introducing the '21 DeepMind x @ai_ucl Reinforcement Learning Lecture Series, a comprehensive introduction to modern RL.

Follow along with our researchers are they explore Markov Decision Processes, sample-based learning algorithms & much more: dpmd.ai/2021RLseries 1/2 Image
Also find the full series via the DeepMind @YouTube channel: dpmd.ai/DeepMindxUCL21
In the first lecture of the series, Research Scientist Hado introduces the course and explores the fascinating connection between reinforcement learning and artificial intelligence: dpmd.ai/RLseries1

#DeepMindxUCL @ai_ucl Image
In lecture two, Research Scientist Hado explains why it's important for learning agents to balance exploring and exploiting acquired knowledge at the same time: dpmd.ai/RLseries2

#DeepMindxUCL @ai_ucl Image
In the third lecture, Research Scientist Diana shows us how to solve MDPs with dynamic programming to extract accurate predictions and good control policies: dpmd.ai/RLseries3

#DeepMindxUCL @ai_ucl Image
In lecture four, Diana covers dynamic programming algorithms as contraction mappings, looking at when and how they converge to the right solutions: dpmd.ai/RLseries4

#DeepMindxUCL @ai_ucl Image
In this lecture, Hado explores model-free prediction and its relation to Monte Carlo and temporal difference algorithms: dpmd.ai/RLseries5

#DeepMindxUCL @ai_ucl Image
In part two of the model-free lecture, Hado explains how to use prediction algorithms for policy improvement, leading to algorithms - like Q-learning - that can learn good behaviour policies from sampled experience: dpmd.ai/RLseries6

#DeepMindxUCL @ai_ucl Image
In this lecture, Hado explains how to combine deep learning with reinforcement learning for deep reinforcement learning. He looks at the properties and difficulties that arise when combining function approximation with RL algorithms: dpmd.ai/RLseries7

#DeepMindxUCL @ai_ucl Image
In this lecture, Research Engineer Matteo explains how to learn and use models, including algorithms like Dyna and Monte-Carlo tree search (MCTS): dpmd.ai/RLseries8

#DeepMindxUCL @ai_ucl Image

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Google DeepMind

Google DeepMind Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @GoogleDeepMind

Nov 20
Introducing AlphaQubit: our AI-based system that can more accurately identify errors inside quantum computers. 🖥️⚡

This research is a joint venture with @GoogleQuantumAI, published today in @Nature → goo.gle/3ZflWMnImage
The possibilities in quantum computing are compelling. ♾️

They can solve certain problems in a few hours, which would take a classical computer billions of years. This can help lead to advances in areas like drug discovery to material design.

But building a stable quantum system is a challenge.
Qubits are units of information that underpin quantum computing. These can be disrupted by microscopic defects in hardware, heat, vibration, and more.

Quantum error correction solves this by grouping multiple noisy qubits together to create redundancy, into something called a “logical qubit”. Using consistency checks, a decoder then protects the information stored in this.

In our experiments, our decoder AlphaQubit made the fewest errors.
Read 7 tweets
Oct 23
Our latest generative technology is now powering MusicFX DJ in @LabsDotGoogle - and we’ve also updated Music AI Sandbox, a suite of experimental music tools which can streamline creation. 🎵

This will make it easier than ever to make music in real-time with AI. ✨goo.gle/4eTg28ZImage
MusicFX DJ lets you input multiple prompts and include details on instruments, genres and vibes to create music. 🎛️

We’ve updated and improved the interface using feedback from @YouTube’s Music AI Incubator.
Two key innovations lie at the core of MusicFX DJ.

🔘 We adapted our models to perform real-time streaming by training them to generate the next 2 seconds of music, based on the previous 10 seconds.

🔘 “Style embedding” is steered by the player, which is a mix of text prompt embeddings set by the slider values
Read 6 tweets
Sep 5
We’re presenting AlphaProteo: an AI system for designing novel proteins that bind more successfully to target molecules. 🧬

It could help scientists better understand how biological systems function, save time in research, advance drug design and more. 🧵 dpmd.ai/3XuMqbX
Protein binders are promising tools in drug development and biotech.

They’ve demonstrated potential in:
🌀 binding cancer targets
🌀 blocking viral infections
🌀 modulating immune response

But traditional ways of identifying effective protein binders involve extensive lab work.
AlphaProteo was trained on vast amounts of protein data from @PDBeurope and millions of predicted structures from #AlphaFold.

This meant it was able to learn how molecules bind to each other in intricate ways. dpmd.ai/3XuMqbX
Read 6 tweets
Aug 8
Meet our AI-powered robot that’s ready to play table tennis. 🤖🏓

It’s the first agent to achieve amateur human level performance in this sport. Here’s how it works. 🧵
Robotic table tennis has served as a benchmark for this type of research since the 1980s.

The robot has to be good at low level skills, such as returning the ball, as well as high level skills, like strategizing and long-term planning to achieve a goal.
To train the robot, we gathered a dataset of initial table tennis ball states - which included information about position, speed, and spin.

The system practiced using this library and learned different skills, like forehand topspin, backhand targeting, and returning serves.
Read 9 tweets
Aug 2
AI systems can be powerful but opaque "black boxes" - even to researchers who train them. ⬛

Enter Gemma Scope: a set of open tools made up of sparse autoencoders to help decode the inner workings of Gemma 2 models, and better address safety issues. → dpmd.ai/gemma-scope
Language models turn your text input into a series of ‘activations’ - which map the relationships between the words you’ve entered to help it write its answer. 💬

Activations at different layers in its neural network represent increasingly advanced concepts, known as ‘features’. Image
Activations are made up of neurons, which “fire” for many unrelated features - making them hard to decipher.

Each feature seems to be a specific combination of neurons - but how can we find the meaningful combinations of neurons?

This is where sparse autoencoders can help.Image
Read 7 tweets
Jul 31
We’re welcoming a new 2 billion parameter model to the Gemma 2 family. 🛠️

It offers best-in-class performance for its size and can run efficiently on a wide range of hardware.

Developers can get started with 2B today → dpmd.ai/4d0MKEH
We’re also introducing ShieldGemma: a series of state-of-the-art safety classifiers designed to filter harmful content. 🛡️

These target hate speech, harassment, sexually explicit material and more, both in the input and output stages.
Finally, we’re announcing Gemma Scope, a set of tools to help researchers examine how Gemma 2 makes decisions. 🔍

It's a comprehensive, open suite of sparse autoencoders - specialized neural networks that zoom into the model’s inner workings and make them more interpretable.
Read 4 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(