12,399 views

Florian Richoux

@FloRicx

, 25 tweets, 6 min read

Nobody:

Me: I will give you my opinion about AlphaStar being a Grandmaster. A (long) thread ⤵️

@DeepMindAI

@DeepMindAI

The Good:

1. First, we can congratulate @DeepMindAI: AlphaStar's results released yesterday are impressive both from a technical and scientific point of view, given the difficulty to have one (or here, three) AI(s) playing at such a complex game at this level.

2. More In their blog post (deepmind.com/blog/article/A…) than in their scientific paper (nature.com/articles/s4158…), DeepMind is honest about possible, concrete applications of this work: so far, there are none. But it does not matter:

One should see this kind of work as an "excellent training ground to advance [complex, real-world domains]", like written in their blog post.

3. Even if DeepMind chooses to not release AlphaStar source code (I will speak about that in the next tweet), they release details about their neural network architecture and its hyper-parameters. However, we can regret they do not release theirs networks weights (139 millions!!)

The Bad:

4. We can only regret the choice of not releasing the source code, like it was also the case for AlphaGo and AlphaZero. We can also question this strategy, since they won't sell their Alpha agents.

https://twitter.com/FloRicx/status/1088670126847619072?s=20

https://twitter.com/FloRicx/status/1088670126847619072?s=20

5. Like I explained in details in my AlphaStar thread last January (

https://twitter.com/FloRicx/status/1088670126847619072?s=20

), having a bot without APM limits and without a camera-limited view is not a problem for me. However it seems important for DeepMind to show they apply those limitations, so let's dig it.

AlphaStar plays with limited APM. Or rather, with limited "agent action", a DeepMind homemade metric. AlphaStar is limited to 22 "agent actions" within 5 seconds. An "agent action" is described to be potentially 3 clicks (unit, action and target selections).

This gives us nearly 800 classic APM! DeepMind explains it is not trivial to convert "agent actions" into APM, so why not taking into account APM directly?

6. In addition, DeepMind explains in their paper that AlphaStar has worst performance with more strict APM limits (OK, makes sense) but also with more relaxed APM limits, and even without APM limits (what?!). This shows two things:

a. high APMs do not lead to victory, unlike many people are thinking when it comes to bots, and b. like written in the paper, AlphaStar struggles to focus on strategy learning rather than on micro-management if one does not set clear boundaries.

It is also clear to me they fixed their 800 APM limit experimentally.

7. DeepMind writes in their blog that AlphaStar plays under a camera-view only, but in the paper, only enemy units outside the camera-view are unreachable. All AlphaStar units are selectable, even outside the camera-view. "What's the problem" you wonder, since humans can make

control groups. Well, it's like AlphaStar has a control group for EACH single unit. A human cannot do that.

@DaveChurchill

@DaveChurchill

8. Like always, DeepMind like to distinguish machine learning and "hard coded methods". Like @DaveChurchill recalled 3 weeks ago at the last @AIIDEconference, DeepMind has hard coded features of their NN, as well as statistics from human data given as inputs.

9. They stressed that "the interface and restrictions were approved by a professional player", ie, TLO. But TLO is now quite close to the AlphaStar team since show matches from December 2018.

DeepMind paid him several trips to London and I won't be surprised if he got paid as a consultant. One can have some doubt about the objectivity of this (unique) approval.

10. AlphaStar cannot really find new strategies by itself: the system needs guidance based on carefully manually-selected human data. However DeepMind clearly states this limitation in the paper.

11. Like always with NN, it is deadly slow to learn something: 150 millions of StarCraft 2 games have been played. DeepMind does not tell what was the average game lenght, so considering an average SC2 game is about 12 minutes, this gives 3400 years of play!!!

Condensed within 44 days, you can guess the monstrous infrastructure used for this (as well as the carbon-cost).

12. Details about delays are unclear in the paper: they wrote the delay due to latency, processing and computing is 110ms, but a figure also states a 200ms "requested delay" (?). The average human reflex is about from 190 to 250ms.

The Ugly:

13. The January 2019 version of AlphaStar was a set of 5 AIs playing Protoss. Now, it is 3 AIs, each playing one race, so it is still not one unique AI able to play the 3 races. And with the current system, it seems not possible for AlphaStar to play Random.

14. Nothing indicates that AlphaStar is able to adapt its strategy during a game. I bet it is doing the same thing we saw last January during show matches: picking one strategy at the beginning of the game and sticking with it, whatever the situation.

15. Last April, I criticized the fact that games on Battle.net where blind, ie, players did not know they were playing against AlphaStar. Here again in this paper, they did blind games, so human players cannot even try to look for flaws.

Notice also that 30 games with each race against GM players seems too few to state AlphaStar is GM: this may be enough to reach the Grandmaster level, but is it enough to stay in this category?

Enjoying this thread?

Keep Current with Florian Richoux

Stay in touch and get notified when new unrolls are available from this author!

This Thread may be Removed Anytime!

Twitter may remove this content at anytime, convert it as a PDF, save and print for later use!

Try unrolling a thread yourself!

1) Follow Thread Reader App on Twitter so you can easily mention us!

2) Go to a Twitter thread (series of Tweets by the same owner) and mention us with a keyword "unroll" @threadreaderapp unroll

You can practice here first or read more on our help page!

Enjoying this thread?

Try unrolling a thread yourself!

More from @FloRicx see all

Related threads

Trending hashtags

Did Thread Reader help you today?