Tweet

labml.ai

22 Jun, 8 tweets, 5 min read

Learning to play Kuhn Poker with Monte Carlo Counterfactual Regret Minimization (MC-CFR) in #python

📝 Code/Tutorial: nn.labml.ai/cfr/index.html

This isn't deep learning. But it'll be interesting if you do machine learning, like incomplete information games or play #poker.

🧵👇

2/8) Kuhn Poker is a simple 2-player betting game with three cards (A, K, Q). A single card is dealt to each player. Players take turns betting chips and the player with the higher card wins the chips. If a player folds the other player wins the chips.

👇

3/8) CFR finds the Nash equilibrium with self-play. In each iteration, it calculates the regret of following the current strategy instead of playing each action. Then it updates the strategy with regret matching:

strategy = regret of action/total regret of all actions

👇

4/8) The average of the strategies throughout the iterations gets close to the Nash equilibrium as we iterate.

Nash equilibrium is a state where no player can increase their expected payoff by changing their strategy.

👇

5/8) The strategy is a function of "information set" and gives a probability distribution across actions. An "information set" is the state of the game that’s visible to the player.

👇

6/8) Our implementation is accompanied by a lengthy introduction to CFR and MCCFR. The MCCFR implementation is abstracted from the game Kuhn Poker and we will add Leduc Poker implementation soon.

👇

@weights_biases

7/8) Here’s the Kuhn Poker experiment: nn.labml.ai/cfr/kuhn/index…

Colab Notebook with some visualizations:
colab.research.google.com/github/lab-ml/…

Results in @weights_biases:
wandb.ai/vpj/kuhn_poker…

👇

@vpj

8/8) This implementation is based on a tutorial @vpj wrote about a year ago:
github.com/vpj/poker/blob…

We will add implementations for Leduc poker and more efficient variants of CFR such as public chance sampling (PCS) if you find it useful.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @labmlai

labml.ai

@labmlai

23 May

@PyTorch

1/4) Simple StyleGAN2 implementation (with notes) in @PyTorch in less than 500 lines of code (including training)

Annotated Code: nn.labml.ai/gan/stylegan/i…

Github: github.com/lab-ml/annotat…

Paper: arxiv.org/abs/1912.04958

@jaakkolehtinen @NVIDIAAI

🧵👇

2/4) StyleGAN2 is based on StyleGAN which is based on ProgressiveGAN (by the same authors). We have briefly discussed these models.

StyleGAN paper: arxiv.org/abs/1812.04948
ProgressiveGAN paper: arxiv.org/abs/1710.10196

👇

3/4) This is a minimalistic Style GAN2 model training code. Only single-GPU training is supported to keep the implementation simple. We managed to shrink it to keep it at 425 lines of code, including the training loop.

We trained it on 128x128 resolution.

👇

Read 5 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

labml.ai

Try unrolling a thread yourself!

More from @labmlai

labml.ai

Did Thread Reader help you today?

Like this author's thread?