labml.ai Profile picture
Jun 22, 2021 β€’ 8 tweets β€’ 5 min read β€’ Read on X
Learning to play Kuhn Poker with Monte Carlo Counterfactual Regret Minimization (MC-CFR) in #python

πŸ“ Code/Tutorial: nn.labml.ai/cfr/index.html

This isn't deep learning. But it'll be interesting if you do machine learning, like incomplete information games or play #poker.

πŸ§΅πŸ‘‡
2/8) Kuhn Poker is a simple 2-player betting game with three cards (A, K, Q). A single card is dealt to each player. Players take turns betting chips and the player with the higher card wins the chips. If a player folds the other player wins the chips.

πŸ‘‡
3/8) CFR finds the Nash equilibrium with self-play. In each iteration, it calculates the regret of following the current strategy instead of playing each action. Then it updates the strategy with regret matching:

strategy = regret of action/total regret of all actions

πŸ‘‡
4/8) The average of the strategies throughout the iterations gets close to the Nash equilibrium as we iterate.

Nash equilibrium is a state where no player can increase their expected payoff by changing their strategy.

πŸ‘‡
5/8) The strategy is a function of "information set" and gives a probability distribution across actions. An "information set" is the state of the game that’s visible to the player.

πŸ‘‡
6/8) Our implementation is accompanied by a lengthy introduction to CFR and MCCFR. The MCCFR implementation is abstracted from the game Kuhn Poker and we will add Leduc Poker implementation soon.

πŸ‘‡
7/8) Here’s the Kuhn Poker experiment: nn.labml.ai/cfr/kuhn/index…

Colab Notebook with some visualizations:
colab.research.google.com/github/lab-ml/…

Results in @weights_biases:
wandb.ai/vpj/kuhn_poker…

πŸ‘‡
8/8) This implementation is based on a tutorial @vpj wrote about a year ago:
github.com/vpj/poker/blob…

We will add implementations for Leduc poker and more efficient variants of CFR such as public chance sampling (PCS) if you find it useful.

β€’ β€’ β€’

Missing some Tweet in this thread? You can try to force a refresh
γ€€

Keep Current with labml.ai

labml.ai Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @labmlai

Mar 5, 2022
πŸŽ₯ Improving language models by retrieving from trillions of tokens by @DeepMind

Paper explanation by @janithcwanni

The paper introduced Retrieval Enhanced Transformer (RETRO) - 25X smaller than GPT-3 with comparable performance.
Link to paper and other related resources such as code, discussions and tweets:

πŸ“Ž papers.labml.ai/paper/324a7d2e…
Read 6 tweets
Jan 23, 2022
πŸŽ₯ Lists of papers covered by popular YouTube channels

papers.labml.ai/lists

Here are some lists/channels we picked (1/9)

πŸ§΅πŸ‘‡ You can see the name of the videos on the bottom right corne
Read 9 tweets
Jan 22, 2022
πŸ₯³ Excited to share our Chrome browser extension for papers.labml.ai πŸŽ‰

chrome.google.com/webstore/detai…

It identifies research papers mentioned in websites you visit and shows a 2-line summary, availability code/videos/discussions, popularity on Twitter, and conferences.

πŸ§΅πŸ‘‡
2/4 We released the source code of the extension

github.com/labmlai/chrome…

πŸ‘‡
3/4 πŸ™ A big thank you to all of you who helped us test the extension and suggested making an app/browser-extension.

If you find any bugs or have suggestions please DM us on Twitter or open an issue on the Github repo. We love to hear your feedback.

πŸ‘‡
Read 5 tweets
Oct 18, 2021
If you like our highlighted papers you will love this!

We found a few awesome Github repos with highlighted/annotated research paper PDFs. We started linking them from papers.labml.ai

Here's the list of of repos and 😍 interesting papers they had:

πŸ§΅πŸ‘‡
2/13 Github: github.com/AakashKumarNai…
by @A_K_Nain
πŸ“Ž 20 ✨ 1,986 πŸƒ active

He has covered a wide area of papers with a lot of self-supervised papers.

πŸ‘‡
3/13 A pick

Emerging Properties in Self-Supervised Vision Transformers by @HugoTouvron @mcaron31 @alaaelnouby @imisra_ @hjegou @julienmairal @PierreStock @quobbe @alexsablay @armandjoulin @p_bojanowski @syhw Ben Graham and Matthijs
Douze

πŸ“Ž papers.labml.ai/paper/e2e56d9c…

πŸ‘‡
Read 11 tweets
Oct 14, 2021
Patches Are All You Need? by ❓

@PyTorch Paper implementation with side-by-side notes.

πŸ“ Annotated code nn.labml.ai/conv_mixer/ind…
πŸ“Ž Paper papers.labml.ai/paper/dd638a44…

The paper introduces ConvMixer which mix patch embeddings with depth-wise and point-wise convolutions.

πŸ§΅πŸ‘‡
2/ The implementation is very simple and the paper presents a 280 character version of the @PyTorch model code - fits a tweet πŸ’ͺ

Our implementation is a bit lengthy (hopefully easier to understand 😁)

πŸ‘‡
3/ ConvMixer is similar to MLP-mixer but uses linear transforms (convolutions) instead of multiple layers for each mixing. And It only mixes the neighboring patches within the convolution kernel.

πŸ“Ž MLP Mixer papers.labml.ai/paper/2105.016…
πŸ“ Our impl nn.labml.ai/transformers/m…

πŸ‘‡
Read 8 tweets
Oct 9, 2021
Annotated @PyTorch implementation of "Denoising Diffusion Probabilistic Models" by @hojonathanho @ajayj_ @pabbeel @berkeley_ai

πŸ“ Annotated code nn.labml.ai/diffusion/ddpm…
πŸ–₯ Github github.com/labmlai/annota…
πŸ“Ž Paper papers.labml.ai/paper/2006.112…

πŸ§΅πŸ‘‡
2/ This removes noise (denoise) step-by-step to generate images. It adds noise to an image from the dataset iteratively and a model is trained to predict the noise at each step.

πŸ‘‡
3/ Model is based on U-Net

πŸ“ nn.labml.ai/diffusion/ddpm…
πŸ“Ž papers.labml.ai/paper/1505.045… by @oronneberger @phffischer @thomasbrox

The first half progressively decreases the feature map resolution and the second half increases the res, with skip connections from the first half.

πŸ‘‡
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(