Matteo Capucci Profile picture
Jun 27, 2022 23 tweets 8 min read Read on X
✨📜 New paper out!

'Diegetic representation of feedback in open games' arxiv.org/abs/2206.12338, accepted for proceedings track at ACT'22

TL;DR: 'reverse-mode diff' for games clarifies game-theoretic structure, fixes conceptual issues and unifies them with learners

🧵 0/n Diegetic representation of feedback in open games Matteo Cap
At least two of original inventors of open games, @_julesh_ and @philipp_m_zahn, have been quite enthusiastic about the ideas in the paper (and afaiu also @Anarchia45 liked them :) 1/n
First, what the heck does 'diegetic' mean?
It's a loanword from narratology and means 'internal to the story being narrated'. You might be familiar with 'diegetic narrators', being narrators who also feature in the story they recount, vs 'extradiegetic narrators' which don't 2/n Homer Simpson fatally suprised by a giant billboard saying '
I picture game specifications (extensive form and open games, specifically) as telling a story about how a game is played, featuring players as characters. The climax of the story is when players react to the outcomes of the game by deviating their strategy 3/n
So 'diegetic representation of feedback' means to explicitly represent this part of the story in the 'game system'. Until now, this part was handled *extradiegetically*, using decoration by equilibrium predicates, best response and selection functions. 4/n
The problem? The most important part of the system, the strategic core of the game, which ends up computing Nash equilibria, had to be handled ad-hoc and wasn't really recognized as part of the system 5/n
Most embarassingly, when last year we outlined some of the core ideas regarding the representation of agency in cybercats (matteocapucci.wordpress.com/2021/06/21/ope…), we missed that 'open games with agency' were, according to our criterion, devoid of agency!😱 6/n
In fact the selection function of an open game is a decoration on the top boundary object and not a morphism, which would represent a system of agents. Remember: morphisms are systems, objects are just boundaries. 7/n
Compare this to learners (depicted), in which over a controlled system L lies a well-distinguished controller system, the GD (gradient descent) lens, which diegetically embodies a cybernetic feedback action-reaction cycle. 8/n
The main problem in doing so for open games lies in the extreme counterfactuality of players' feedback processing. In fact in open games (and in all game theory before them) players are pictured receiving a payoff as their only feedback 9/n
Usually players in games are pictured as utility-maximizing agents which receive a payoff they strive to maximize. However, this picture is (partially) wrong! This took me a long time to realize, but then I was enlightened 10/n
Players do not receive just their payoff as feedback, but *an entire payoff function*. Then they select their strategy in order to maximize what they can get from that. Without such an informative feedback from the game arena, they wouldn't be able to compute best responses 11/n
Players that only receive a payoff from the system are reinforcement learning agents, and balance such low bandwidth feedback with a richer internal dynamics. Incidentally, you can read about this in @RiuSakamoto & @_julesh_ latest paper arxiv.org/abs/2206.04547 12/n
So the main idea in my paper is to take this seriously and rethink games as feedback systems where feedback is given by a sort of backpropagation of payoff functions. The results are striking! 13/n
In the paper I show you can associate a feedback dynamics functorially to a given play dynamics. That is, given the way players interact and suitable payoff types, we can automatically build the so-called 'coplay' of a game, i.e. the backward pass of an open game 14/n
Fixing payoffs P, this is achieved by first defining a functor P^* : Set -> Lens(Set) which does a very simple thing: given f:X->Y, sends it to the lens (X,P^X) -> (Y,P^Y) given by (f, P^f), where P^f : u ↦ f;u 15/n
The absolute outrageous fact about this functor is, it's basically the functor P^(-), i.e. the contravariant hom-functor Set(-, P), but crucially landing in lenses allows us to define a monoidal structure on it 16/n
That's probably the most important finding in this work. The Nashator is the fundamental bit of math making games act weird and interesting. It's an extremely forgetful thing: it projects lines from a payoff matrix, thus losing most of the information in it. 17/n
Magically, the Nashator is the spice that makes the counterfactual analysis of players in a game even possible. Together with lens composition, it handles all the complexity of payoff backpropagation 18/n
We can now represent players' counterfactual analysis because the game shoots up to players their entire payoff function, crucially conditioned on other players choices. So now we have the right data to even been able to consider a system of players above the arena 19/n
Essentially, these systems are given by selection functions (here, argmaxens) which are now given as *lenses*, hence morphisms, and not as objects, thereby fixing the conceptual awkwardness of open games with agency (here 𝒫 is powerset) 20/n
Now compare the resulting pictures we get of a gradient-based learner (left) and a game (right). Suspiciously similar, aren't they?
Next time I'll tell you about how diegetic feedback unifies the kinds of backpropagation these systems do, opening a vast horizon beyond RDCs! 21/21
Little spoiler: 22/

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Matteo Capucci

Matteo Capucci Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mattecapu

Dec 14, 2024
soooo why are proofs cultural objects? here's a quick thread 🧵
here's the deal: proofs are mainly artifacts (usually text, but figures count too!) mathematicians produce to convince other mathematicians of some fact about their *shared* imaginary world. without the *shared* part, they'd mean nothing. 1/n
specifically there is no such thing as a 'correct proof', there is only a consensus about which proofs are correct. correctness it's not an objective fact. 2/n
Read 22 tweets
Feb 6, 2023
Two of the most lucid paragraphs on the subject (emergence) I've read recently csc.ucdavis.edu/~cmg/papers/Ca…
And now a very interesting concept/observation: some emergent effects have intrinsic significance since they feed back in the components of the system which gave rise to the them:
Hence this classification of emergence (paraphrasing Cruthfield below):
1. 'something new appears'
2. 1 + observer identifies a pattern
3. 2 + the observer is part of the system itself (strong 2nd order cybernetics vibes from this one)
Read 4 tweets
Dec 12, 2022
@math3ma just gave a very interesting talk about this paper, with wonderful intuitions AN ENRICHED CATEGORY THEORY...
My understanding of the situation is the following (and I hope she'll correct me if I'm wrong). At a first approximation, a lang model (LM) learns a Markov kernel π:X->X where X is a set of strings.
The question is, what structure shall we expect this kernel to have?
The idea of π is that given a string x:X, π(-|x) is the probability measuring the likelihood of a given string y:X to follow x. Hence π(y|x) is π(xy) up to normalization.
Read 17 tweets
Nov 17, 2022
Idea: the structure of scientific revolutions identified by Kuhn is an instance of the more general features of evolutive/inferential dynamics. Available evidence provides the selective pressure for scientific theories.
For instance, lack of selective pressure produce adaptive radiation in evolution. en.wikipedia.org/wiki/Adaptive_…
This is analogous to the pre-paradigmatic phase of a science, where lack of evidence produces a plethora of alternative theories and models.
A 'revolution' would correspond to speciation/extinction, i.e. the strong selection of a few successful traits (revolution/crisis). Then for a long time these traits don't vary (paradigmatic periods), giving rise to punctuated equilibria. en.wikipedia.org/wiki/Punctuate…
Read 4 tweets
Nov 15, 2022
David Spivak delivered one of the best motivational talks about ACT I've ever seen:


It's a replica of his NIST talk from last week, here's a few key points I personally vibed with 👇🧵
First: ACT is about better communication and better language *for SMEs* (Subject Matter Experts).
The corollary (this is me not David) is you shouldn't exact applications from applied category theorists.
It's not our job!
We provide the fishing cane, not the fish. I'm here to help improve co...
Second: mathematics as account systems. It's all in the slide. The example is great because is so easy to disregard $s as 'just numbers'.
Mathematicians (but CTists are somehow more sensible to this) know they must be parsimonious with structure. Mathematical fields as acco...
Read 12 tweets
Jun 28, 2022
👉🏼 Fibred categories are like woven fabric and doing a Grothendieck construction is a like weaving on a loom: a... thread about textile intuitions for fibrations 🧵👇🏼
The threads of woven fabric, when you look up close, are entertwined in a distinctive pattern: some of them run vertically (that's the *weft* or *woof*), and some run horizontally (that's the *warp*). The *bias* is the 'diagonal' direction, along which fabric is easy to stretch From https://www.thesprucecrafts.com/warp-and-weft-1177681
Likewise, when a category E is fibred there is a factorization system that tells you how to decompose each morphism into a 'warp' part and a 'weft' part.
If we forget about the weft, we can project down our fabric on the selvedge. This projection is the fibration!
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(