Matteo Capucci Profile picture
Jun 27 โ€ข 23 tweets โ€ข 8 min read
โœจ๐Ÿ“œ New paper out!

'Diegetic representation of feedback in open games' arxiv.org/abs/2206.12338, accepted for proceedings track at ACT'22

TL;DR: 'reverse-mode diff' for games clarifies game-theoretic structure, fixes conceptual issues and unifies them with learners

๐Ÿงต 0/n Diegetic representation of feedback in open games Matteo Cap
At least two of original inventors of open games, @_julesh_ and @philipp_m_zahn, have been quite enthusiastic about the ideas in the paper (and afaiu also @Anarchia45 liked them :) 1/n
First, what the heck does 'diegetic' mean?
It's a loanword from narratology and means 'internal to the story being narrated'. You might be familiar with 'diegetic narrators', being narrators who also feature in the story they recount, vs 'extradiegetic narrators' which don't 2/n Homer Simpson fatally suprised by a giant billboard saying '
I picture game specifications (extensive form and open games, specifically) as telling a story about how a game is played, featuring players as characters. The climax of the story is when players react to the outcomes of the game by deviating their strategy 3/n
So 'diegetic representation of feedback' means to explicitly represent this part of the story in the 'game system'. Until now, this part was handled *extradiegetically*, using decoration by equilibrium predicates, best response and selection functions. 4/n
The problem? The most important part of the system, the strategic core of the game, which ends up computing Nash equilibria, had to be handled ad-hoc and wasn't really recognized as part of the system 5/n
Most embarassingly, when last year we outlined some of the core ideas regarding the representation of agency in cybercats (matteocapucci.wordpress.com/2021/06/21/opeโ€ฆ), we missed that 'open games with agency' were, according to our criterion, devoid of agency!๐Ÿ˜ฑ 6/n
In fact the selection function of an open game is a decoration on the top boundary object and not a morphism, which would represent a system of agents. Remember: morphisms are systems, objects are just boundaries. 7/n
Compare this to learners (depicted), in which over a controlled system L lies a well-distinguished controller system, the GD (gradient descent) lens, which diegetically embodies a cybernetic feedback action-reaction cycle. 8/n
The main problem in doing so for open games lies in the extreme counterfactuality of players' feedback processing. In fact in open games (and in all game theory before them) players are pictured receiving a payoff as their only feedback 9/n
Usually players in games are pictured as utility-maximizing agents which receive a payoff they strive to maximize. However, this picture is (partially) wrong! This took me a long time to realize, but then I was enlightened 10/n
Players do not receive just their payoff as feedback, but *an entire payoff function*. Then they select their strategy in order to maximize what they can get from that. Without such an informative feedback from the game arena, they wouldn't be able to compute best responses 11/n
Players that only receive a payoff from the system are reinforcement learning agents, and balance such low bandwidth feedback with a richer internal dynamics. Incidentally, you can read about this in @RiuSakamoto & @_julesh_ latest paper arxiv.org/abs/2206.04547 12/n
So the main idea in my paper is to take this seriously and rethink games as feedback systems where feedback is given by a sort of backpropagation of payoff functions. The results are striking! 13/n
In the paper I show you can associate a feedback dynamics functorially to a given play dynamics. That is, given the way players interact and suitable payoff types, we can automatically build the so-called 'coplay' of a game, i.e. the backward pass of an open game 14/n
Fixing payoffs P, this is achieved by first defining a functor P^* : Set -> Lens(Set) which does a very simple thing: given f:X->Y, sends it to the lens (X,P^X) -> (Y,P^Y) given by (f, P^f), where P^f : u โ†ฆ f;u 15/n
The absolute outrageous fact about this functor is, it's basically the functor P^(-), i.e. the contravariant hom-functor Set(-, P), but crucially landing in lenses allows us to define a monoidal structure on it 16/n
That's probably the most important finding in this work. The Nashator is the fundamental bit of math making games act weird and interesting. It's an extremely forgetful thing: it projects lines from a payoff matrix, thus losing most of the information in it. 17/n
Magically, the Nashator is the spice that makes the counterfactual analysis of players in a game even possible. Together with lens composition, it handles all the complexity of payoff backpropagation 18/n
We can now represent players' counterfactual analysis because the game shoots up to players their entire payoff function, crucially conditioned on other players choices. So now we have the right data to even been able to consider a system of players above the arena 19/n
Essentially, these systems are given by selection functions (here, argmaxens) which are now given as *lenses*, hence morphisms, and not as objects, thereby fixing the conceptual awkwardness of open games with agency (here ๐’ซ is powerset) 20/n
Now compare the resulting pictures we get of a gradient-based learner (left) and a game (right). Suspiciously similar, aren't they?
Next time I'll tell you about how diegetic feedback unifies the kinds of backpropagation these systems do, opening a vast horizon beyond RDCs! 21/21
Little spoiler: 22/

โ€ข โ€ข โ€ข

Missing some Tweet in this thread? You can try to force a refresh
ใ€€

Keep Current with Matteo Capucci

Matteo Capucci Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @mattecapu

Jun 28
๐Ÿ‘‰๐Ÿผ Fibred categories are like woven fabric and doing a Grothendieck construction is a like weaving on a loom: a... thread about textile intuitions for fibrations ๐Ÿงต๐Ÿ‘‡๐Ÿผ Image
The threads of woven fabric, when you look up close, are entertwined in a distinctive pattern: some of them run vertically (that's the *weft* or *woof*), and some run horizontally (that's the *warp*). The *bias* is the 'diagonal' direction, along which fabric is easy to stretch From https://www.thesprucec...
Likewise, when a category E is fibred there is a factorization system that tells you how to decompose each morphism into a 'warp' part and a 'weft' part.
If we forget about the weft, we can project down our fabric on the selvedge. This projection is the fibration! Image
Read 10 tweets
Jun 27
๐Ÿ‘‰๐Ÿผ So, what makes games and learners so similar and yet so different from each other?

A ๐Ÿงต on an abstract yet very simple (if you're categorically minded) explanation of what 'backpropagation' actually is. 0/n Image
Let's start from gradient-based learners, i.e. machine learning models trained by gradient descent, like NNs.
Their categorical framing has been studied extensively now, starting from arxiv.org/abs/1711.10455 and culminating in arxiv.org/abs/2103.01931 1/n
The two main observations have been that (1) backpropagation has something very lensy about it and (2) parametric lenses describe very well how learners work *and* get trained.
Ultimately, a learner is drawn/specified like this:
2/n Image
Read 27 tweets
Jan 9
#PaperADay๐Ÿฆ† today is kindly offered by @jonmsterling & collaborators:

arxiv.org/abs/2107.04663

1/n
I'm not going to run a long thread on this because the paper is very detailed and quite enjoyable to read, but here's some of the things I loved 2/n
The central idea of the paper is beautifully laid out at the beginning: use an open/closed pair of modalities on types to treat *intension* as structure on *extension*. So cool! 3/n
Read 7 tweets
Jan 6
#PaperADay๐Ÿฆ† time!

Today I read Apolito's essay I announced yersterday.

TL;DR: anarcho-communism should develop a competing alternative to markets for large-scale economical organization, and 'integrated information' is an appealing theoretical framework to look at

1/n
First, a reminder that Aurora Apolito is a pseudonym (I don't know if her real name is public knowledge so I'm not gonna dox). But it's cool that the name itself roughly means 'dawn of the stateless [society]', which is really cool 2/n
The paper starts with a question: 'Is anarchism [...] a system destined to only work in the scale of small local communities?' 3/n
Read 31 tweets
Dec 22, 2021
Today Dylan Braithwaite, @bgavran3, @_julesh_, @AyeGill and myself published the extended abstract of a work that has been cooking for quite a bit (at least a year!)

arxiv.org/abs/2112.11145

What's in there? 1/n Fibre optics (extended abstract) Dylan Braithwaite, Matteo C
The problem we're trying to solve is to 'complete this square'. Lenses are modular data accessors for records (i.e. pairs), dep. lenses are modula data accessors for records *with dependency* (i.e. dependent pairs), whereas optics extend lenses to obscene levels of generality 2/n
In particular, optics provide an abstract framework for defining 'lens-like' accessors for structures with much more complexity than records (e.g. trees) and for more exotic accessing patterns (e.g. with monadic effects)
See arxiv.org/abs/1703.10857 and arxiv.org/abs/2001.07488 3/n
Read 43 tweets
Nov 27, 2021
I'm starting a category theory without context (credit: ) art gallery thread
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us on Twitter!

:(