'Diegetic representation of feedback in open games' arxiv.org/abs/2206.12338, accepted for proceedings track at ACT'22
TL;DR: 'reverse-mode diff' for games clarifies game-theoretic structure, fixes conceptual issues and unifies them with learners
๐งต 0/n
At least two of original inventors of open games, @_julesh_ and @philipp_m_zahn, have been quite enthusiastic about the ideas in the paper (and afaiu also @Anarchia45 liked them :) 1/n
First, what the heck does 'diegetic' mean?
It's a loanword from narratology and means 'internal to the story being narrated'. You might be familiar with 'diegetic narrators', being narrators who also feature in the story they recount, vs 'extradiegetic narrators' which don't 2/n
I picture game specifications (extensive form and open games, specifically) as telling a story about how a game is played, featuring players as characters. The climax of the story is when players react to the outcomes of the game by deviating their strategy 3/n
So 'diegetic representation of feedback' means to explicitly represent this part of the story in the 'game system'. Until now, this part was handled *extradiegetically*, using decoration by equilibrium predicates, best response and selection functions. 4/n
The problem? The most important part of the system, the strategic core of the game, which ends up computing Nash equilibria, had to be handled ad-hoc and wasn't really recognized as part of the system 5/n
Most embarassingly, when last year we outlined some of the core ideas regarding the representation of agency in cybercats (matteocapucci.wordpress.com/2021/06/21/opeโฆ), we missed that 'open games with agency' were, according to our criterion, devoid of agency!๐ฑ 6/n
In fact the selection function of an open game is a decoration on the top boundary object and not a morphism, which would represent a system of agents. Remember: morphisms are systems, objects are just boundaries. 7/n
Compare this to learners (depicted), in which over a controlled system L lies a well-distinguished controller system, the GD (gradient descent) lens, which diegetically embodies a cybernetic feedback action-reaction cycle. 8/n
The main problem in doing so for open games lies in the extreme counterfactuality of players' feedback processing. In fact in open games (and in all game theory before them) players are pictured receiving a payoff as their only feedback 9/n
Usually players in games are pictured as utility-maximizing agents which receive a payoff they strive to maximize. However, this picture is (partially) wrong! This took me a long time to realize, but then I was enlightened 10/n
Players do not receive just their payoff as feedback, but *an entire payoff function*. Then they select their strategy in order to maximize what they can get from that. Without such an informative feedback from the game arena, they wouldn't be able to compute best responses 11/n
Players that only receive a payoff from the system are reinforcement learning agents, and balance such low bandwidth feedback with a richer internal dynamics. Incidentally, you can read about this in @RiuSakamoto & @_julesh_ latest paper arxiv.org/abs/2206.04547 12/n
So the main idea in my paper is to take this seriously and rethink games as feedback systems where feedback is given by a sort of backpropagation of payoff functions. The results are striking! 13/n
In the paper I show you can associate a feedback dynamics functorially to a given play dynamics. That is, given the way players interact and suitable payoff types, we can automatically build the so-called 'coplay' of a game, i.e. the backward pass of an open game 14/n
Fixing payoffs P, this is achieved by first defining a functor P^* : Set -> Lens(Set) which does a very simple thing: given f:X->Y, sends it to the lens (X,P^X) -> (Y,P^Y) given by (f, P^f), where P^f : u โฆ f;u 15/n
The absolute outrageous fact about this functor is, it's basically the functor P^(-), i.e. the contravariant hom-functor Set(-, P), but crucially landing in lenses allows us to define a monoidal structure on it 16/n
That's probably the most important finding in this work. The Nashator is the fundamental bit of math making games act weird and interesting. It's an extremely forgetful thing: it projects lines from a payoff matrix, thus losing most of the information in it. 17/n
Magically, the Nashator is the spice that makes the counterfactual analysis of players in a game even possible. Together with lens composition, it handles all the complexity of payoff backpropagation 18/n
We can now represent players' counterfactual analysis because the game shoots up to players their entire payoff function, crucially conditioned on other players choices. So now we have the right data to even been able to consider a system of players above the arena 19/n
Essentially, these systems are given by selection functions (here, argmaxens) which are now given as *lenses*, hence morphisms, and not as objects, thereby fixing the conceptual awkwardness of open games with agency (here ๐ซ is powerset) 20/n
Now compare the resulting pictures we get of a gradient-based learner (left) and a game (right). Suspiciously similar, aren't they?
Next time I'll tell you about how diegetic feedback unifies the kinds of backpropagation these systems do, opening a vast horizon beyond RDCs! 21/21
Little spoiler: 22/
โข โข โข
Missing some Tweet in this thread? You can try to
force a refresh
๐๐ผ Fibred categories are like woven fabric and doing a Grothendieck construction is a like weaving on a loom: a... thread about textile intuitions for fibrations ๐งต๐๐ผ
The threads of woven fabric, when you look up close, are entertwined in a distinctive pattern: some of them run vertically (that's the *weft* or *woof*), and some run horizontally (that's the *warp*). The *bias* is the 'diagonal' direction, along which fabric is easy to stretch
Likewise, when a category E is fibred there is a factorization system that tells you how to decompose each morphism into a 'warp' part and a 'weft' part.
If we forget about the weft, we can project down our fabric on the selvedge. This projection is the fibration!
๐๐ผ So, what makes games and learners so similar and yet so different from each other?
A ๐งต on an abstract yet very simple (if you're categorically minded) explanation of what 'backpropagation' actually is. 0/n
Let's start from gradient-based learners, i.e. machine learning models trained by gradient descent, like NNs.
Their categorical framing has been studied extensively now, starting from arxiv.org/abs/1711.10455 and culminating in arxiv.org/abs/2103.01931 1/n
The two main observations have been that (1) backpropagation has something very lensy about it and (2) parametric lenses describe very well how learners work *and* get trained.
Ultimately, a learner is drawn/specified like this:
2/n
I'm not going to run a long thread on this because the paper is very detailed and quite enjoyable to read, but here's some of the things I loved 2/n
The central idea of the paper is beautifully laid out at the beginning: use an open/closed pair of modalities on types to treat *intension* as structure on *extension*. So cool! 3/n
Today I read Apolito's essay I announced yersterday.
TL;DR: anarcho-communism should develop a competing alternative to markets for large-scale economical organization, and 'integrated information' is an appealing theoretical framework to look at
First, a reminder that Aurora Apolito is a pseudonym (I don't know if her real name is public knowledge so I'm not gonna dox). But it's cool that the name itself roughly means 'dawn of the stateless [society]', which is really cool 2/n
The paper starts with a question: 'Is anarchism [...] a system destined to only work in the scale of small local communities?' 3/n
Today Dylan Braithwaite, @bgavran3, @_julesh_, @AyeGill and myself published the extended abstract of a work that has been cooking for quite a bit (at least a year!)
The problem we're trying to solve is to 'complete this square'. Lenses are modular data accessors for records (i.e. pairs), dep. lenses are modula data accessors for records *with dependency* (i.e. dependent pairs), whereas optics extend lenses to obscene levels of generality 2/n
In particular, optics provide an abstract framework for defining 'lens-like' accessors for structures with much more complexity than records (e.g. trees) and for more exotic accessing patterns (e.g. with monadic effects)
See arxiv.org/abs/1703.10857 and arxiv.org/abs/2001.07488 3/n