Thread by @trickyreds14 on Thread Reader App

A 🧵 on @karun1710 's talk at the @StatsBomb conference about some really interesting Data Analytics stuff happening at @Arsenal

Starting with the types of data to encode a sequence in a game, we have 3 types (images attached in order)

1.Event tracking data

2.Full Tracking data

3.Broadcast tracking data

This data can described by it's ease of availability (Coverage) and level of detail it provides(Granularity)

1.Event data (High Coverage, Low granularity, Limited)

2.Full Tracking (Low Coverage, High granularity)

3.Broadcast tracking (High coverage, Medium-high granularity)

How to use this data at scale?

We want to answer the question "How dangerous is the situation?" using tracking data

This question can be thought from different lenses

Attacking runs, Phase of play, Structure, Counter threat etc

Using tracking data some "models" try to answer the question and are listed below, still it's not a complete answer

Expected Threat
Passing options
Attacking runs
Phase of play
Team structure
Counter threat

ISSUES with this :

1. Models are SEGRAGATED and ISOLATED + don't have a common communication between them

2.High maintenance cost for all these models

So can we build an Unified model? How to go about it? Let's see the 2 approaches

Approach 1

A model requires features and here we extract features from the tracking data

Eg : Player level, team level and situation level features

These features are handcrafted and we are limited by our own heuristics + biases on how these questions should be solved

They also have a high maintenance cost

They make sense with full tracking but don't make sense with broadcast tracking as the data can be noisy and incomplete

So what to do now? Can we make the model "learn" features by itself?

Approach 2:

Let the Unified Model derive answers from first principles instead of us feeding handcrafted features to it

The focus now is on designing the system by *designing the questions* rather than designing the way it should arrive at our answers

This is the difference!

Remember ChatGPT? We use the 'T' in it to solve our problem (Transformers)

Why are Transformers relevant?

Tracking data is a sequence of events and they are good at modelling sequences

So you can feed individual video frames to the model as inputs and the different "models" (xT, Passing options etc) we talked about as outputs and let the model "learn" features by itself!

Now onto the use cases!

1. Tactics board interface

One can move players like on a tactics board but here the interface will UPDATE the possession value as you move the player and you can see what the team could have done better in every single SITUATION!

Eg: Brentford 👇

2. Situation Search

Now this is INCREDIBLE!

You can search across "All Man U situations"

Eg : 1) You can look at all Luke Shaw overlapping situations across their history!

2) You can also look at similar situations Arsenal conceded by other team's attacks! 🤯

AMAZING!!

3. Live match dashboards

i. Plotting Live Momentum and Game dominance chart using expected threat

ii. Showing data and metrics for different phases such as Low block mid block high press live in game

Something @m8arteta could look at during the game to make substitutions

Credit to @karun1710 and @StatsBomb for the images and the wonderful talk

If you found it interesting and learnt something new then drop a like or a RT!

End of thread!

@GiantGooner @scottjwillis @adamvoge @veeyahborna @A1ZH4RY @GeorgeV_AFC @TMftbl @RjArsenalBlog @watmanAFC @TacticsJournal would like to know your thoughts on it

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Share this page!

Enter URL or ID to Unroll