Visual analytics for team-based invasion sports with significant events and Markov reward process

Kun Zhao; Takayuki Osogami; Tetsuro Morimura

arxiv: 1907.01221 · v1 · pith:N2IZXSDNnew · submitted 2019-07-02 · 💻 cs.AI · cs.HC· cs.LG

Visual analytics for team-based invasion sports with significant events and Markov reward process

Kun Zhao , Takayuki Osogami , Tetsuro Morimura This is my paper

Pith reviewed 2026-05-25 11:29 UTC · model grok-4.3

classification 💻 cs.AI cs.HCcs.LG

keywords visual analyticsMarkov reward processinvasion sportssignificant eventscontinuous parameter spacesoccer analyticsfitted-value iterationevent value estimation

0 comments

The pith

A match is modeled as a Markov chain of significant events extracted from player distributions so that a reward process solved by fitted-value iteration yields a regression model for event values at any continuous location, time, or score.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to evaluate the value of arbitrary events in invasion sports without forcing discretization of space or restriction to narrow event types like shots. It extracts significant events from the time-varying spatial distribution of players to stand in for an entire match, then treats those events as states in a Markov chain equipped with continuous parameters. Solving the resulting Markov reward process via a customized fitted-value iteration produces a regression model whose outputs can be visualized anywhere on the field under any chosen conditions. A reader would care because raw tracking data could then support fine-grained, condition-specific performance maps instead of coarse aggregates or local features alone.

Core claim

A whole match can be represented as a Markov chain of significant events derived from the time-varying distribution of players; the associated Markov reward process is solved by a customized fitted-value iteration algorithm that trains a regression model, thereby predicting the value of any event whose parameters (time, location, score, and others) lie in a continuous space and enabling visual inspection of those values over the entire playing field under arbitrary conditions.

What carries the argument

Markov reward process on states defined by significant events extracted from time-varying player distributions, solved via customized fitted-value iteration to train a regression model over continuous parameters.

If this is right

Event values become estimable for any event type without subdividing the field or limiting analysis to specific actions.
Values can be rendered as continuous surfaces over the full playing area for any chosen combination of time, score, and other parameters.
The fitted regression model supplies the numerical predictions that make such surfaces computable from the solved reward process.
Real soccer data can be used to produce the visualizations that demonstrate the method.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same extraction and chaining steps could be applied to basketball tracking data mentioned in the abstract to produce comparable continuous-value maps.
If the regression step generalizes across matches, the resulting model could support queries that vary one parameter while holding others fixed.
The approach leaves open whether the extracted events must be augmented with additional low-level features to capture defensive or set-piece situations.

Load-bearing premise

Significant events extracted solely from the time-varying distribution of players are sufficient to represent an entire match as a Markov chain whose reward process yields meaningful continuous-parameter event values.

What would settle it

On held-out soccer tracking data the regression model produces event-value maps whose ordering or magnitudes show no systematic agreement with independent measures such as goal-scoring frequency or expert ratings of the same events under the same continuous conditions.

read the original abstract

In team-based invasion sports such as soccer and basketball, analytics is important for teams to understand their performance and for audiences to understand matches better. The present work focuses on performing visual analytics to evaluate the value of any kind of event occurring in a sports match with a continuous parameter space. Here, the continuous parameter space involves the time, location, score, and other parameters. Because the spatiotemporal data used in such analytics is a low-level representation and has a very large size, however, traditional analytics may need to discretize the continuous parameter space (e.g., subdivide the playing area) or use a local feature to limit the analysis to specific events (e.g., only shots). These approaches make evaluation impossible for any kind of event with a continuous parameter space. To solve this problem, we consider a whole match as a Markov chain of significant events, so that event values can be estimated with a continuous parameter space by solving the Markov chain with a machine learning model. The significant events are first extracted by considering the time-varying distribution of players to represent the whole match. Then, the extracted events are redefined as different states with the continuous parameter space and built as a Markov chain so that a Markov reward process can be applied. Finally, the Markov reward process is solved by a customized fitted-value iteration algorithm so that the event values with the continuous parameter space can be predicted by a regression model. As a result, the event values can be visually inspected over the whole playing field under arbitrary given conditions. Experimental results with real soccer data show the effectiveness of the proposed system.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a pipeline to value events in continuous space for soccer by extracting states from player distributions and regressing a Markov reward process, but the first-order Markov assumption is the clear weak point.

read the letter

The main thing here is a method that pulls significant events from time-varying player position distributions, treats them as states with continuous parameters like time and location, builds a Markov reward process, and solves it with fitted-value iteration followed by regression so the values can be visualized across the whole field under different conditions. That pipeline is the concrete new element; prior work has used MRPs in sports and distribution features separately, but this specific combination for avoiding discretization is not already standard in the abstract. It does a solid job explaining why binning space or limiting to shots is limiting and why a regression-based solver could let analysts inspect arbitrary conditions without those restrictions. The approach is practical for generating field-wide value maps from real match data. The soft spot is the Markov property. States defined only by current player distributions at event instants omit ball trajectory history, possession sequences, and tactical context, so the probability of the next event is unlikely to depend only on the current state. The stress-test note lands here. The abstract also supplies no equations, no error metrics, no baseline comparisons, and no validation on the regression step, which leaves the central claim undemonstrated. This is aimed at sports analytics researchers who work on visual tools or event valuation and already have spatiotemporal data. A reader building similar systems could borrow the distribution-to-MRP step even if they adjust the state definition. It deserves a serious referee because the integration is specific and the application is grounded, even though the assumption and missing validation need direct attention in review.

Referee Report

3 major / 1 minor

Summary. The paper claims to enable visual analytics of event values in invasion sports (e.g., soccer) over a continuous parameter space (time, location, score, etc.) by extracting significant events from time-varying player position distributions, modeling them as states in a Markov chain, solving the resulting Markov reward process via customized fitted-value iteration, and using the output to train a regression model that predicts values for visualization across the field under arbitrary conditions. Experiments on real soccer data are asserted to demonstrate effectiveness.

Significance. If the pipeline were shown to produce meaningful, non-circular values with proper validation, it would offer a route to continuous-parameter event evaluation without forced discretization or restriction to local event types, which could extend sports analytics beyond current discrete or feature-limited methods. The combination of MRP with regression for continuous states is conceptually novel, but the absence of any reported equations, metrics, baselines, or tests for the Markov property in the manuscript description substantially reduces the assessed significance.

major comments (3)

[Abstract] Abstract: the claim that event values 'can be predicted by a regression model' is load-bearing for the central contribution, yet the described procedure obtains values by solving the MRP via fitted-value iteration on the extracted events and then applies regression to those same solved values; this makes the regression an approximation of the iteration output rather than an independent predictor for arbitrary conditions.
[Abstract] Abstract: no equations for the MRP, the customized fitted-value iteration, the regression step, or any validation metrics, error analysis, or baseline comparisons are supplied, so the assertion that the soccer experiments show effectiveness cannot be evaluated and leaves the continuous-parameter claim undemonstrated.
[Abstract] Abstract: the weakest assumption—that events extracted solely from time-varying player position distributions suffice to define states for a Markov chain whose reward process yields meaningful values—is not tested; player distributions omit ball-trajectory history, possession sequences, and tactical context, so the first-order Markov property P(next event | current state) is unlikely to hold and any downstream regression/visualization inherits the inconsistency.

minor comments (1)

[Abstract] Abstract: the phrase 'customized fitted-value iteration algorithm' is used without indicating what customization is performed; this detail belongs in the methods section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

Thank you for the constructive referee report. We address each major comment point by point below, with proposed revisions to the manuscript where appropriate.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that event values 'can be predicted by a regression model' is load-bearing for the central contribution, yet the described procedure obtains values by solving the MRP via fitted-value iteration on the extracted events and then applies regression to those same solved values; this makes the regression an approximation of the iteration output rather than an independent predictor for arbitrary conditions.

Authors: The regression is trained on values from the fitted-value iteration precisely to generalize those values to arbitrary points in the continuous parameter space for visualization. This is the intended mechanism for handling conditions not present among the extracted events. We will revise the abstract to state explicitly that the regression approximates the MRP solution for continuous parameters rather than serving as an independent predictor. revision: yes
Referee: [Abstract] Abstract: no equations for the MRP, the customized fitted-value iteration, the regression step, or any validation metrics, error analysis, or baseline comparisons are supplied, so the assertion that the soccer experiments show effectiveness cannot be evaluated and leaves the continuous-parameter claim undemonstrated.

Authors: The full manuscript contains the MRP formulation, the customized fitted-value iteration procedure, the regression model, and experimental results on soccer data. To address the concern that these details are not visible in the abstract, we will add a concise statement of the key equations and validation approach to the revised abstract. revision: yes
Referee: [Abstract] Abstract: the weakest assumption—that events extracted solely from time-varying player position distributions suffice to define states for a Markov chain whose reward process yields meaningful values—is not tested; player distributions omit ball-trajectory history, possession sequences, and tactical context, so the first-order Markov property P(next event | current state) is unlikely to hold and any downstream regression/visualization inherits the inconsistency.

Authors: The model treats player-position distributions as the basis for state extraction, following the standard MRP assumption that the chosen state representation is sufficient. We did not perform an explicit statistical test of the first-order Markov property. We will add a limitations discussion of this modeling choice and note that extensions incorporating ball trajectory or possession context could be explored in future work. revision: partial

Circularity Check

1 steps flagged

Event values 'predicted' by regression model that is fitted as part of solving the MRP

specific steps

fitted input called prediction [Abstract]
"Finally, the Markov reward process is solved by a customized fitted-value iteration algorithm so that the event values with the continuous parameter space can be predicted by a regression model."

Fitted-value iteration iteratively fits the regression model to Bellman backups to approximate the value function; therefore the final 'predicted' event values are definitionally the output of that same fitted regression rather than an independent forecast of pre-computed values.

full rationale

The paper extracts events, builds an MRP over continuous-parameter states, then solves it via customized fitted-value iteration whose output is explicitly a regression model. The abstract directly equates the solved values to what the regression 'predicts,' making the claimed prediction identical to the fitted approximator by construction. This matches the fitted-input-called-prediction pattern with a specific quote and reduction; no other patterns (self-citation, self-definition, etc.) are exhibited in the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on two domain assumptions: that player time-varying distributions yield representative significant events, and that a Markov reward process on those events can be solved by regression to produce continuous-parameter values. No free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)

domain assumption A match can be represented as a Markov chain whose states are significant events extracted from the time-varying distribution of players.
Stated directly in the abstract as the modeling step that enables the subsequent MRP.
domain assumption The Markov reward process on these states admits a regression-based solution that yields valid values for any point in the continuous parameter space.
Implicit in the claim that fitted-value iteration plus regression predicts event values under arbitrary conditions.

pith-pipeline@v0.9.0 · 5825 in / 1424 out tokens · 36862 ms · 2026-05-25T11:29:37.830951+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we consider a whole match as a Markov chain of significant events... Markov reward process... fitted-value iteration algorithm so that the event values with the continuous parameter space can be predicted by a regression model
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

extract significant events according to the playing intensity... multivariate distribution of the players... covariance matrix Σt... eigenvalues (a_t, b_t)... area S(t)=π a_t b_t

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.