FightTracker: Real-time predictive analytics for Mixed Martial Arts bouts

Vincent Berthet

arxiv: 2312.11067 · v2 · submitted 2023-12-18 · 📊 stat.AP

FightTracker: Real-time predictive analytics for Mixed Martial Arts bouts

Vincent Berthet This is my paper

Pith reviewed 2026-05-24 05:06 UTC · model grok-4.3

classification 📊 stat.AP

keywords MMAUFCpredictive modelingreal-time analyticsin-round statisticslive bettingregression analysisfight outcome prediction

0 comments

The pith

Two regression models predict UFC fight outcomes at 80 percent accuracy from live in-round statistics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds two models that take ongoing fight statistics as inputs. One forecasts the judges' majority score per round. The other forecasts whether the red fighter wins when a three-round bout reaches the third round. Both reach 80 percent accuracy on the data examined. An R Shiny app streams these predictions from ESPN live feeds, and a betting rule set derived from the models returned 90.17 percent ROI over eight weeks against one bookmaker.

Core claim

The author shows that regression models trained on in-round fight statistics can forecast both per-round scoring and overall winner in extended three-round UFC bouts at 80 percent accuracy, and that feeding these forecasts into a live betting strategy produced 90.17 percent ROI over an eight-week test period.

What carries the argument

Two regression models that use in-round fight statistics as explanatory variables to output either the judges' majority score or the red fighter's win probability.

If this is right

Bettors receive real-time signals that can be acted on during a live bout.
Coaches and athletes obtain round-by-round score forecasts while the fight is still underway.
The same live-data pipeline can be extended to additional fight formats once more data are collected.
Bookmakers face a measurable edge when bettors adopt the models at scale.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the models remain accurate without frequent retraining, they could be adapted to other combat sports that publish round-level statistics.
The approach separates the prediction task into two stages (score by round, then winner), which may reduce error propagation compared with a single end-to-end model.
Profitability depends on the bookmaker's odds reacting slowly to the same in-round information; faster markets would shrink the observed edge.

Load-bearing premise

The statistical relationships between in-round metrics and outcomes seen in the training fights will remain stable for new fights without retraining or hidden data overlap.

What would settle it

A fresh sample of UFC fights where the models' out-of-sample accuracy falls below 70 percent or the derived betting rule set loses money over a comparable number of bouts.

Figures

Figures reproduced from arXiv: 2312.11067 by Vincent Berthet.

**Figure 1.** Figure 1: Workflow of FightTracker. FightTracker provides in real-time: (1) the prediction of the judges’ majority score as soon as a round has been completed, (2) the prediction that the red fighter will win the fight or not in 3-round fights that go beyond the second round (53% of all UFC fights). While a straightforward way to make this information available to the public would be to deploy the Shiny app, we chos… view at source ↗

**Figure 2.** Figure 2: Test of the regression model of round scores: Model calibration (N=2264 rounds) [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Test of the regression model of 3-round fights based on round 1 data: Model calibration (left panel) and confusion matrix (right panel) (N=1216) [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Test of the regression model of 3-round fights based on rounds 1 and 2 data: Model calibration (left panel) and confusion matrix (right panel) (N=961). The overall accuracy of the model was 80.64% (cutoff value = 0.50), which is significantly above the baseline (57.48%). Its sensitivity and specificity were estimated at 84.78% and 74.75%, respectively. Once again, the model was more prone to false positive… view at source ↗

read the original abstract

Mixed martial arts (MMA) has been one of the fastest-growing sports in recent years and has become a mainstream sport on the global stage. The growth of MMA has been driven by the Ultimate Fighting Championship (UFC), which is currently the largest MMA promotion organization in the world. However, data collection and statistical modeling in MMA are still in their infancy. We developed FightTracker, a data-driven solution that delivers real-time predictions for UFC fights. We first conducted regression analyses on the data provided by the UFC and MMA Decisions and built two predictive models of UFC fight outcomes. One model predicts the judges' majority score by round while the other predicts whether the red fighter will win the fight or not in 3-round fights that go beyond the second round (53% of all UFC fights). Both models use in-round fight statistics as explanatory variables and achieve 80% accuracy. We then designed an R shiny app that delivers these two predictions in real-time based on the ESPN live data. This information is valuable for fans, coaches, athletes, and especially bettors. Indeed, a live betting strategy based on FightTracker proved to generate large profits over an 8-week period against the bookmaker Unibet (90.17% ROI).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces FightTracker, consisting of two regression models fitted to UFC and MMA Decisions data that use in-round statistics to predict (i) judges' majority scores per round and (ii) the red fighter's win probability in 3-round fights that reach the third round. Both models are reported to achieve 80% accuracy; an R Shiny app delivers real-time predictions from ESPN data, and a live betting strategy based on the models is claimed to have produced a 90.17% ROI over an eight-week period against Unibet.

Significance. If the reported accuracies and ROI were obtained on strictly held-out, temporally subsequent fights with documented train-test separation and without post-hoc selection, the work would provide a concrete demonstration of real-time, in-round predictive modeling for MMA with direct economic implications. The absence of such validation details, however, prevents the central claims from being evaluated as prospective performance.

major comments (3)

[Abstract] Abstract: the 80% accuracy figures for both regression models are stated without any description of the train-test split, cross-validation procedure, baseline comparison, or the total number of fights used, rendering it impossible to determine whether the reported performance reflects generalization or in-sample fit.
[Abstract] Abstract (betting results): the 90.17% ROI over the eight-week live test is presented without stating how many bets were placed, whether the eight-week window lies entirely after the data cutoff used to estimate the regression coefficients, or any temporal validation protocol; this directly undermines the claim that the strategy constitutes out-of-sample, prospective performance.
[Methods/Results] Methods/Results (regression specification): the models are ordinary regressions whose coefficients are estimated from the same UFC fight database later used both to quote accuracy and to generate betting signals; without an explicit statement that the evaluation fights were excluded from coefficient estimation, the reported metrics reduce by construction to in-sample performance.

minor comments (2)

[Abstract] The abstract does not define the exact response variables (e.g., how 'judges' majority score' is coded) or the precise set of in-round statistics employed as predictors.
[Methods] No mention is made of the number of observations, the software or package used for the regressions, or any regularization or variable-selection steps.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We agree that the manuscript requires additional explicit documentation of the train-test procedures, temporal separation, and betting validation to allow proper evaluation of the reported performance metrics. We address each major comment below and will revise the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [Abstract] Abstract: the 80% accuracy figures for both regression models are stated without any description of the train-test split, cross-validation procedure, baseline comparison, or the total number of fights used, rendering it impossible to determine whether the reported performance reflects generalization or in-sample fit.

Authors: We agree that these details are absent from the abstract. In the revised manuscript we will add a concise description of the temporal train-test split (coefficients estimated on fights prior to a fixed cutoff date and evaluated on subsequent fights), the cross-validation approach used in model development, a baseline comparison, and the total number of fights in the dataset. revision: yes
Referee: [Abstract] Abstract (betting results): the 90.17% ROI over the eight-week live test is presented without stating how many bets were placed, whether the eight-week window lies entirely after the data cutoff used to estimate the regression coefficients, or any temporal validation protocol; this directly undermines the claim that the strategy constitutes out-of-sample, prospective performance.

Authors: We agree that the abstract lacks these specifics. We will revise the abstract to report the number of bets placed, confirm that the eight-week window occurred entirely after the model estimation cutoff, and outline the temporal validation protocol used for the live betting evaluation. revision: yes
Referee: [Methods/Results] Methods/Results (regression specification): the models are ordinary regressions whose coefficients are estimated from the same UFC fight database later used both to quote accuracy and to generate betting signals; without an explicit statement that the evaluation fights were excluded from coefficient estimation, the reported metrics reduce by construction to in-sample performance.

Authors: We agree that an explicit statement is required. We will add language in the Methods section clarifying that a strict temporal cutoff was applied so that regression coefficients were estimated exclusively on earlier fights and all accuracy and betting metrics were computed on later, unseen fights. revision: yes

Circularity Check

0 steps flagged

No circularity: models and ROI presented as empirical results without reduction to input by construction

full rationale

The abstract describes conducting regression analyses on UFC data to build predictive models that achieve 80% accuracy and then testing a betting strategy yielding 90.17% ROI. No equations, coefficient tables, or explicit statements are provided showing that the reported accuracy or ROI values are computed on the identical observations used to estimate the regression coefficients. The derivation chain therefore does not reduce by construction to a fitted parameter renamed as a prediction; any concern about train/test overlap is a methodological gap rather than a self-definitional or fitted-input equivalence. The paper remains self-contained against external benchmarks because its central claims rest on data-driven regression whose out-of-sample status is asserted but not internally contradicted.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claims rest on two fitted regression models whose coefficients are free parameters estimated from UFC data. Standard statistical assumptions (linearity, independence of rounds, no omitted variable bias) are invoked without explicit statement. No new physical entities are postulated.

free parameters (2)

regression coefficients for strike, takedown, and control variables
Estimated from historical UFC and MMA Decisions data to produce the 80% accuracy figures.
thresholds or cutoffs used to convert model probabilities into betting signals
Chosen to generate the reported 90.17% ROI over the eight-week window.

axioms (2)

domain assumption The relationship between in-round statistics and final outcomes remains stable across future fights.
Required for both the accuracy claim and the live-betting ROI to generalize beyond the observed sample.
standard math Rounds within a fight are statistically independent conditional on the covariates.
Implicit in treating each round's statistics as separate observations for the round-score model.

pith-pipeline@v0.9.0 · 5739 in / 1737 out tokens · 29277 ms · 2026-05-24T05:06:09.681981+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

Angelini G., De Angelis L. (2019). Efficiency of online football betting markets. International Journal of Forecasting, 35(2), 712–721. Bishop, S . H., La Bounty, P ., & Devlin, M. (2013). Mixed martial arts: A comprehensive review. Journal of Sport and Human Performance, 1(1), 28–42. Bueno, J. C. A., Faro, H., Lenetsky, S., Gonçalves, A. F., Dias, S. B. ...

work page 2019
[2]

W., Stein, M.‐K., & Jensen, T

Burton, J. W., Stein, M.‐K., & Jensen, T. B. (2020). A systematic review of algorithm aversion in augmented decision making. Journal of Behavioral Decision Making , 33(2), 220–

work page 2020
[3]

Collier, T., Johnson, A., & Ruggiero, J. (2012). Aggression in mixed martial arts: An analysis of the likelihood of winning a decision. In Jewell, R. T. (ed.) Violence and Aggression in Sporting Contests: Economics, History, and Policy (pp. 97–109). New York: Springer Publishing. Feldman, T. (2020). The Way of the Fight: An Analysis of MMA Judging. Journa...

work page 2012

[1] [1]

Angelini G., De Angelis L. (2019). Efficiency of online football betting markets. International Journal of Forecasting, 35(2), 712–721. Bishop, S . H., La Bounty, P ., & Devlin, M. (2013). Mixed martial arts: A comprehensive review. Journal of Sport and Human Performance, 1(1), 28–42. Bueno, J. C. A., Faro, H., Lenetsky, S., Gonçalves, A. F., Dias, S. B. ...

work page 2019

[2] [2]

W., Stein, M.‐K., & Jensen, T

Burton, J. W., Stein, M.‐K., & Jensen, T. B. (2020). A systematic review of algorithm aversion in augmented decision making. Journal of Behavioral Decision Making , 33(2), 220–

work page 2020

[3] [3]

Collier, T., Johnson, A., & Ruggiero, J. (2012). Aggression in mixed martial arts: An analysis of the likelihood of winning a decision. In Jewell, R. T. (ed.) Violence and Aggression in Sporting Contests: Economics, History, and Policy (pp. 97–109). New York: Springer Publishing. Feldman, T. (2020). The Way of the Fight: An Analysis of MMA Judging. Journa...

work page 2012