Velocity and stroke rate reconstruction of canoe sprint team boats based on panned and zoomed video recordings
Pith reviewed 2026-05-21 11:40 UTC · model grok-4.3
The pith
Computer vision reconstructs canoe sprint velocity and stroke rate from panned and zoomed videos to match GPS accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework estimates boat positions by computing homographies from detected buoys in the known grid and calibrates the boat tip location using a learned athlete offset from a U-net. It tracks multi-athlete boats with optical flow and extracts stroke rate from pose estimates or bounding box movements. When tested on elite competition videos, this yields velocity errors under 1.5 percent and stroke rate errors under 1.3 percent compared to GPS ground truth.
What carries the argument
Homography estimation from YOLOv8-detected buoys combined with U-net boat tip calibration for position reconstruction in panned and zoomed videos.
Load-bearing premise
The racecourse buoy positions must be known in advance and the buoys must be accurately detected in every video frame despite camera movements.
What would settle it
A direct comparison of the reconstructed velocity profiles against simultaneous GPS measurements on additional races where buoy detection occasionally fails would show whether errors exceed the reported 1 percent MAPE.
Figures
read the original abstract
Pacing strategies, defined by velocity and stroke rate profiles, are essential for peak performance in canoe sprint. While GPS is the gold standard for analysis, its limited availability necessitates automated video-based solutions. This paper presents an extended framework for reconstructing performance metrics from panned and zoomed video recordings across all sprint disciplines (K1-K4, C1-C2) and distances (200m-500m). Our method utilizes YOLOv8 for buoy and athlete detection, leveraging the known buoy grid to estimate homographies. We generalized the estimation of the boat position by means of learning a boat-specific athlete offset using a U-net based boat tip calibration. Further, we implement a robust tracking scheme using optical flow to adapt to multi-athlete boat types. Finally, we introduce methods to extract stroke rate information from either pose estimations or the athlete bounding boxes themselves. Evaluation against GPS data from elite competitions yields a velocity MAPE of 0.011 [0.008 0.014] (Spearman rho=0.974) and a stroke rate MAPE of 0.009 [0.006 0.013] (Spearman rho = 0.975). The methods provide coaches with highly accurate, automated feedback with minimal manual initialization work required, and without requiring sensors.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents an automated computer vision pipeline for reconstructing velocity and stroke rate in canoe sprint team boats (K1-K4, C1-C2) from panned and zoomed video recordings. The approach detects buoys and athletes with YOLOv8, computes homographies from a known buoy grid for position estimation, uses a U-Net to learn boat-specific athlete offsets for boat tip calibration, applies optical flow for robust tracking in multi-athlete scenarios, and extracts stroke rates from pose estimation or bounding boxes. End-to-end evaluation against GPS data from elite competitions reports velocity MAPE of 0.011 with Spearman rho=0.974 and stroke rate MAPE of 0.009 with rho=0.975.
Significance. If the results hold, this provides a practical sensor-free method for obtaining detailed performance metrics in canoe sprint, which is significant given GPS limitations in competitions. The generalization across boat types and distances, combined with high correlations to independent GPS measurements, supports utility for coaches. The end-to-end validation against real elite data and use of reproducible CV components (YOLOv8, U-Net, optical flow) are strengths that enhance the work's applicability in sports analytics.
major comments (1)
- [§3.2] §3.2 (Homography and Position Estimation): The velocity reconstruction depends on per-frame homographies computed from YOLOv8 buoy detections on the known grid. No quantitative metrics are reported for buoy detection precision/recall, homography reprojection error, or failure rates across zoom/pan conditions. This is load-bearing for the central claim because the reported velocity MAPE of 0.011 is end-to-end against GPS; without these intermediate statistics it remains unclear whether the low error persists when detections are imperfect (common when buoys occupy few pixels in zoomed footage), even with optical-flow tracking and U-Net offset.
minor comments (2)
- [Abstract] The abstract states generalization to all disciplines and distances but provides no breakdown of the number of boats, athletes, or video sequences used in the GPS evaluation; adding this would strengthen the cross-discipline claims.
- [§4] Consider including a supplementary table or figure showing per-boat-type MAPE and rho values (e.g., K1 vs. K4) to support the claim of applicability to team boats.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work and the constructive major comment. We address it point by point below and commit to revisions that strengthen the manuscript without altering its core claims.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Homography and Position Estimation): The velocity reconstruction depends on per-frame homographies computed from YOLOv8 buoy detections on the known grid. No quantitative metrics are reported for buoy detection precision/recall, homography reprojection error, or failure rates across zoom/pan conditions. This is load-bearing for the central claim because the reported velocity MAPE of 0.011 is end-to-end against GPS; without these intermediate statistics it remains unclear whether the low error persists when detections are imperfect (common when buoys occupy few pixels in zoomed footage), even with optical-flow tracking and U-Net offset.
Authors: We agree that reporting these intermediate metrics would improve transparency and allow readers to evaluate robustness under realistic detection challenges. Although our primary contribution and validation focus on end-to-end accuracy against GPS (the metric most relevant to coaches), we acknowledge the referee's point that dissecting the homography stage is valuable. In the revised manuscript we will add to §3.2: (i) precision and recall for YOLOv8 buoy detections on our annotated validation frames, (ii) mean and standard deviation of homography reprojection error both in image pixels and in world coordinates, and (iii) the fraction of frames in which homography estimation failed or fell back to optical-flow tracking, stratified by zoom level and pan speed. These statistics are computable from our existing dataset and will be presented in a new table. We expect the numbers to show that the combination of optical flow and U-Net offset calibration keeps velocity error low even when individual buoy detections are imperfect. revision: yes
Circularity Check
No significant circularity; external GPS validation anchors results
full rationale
The paper reconstructs velocity and stroke rate via YOLOv8 buoy/athlete detection, known-grid homographies, U-Net boat-tip offset calibration, optical-flow tracking, and pose/bounding-box stroke extraction. All performance claims (velocity MAPE 0.011, stroke-rate MAPE 0.009, Spearman correlations) are computed against independent GPS ground truth from elite competitions, not against the method's own fitted parameters or internal assumptions. No self-definitional equations, fitted-input-as-prediction steps, or load-bearing self-citations appear in the derivation chain. The external benchmark keeps the evaluation non-circular.
Axiom & Free-Parameter Ledger
free parameters (1)
- boat-specific athlete offset
axioms (2)
- domain assumption Buoy grid positions are known and fixed
- domain assumption YOLOv8 detections are sufficiently accurate for buoy and athlete localization
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Our method utilizes YOLOv8 for buoy and athlete detection, leveraging the known buoy grid to estimate homographies... U-Net model... optical flow... ViTPose or the YOLO bounding boxes
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
estimate a homography Hi for each frame fi... mapping of image points on the water surface plane to two-dimensional real-world coordinates
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Recon- structing velocity profiles using scene geometry in panned and zoomed canoe sprint videos,
D. Matthes, P. Frenzel, J. Ziegler, T. Warnke, T. K¨ ovari, and M. Fuchs, “Recon- structing velocity profiles using scene geometry in panned and zoomed canoe sprint videos,” in2025 IEEE International Workshop on Sport, Technology and Research (STAR), pp. 78–83, Oct. 2025
work page 2025
-
[2]
YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness,
R. Varghese and S. M., “YOLOv8: A Novel Object Detection Algorithm with Enhanced Performance and Robustness,” in2024 International Conference on Ad- vances in Data Engineering and Intelligent Computing Systems (ADICS), (Chen- nai, India), pp. 1–6, IEEE, Apr. 2024
work page 2024
-
[3]
Vitpose: simple vision transformer base- lines for human pose estimation,
Y. Xu, J. Zhang, Q. Zhang, and D. Tao, “Vitpose: simple vision transformer base- lines for human pose estimation,” inProceedings of the 36th International Confer- ence on Neural Information Processing Systems, NIPS ’22, (Red Hook, NY, USA), Curran Associates Inc., 2022
work page 2022
-
[4]
K. Nishida, J. Fujiki, C. Tsuchiya, S. Tanaka, and T. Kurita, “Road Plane Detec- tion using Differential Homography Estimated by Pair Feature Matching of Local Regions,” ACTA Press, Apr. 2011
work page 2011
-
[5]
Continuous 3D Label Stereo Matching using Local Expansion Moves,
T. Taniai, Y. Matsushita, Y. Sato, and T. Naemura, “Continuous 3D Label Stereo Matching using Local Expansion Moves,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, pp. 2725–2739, Nov. 2018
work page 2018
-
[6]
K. Yagi, K. Hasegawa, Y. Sugiura, and H. Saito, “Estimation of Runners’ Number of Steps, Stride Length and Speed Transition from Video of a 100-Meter Race,” in Proceedings of the 1st International Workshop on Multimedia Content Analysis in Sports, (Seoul Republic of Korea), pp. 87–95, ACM, Oct. 2018
work page 2018
-
[7]
No bells just whistles: Sports field registration by leveraging geometric properties,
M. Guti´ errez-P´ erez and A. Agudo, “No bells just whistles: Sports field registration by leveraging geometric properties,” inProceedings of the IEEE/CVF Conference Reconstruction of canoe velocity and stroke rate XXIII on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 3325–3334, June 2024
work page 2024
-
[8]
Video-based Sequential Bayesian Homogra- phy Estimation for Soccer Field Registration,
P. J. Claasen and J. P. d. Villiers, “Video-based Sequential Bayesian Homogra- phy Estimation for Soccer Field Registration,”Expert Systems with Applications, vol. 252, p. 124156, Oct. 2024
work page 2024
-
[9]
J. Ziegler, D. Matthes, P. Frenzel, and M. Fuchs, “AuxFlow: Anchor-grounded ho- mography estimation through flow-guided auxiliary points for Soccer field registra- tion and player localization,”Computer Vision and Image Understanding, vol. 264, p. 104662, Feb. 2026
work page 2026
-
[10]
Utilizing Mask R-CNN for Waterline Detection in Canoe Sprint Video Analysis,
M.-S. von Braun, P. Frenzel, C. Kading, and M. Fuchs, “Utilizing Mask R-CNN for Waterline Detection in Canoe Sprint Video Analysis,” in2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 3826–3835, June 2020
work page 2020
-
[11]
System for Performance Assessment of K2 Crews in Flatwater Sprint Kayak,
V. Bonaiuto, G. Annino, P. Boatto, N. Lanotte, L. Caprioli, E. Padua, and C. Ro- magnoli, “System for Performance Assessment of K2 Crews in Flatwater Sprint Kayak,” in2022 IEEE International Workshop on Sport, Technology and Research (STAR), pp. 56–60, July 2022
work page 2022
-
[12]
AI-Driven Paddle Motion Detection,
A. Najlaoui, F. Campoli, L. Caprioli, S. Edriss, C. Frontuto, C. Romagnoli, G. An- nino, V. Bonaiuto, and A. Zanela, “AI-Driven Paddle Motion Detection,” in 2024 IEEE International Workshop on Sport, Technology and Research (STAR), pp. 290–295, July 2024
work page 2024
-
[13]
3D human pose estimation in video with temporal convolutions and semi-supervised training
D. Pavllo, C. Feichtenhofer, D. Grangier, and M. Auli, “3D human pose estimation in video with temporal convolutions and semi-supervised training,” Mar. 2019. http://arxiv.org/abs/1811.11742
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[14]
End-to-End Camera Calibration for Broadcast Videos,
L. Sha, J. Hobbs, P. Felsen, X. Wei, P. Lucey, and S. Ganguly, “End-to-End Camera Calibration for Broadcast Videos,” in2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13624–13633, June 2020
work page 2020
-
[15]
A Video-Based Method to Quantify Stroke Syn- chronisation in Crew Boat Sprint Kayaking,
C. S. Tay and P. W. Kong, “A Video-Based Method to Quantify Stroke Syn- chronisation in Crew Boat Sprint Kayaking,”Journal of Human Kinetics, vol. 65, pp. 45–56, Dec. 2018
work page 2018
-
[16]
H. Estreich, N. Bullock, M. Osborne, E. Santos-Fernandez, and P. P.-Y. Wu, “An analysis of pacing profiles in sprint kayak racing using functional principal compo- nents and hidden Markov models,”PLOS ONE, vol. 20, p. e0326375, July 2025. Publisher: Public Library of Science
work page 2025
-
[17]
S. Amat, S. Busquier, C. D. G´ omez-Carmona, M. G´ omez-L´ opez, and J. Pino- Ortega, “Algorithm-Based Real-Time Analysis of Training Phases in Competi- tive Canoeing: An Automated Approach for Performance Monitoring,”Algorithms, vol. 18, p. 242, May 2025. Publisher: Multidisciplinary Digital Publishing Institute
work page 2025
-
[18]
An iterative image registration technique with an application to stereo vision,
B. D. Lucas and T. Kanade, “An iterative image registration technique with an application to stereo vision,” inProc. of 7th int. joint conference on Artificial intelligence - Volume 2, IJCAI’81, pp. 674–679, Aug. 1981
work page 1981
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.