Beyond Chamfer Distance: Granular Order-aware Evaluation Metric For Online Mapping
Pith reviewed 2026-05-22 06:57 UTC · model grok-4.3
The pith
PLD and SOSPA metrics rank online mapping methods by identifying detection as the main performance bottleneck unlike mAP.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that SOSPA provides an order-aware, axiom-satisfying replacement for Chamfer distance on individual map elements, while PLD replaces hard-threshold mAP with a soft joint measure of detection and geometry; together they yield rankings and error breakdowns on nuScenes that expose detection capability as the dominant current limitation.
What carries the argument
SOSPA (sequence optimal sub-pattern assignment) for order-preserving single-geometry comparison, and PLD (polyline localisation and detection) for soft multi-instance scoring that avoids binary thresholds.
Load-bearing premise
The entire argument assumes that order-aware optimal assignment and soft matching capture geometric and detection quality more faithfully than hard-thresholded Chamfer mAP.
What would settle it
If PLD and traditional mAP produce identical method rankings and error trends when both are run on the same nuScenes predictions from MapTRv2, StreamMapNet and MapTracker, the claimed advantage in granularity would not hold.
Figures
read the original abstract
Online map estimation is a crucial component of autonomous driving systems that reduces the reliance on costly high-definition maps. State-of-the-art (SOTA) methods commonly predict map elements as ordered sequences of points that form polylines and polygons. The evaluation of these methods relies predominantly on mean average precision (mAP) based on thresholded Chamfer distance (CD). This framework lacks sensitivity to point ordering and provides limited granularity in assessing geometric quality, making it difficult to distinguish which methods truly excel over others. In this work, we address these limitations on two fronts. For the single-instance similarity measure, we introduce sequence optimal sub-pattern assignment (SOSPA), an order-aware metric that enables fine-grained evaluation of individual geometries while satisfying all metric axioms. For the multi-instance evaluation framework, we propose polyline localisation and detection (PLD), a soft metric that jointly captures detection quality and geometric accuracy, replacing the hard thresholding of mAP with a principled soft assignment. Through evaluations on nuScenes, we demonstrate that PLD effectively ranks SOTA online mapping methods (MapTRv2, StreamMapNet, MapTracker) while providing a decomposed error analysis. This analysis identifies detection capability as the dominant bottleneck in current methods, revealing a performance trend that mAP fails to capture. Code for evaluation using our metrics will be released.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to introduce SOSPA, a sequence optimal sub-pattern assignment metric for order-aware single-instance similarity of polylines/polygons that satisfies all metric axioms, and PLD, a soft-assignment multi-instance metric that jointly evaluates detection quality and geometric accuracy without hard thresholding. On nuScenes, PLD ranks SOTA methods (MapTRv2, StreamMapNet, MapTracker) and decomposes errors to identify detection capability as the dominant bottleneck, a trend not captured by mAP-based evaluation.
Significance. If the metrics are rigorously validated and the decomposition is shown to be robust, the work could improve evaluation practices in online mapping for autonomous driving by providing more granular, order-sensitive, and decomposable alternatives to thresholded Chamfer distance and mAP.
major comments (1)
- [PLD definition and decomposed error analysis] The headline result that PLD identifies detection as the dominant bottleneck (and reveals trends invisible to mAP) depends on a decomposed error analysis derived from the joint soft assignment. Because the assignment simultaneously scores localisation and existence, any split into separate 'detection' and 'geometry' terms requires an explicit partitioning of the cost. Please provide the precise formula or procedure used for this decomposition (in the PLD definition section or associated equation) and demonstrate invariance to relative scaling of the geometric versus existence cost terms. Without this, the dominance conclusion risks being an artifact of the chosen weights rather than an intrinsic property of the evaluated methods.
minor comments (1)
- [Abstract] The abstract states that SOSPA satisfies all metric axioms but provides no sketch or reference to the proof; a brief indication of which axioms are verified and where the verification appears would aid readability.
Simulated Author's Rebuttal
We are grateful to the referee for the constructive feedback. Below we respond to the major comment regarding the PLD decomposed error analysis.
read point-by-point responses
-
Referee: The headline result that PLD identifies detection as the dominant bottleneck (and reveals trends invisible to mAP) depends on a decomposed error analysis derived from the joint soft assignment. Because the assignment simultaneously scores localisation and existence, any split into separate 'detection' and 'geometry' terms requires an explicit partitioning of the cost. Please provide the precise formula or procedure used for this decomposition (in the PLD definition section or associated equation) and demonstrate invariance to relative scaling of the geometric versus existence cost terms. Without this, the dominance conclusion risks being an artifact of the chosen weights rather than an intrinsic property of the evaluated methods.
Authors: We agree that an explicit description of the decomposition is essential to substantiate the claim. The PLD metric uses a joint soft assignment based on a cost that combines existence probability and geometric similarity via SOSPA. The decomposition separates the total error into detection (existence-related assignment costs for unmatched or falsely matched elements) and geometry (SOSPA costs for correctly assigned elements). We will provide the precise mathematical formulation of this partitioning in the PLD definition section. Additionally, to address the scaling invariance, we will include an analysis showing that the relative dominance of detection error persists under different weightings of the geometric and existence terms (e.g., varying the balance parameter by factors of 0.5x to 2x). This will be added to the experiments section of the revised manuscript. revision: yes
Circularity Check
New metrics defined from first principles with independent empirical evaluation
full rationale
The paper proposes SOSPA as an order-aware single-instance metric satisfying metric axioms and PLD as a soft-assignment multi-instance framework that jointly scores detection and geometry while replacing mAP hard thresholds. These are introduced as novel constructions rather than derived from prior fitted parameters or self-cited equations. The decomposed error analysis and claim that detection is the dominant bottleneck are outputs of applying the metrics to nuScenes evaluations of MapTRv2, StreamMapNet, and MapTracker; they do not reduce by construction to the metric definitions themselves or to any self-citation chain. No self-definitional, fitted-input, or uniqueness-imported steps appear in the derivation. The work is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption SOSPA satisfies all metric axioms
- domain assumption Soft assignment in PLD is a principled replacement for hard thresholding
Reference graph
Works this paper leans on
-
[1]
Q. Li, Y . Wang, Y . Wang, and H. Zhao. HDMapNet: An online HD map construction and evaluation framework. InProc. 2022 IEEE Int. Conf. Robot. Autom. (ICRA), pages 4628–4634, 2022
work page 2022
-
[2]
Y . Liu, T. Yuan, Y . Wang, Y . Wang, and H. Zhao. VectorMapNet: End-to-end vectorized HD map learning. InProc. 40th Int. Conf. Mach. Learn. (ICML), volume 202 ofProc. Mach. Learn. Res., pages 22352–22369. PMLR, 2023
work page 2023
-
[3]
B. Liao, S. Chen, X. Wang, T. Cheng, Q. Zhang, W. Liu, and C. Huang. MapTR: Structured modeling and learning for online vectorized HD map construction. InProc. Int. Conf. Learn. Represent. (ICLR), 2023
work page 2023
-
[4]
B. Liao, S. Chen, Y . Zhang, B. Jiang, Q. Zhang, W. Liu, C. Huang, and X. Wang. MapTRv2: An end-to-end framework for online vectorized HD map construction.Int. J. Comput. Vis., pages 1–23, 2024
work page 2024
-
[5]
T. Yuan, Y . Liu, Y . Wang, Y . Wang, and H. Zhao. StreamMapNet: Streaming mapping network for vectorized online HD map construction. InProc. IEEE/CVF Winter Conf. Appl. Comput. Vis. (WACV). IEEE, 2024
work page 2024
-
[6]
J. Chen, Y . Wu, J. Tan, H. Ma, and Y . Furukawa. MapTracker: Tracking with strided memory fusion for consistent vector HD mapping. InProc. Eur. Conf. Comput. Vis. (ECCV), Lecture Notes in Computer Science. Springer, 2024. Oral presentation
work page 2024
-
[7]
H. Wang, T. Li, Y . Li, L. Chen, C. Sima, Z. Liu, B. Wang, P. Jia, Y . Wang, S. Jiang, F. Wen, H. Xu, P. Luo, J. Yan, W. Zhang, and H. Li. Openlane-v2: A topology reasoning benchmark for unified 3d HD mapping. InProc. 37th Int. Conf. Neural Inf. Process. Syst. (NeurIPS), pages 827–838, New Orleans, LA, USA, 2023. Curran Associates Inc
work page 2023
-
[8]
D. Schuhmacher, B.-T. V o, and B.-N. V o. A consistent metric for performance evaluation of multi-object filters.IEEE Trans. Signal Process., 56(9):3447–3457, September 2008
work page 2008
-
[9]
A. S. Rahmathullah, Á. F. García-Fernández, and L. Svensson. Generalized optimal sub-pattern assignment metric. InProc. 20th Int. Conf. Inf. Fusion (Fusion). IEEE, July 2017
work page 2017
-
[10]
Y . Xia, Á. F. García-Fernández, J. Karlsson, T. Yuan, K.-C. Chang, and L. Svensson. Proba- bilistic GOSPA: A metric for performance evaluation of multi-object filters with uncertainties. IEEE Trans. Aerosp. Electron. Syst., 2025
work page 2025
-
[11]
R. A. Wagner and M. J. Fischer. The string-to-string correction problem.J. ACM, 21(1):168–178, 1974
work page 1974
-
[12]
M. Maes. On a cyclic string-to-string correction problem.Inf. Process. Lett., 35(2):73–78, 1990
work page 1990
- [13]
- [14]
-
[15]
Á. F. García-Fernández, A. S. Rahmathullah, and L. Svensson. A metric on the space of finite sets of trajectories for evaluation of multi-target tracking algorithms.IEEE Trans. Signal Process., 68:3917–3928, 2020
work page 2020
-
[16]
J. Gu, Á. F. García-Fernández, R. E. Firth, and L. Svensson. Graph GOSPA metric: A metric to measure the discrepancy between graphs of different sizes.IEEE Trans. Signal Process., 72:4037–4049, 2024. 10
work page 2024
- [17]
-
[18]
V . I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals.Soviet Physics Doklady, 10(8):707–710, 1966
work page 1966
-
[19]
A. Marzal and E. Vidal. Computation of normalized edit distance and applications.IEEE Trans. Pattern Anal. Mach. Intell., 15(9):926–932, 1993
work page 1993
-
[20]
S. B. Needleman and C. D. Wunsch. A general method applicable to the search for similarities in the amino acid sequence of two proteins.J. Molecular Biology, 48(3):443–453, 1970
work page 1970
-
[21]
E. V odolazskiy. Discrete Fréchet distance for closed curves.Comput. Geom., 111:101967, 2023
work page 2023
-
[22]
A. Marzal and S. Barrachina. Speeding up the computation of the edit distance for cyclic strings. InProc. 15th Int. Conf. Pattern Recognit. (ICPR), volume 2, pages 891–894, 2000
work page 2000
-
[23]
A. Lilja, J. Fu, E. Stenborg, and L. Hammarstrand. Localization is all you evaluate: Data leakage in online mapping datasets and how to fix it. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR). IEEE, 2024. 11 Table 4: Effect of sampling distance on mPLD (c= 1.5 , p= 1 ) and CD-mAP benchmarked using StreamMapNet on R60. The runtime is based on ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.