pith. machine review for the scientific record. sign in

arxiv: 2604.03463 · v1 · submitted 2026-04-03 · 💻 cs.LG · cs.RO

Recognition: 2 theorem links

· Lean Theorem

Super Agents and Confounders: Influence of surrounding agents on vehicle trajectory prediction

Authors on Pith no claims yet

Pith reviewed 2026-05-13 19:39 UTC · model grok-4.3

classification 💻 cs.LG cs.RO
keywords trajectory predictionsurrounding agentsconditional information bottleneckShapley attributionautonomous drivingrobustnessnon-causal learningvehicle prediction
0
0 comments X

The pith

Many surrounding agents degrade vehicle trajectory prediction accuracy because models learn unstable non-causal rules from them.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that in interactive driving scenes, information from surrounding cars and pedestrians often lowers the accuracy of trajectory predictions rather than raising it. State-of-the-art models develop decision schemes that vary sharply across training runs and fail to track true causal influences, as shown by Shapley attribution. The authors add a Conditional Information Bottleneck that compresses agent features and drops those that do not help the prediction task, without needing extra labels. Experiments across datasets and architectures show gains in accuracy and robustness to perturbations when this filtering is applied. The result matters because real driving involves dense traffic where unfiltered context can inject misleading signals that harm reliability.

Core claim

Our main contribution is a comprehensive analysis of state-of-the-art trajectory predictors, which reveals a surprising and critical flaw: many surrounding agents degrade prediction accuracy rather than improve it. Using Shapley-based attribution, we rigorously demonstrate that models learn unstable and non-causal decision-making schemes that vary significantly across training runs. Building on these insights, we propose to integrate a Conditional Information Bottleneck (CIB), which does not require additional supervision and is trained to effectively compress agent features as well as ignore those that are not beneficial for the prediction task.

What carries the argument

The Conditional Information Bottleneck (CIB) applied to surrounding agent features, which learns to compress them while discarding information that does not aid trajectory prediction.

If this is right

  • Trajectory prediction accuracy rises in many settings once non-beneficial agent information is filtered out.
  • Models gain robustness to input perturbations when only useful contextual features are retained.
  • Attribution metrics can flag non-robust behavior learned by predictors without extra supervision.
  • Selective use of surrounding-agent data reduces the effect of spurious signals in crowded driving scenes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same compression step could help other multi-agent forecasting tasks where extra context risks adding confounders.
  • The training-run instability suggests that safety-critical predictors may need explicit regularization for causal consistency.
  • Testing the CIB on real-time streams with fluctuating agent counts would show whether the gains hold when the scene changes dynamically.

Load-bearing premise

That Shapley-based attribution reliably identifies non-causal and unstable decision schemes, and that the CIB compression preserves all causally relevant information without introducing new biases.

What would settle it

A controlled test on a dataset where surrounding agents are known to carry clear causal signals, showing whether the CIB still lowers prediction error or whether removing it restores the original performance gap.

Figures

Figures reproduced from arXiv: 2604.03463 by Daniel Jost, J\"org Wagner, Joschka B\"odecker, Luca Paparusso, Martin Stoll, Raghu Rajan.

Figure 1
Figure 1. Figure 1: (a) Removing a ”Confounding Agent” can sig [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed prediction architecture. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Consistency of improving agent influence for the [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: Insertion test performed on the train and validation [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Consistency of improving agent influence for the MTR model and the MTR+IB model compared to human labels. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
read the original abstract

In highly interactive driving scenes, trajectory prediction is conditioned on information from surrounding traffic participants such as cars and pedestrians. Our main contribution is a comprehensive analysis of state-of-the-art trajectory predictors, which reveals a surprising and critical flaw: many surrounding agents degrade prediction accuracy rather than improve it. Using Shapley-based attribution, we rigorously demonstrate that models learn unstable and non-causal decision-making schemes that vary significantly across training runs. Building on these insights, we propose to integrate a Conditional Information Bottleneck (CIB), which does not require additional supervision and is trained to effectively compress agent features as well as ignore those that are not beneficial for the prediction task. Comprehensive experiments using multiple datasets and model architectures demonstrate that this simple yet effective approach not only improves overall trajectory prediction performance in many cases but also increases robustness to different perturbations. Our results highlight the importance of selectively integrating contextual information, which can often contain spurious or misleading signals, in trajectory prediction. Moreover, we provide interpretable metrics for identifying non-robust behavior and present a promising avenue towards a solution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that surrounding agents in trajectory prediction often degrade accuracy because models learn unstable non-causal schemes (shown via Shapley attribution varying across runs), and proposes a Conditional Information Bottleneck (CIB) to compress agent features without extra supervision, yielding better performance and robustness across datasets and architectures.

Significance. If the central empirical findings hold after addressing evidence gaps, the work would highlight risks of spurious context in prediction models and offer a practical, supervision-free compression method, with potential impact on robust autonomous driving systems.

major comments (3)
  1. [Section 4] Shapley attribution section: the leap from high Shapley values to 'non-causal' and 'unstable' decision schemes is not demonstrated; Shapley quantifies marginal contribution to loss but does not establish causality (no interventional or counterfactual tests) or instability (no variance across seeds or statistical tests reported).
  2. [Section 5] Experiments and results: claims of CIB improving robustness lack detailed error bars, multi-seed statistics, and explicit ablation comparisons against standard feature selection or dropout baselines to confirm selective compression of non-beneficial agents.
  3. [Section 3.2] CIB formulation: it is unclear whether the conditional mutual information estimation preserves all causally relevant information or introduces new biases, as no analysis of information retention versus prediction metric is provided.
minor comments (2)
  1. [Abstract] Abstract: the statement 'many surrounding agents degrade prediction accuracy' would benefit from a quantitative definition or threshold for when degradation occurs.
  2. [Section 3] Notation: the definition of the CIB objective could include an explicit equation for the compression term to improve clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback, which helps clarify and strengthen the empirical foundations of our work. We address each major comment below and outline the corresponding revisions.

read point-by-point responses
  1. Referee: [Section 4] Shapley attribution section: the leap from high Shapley values to 'non-causal' and 'unstable' decision schemes is not demonstrated; Shapley quantifies marginal contribution to loss but does not establish causality (no interventional or counterfactual tests) or instability (no variance across seeds or statistical tests reported).

    Authors: We agree that Shapley values quantify marginal contribution to the loss rather than establishing causality via interventions or counterfactuals. Our claim of instability is based on observed variation in attributions across training runs, but we acknowledge the absence of explicit multi-seed statistics and significance tests in the current draft. In the revision we will add experiments over at least five random seeds, reporting standard deviations of Shapley values per agent together with statistical tests for variance. We will also revise the language to state that the high variability suggests reliance on non-robust, spurious correlations, while explicitly noting the lack of interventional evidence as a limitation. revision: partial

  2. Referee: [Section 5] Experiments and results: claims of CIB improving robustness lack detailed error bars, multi-seed statistics, and explicit ablation comparisons against standard feature selection or dropout baselines to confirm selective compression of non-beneficial agents.

    Authors: We accept that additional statistical detail and targeted ablations are required. The revised manuscript will report all main results and robustness experiments with error bars computed over multiple random seeds (minimum five). We will further include new ablation tables comparing CIB against dropout applied to agent features and against simple feature-selection baselines (e.g., attention-score thresholding and mutual-information pruning). These comparisons will directly test whether CIB achieves more selective compression of non-beneficial agents. revision: yes

  3. Referee: [Section 3.2] CIB formulation: it is unclear whether the conditional mutual information estimation preserves all causally relevant information or introduces new biases, as no analysis of information retention versus prediction metric is provided.

    Authors: The CIB objective is designed to retain only information relevant to the prediction target by minimizing conditional mutual information while maximizing predictive mutual information. We agree that an explicit analysis linking retained information to downstream metrics is missing. In the revision we will add plots and tables that show estimated conditional mutual information values against ADE/FDE for different values of the trade-off parameter, thereby illustrating the information-retention versus accuracy trade-off and helping to surface any systematic biases. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical analysis with independent attribution and external datasets

full rationale

The paper performs post-hoc Shapley attribution on trained trajectory predictors using standard external datasets and multiple architectures. The CIB objective is introduced as an independent compression term whose training loss is defined separately from the final prediction metric and does not reduce to any fitted parameter by construction. No self-citations are invoked as load-bearing uniqueness theorems, no ansatz is smuggled, and no prediction is statistically forced from a subset fit. The derivation from analysis to proposed method remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claims rest on standard supervised learning assumptions for trajectory models and the validity of Shapley attribution for feature importance; no new free parameters or invented entities are introduced beyond the CIB formulation.

pith-pipeline@v0.9.0 · 5503 in / 987 out tokens · 27928 ms · 2026-05-13T19:39:53.291411+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 1 internal anchor

  1. [1]

    Q- EANet: Implicit social modeling for trajectory prediction via experience-anchored queries,

    J. Chen, Z. Wang, J. Wang, and B. Cai, “Q- EANet: Implicit social modeling for trajectory prediction via experience-anchored queries,”IET Intelligent Transport Systems, vol. 18, no. 6, pp. 1004–1015, 2024. [Online]. Available: https://articlelibrary.wiley.com/doi/abs/10.1049/itr2.12477

  2. [2]

    LAformer: Trajectory Prediction for Autonomous Driving with Lane-Aware Scene Constraints,

    M. Liu, H. Cheng, L. Chen, H. Broszio, J. Li, R. Zhao, M. Sester, and M. Y . Yang, “LAformer: Trajectory Prediction for Autonomous Driving with Lane-Aware Scene Constraints,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2023. [Online]. Available: http://arxiv.org/abs/2302.13933

  3. [3]

    MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying,

    S. Shi, L. Jiang, D. Dai, and B. Schiele, “MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying,”IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024. [Online]. Available: http://arxiv.org/abs/2306.17770

  4. [4]

    EDA: Evolving and Distinct Anchors for Multimodal Motion Prediction,

    L. Lin, X. Lin, T. Lin, L. Huang, R. Xiong, and Y . Wang, “EDA: Evolving and Distinct Anchors for Multimodal Motion Prediction,” Proceedings of the AAAI Conference on Artificial Intelligence, 2023. [Online]. Available: http://arxiv.org/abs/2312.09501

  5. [5]

    CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal Relationships,

    R. Roelofs, L. Sun, B. Caine, K. S. Refaat, B. Sapp, S. Ettinger, and W. Chai, “CausalAgents: A Robustness Benchmark for Motion Forecasting using Causal Relationships,” Oct. 2022

  6. [6]

    On Adversarial Robustness of Trajectory Prediction for Autonomous Vehicles,

    Q. Zhang, S. Hu, J. Sun, Q. A. Chen, and Z. M. Mao, “On Adversarial Robustness of Trajectory Prediction for Autonomous Vehicles,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 15 159–15 168

  7. [7]

    Are socially-aware trajectory prediction models really socially-aware?

    S. Saadatnejad, M. Bahari, P. Khorsandi, M. Saneian, S.-M. Moosavi- Dezfooli, and A. Alahi, “Are socially-aware trajectory prediction models really socially-aware?” Feb. 2022

  8. [8]

    AdvDO: Realistic Adversarial Attacks for Trajectory Prediction,

    Y . Cao, C. Xiao, A. Anandkumar, D. Xu, and M. Pavone, “AdvDO: Realistic Adversarial Attacks for Trajectory Prediction,” Sep. 2022

  9. [9]

    7. A Value for n-Person Games. Contributions to the Theory of Games II (1953) 307-317

    L. Shapley, “7. A Value for n-Person Games. Contributions to the Theory of Games II (1953) 307-317.” inClassics in Game Theory, H. W. Kuhn, Ed. Princeton University Press, Nov. 2020, pp. 69–79

  10. [10]

    YOU MOSTLY W ALK ALONE: ANALYZING FEATURE ATTRIBUTION IN TRAJEC- TORY PREDICTION,

    O. Makansi, T. Brox, and B. Scholkopf, “YOU MOSTLY W ALK ALONE: ANALYZING FEATURE ATTRIBUTION IN TRAJEC- TORY PREDICTION,” inInternational Conference on Learning Rep- resentations (ICLR), 2022

  11. [11]

    arXiv preprint arXiv:1612.00410 , year=

    A. A. Alemi, I. Fischer, J. V . Dillon, and K. Murphy, “Deep Variational Information Bottleneck,”International Conference on Learning Representations (ICLR), 2019. [Online]. Available: http://arxiv.org/abs/1612.00410

  12. [12]

    Conditional Graph Information Bottleneck for Molecular Relational Learning,

    N. Lee, D. Hyun, G. S. Na, S. Kim, J. Lee, and C. Park, “Conditional Graph Information Bottleneck for Molecular Relational Learning,” Jul. 2023. [Online]. Available: http://arxiv.org/abs/2305.01520

  13. [13]

    Motion Transformer with global intention localization and local movement refinement,

    S. Shi, L. Jiang, D. Dai, and B. Schiele, “Motion Transformer with global intention localization and local movement refinement,” in Advances in Neural Information Processing Systems (NeurIPS), 2022

  14. [14]

    Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom

    H. Caesar, V . Bankiti, A. H. Lang, S. V ora, V . E. Liong, Q. Xu, A. Krishnan, Y . Pan, G. Baldan, and O. Beijbom, “nuScenes: A multimodal dataset for autonomous driving,”Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. [Online]. Available: http://arxiv.org/abs/1903.11027

  15. [15]

    Large scale interactive motion forecasting for autonomous driving: The Waymo Open Motion Dataset,

    S. Ettinger, S. Cheng, B. Caine, C. Liu, H. Zhao, S. Pradhan, Y . Chai, B. Sapp, C. Qi, Y . Zhou, Z. Yang, A. Chouard, P. Sun, J. Ngiam, V . Vasudevan, A. McCauley, J. Shlens, and D. Anguelov, “Large scale interactive motion forecasting for autonomous driving: The Waymo Open Motion Dataset,” inProceedings of the IEEE/CVF International Conference on Comput...

  16. [16]

    Query-Centric Trajectory Prediction,

    Z. Zhou, J. Wang, Y . Li, and Y . Huang, “Query-Centric Trajectory Prediction,” in2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2023, pp. 17 863–17 873. [Online]. Available: https://ieeexplore.ieee.org/document/10203873/

  17. [17]

    Qcnext: A next-generation framework for joint multi-agent trajectory prediction.arXiv preprint arXiv:2306.10508, 2023

    Z. Zhou, Z. Wen, J. Wang, Y .-H. Li, and Y .-K. Huang, “QCNeXt: A Next-Generation Framework For Joint Multi-Agent Trajectory Prediction,”Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. [Online]. Available: http://arxiv.org/abs/2306.10508

  18. [18]

    A joint prediction method of multi-agent to reduce collision rate,

    M. Wang, H. Zou, Y . Liu, Y . Wang, and G. Li, “A joint prediction method of multi-agent to reduce collision rate,” 2024. [Online]. Available: http://arxiv.org/abs/2411.07612

  19. [19]

    Reasoning multi- agent behavioral topology for interactive autonomous driving,

    H. Liu, L. Chen, Y . Qiao, C. Lv, and H. Li, “Reasoning multi- agent behavioral topology for interactive autonomous driving,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 15 918–15 928

  20. [20]

    Attention is not Explanation

    S. Jain and B. C. Wallace, “Attention is not Explanation,” Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019. [Online]. Available: http://arxiv.org/abs/1902.10186

  21. [21]

    Sanity Checks for Saliency Maps,

    J. Adebayo, J. Gilmer, M. Muelly, I. Goodfellow, M. Hardt, and B. Kim, “Sanity Checks for Saliency Maps,” inAdvances in Neural Information Processing Systems, vol. 31. Curran Associates, Inc., 2018

  22. [22]

    Curb Your Attention: Causal Attention Gating for Robust Trajectory Prediction in Autonomous Driving,

    E. Ahmadi, R. Mercurius, S. Alizadeh, K. Rezaee, and A. Rasouli, “Curb Your Attention: Causal Attention Gating for Robust Trajectory Prediction in Autonomous Driving,”Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025. [Online]. Available: http://arxiv.org/abs/2410.07191

  23. [23]

    CaDeT: A Causal Disentanglement Approach for Robust Trajectory Prediction in Autonomous Driving,

    M. Pourkeshavarz, J. Zhang, and A. Rasouli, “CaDeT: A Causal Disentanglement Approach for Robust Trajectory Prediction in Autonomous Driving,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2024, pp. 14 874–14 884. [Online]. Available: https://ieeexplore.ieee.org/document/10657124/

  24. [24]

    The information bottleneck method

    N. Tishby, F. C. Pereira, and W. Bialek, “The information bottleneck method,” 2000. [Online]. Available: http://arxiv.org/abs/physics/0004057

  25. [25]

    Polynomial calculation of the Shapley value based on sampling,

    J. Castro, D. G ´omez, and J. Tejada, “Polynomial calculation of the Shapley value based on sampling,”Computers & Operations Research, vol. 36, no. 5, pp. 1726–1730, May 2009

  26. [26]

    RISE: Randomized Input Sam- pling for Explanation of Black-box Models,

    V . Petsiuk, A. Das, and K. Saenko, “RISE: Randomized Input Sam- pling for Explanation of Black-box Models,” Sep. 2018

  27. [27]

    Deletion and insertion tests in regression models,

    N. Hama, M. Mase, and A. B. Owen, “Deletion and insertion tests in regression models,”Journal of Machine Learning Research, vol. 23, no. 1, 2022

  28. [28]

    Improving Subgraph Recognition with Variational Graph Information Bottleneck,

    J. Yu, J. Cao, and R. He, “Improving Subgraph Recognition with Variational Graph Information Bottleneck,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA: IEEE, Jun. 2022, pp. 19 374–19 383. [Online]. Available: https://ieeexplore.ieee.org/document/9880086/

  29. [29]

    Learning Robust Representations via Multi-View Information Bottleneck,

    M. Federici, A. Dutta, P. Forr ´e, N. Kushman, and Z. Akata, “Learning Robust Representations via Multi-View Information Bottleneck,” Feb

  30. [30]

    Available: http://arxiv.org/abs/2002.07017

    [Online]. Available: http://arxiv.org/abs/2002.07017