Interpretable Modelling of Driving Behaviors in Interactive Driving Scenarios based on Cumulative Prospect Theory
Pith reviewed 2026-05-24 18:58 UTC · model grok-4.3
The pith
Cumulative prospect theory models human driving behavior in interactive scenarios, outperforming time-to-collision and matching neural networks with less data and better interpretability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By casting driver decisions as the maximization of a cumulative-prospect value that combines a nonlinear value function and a nonlinear decision-weighting function, the authors recover parameters from real merging data that yield lower prediction error than a time-to-collision baseline and statistically comparable error to a neural-network model while requiring far fewer training examples and exposing the recovered utility and weighting curves for inspection.
What carries the argument
The CPT-driven decision-making model together with its hierarchical learning algorithm that jointly recovers the utility function, value function, and decision-weighting function from observed trajectories.
If this is right
- Interactive driving decisions can be generated by maximizing a prospect-theory value that incorporates loss aversion and probability distortion.
- Accurate prediction of human maneuvers is possible with substantially smaller training sets than those needed by neural networks.
- The explicit value and weighting functions allow direct examination of which behavioral biases explain observed choices.
- The same CPT structure applies to any two-agent interaction once the functions have been fitted to representative data.
Where Pith is reading between the lines
- Parameters recovered from one scenario class may transfer to other interactive maneuvers if the underlying biases prove stable across contexts.
- Vehicle planners could embed the recovered CPT value function to generate trajectories that explicitly hedge against probable human misjudgments rather than assuming perfect rationality.
- Population-level differences in the fitted weighting function could support demographic or personalized driver models.
Load-bearing premise
The hierarchical learning algorithm recovers CPT parameters that remain valid outside the particular roundabout-merging dataset used for fitting.
What would settle it
On an independent set of interactive driving trajectories the CPT model produces higher prediction error than the neural-network baseline or no improvement over the time-to-collision rule.
Figures
read the original abstract
Understanding human driving behavior is important for autonomous vehicles. In this paper, we propose an interpretable human behavior model in interactive driving scenarios based on the cumulative prospect theory (CPT). As a non-expected utility theory, CPT can well explain some systematically biased or ``irrational'' behavior/decisions of human that cannot be explained by the expected utility theory. Hence, the goal of this work is to formulate the human drivers' behavior generation model with CPT so that some ``irrational'' behavior or decisions of human can be better captured and predicted. Towards such a goal, we first develop a CPT-driven decision-making model focusing on driving scenarios with two interacting agents. A hierarchical learning algorithm is proposed afterward to learn the utility function, the value function, and the decision weighting function in the CPT model. A case study for roundabout merging is also provided as verification. With real driving data, the prediction performances of three different models are compared: a predefined model based on time-to-collision (TTC), a learning-based model based on neural networks, and the proposed CPT-based model. The results show that the proposed model outperforms the TTC model and achieves similar performance as the learning-based model with much less training data and better interpretability.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a Cumulative Prospect Theory (CPT)-based model for human driving behavior in two-agent interactive scenarios. It introduces a hierarchical learning algorithm to recover the value function, utility function, and probability weighting function parameters, then evaluates the model on real roundabout merging trajectories, claiming it outperforms a time-to-collision (TTC) baseline and matches a neural-network model while requiring less training data and providing greater interpretability.
Significance. If the hierarchical procedure reliably recovers generalizable CPT parameters that explain irrational driving choices, the work would supply a useful interpretable alternative to black-box predictors for human-AV interaction. The use of real data and explicit comparison to TTC and NN baselines is a constructive step; however, the absence of identifiability checks or cross-scenario validation limits the strength of the interpretability and generalization claims.
major comments (3)
- [Methodology / hierarchical learning algorithm] The hierarchical learning algorithm (described in the methodology section following the CPT model formulation) is presented without a synthetic-data recovery experiment or identifiability analysis. Because the central empirical claim—that performance gains reflect CPT structure rather than dataset-specific fitting—depends on correct recovery of the value, utility, and weighting parameters from the same roundabout trajectories used for evaluation, this omission is load-bearing.
- [Case study / experimental results] In the case-study verification (roundabout merging experiments), the performance comparison to TTC and NN reports no cross-validation procedure, no statistical significance tests on the reported accuracy differences, and no description of how the CPT parameters were regularized or selected. Without these, it is impossible to determine whether the claimed parity with the NN model (while using less data) is robust or an artifact of the single-dataset fit.
- [Discussion / interpretability claims] The interpretability advantage asserted for the CPT model is not supported by any quantitative metric or concrete example of how the recovered parameters explain specific irrational behaviors observed in the data. This weakens the claim that CPT provides better insight than the NN baseline.
minor comments (2)
- [CPT model formulation] The notation for the reference point and the exact form of the value function should be stated explicitly with an equation number so that readers can reproduce the CPT decision rule without ambiguity.
- [Figures and tables] Figure captions in the experimental section do not indicate the number of trajectories or the train/test split sizes, making it difficult to assess the 'much less training data' claim quantitatively.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript accordingly to strengthen the methodology, experiments, and interpretability discussion.
read point-by-point responses
-
Referee: [Methodology / hierarchical learning algorithm] The hierarchical learning algorithm (described in the methodology section following the CPT model formulation) is presented without a synthetic-data recovery experiment or identifiability analysis. Because the central empirical claim—that performance gains reflect CPT structure rather than dataset-specific fitting—depends on correct recovery of the value, utility, and weighting parameters from the same roundabout trajectories used for evaluation, this omission is load-bearing.
Authors: We agree that a synthetic-data recovery experiment would provide stronger validation of the hierarchical algorithm. We will add such an experiment in the revised manuscript, using simulated trajectories generated from known CPT parameters to demonstrate accurate recovery of the value, utility, and weighting functions. revision: yes
-
Referee: [Case study / experimental results] In the case-study verification (roundabout merging experiments), the performance comparison to TTC and NN reports no cross-validation procedure, no statistical significance tests on the reported accuracy differences, and no description of how the CPT parameters were regularized or selected. Without these, it is impossible to determine whether the claimed parity with the NN model (while using less data) is robust or an artifact of the single-dataset fit.
Authors: We acknowledge these gaps in the experimental reporting. The revised case study will include a cross-validation procedure, statistical significance tests on accuracy differences, and explicit details on regularization and parameter selection for the CPT model. revision: yes
-
Referee: [Discussion / interpretability claims] The interpretability advantage asserted for the CPT model is not supported by any quantitative metric or concrete example of how the recovered parameters explain specific irrational behaviors observed in the data. This weakens the claim that CPT provides better insight than the NN baseline.
Authors: The CPT formulation allows direct inspection of parameters to explain behaviors, but we agree concrete support is needed. We will add specific examples from the data in the discussion showing how recovered parameters (e.g., the weighting function) account for observed irrational merging decisions, and include a basic quantitative comparison of model complexity. revision: partial
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper grounds its approach in Cumulative Prospect Theory (an external decision theory) to build a CPT-driven decision model for two-agent interactions, then applies a hierarchical learning procedure to estimate the value function, utility, and weighting parameters from roundabout merging trajectories. Model predictions are compared against an independent predefined TTC baseline and a separate neural-network learner on the same real-world dataset. No equation or step reduces a claimed prediction to its own fitted inputs by construction, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled via prior work. The chain therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- parameters of value function, utility function, and decision weighting function
axioms (1)
- domain assumption Cumulative prospect theory provides a better description of human decision making under risk than expected utility theory in driving contexts
Reference graph
Works this paper leans on
-
[1]
Driver Behavior Modeling: Developments and Future Directions,
N. AbuAli and H. Abou-zeid, “Driver Behavior Modeling: Developments and Future Directions,” International Journal of Vehicular Technology, 2016
work page 2016
-
[2]
Modeling and Recognizing Driver Behavior Based on Driving Data: A Survey,
W. Wang, J. Xi, and H. Chen, “Modeling and Recognizing Driver Behavior Based on Driving Data: A Survey,” Mathematical Problems in Engineering , 2014
work page 2014
-
[3]
Review of Mi- croscopic Lane-Changing Models and Future Research Oppor- tunities,
M. Rahman, M. Chowdhury, Y. Xie, and Y. He, “Review of Mi- croscopic Lane-Changing Models and Future Research Oppor- tunities,” IEEE Transactions on Intelligent Transportation Systems , vol. 14, no. 4, pp. 1942–1956, Dec. 2013
work page 1942
-
[4]
Development of a Fuzzy Logic based Microscopic Motorway Simulation Model,
M. McDonald, J. Wu, and M. Brackstone, “Development of a Fuzzy Logic based Microscopic Motorway Simulation Model,” in Proceedings of Conference on Intelligent Transportation Systems . Boston, MA, USA: IEEE, 1997, pp. 82–87
work page 1997
-
[5]
Status of NHTSA’s Rear-End Crash Prevention Research Program,
R. J. Kiefer, J. Salinger, and J. J. Ference, “Status of NHTSA’s Rear-End Crash Prevention Research Program,” June 2005
work page 2005
-
[6]
Congested Traffic States in Empirical Observations and Microscopic Simulations,
M. Treiber, A. Hennecke, and D. Helbing, “Congested Traffic States in Empirical Observations and Microscopic Simulations,” Physical Review E , vol. 62, no. 2, pp. 1805–1824, Aug. 2000
work page 2000
-
[7]
General Lane-Changing Model MOBIL for Car-Following Models,
A. Kesting, M. Treiber, and D. Helbing, “General Lane-Changing Model MOBIL for Car-Following Models,” Transportation Re- search Record, vol. 1999, no. 1, pp. 86–94, Jan. 2007
work page 1999
-
[8]
Driver Behavior Classification at Intersections and Validation on Large Naturalistic Data Set,
G. S. Aoude, V . R. Desaraju, L. H. Stephens, and J. P . How, “Driver Behavior Classification at Intersections and Validation on Large Naturalistic Data Set,” IEEE Transactions on Intelligent Transportation Systems, vol. 13, no. 2, pp. 724–736, June 2012
work page 2012
-
[9]
Probabilistic Prediction of Vehicle Semantic Intention and Motion,
Y. Hu, W. Zhan, and M. Tomizuka, “Probabilistic Prediction of Vehicle Semantic Intention and Motion,” in 2018 IEEE Intelligent Vehicles Symposium (IV), June 2018, pp. 307–313
work page 2018
-
[10]
J. Li, W. Zhan, and M. Tomizuka, “Generic Vehicle Tracking Framework Capable of Handling Occlusions Based on Modified Mixture Particle Filter,” in 2018 IEEE Intelligent Vehicles Sympo- sium (IV), June 2018, pp. 936–942
work page 2018
-
[11]
Coordination and Trajectory Prediction for Vehicle Interactions via Bayesian Gen- erative Modeling,
J. Li, H. Ma, W. Zhan, and M. Tomizuka, “Coordination and Trajectory Prediction for Vehicle Interactions via Bayesian Gen- erative Modeling,” in IEEE Intelligent Vehicles Symposium , 2019
work page 2019
-
[12]
Social gan: Socially acceptable trajectories with generative adversarial networks,
A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2018, pp. 2255–2264
work page 2018
-
[13]
A Framework for Prob- abilistic Generic Traffic Scene Prediction,
Y. Hu, W. Zhan, and M. Tomizuka, “A Framework for Prob- abilistic Generic Traffic Scene Prediction,” in 2018 21st Interna- tional Conference on Intelligent Transportation Systems (ITSC), Nov. 2018, pp. 2790–2796
work page 2018
-
[14]
TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents
Y. Ma, X. Zhu, S. Zhang, R. Yang, W. Wang, and D. Manocha, “Trafficpredict: Trajectory prediction for heterogeneous traffic- agents,” arXiv preprint arXiv:1811.02146 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[15]
Multi-modal prob- abilistic prediction of interactive behavior via an interpretable model,
Y. Hu, W. Zhan, L. Sun, and M. Tomizuka, “Multi-modal prob- abilistic prediction of interactive behavior via an interpretable model,” in Proceedings of the IEEE Intelligent Vehicle Symposium (IV2019), 2019
work page 2019
-
[16]
Does the chimpanzee have a theory of mind?
D. Premack and G. Woodruff, “Does the chimpanzee have a theory of mind?” Behavioral and brain sciences , vol. 1, no. 4, pp. 515–526, 1978
work page 1978
-
[17]
Modeling human plan recognition using bayesian theory of mind,
C. L. Baker and J. B. Tenenbaum, “Modeling human plan recognition using bayesian theory of mind,” Plan, activity, and intent recognition: Theory and practice , pp. 177–204, 2014
work page 2014
-
[18]
Integrat- ing Intuitive Driver Models in Autonomous Planning for Inter- active Maneuvers,
K. Driggs-Campbell, V . Govindarajan, and R. Bajcsy, “Integrat- ing Intuitive Driver Models in Autonomous Planning for Inter- active Maneuvers,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 12, pp. 3461–3472, Dec. 2017
work page 2017
-
[19]
Apprenticeship learning via inverse reinforcement learning,
P . Abbeel and A. Y. Ng, “Apprenticeship learning via inverse reinforcement learning,” in Proceedings of the twenty-first interna- tional conference on Machine learning . ACM, 2004, p. 1
work page 2004
-
[20]
Maxi- mum entropy inverse reinforcement learning
B. D. Ziebart, A. L. Maas, J. A. Bagnell, and A. K. Dey, “Maxi- mum entropy inverse reinforcement learning.” in AAAI, vol. 8. Chicago, IL, USA, 2008, pp. 1433–1438
work page 2008
-
[21]
continuous inverse optimal control with locally optimal examples„
S. Levine and V . Koltun, “continuous inverse optimal control with locally optimal examples„” in the 29th International Confer- ence on Machine Learning (ICML-12) , 2012
work page 2012
-
[22]
L. Sun, W. Zhan, M. Tomizuka, and A. D. Dragan, “Courteous Autonomous Cars,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , Oct. 2018, pp. 663–670
work page 2018
-
[23]
L. Sun, W. Zhan, and M. Tomizuka, “Probabilistic Prediction of Interactive Driving Behavior via Hierarchical Inverse Rein- forcement Learning,” in 2018 21st International Conference on Intelligent Transportation Systems (ITSC) , Nov. 2018, pp. 2111– 2117
work page 2018
-
[24]
Generic prediction archi- tecture considering both rational and irrational driving behav- ior,
Y. Hu, L. Sun, and M. Tomizuka, “Generic prediction archi- tecture considering both rational and irrational driving behav- ior,” in Proceedings of the IEEE Transportation System Conference (ITSC2019), 2019
work page 2019
-
[25]
Prospect theory: An analysis of decisions under risk,
D. Kahneman, “Prospect theory: An analysis of decisions under risk,” Econometrica, vol. 47, p. 278, 1979
work page 1979
-
[26]
Advances in prospect theory: Cumulative representation of uncertainty,
A. Tversky and D. Kahneman, “Advances in prospect theory: Cumulative representation of uncertainty,” Journal of Risk and uncertainty, vol. 5, no. 4, pp. 297–323, 1992
work page 1992
-
[27]
Con- structing a Highly Interactive Vehicle Motion Dataset,
W. Zhan, L. Sun, D. Wang, Y. Jin, and M. Tomizuka, “Con- structing a Highly Interactive Vehicle Motion Dataset,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019
work page 2019
-
[28]
W. Zhan, L. Sun, D. Wang, H. Shi, A. Clausse, M. Naumann, J. Kümmerle, H. Königshof, C. Stiller, A. de La Fortelle, and M. Tomizuka, “INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Scenarios with Semantic Maps,” 2019
work page 2019
-
[29]
Exposition of a new theory on the measurement,
O. RISK and D. BERNOULLI, “Exposition of a new theory on the measurement,” Econometrica, vol. 22, no. 1, pp. 23–36, 1954
work page 1954
-
[30]
M. Allais, “Rational man’s behavior in the presence of risk: Critique of the postulates and axioms of the american school,” Econometrica, vol. 21, no. 4, pp. 503–46, 1953
work page 1953
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.