pith. sign in

arxiv: 1907.08707 · v1 · pith:A2NSH6ZQnew · submitted 2019-07-19 · 💻 cs.AI · cs.RO· cs.SY· eess.SY

Interpretable Modelling of Driving Behaviors in Interactive Driving Scenarios based on Cumulative Prospect Theory

Pith reviewed 2026-05-24 18:58 UTC · model grok-4.3

classification 💻 cs.AI cs.ROcs.SYeess.SY
keywords cumulative prospect theoryhuman driving behaviorinteractive driving scenariosinterpretable modelingroundabout mergingdecision weighting functionautonomous vehicle prediction
0
0 comments X

The pith

Cumulative prospect theory models human driving behavior in interactive scenarios, outperforming time-to-collision and matching neural networks with less data and better interpretability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a decision-making model for two-agent driving interactions that replaces expected-utility calculations with cumulative prospect theory to account for documented human biases such as loss aversion and distorted probability weighting. A hierarchical learning procedure fits the utility function, value function, and weighting function directly to trajectory data. The resulting model is tested on real roundabout-merging recordings and compared against a simple time-to-collision rule and a neural-network predictor. If the CPT formulation holds, autonomous-vehicle planners gain an explicit, data-efficient way to anticipate the non-rational choices drivers actually make.

Core claim

By casting driver decisions as the maximization of a cumulative-prospect value that combines a nonlinear value function and a nonlinear decision-weighting function, the authors recover parameters from real merging data that yield lower prediction error than a time-to-collision baseline and statistically comparable error to a neural-network model while requiring far fewer training examples and exposing the recovered utility and weighting curves for inspection.

What carries the argument

The CPT-driven decision-making model together with its hierarchical learning algorithm that jointly recovers the utility function, value function, and decision-weighting function from observed trajectories.

If this is right

  • Interactive driving decisions can be generated by maximizing a prospect-theory value that incorporates loss aversion and probability distortion.
  • Accurate prediction of human maneuvers is possible with substantially smaller training sets than those needed by neural networks.
  • The explicit value and weighting functions allow direct examination of which behavioral biases explain observed choices.
  • The same CPT structure applies to any two-agent interaction once the functions have been fitted to representative data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Parameters recovered from one scenario class may transfer to other interactive maneuvers if the underlying biases prove stable across contexts.
  • Vehicle planners could embed the recovered CPT value function to generate trajectories that explicitly hedge against probable human misjudgments rather than assuming perfect rationality.
  • Population-level differences in the fitted weighting function could support demographic or personalized driver models.

Load-bearing premise

The hierarchical learning algorithm recovers CPT parameters that remain valid outside the particular roundabout-merging dataset used for fitting.

What would settle it

On an independent set of interactive driving trajectories the CPT model produces higher prediction error than the neural-network baseline or no improvement over the time-to-collision rule.

Figures

Figures reproduced from arXiv: 1907.08707 by Liting Sun, Masayoshi Tomizuka, Wei Zhan, Yeping Hu.

Figure 1
Figure 1. Figure 1: Examples of the value function and weighting function where w ±:[0, 1]→[0, 1] are both strictly increasing func￾tions with w +(0)=w −(0)=0, and w +(1)=w −(1)=1. Typically, the value function v(u) is convex when u≥u0 (gains) and concave when u<u0 (losses), and it is steeper for losses than for gains [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: The reference paths (a) and trajectories in Frenet frame (b). The crossing points on the pair of reference paths define the common reference zero points for two interactive cars. B. Comparison models We compared the decision prediction performance among three different models: 1) a predefined TTC rule-based model, 2) a learning-based neural network model, and 3) the proposed CPT model. A brief intro￾ductio… view at source ↗
Figure 2
Figure 2. Figure 2: (a), the roundabout has 6 branches and each branch has two directions (both in and out). We selected the interactive motions of two cars at the left-most branch ( [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: (b) shows the optimal yielding trajectory of the target vehicle (cyan) and the ground truth trajectory of the interacting vehicle (blue) [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The learned decision weighting function (red curve) theory (CPT). To learn the model parameters from real driving data, a hierarchical learning algorithm was also developed, in which inverse reinforcement learning and nonlinear logistic regression were combined. Com￾parison studies were conducted among three different models: a predefined TTC model, a neural network (NN) based learning model, and the propo… view at source ↗
read the original abstract

Understanding human driving behavior is important for autonomous vehicles. In this paper, we propose an interpretable human behavior model in interactive driving scenarios based on the cumulative prospect theory (CPT). As a non-expected utility theory, CPT can well explain some systematically biased or ``irrational'' behavior/decisions of human that cannot be explained by the expected utility theory. Hence, the goal of this work is to formulate the human drivers' behavior generation model with CPT so that some ``irrational'' behavior or decisions of human can be better captured and predicted. Towards such a goal, we first develop a CPT-driven decision-making model focusing on driving scenarios with two interacting agents. A hierarchical learning algorithm is proposed afterward to learn the utility function, the value function, and the decision weighting function in the CPT model. A case study for roundabout merging is also provided as verification. With real driving data, the prediction performances of three different models are compared: a predefined model based on time-to-collision (TTC), a learning-based model based on neural networks, and the proposed CPT-based model. The results show that the proposed model outperforms the TTC model and achieves similar performance as the learning-based model with much less training data and better interpretability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a Cumulative Prospect Theory (CPT)-based model for human driving behavior in two-agent interactive scenarios. It introduces a hierarchical learning algorithm to recover the value function, utility function, and probability weighting function parameters, then evaluates the model on real roundabout merging trajectories, claiming it outperforms a time-to-collision (TTC) baseline and matches a neural-network model while requiring less training data and providing greater interpretability.

Significance. If the hierarchical procedure reliably recovers generalizable CPT parameters that explain irrational driving choices, the work would supply a useful interpretable alternative to black-box predictors for human-AV interaction. The use of real data and explicit comparison to TTC and NN baselines is a constructive step; however, the absence of identifiability checks or cross-scenario validation limits the strength of the interpretability and generalization claims.

major comments (3)
  1. [Methodology / hierarchical learning algorithm] The hierarchical learning algorithm (described in the methodology section following the CPT model formulation) is presented without a synthetic-data recovery experiment or identifiability analysis. Because the central empirical claim—that performance gains reflect CPT structure rather than dataset-specific fitting—depends on correct recovery of the value, utility, and weighting parameters from the same roundabout trajectories used for evaluation, this omission is load-bearing.
  2. [Case study / experimental results] In the case-study verification (roundabout merging experiments), the performance comparison to TTC and NN reports no cross-validation procedure, no statistical significance tests on the reported accuracy differences, and no description of how the CPT parameters were regularized or selected. Without these, it is impossible to determine whether the claimed parity with the NN model (while using less data) is robust or an artifact of the single-dataset fit.
  3. [Discussion / interpretability claims] The interpretability advantage asserted for the CPT model is not supported by any quantitative metric or concrete example of how the recovered parameters explain specific irrational behaviors observed in the data. This weakens the claim that CPT provides better insight than the NN baseline.
minor comments (2)
  1. [CPT model formulation] The notation for the reference point and the exact form of the value function should be stated explicitly with an equation number so that readers can reproduce the CPT decision rule without ambiguity.
  2. [Figures and tables] Figure captions in the experimental section do not indicate the number of trajectories or the train/test split sizes, making it difficult to assess the 'much less training data' claim quantitatively.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript accordingly to strengthen the methodology, experiments, and interpretability discussion.

read point-by-point responses
  1. Referee: [Methodology / hierarchical learning algorithm] The hierarchical learning algorithm (described in the methodology section following the CPT model formulation) is presented without a synthetic-data recovery experiment or identifiability analysis. Because the central empirical claim—that performance gains reflect CPT structure rather than dataset-specific fitting—depends on correct recovery of the value, utility, and weighting parameters from the same roundabout trajectories used for evaluation, this omission is load-bearing.

    Authors: We agree that a synthetic-data recovery experiment would provide stronger validation of the hierarchical algorithm. We will add such an experiment in the revised manuscript, using simulated trajectories generated from known CPT parameters to demonstrate accurate recovery of the value, utility, and weighting functions. revision: yes

  2. Referee: [Case study / experimental results] In the case-study verification (roundabout merging experiments), the performance comparison to TTC and NN reports no cross-validation procedure, no statistical significance tests on the reported accuracy differences, and no description of how the CPT parameters were regularized or selected. Without these, it is impossible to determine whether the claimed parity with the NN model (while using less data) is robust or an artifact of the single-dataset fit.

    Authors: We acknowledge these gaps in the experimental reporting. The revised case study will include a cross-validation procedure, statistical significance tests on accuracy differences, and explicit details on regularization and parameter selection for the CPT model. revision: yes

  3. Referee: [Discussion / interpretability claims] The interpretability advantage asserted for the CPT model is not supported by any quantitative metric or concrete example of how the recovered parameters explain specific irrational behaviors observed in the data. This weakens the claim that CPT provides better insight than the NN baseline.

    Authors: The CPT formulation allows direct inspection of parameters to explain behaviors, but we agree concrete support is needed. We will add specific examples from the data in the discussion showing how recovered parameters (e.g., the weighting function) account for observed irrational merging decisions, and include a basic quantitative comparison of model complexity. revision: partial

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper grounds its approach in Cumulative Prospect Theory (an external decision theory) to build a CPT-driven decision model for two-agent interactions, then applies a hierarchical learning procedure to estimate the value function, utility, and weighting parameters from roundabout merging trajectories. Model predictions are compared against an independent predefined TTC baseline and a separate neural-network learner on the same real-world dataset. No equation or step reduces a claimed prediction to its own fitted inputs by construction, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled via prior work. The chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the premise that CPT parameters can be learned to represent human driving decisions and that these parameters generalize; the paper introduces no new physical entities but fits several functional forms to data.

free parameters (1)
  • parameters of value function, utility function, and decision weighting function
    These are learned via the hierarchical algorithm from the roundabout merging dataset and directly determine the model's predictions.
axioms (1)
  • domain assumption Cumulative prospect theory provides a better description of human decision making under risk than expected utility theory in driving contexts
    Invoked in the motivation and model formulation sections to justify replacing standard utility with CPT.

pith-pipeline@v0.9.0 · 5765 in / 1283 out tokens · 17617 ms · 2026-05-24T18:58:58.335388+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 1 internal anchor

  1. [1]

    Driver Behavior Modeling: Developments and Future Directions,

    N. AbuAli and H. Abou-zeid, “Driver Behavior Modeling: Developments and Future Directions,” International Journal of Vehicular Technology, 2016

  2. [2]

    Modeling and Recognizing Driver Behavior Based on Driving Data: A Survey,

    W. Wang, J. Xi, and H. Chen, “Modeling and Recognizing Driver Behavior Based on Driving Data: A Survey,” Mathematical Problems in Engineering , 2014

  3. [3]

    Review of Mi- croscopic Lane-Changing Models and Future Research Oppor- tunities,

    M. Rahman, M. Chowdhury, Y. Xie, and Y. He, “Review of Mi- croscopic Lane-Changing Models and Future Research Oppor- tunities,” IEEE Transactions on Intelligent Transportation Systems , vol. 14, no. 4, pp. 1942–1956, Dec. 2013

  4. [4]

    Development of a Fuzzy Logic based Microscopic Motorway Simulation Model,

    M. McDonald, J. Wu, and M. Brackstone, “Development of a Fuzzy Logic based Microscopic Motorway Simulation Model,” in Proceedings of Conference on Intelligent Transportation Systems . Boston, MA, USA: IEEE, 1997, pp. 82–87

  5. [5]

    Status of NHTSA’s Rear-End Crash Prevention Research Program,

    R. J. Kiefer, J. Salinger, and J. J. Ference, “Status of NHTSA’s Rear-End Crash Prevention Research Program,” June 2005

  6. [6]

    Congested Traffic States in Empirical Observations and Microscopic Simulations,

    M. Treiber, A. Hennecke, and D. Helbing, “Congested Traffic States in Empirical Observations and Microscopic Simulations,” Physical Review E , vol. 62, no. 2, pp. 1805–1824, Aug. 2000

  7. [7]

    General Lane-Changing Model MOBIL for Car-Following Models,

    A. Kesting, M. Treiber, and D. Helbing, “General Lane-Changing Model MOBIL for Car-Following Models,” Transportation Re- search Record, vol. 1999, no. 1, pp. 86–94, Jan. 2007

  8. [8]

    Driver Behavior Classification at Intersections and Validation on Large Naturalistic Data Set,

    G. S. Aoude, V . R. Desaraju, L. H. Stephens, and J. P . How, “Driver Behavior Classification at Intersections and Validation on Large Naturalistic Data Set,” IEEE Transactions on Intelligent Transportation Systems, vol. 13, no. 2, pp. 724–736, June 2012

  9. [9]

    Probabilistic Prediction of Vehicle Semantic Intention and Motion,

    Y. Hu, W. Zhan, and M. Tomizuka, “Probabilistic Prediction of Vehicle Semantic Intention and Motion,” in 2018 IEEE Intelligent Vehicles Symposium (IV), June 2018, pp. 307–313

  10. [10]

    Generic Vehicle Tracking Framework Capable of Handling Occlusions Based on Modified Mixture Particle Filter,

    J. Li, W. Zhan, and M. Tomizuka, “Generic Vehicle Tracking Framework Capable of Handling Occlusions Based on Modified Mixture Particle Filter,” in 2018 IEEE Intelligent Vehicles Sympo- sium (IV), June 2018, pp. 936–942

  11. [11]

    Coordination and Trajectory Prediction for Vehicle Interactions via Bayesian Gen- erative Modeling,

    J. Li, H. Ma, W. Zhan, and M. Tomizuka, “Coordination and Trajectory Prediction for Vehicle Interactions via Bayesian Gen- erative Modeling,” in IEEE Intelligent Vehicles Symposium , 2019

  12. [12]

    Social gan: Socially acceptable trajectories with generative adversarial networks,

    A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2018, pp. 2255–2264

  13. [13]

    A Framework for Prob- abilistic Generic Traffic Scene Prediction,

    Y. Hu, W. Zhan, and M. Tomizuka, “A Framework for Prob- abilistic Generic Traffic Scene Prediction,” in 2018 21st Interna- tional Conference on Intelligent Transportation Systems (ITSC), Nov. 2018, pp. 2790–2796

  14. [14]

    TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents

    Y. Ma, X. Zhu, S. Zhang, R. Yang, W. Wang, and D. Manocha, “Trafficpredict: Trajectory prediction for heterogeneous traffic- agents,” arXiv preprint arXiv:1811.02146 , 2018

  15. [15]

    Multi-modal prob- abilistic prediction of interactive behavior via an interpretable model,

    Y. Hu, W. Zhan, L. Sun, and M. Tomizuka, “Multi-modal prob- abilistic prediction of interactive behavior via an interpretable model,” in Proceedings of the IEEE Intelligent Vehicle Symposium (IV2019), 2019

  16. [16]

    Does the chimpanzee have a theory of mind?

    D. Premack and G. Woodruff, “Does the chimpanzee have a theory of mind?” Behavioral and brain sciences , vol. 1, no. 4, pp. 515–526, 1978

  17. [17]

    Modeling human plan recognition using bayesian theory of mind,

    C. L. Baker and J. B. Tenenbaum, “Modeling human plan recognition using bayesian theory of mind,” Plan, activity, and intent recognition: Theory and practice , pp. 177–204, 2014

  18. [18]

    Integrat- ing Intuitive Driver Models in Autonomous Planning for Inter- active Maneuvers,

    K. Driggs-Campbell, V . Govindarajan, and R. Bajcsy, “Integrat- ing Intuitive Driver Models in Autonomous Planning for Inter- active Maneuvers,” IEEE Transactions on Intelligent Transportation Systems, vol. 18, no. 12, pp. 3461–3472, Dec. 2017

  19. [19]

    Apprenticeship learning via inverse reinforcement learning,

    P . Abbeel and A. Y. Ng, “Apprenticeship learning via inverse reinforcement learning,” in Proceedings of the twenty-first interna- tional conference on Machine learning . ACM, 2004, p. 1

  20. [20]

    Maxi- mum entropy inverse reinforcement learning

    B. D. Ziebart, A. L. Maas, J. A. Bagnell, and A. K. Dey, “Maxi- mum entropy inverse reinforcement learning.” in AAAI, vol. 8. Chicago, IL, USA, 2008, pp. 1433–1438

  21. [21]

    continuous inverse optimal control with locally optimal examples„

    S. Levine and V . Koltun, “continuous inverse optimal control with locally optimal examples„” in the 29th International Confer- ence on Machine Learning (ICML-12) , 2012

  22. [22]

    Courteous Autonomous Cars,

    L. Sun, W. Zhan, M. Tomizuka, and A. D. Dragan, “Courteous Autonomous Cars,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) , Oct. 2018, pp. 663–670

  23. [23]

    Probabilistic Prediction of Interactive Driving Behavior via Hierarchical Inverse Rein- forcement Learning,

    L. Sun, W. Zhan, and M. Tomizuka, “Probabilistic Prediction of Interactive Driving Behavior via Hierarchical Inverse Rein- forcement Learning,” in 2018 21st International Conference on Intelligent Transportation Systems (ITSC) , Nov. 2018, pp. 2111– 2117

  24. [24]

    Generic prediction archi- tecture considering both rational and irrational driving behav- ior,

    Y. Hu, L. Sun, and M. Tomizuka, “Generic prediction archi- tecture considering both rational and irrational driving behav- ior,” in Proceedings of the IEEE Transportation System Conference (ITSC2019), 2019

  25. [25]

    Prospect theory: An analysis of decisions under risk,

    D. Kahneman, “Prospect theory: An analysis of decisions under risk,” Econometrica, vol. 47, p. 278, 1979

  26. [26]

    Advances in prospect theory: Cumulative representation of uncertainty,

    A. Tversky and D. Kahneman, “Advances in prospect theory: Cumulative representation of uncertainty,” Journal of Risk and uncertainty, vol. 5, no. 4, pp. 297–323, 1992

  27. [27]

    Con- structing a Highly Interactive Vehicle Motion Dataset,

    W. Zhan, L. Sun, D. Wang, Y. Jin, and M. Tomizuka, “Con- structing a Highly Interactive Vehicle Motion Dataset,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019

  28. [28]

    INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Scenarios with Semantic Maps,

    W. Zhan, L. Sun, D. Wang, H. Shi, A. Clausse, M. Naumann, J. Kümmerle, H. Königshof, C. Stiller, A. de La Fortelle, and M. Tomizuka, “INTERACTION Dataset: An INTERnational, Adversarial and Cooperative moTION Dataset in Interactive Scenarios with Semantic Maps,” 2019

  29. [29]

    Exposition of a new theory on the measurement,

    O. RISK and D. BERNOULLI, “Exposition of a new theory on the measurement,” Econometrica, vol. 22, no. 1, pp. 23–36, 1954

  30. [30]

    Rational man’s behavior in the presence of risk: Critique of the postulates and axioms of the american school,

    M. Allais, “Rational man’s behavior in the presence of risk: Critique of the postulates and axioms of the american school,” Econometrica, vol. 21, no. 4, pp. 503–46, 1953