Stochastic trajectory prediction with social graph network

Lidan Zhang; Ping Guo; Qi She

REVIEW 2 major objections 1 minor 2 cited by

A directed social graph built from positions and velocities plus sequential stochastic sampling yields better pedestrian trajectory predictions in crowds.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

2026-05-24 17:13 UTC pith:AT4MHS7S

load-bearing objection Directed graph for non-symmetric interactions and sequential uncertainty modeling are the concrete design choices, but the abstract supplies zero numbers so the effectiveness claim can't be checked. the 2 major comments →

arxiv 1907.10233 v1 pith:AT4MHS7S submitted 2019-07-24 cs.CV

Stochastic trajectory prediction with social graph network

Lidan Zhang , Qi She , Ping Guo This is my paper

classification cs.CV

keywords pedestrian trajectory predictionsocial graphdirected graphstochastic modelinghierarchical LSTMcrowd behaviormotion forecastingsocial interactions

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish that existing fully connected models for pedestrian interactions overlook non-symmetric relationships, so a dynamically built directed social graph based on current locations and speed directions can better capture relevant social effects. It further claims that modeling uncertainty sequentially through a learned prior at each time step, rather than all at once, allows more accurate progressive decoding of future paths using hierarchical LSTMs. This combination produces representations that are both destination-oriented and aware of social context. The authors test the resulting network on two public datasets and report gains that are most pronounced in very crowded scenes. A sympathetic reader would care because reliable short-term motion forecasts matter for applications like autonomous navigation where human crowds create complex, asymmetric influences.

Core claim

The paper claims that constructing a directed social graph dynamically from timely location and speed direction information captures non-symmetric pairwise social relationships, allowing a network to aggregate social effects with individual features into destination-oriented representations; combining this with a temporal stochastic method that learns a prior model of uncertainty step-by-step during interactions and decodes via hierarchical LSTMs produces trajectory predictions that are more effective than prior approaches, with particular gains shown on crowded scenes in two public datasets.

What carries the argument

The directed social graph dynamically constructed on timely location and speed direction, which supplies the structure for collecting and accumulating social effects into individual representations before stochastic sequential decoding.

Load-bearing premise

The directed social graph built from locations and speed directions accurately identifies the relevant non-symmetric interactions without missing important ones or adding spurious connections.

What would settle it

An experiment on the same crowded-scene datasets where the proposed method shows no statistically significant improvement over strong fully-connected or non-stochastic baselines would falsify the central effectiveness claim.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

Predictions become more accurate when pairwise influences are treated as directed rather than symmetric or fully connected.
Sequential sampling from a learned prior allows uncertainty to be resolved progressively instead of modeled globally.
Social effects can be accumulated with individual motion features to produce representations that respect both destination intent and crowd context.
The approach shows its largest gains precisely when scene density increases and non-symmetric interactions matter most.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same graph-construction principle could be tested on multi-agent systems beyond pedestrians, such as vehicle or animal groups, to check whether directionality remains useful.
If the graph misses interactions in certain cultural or environmental settings, retraining the edge-construction rule on domain-specific data would be a direct next experiment.
The method's step-by-step uncertainty handling suggests a possible link to online planning systems that must replan as new observations arrive.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

Directed graph for non-symmetric interactions and sequential uncertainty modeling are the concrete design choices, but the abstract supplies zero numbers so the effectiveness claim can't be checked.

read the letter

The two main ideas here are a directed social graph built from location and speed to capture non-symmetric interactions, and a temporal stochastic method that learns uncertainty sequentially rather than all at once. Those choices address real limitations in earlier work on pedestrian prediction. The directed graph part is a sensible attempt to move beyond fully connected topologies. Using timely position and direction to decide edges is a practical way to try to model who actually influences whom. The sequential sampling with hierarchical LSTMs for uncertainty also has a clear logic behind it. The paper claims this works well on two public datasets, particularly in crowded scenes. But the abstract gives no quantitative results, no error bars, no baseline numbers, and no training details. That leaves the central claim without support. The stress-test point about the graph possibly adding spurious edges or missing real ones in dense crowds is a fair concern. The construction rule depends on proximity and direction, which can be noisy, and nothing in the description shows how it avoids that problem. This paper is for specialists in trajectory prediction who are already thinking about graph-based social models. A reader might get some ideas from the design choices, but without the experiments it's not something to rely on or build from. I would not send it for peer review until the results are added and the evaluation is solid.

Referee Report

2 major / 1 minor

Summary. The paper proposes a pedestrian trajectory prediction approach that constructs a directed social graph dynamically from pedestrian locations and velocity directions to capture non-symmetric interactions, accumulates social effects into individual representations via a dedicated network, models future uncertainty via a temporal stochastic prior learned sequentially during interactions, and decodes predictions using hierarchical LSTMs. It asserts that results on two public datasets demonstrate effectiveness, especially in crowded scenes.

Significance. If the empirical claims are substantiated and the graph construction is shown to avoid systematic errors in interaction topology, the method could contribute to more accurate modeling of asymmetric social influences and sequential uncertainty in dense pedestrian scenarios, with potential relevance to autonomous navigation and surveillance systems.

major comments (2)

[Abstract] Abstract: the central claim that 'experimental results on two public datasets show the effectiveness of our method, especially when predicting trajectories in very crowded scenes' is unsupported by any quantitative metrics, error bars, baseline comparisons, data-split details, or training-procedure information, leaving the primary empirical assertion without visible evidence.
[Method description] Method (directed social graph construction): the edge-selection rule based on proximity and speed direction is asserted to capture non-symmetric pairwise relationships, yet no analysis, sensitivity study, or ablation is supplied to show that the rule neither omits pedestrians with real influence nor adds negligible edges; in dense scenes this topology directly determines the social-effect accumulation step and is therefore load-bearing for the crowded-scene claim.

minor comments (1)

[Abstract] The abstract introduces terms such as 'temporal stochastic method' and 'hierarchical LSTMs' without a concise one-sentence definition or pointer to the relevant equations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comments point by point below and indicate the revisions we will make.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'experimental results on two public datasets show the effectiveness of our method, especially when predicting trajectories in very crowded scenes' is unsupported by any quantitative metrics, error bars, baseline comparisons, data-split details, or training-procedure information, leaving the primary empirical assertion without visible evidence.

Authors: The abstract is a high-level summary; the quantitative metrics (ADE/FDE), error bars, baseline comparisons, data splits, and training details are provided in the Experiments section. To address the concern, we will revise the abstract to include specific numerical results and key comparisons that substantiate the effectiveness claim, especially for crowded scenes. revision: partial
Referee: [Method description] Method (directed social graph construction): the edge-selection rule based on proximity and speed direction is asserted to capture non-symmetric pairwise relationships, yet no analysis, sensitivity study, or ablation is supplied to show that the rule neither omits pedestrians with real influence nor adds negligible edges; in dense scenes this topology directly determines the social-effect accumulation step and is therefore load-bearing for the crowded-scene claim.

Authors: We agree that explicit validation of the edge-selection rule is needed. In the revision we will add an ablation study and sensitivity analysis on the proximity and direction thresholds, reporting their effect on prediction error in dense scenes to demonstrate that influential pedestrians are retained while negligible edges are avoided. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a directed social graph dynamically built from location/speed, a social-effect accumulation network, and a temporal stochastic prior decoded via hierarchical LSTMs. Claims rest on experimental results on public datasets rather than any derivation that reduces by the paper's own equations to fitted quantities or self-citation chains. No load-bearing step matches the enumerated circularity patterns; the architecture and evaluation are independent of the target predictions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the social graph and stochastic prior are described at the level of architectural choices without numerical fitting details or background assumptions stated.

pith-pipeline@v0.9.0 · 5685 in / 1194 out tokens · 24800 ms · 2026-05-24T17:13:51.129597+00:00 · methodology

0 comments

read the original abstract

Pedestrian trajectory prediction is a challenging task because of the complexity of real-world human social behaviors and uncertainty of the future motion. For the first issue, existing methods adopt fully connected topology for modeling the social behaviors, while ignoring non-symmetric pairwise relationships. To effectively capture social behaviors of relevant pedestrians, we utilize a directed social graph which is dynamically constructed on timely location and speed direction. Based on the social graph, we further propose a network to collect social effects and accumulate with individual representation, in order to generate destination-oriented and social-aware representations. For the second issue, instead of modeling the uncertainty of the entire future as a whole, we utilize a temporal stochastic method for sequentially learning a prior model of uncertainty during social interactions. The prediction on the next step is then generated by sampling on the prior model and progressively decoding with a hierarchical LSTMs. Experimental results on two public datasets show the effectiveness of our method, especially when predicting trajectories in very crowded scenes.

Figures

Figures reproduced from arXiv: 1907.10233 by Lidan Zhang, Ping Guo, Qi She.

**Figure 2.** Figure 2: An example of social graph. not changed throughout the sequence. Et represents the set of directed edges determined by adjacency matrix At. An edge from node ni to node nj exists when the element in adjacency matrix (aij,t) equals 1. As shown in [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the prediction trajectories with [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Example of diverse predictions from our model. (a) [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Dynamic Scene Interaction Reasoning Framework for Scene-level Lane-Change Intention and Trajectory Prediction of Multiple Interacting Vehicles
cs.AI 2026-07 conditional novelty 5.5

A dynamic graph-attention model jointly predicts every nearby vehicle’s lane-change intention and trajectory, cutting trajectory error by up to ~53% and improving scene coherence on NGSIM and highD.
Exploitation of Hidden Context in Dynamic Movement Forecasting: A Neural Network Journey from Recurrent to Graph Neural Networks and General Purpose Transformers
cs.LG 2026-05 unverdicted novelty 2.0

Empirical comparison of LSTM, GNN, and Transformer architectures for NBA trajectory forecasting finds hybrid LSTM with contextual information yields lowest FDE of 1.51m over horizons up to 2s.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · cited by 2 Pith papers · 2 internal anchors

[1]

So- cial lstm: Human trajectory prediction in crowded spaces

Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. So- cial lstm: Human trajectory prediction in crowded spaces. In CVPR, June 2016

work page 2016
[2]

Campbell, and Sergey Levine

Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell, and Sergey Levine. Stochastic varia- tional video prediction. In ICLR, 2018

work page 2018
[3]

An Evaluation of Trajectory Prediction Approaches and Notes on the TrajNet Benchmark

Stefan Becker, Ronny Hug, Wolfgang H¨ ubner, and Michael Arens. An evaluation of trajectory predic- tion approaches and notes on the trajnet benchmark. arXiv:1805.07663, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[4]

A recurrent latent variable model for sequential data

Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Y oshua Bengio. A recurrent latent variable model for sequential data. In NIPS, pages 2980–2988, 2015

work page 2015
[5]

Stochastic video genera- tion with a learned prior

Emily Denton and Rob Fergus. Stochastic video genera- tion with a learned prior. In ICML, pages 174–1183, 2018

work page 2018
[6]

Where will they go? predicting ﬁne-grained adversarial multi- agent motion using conditional variational autoencoders

Panna Felsen, Patrick Lucey, and Sujoy Ganguly. Where will they go? predicting ﬁne-grained adversarial multi- agent motion using conditional variational autoencoders. In ECCV, pages 761–776, 2018

work page 2018
[7]

Sequential neural models with stochastic lay- ers

Marco Fraccaro, Søren Kaae Sønderby, Ulrich Paquet, and Ole Winther. Sequential neural models with stochastic lay- ers. In NIPS, pages 2207–2215, 2016

work page 2016
[8]

Z-forcing: Training stochastic recurrent networks

Anirudh Goyal, Alessandro Sordoni, Marc-Alexandre Cˆ ot´ e, Nan Rosemary Ke, and Y oshua Bengio. Z-forcing: Training stochastic recurrent networks. In NIPS, pages 6716–6726, 2017

work page 2017
[9]

Social gan: Socially acceptable tra- jectories with generative adversarial networks

Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savares e, and Alexandre Alahi. Social gan: Socially acceptable tra- jectories with generative adversarial networks. In CVPR, June 2018

work page 2018
[10]

Social force model for pedestrian dynamics

Dirk Helbing and Peter Molnar. Social force model for pedestrian dynamics. In Physical Review E , volume 51, pages 4282–4286, 1995

work page 1995
[11]

Choy, Philip H

Namhoon Lee, Wongun Choi, Paul V ernaza, Christo- pher B. Choy, Philip H. S. Torr, and Manmohan Chan- draker. Desire: Distant future prediction in dynamic scenes with interacting agents. In CVPR, July 2017

work page 2017
[12]

Crowds by example

Alon Lerner, Yiorgos Chrysanthou, and Dani Lischin- ski. Crowds by example. Computer Graphics F orum , 26(3):655–664, 2007

work page 2007
[13]

Diffu- sion convolutional recurrent neural network: Data-driven trafﬁc forecasting

Yaguang Li, Rose Y u, Cyrus Shahabi, and Yan Liu. Diffu- sion convolutional recurrent neural network: Data-driven trafﬁc forecasting. In ICLR, 2018

work page 2018
[14]

Trafﬁcpredict: Trajec- tory prediction for heterogeneous trafﬁc-agents

Y uexin Ma, Xinge Zhu, Sibo Zhang, Ruigang Yang, Wen- ping Wang, and Dinesh Manocha. Trafﬁcpredict: Trajec- tory prediction for heterogeneous trafﬁc-agents. In AAAI, 2019

work page 2019
[15]

Egocentric future localization

Hyun Soo Park, Jyh-Jing Hwang, Yedong Niu, and Jianbo Shi. Egocentric future localization. In CVPR, 2016

work page 2016
[16]

V an Gool

Stefano Pellegrini, Andreas Ess, Konrad Schindler, an d Luc J. V an Gool. Y ou’ll never walk alone: Modeling social behavior for multi-target tracking. In ICCV, pages 261– 268, 2009

work page 2009
[17]

Wrong turn - no dead end: A stochas- tic pedestrian motion model

Stefano Pellegrini, Andreas Ess, Marko Tanaskovic, an d Luc V an Gool. Wrong turn - no dead end: A stochas- tic pedestrian motion model. In International W orkshop on Socially Intelligent Surveillance and Monitoring , pages 15–22, 2010

work page 2010
[18]

Im- proving data association by joint modeling of pedestrian trajectories and groupings

Stefano Pellegrini, Andreas Ess, and Luc V an Gool. Im- proving data association by joint modeling of pedestrian trajectories and groupings. In ECCV, pages 452–465, 2010

work page 2010
[19]

R2p2: A reparameterized pushforward policy for diverse, precise generative path forecasting

Nick Rhinehart, Paul V ernaza, and Kris Kitani. R2p2: A reparameterized pushforward policy for diverse, precise generative path forecasting. In ECCV, pages 794 – 811, 2018

work page 2018
[20]

SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints

Amir Sadeghian, Vineet Kosaraju, Ali Sadeghian, Noria ki Hirose, and Silvio Savarese. Sophie: An attentive GAN for predicting paths compliant to social and physical con- straints. arXiv:1806.01482, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[21]

Learn- ing structured output representation using deep conditional generative models

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learn- ing structured output representation using deep conditional generative models. In NIPS, 2015

work page 2015
[22]

Forecast the plausible paths in crowd scenes

Hang Su, Jun Zhu, Yinpeng Dong, and Bo Zhang. Forecast the plausible paths in crowd scenes. In IJCAI, pages 2772– 2778, 2017. 8 Layer Input, (Dimensions) Output, (Dimensions) Parameters Encoder Fully-connected [pj, v j], (4) fn,j, (32) act:=ReLU Fully-connected Polarpi [pj, v j], (4) fp,ij, (32) act:=ReLU Fully-connected [fn,i, f n,j, f p,ij], (96) fe,ij,...

work page 2017
[23]

Stochastic prediction of multi-agent interactions from partial observations

Chen Sun, Per Karlsson, Jiajun Wu, Joshua B Tenenbaum, and Kevin Murphy. Stochastic prediction of multi-agent interactions from partial observations. In ICLR, 2019

work page 2019
[24]

Graph attention networks

Petar V eliˇ ckovi´ c, Guillem Cucurull, Arantxa Casano va, Adriana Romero, Pietro Li` o, and Y oshua Bengio. Graph attention networks. ICLR, 2018

work page 2018
[25]

Socia l attention: Modeling attention in human crowds

Anirudh V emula, Katharina Muelling, and Jean Oh. Socia l attention: Modeling attention in human crowds. In ICRA, May 2018

work page 2018
[26]

An uncertain future: Forecasting from static im- ages using variational autoencoders

Jacob Walker, Carl Doersch, Abhinav Gupta, and Martial Hebert. An uncertain future: Forecasting from static im- ages using variational autoencoders. In ECCV, 2016

work page 2016
[27]

Encoding crowd interaction with deep neural network for pedestrian trajectory prediction

Yanyu Xu, Zhixin Piao, and Shenghua Gao. Encoding crowd interaction with deep neural network for pedestrian trajectory prediction. In CVPR, June 2018

work page 2018
[28]

Future person localization in ﬁrst-person videos

Takuma Yagi, Karttikeya Mangalam, Ryo Y onetani, and Y oichi Sato. Future person localization in ﬁrst-person videos. In CVPR, 2018

work page 2018
[29]

Sr-lstm state reﬁnement for pedestrian trajectory prediction

Pu Zhang, Wanli Ouyang, Pengfei Zhang, Jianru Xue, and Nanning Zheng. Sr-lstm state reﬁnement for pedestrian trajectory prediction. In CVPR, 2019

work page 2019
[30]

Understanding human behaviors in crowds by imitating the decision-making process

Haosheng Zou, Hang Su, Shihong Song, and Jun Zhu. Understanding human behaviors in crowds by imitating the decision-making process. In AAAI, pages 7648–7656, 2018. 10

work page 2018

[1] [1]

So- cial lstm: Human trajectory prediction in crowded spaces

Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. So- cial lstm: Human trajectory prediction in crowded spaces. In CVPR, June 2016

work page 2016

[2] [2]

Campbell, and Sergey Levine

Mohammad Babaeizadeh, Chelsea Finn, Dumitru Erhan, Roy H. Campbell, and Sergey Levine. Stochastic varia- tional video prediction. In ICLR, 2018

work page 2018

[3] [3]

An Evaluation of Trajectory Prediction Approaches and Notes on the TrajNet Benchmark

Stefan Becker, Ronny Hug, Wolfgang H¨ ubner, and Michael Arens. An evaluation of trajectory predic- tion approaches and notes on the trajnet benchmark. arXiv:1805.07663, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[4] [4]

A recurrent latent variable model for sequential data

Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Y oshua Bengio. A recurrent latent variable model for sequential data. In NIPS, pages 2980–2988, 2015

work page 2015

[5] [5]

Stochastic video genera- tion with a learned prior

Emily Denton and Rob Fergus. Stochastic video genera- tion with a learned prior. In ICML, pages 174–1183, 2018

work page 2018

[6] [6]

Where will they go? predicting ﬁne-grained adversarial multi- agent motion using conditional variational autoencoders

Panna Felsen, Patrick Lucey, and Sujoy Ganguly. Where will they go? predicting ﬁne-grained adversarial multi- agent motion using conditional variational autoencoders. In ECCV, pages 761–776, 2018

work page 2018

[7] [7]

Sequential neural models with stochastic lay- ers

Marco Fraccaro, Søren Kaae Sønderby, Ulrich Paquet, and Ole Winther. Sequential neural models with stochastic lay- ers. In NIPS, pages 2207–2215, 2016

work page 2016

[8] [8]

Z-forcing: Training stochastic recurrent networks

Anirudh Goyal, Alessandro Sordoni, Marc-Alexandre Cˆ ot´ e, Nan Rosemary Ke, and Y oshua Bengio. Z-forcing: Training stochastic recurrent networks. In NIPS, pages 6716–6726, 2017

work page 2017

[9] [9]

Social gan: Socially acceptable tra- jectories with generative adversarial networks

Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savares e, and Alexandre Alahi. Social gan: Socially acceptable tra- jectories with generative adversarial networks. In CVPR, June 2018

work page 2018

[10] [10]

Social force model for pedestrian dynamics

Dirk Helbing and Peter Molnar. Social force model for pedestrian dynamics. In Physical Review E , volume 51, pages 4282–4286, 1995

work page 1995

[11] [11]

Choy, Philip H

Namhoon Lee, Wongun Choi, Paul V ernaza, Christo- pher B. Choy, Philip H. S. Torr, and Manmohan Chan- draker. Desire: Distant future prediction in dynamic scenes with interacting agents. In CVPR, July 2017

work page 2017

[12] [12]

Crowds by example

Alon Lerner, Yiorgos Chrysanthou, and Dani Lischin- ski. Crowds by example. Computer Graphics F orum , 26(3):655–664, 2007

work page 2007

[13] [13]

Diffu- sion convolutional recurrent neural network: Data-driven trafﬁc forecasting

Yaguang Li, Rose Y u, Cyrus Shahabi, and Yan Liu. Diffu- sion convolutional recurrent neural network: Data-driven trafﬁc forecasting. In ICLR, 2018

work page 2018

[14] [14]

Trafﬁcpredict: Trajec- tory prediction for heterogeneous trafﬁc-agents

Y uexin Ma, Xinge Zhu, Sibo Zhang, Ruigang Yang, Wen- ping Wang, and Dinesh Manocha. Trafﬁcpredict: Trajec- tory prediction for heterogeneous trafﬁc-agents. In AAAI, 2019

work page 2019

[15] [15]

Egocentric future localization

Hyun Soo Park, Jyh-Jing Hwang, Yedong Niu, and Jianbo Shi. Egocentric future localization. In CVPR, 2016

work page 2016

[16] [16]

V an Gool

Stefano Pellegrini, Andreas Ess, Konrad Schindler, an d Luc J. V an Gool. Y ou’ll never walk alone: Modeling social behavior for multi-target tracking. In ICCV, pages 261– 268, 2009

work page 2009

[17] [17]

Wrong turn - no dead end: A stochas- tic pedestrian motion model

Stefano Pellegrini, Andreas Ess, Marko Tanaskovic, an d Luc V an Gool. Wrong turn - no dead end: A stochas- tic pedestrian motion model. In International W orkshop on Socially Intelligent Surveillance and Monitoring , pages 15–22, 2010

work page 2010

[18] [18]

Im- proving data association by joint modeling of pedestrian trajectories and groupings

Stefano Pellegrini, Andreas Ess, and Luc V an Gool. Im- proving data association by joint modeling of pedestrian trajectories and groupings. In ECCV, pages 452–465, 2010

work page 2010

[19] [19]

R2p2: A reparameterized pushforward policy for diverse, precise generative path forecasting

Nick Rhinehart, Paul V ernaza, and Kris Kitani. R2p2: A reparameterized pushforward policy for diverse, precise generative path forecasting. In ECCV, pages 794 – 811, 2018

work page 2018

[20] [20]

SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints

Amir Sadeghian, Vineet Kosaraju, Ali Sadeghian, Noria ki Hirose, and Silvio Savarese. Sophie: An attentive GAN for predicting paths compliant to social and physical con- straints. arXiv:1806.01482, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[21] [21]

Learn- ing structured output representation using deep conditional generative models

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. Learn- ing structured output representation using deep conditional generative models. In NIPS, 2015

work page 2015

[22] [22]

Forecast the plausible paths in crowd scenes

Hang Su, Jun Zhu, Yinpeng Dong, and Bo Zhang. Forecast the plausible paths in crowd scenes. In IJCAI, pages 2772– 2778, 2017. 8 Layer Input, (Dimensions) Output, (Dimensions) Parameters Encoder Fully-connected [pj, v j], (4) fn,j, (32) act:=ReLU Fully-connected Polarpi [pj, v j], (4) fp,ij, (32) act:=ReLU Fully-connected [fn,i, f n,j, f p,ij], (96) fe,ij,...

work page 2017

[23] [23]

Stochastic prediction of multi-agent interactions from partial observations

Chen Sun, Per Karlsson, Jiajun Wu, Joshua B Tenenbaum, and Kevin Murphy. Stochastic prediction of multi-agent interactions from partial observations. In ICLR, 2019

work page 2019

[24] [24]

Graph attention networks

Petar V eliˇ ckovi´ c, Guillem Cucurull, Arantxa Casano va, Adriana Romero, Pietro Li` o, and Y oshua Bengio. Graph attention networks. ICLR, 2018

work page 2018

[25] [25]

Socia l attention: Modeling attention in human crowds

Anirudh V emula, Katharina Muelling, and Jean Oh. Socia l attention: Modeling attention in human crowds. In ICRA, May 2018

work page 2018

[26] [26]

An uncertain future: Forecasting from static im- ages using variational autoencoders

Jacob Walker, Carl Doersch, Abhinav Gupta, and Martial Hebert. An uncertain future: Forecasting from static im- ages using variational autoencoders. In ECCV, 2016

work page 2016

[27] [27]

Encoding crowd interaction with deep neural network for pedestrian trajectory prediction

Yanyu Xu, Zhixin Piao, and Shenghua Gao. Encoding crowd interaction with deep neural network for pedestrian trajectory prediction. In CVPR, June 2018

work page 2018

[28] [28]

Future person localization in ﬁrst-person videos

Takuma Yagi, Karttikeya Mangalam, Ryo Y onetani, and Y oichi Sato. Future person localization in ﬁrst-person videos. In CVPR, 2018

work page 2018

[29] [29]

Sr-lstm state reﬁnement for pedestrian trajectory prediction

Pu Zhang, Wanli Ouyang, Pengfei Zhang, Jianru Xue, and Nanning Zheng. Sr-lstm state reﬁnement for pedestrian trajectory prediction. In CVPR, 2019

work page 2019

[30] [30]

Understanding human behaviors in crowds by imitating the decision-making process

Haosheng Zou, Hang Su, Shihong Song, and Jun Zhu. Understanding human behaviors in crowds by imitating the decision-making process. In AAAI, pages 7648–7656, 2018. 10

work page 2018