Seahorse: A Unified Benchmarking Framework for Spatiotemporal Event Modeling

Gerrit Gro{\ss}mann; Sebastian Vollmer; Yahya Aalaila

arxiv: 2607.01022 · v1 · pith:PWBBKTQXnew · submitted 2026-07-01 · 💻 cs.LG

Seahorse: A Unified Benchmarking Framework for Spatiotemporal Event Modeling

Yahya Aalaila , Gerrit Gro{\ss}mann , Sebastian Vollmer This is my paper

Pith reviewed 2026-07-02 15:58 UTC · model grok-4.3

classification 💻 cs.LG

keywords spatiotemporal point processesneural STPPbenchmarking frameworkencode-evolve-decodeHawkesNestinductive biasraw-coordinate likelihood

0 comments

The pith

SEAHORSE unifies neural spatiotemporal point process models under a shared encode-evolve-decode interface and single benchmark protocol.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SEAHORSE as a framework that structures neural models for spatiotemporal point processes around a common encode-evolve-decode interface. This structure allows every model family to be trained, tuned, and evaluated with identical preprocessing, splits, and raw-coordinate likelihood reporting. The resulting protocol supports direct comparisons and controlled experiments that vary event pattern complexity. The authors supply a companion synthetic suite called HawkesNest to perform those experiments and demonstrate how complexity reveals differing inductive biases across model types. Consistent evaluation protocols matter for applications such as mobility tracking and epidemiology where model reliability depends on knowing which inductive assumptions hold.

Core claim

SEAHORSE formalizes neural STPPs through a common encode-evolve-decode interface and trains, tunes, and evaluates every model family under a single executable benchmark protocol with raw-coordinate likelihood reporting. This enables fair comparisons but, more importantly, controlled diagnostic studies. We pair SEAHORSE with HawkesNest, a synthetic stress-test suite, and show that increasing event-pattern complexity exposes each family's inductive bias, degrading some models sharply and leaving others stable.

What carries the argument

encode-evolve-decode interface that standardizes representation of intensity models, conditional density models, latent dynamics, flow decoders, and score-based generators for uniform training and evaluation

If this is right

All recent neural STPP families become directly comparable under identical training and evaluation conditions.
Diagnostic experiments can isolate which model families remain stable as synthetic event patterns grow more complex.
Raw-coordinate likelihood reporting removes hidden differences introduced by coordinate normalization choices.
HawkesNest provides a reproducible way to measure how inductive biases manifest under controlled stress.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practitioners could use the framework outputs to select models for domain-specific tasks such as disease spread forecasting.
The shared interface may make it easier to combine components from different model families into new hybrids.
Wider adoption of the protocol could reduce the number of incomparable results appearing in follow-on papers.

Load-bearing premise

The encode-evolve-decode interface can represent every recent neural STPP family without forcing architectural compromises that alter their original behavior.

What would settle it

A demonstration that at least one published neural STPP family cannot be expressed through the encode-evolve-decode interface without changing its likelihood computation or performance relative to its original implementation.

Figures

Figures reproduced from arXiv: 2607.01022 by Gerrit Gro{\ss}mann, Sebastian Vollmer, Yahya Aalaila.

**Figure 1.** Figure 1: Overview of SEAHORSE. The framework takes fixed event datasets, model presets, and benchmark configuration as inputs, runs heterogeneous STPP models under a common contract, and returns comparable performance metrics, selected configurations, and reproducible artifacts. Despite rapid progress, evidence for neural STPPs remains difficult to compare across papers. Reported performance often depends not only … view at source ↗

**Figure 2.** Figure 2: Unified model decomposition for neural STPPs. Every method in our benchmark instantiates [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3 [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Learning dynamics on the HawkesNest entanglement suite. Each panel reports test NLL as [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Post-NLL diagnostics on the HawkesNest entanglement suite. Panel (a) reports temporal CRPS across entanglement levels. Panel (b) reports ground-truth intensity correlation for models with well-defined surface estimates. Shaded bands denote seed variability. event data. SEAHORSE fills this gap by standardizing data handling, model interfaces, training, raw-space likelihood reporting, and diagnostics for het… view at source ↗

**Figure 6.** Figure 6: SEAHORSE software architecture. The CLI resolves schema-validated configuration objects, dataset adapters expose raw event splits, preset registries construct UnifiedSTPP models, and runner/evaluation layers write structured artifacts. The architecture separates configuration, data resolution, model construction, execution, evaluation, and artifact recording. A.2 Getting Started Guide Run an Included Model… view at source ↗

**Figure 7.** Figure 7: Additional learning-budget curves on the HawkesNest entanglement suite. Each panel [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗

**Figure 8.** Figure 8: Autoregressive rollout coherence on the HawkesNest entanglement suite. Curves show [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗

**Figure 9.** Figure 9: Median successful-run wall-clock training time on real datasets and HawkesNest Suite 3. [PITH_FULL_IMAGE:figures/full_fig_p023_9.png] view at source ↗

read the original abstract

Spatiotemporal point processes (STPPs) model event data in continuous time and space, with applications in mobility, epidemiology, and public safety. Recent neural STPPs span expressive intensity models, conditional density models, continuous-time latent dynamics, normalizing-flow spatial decoders, and score-based generative mechanisms. Yet comparison remains fragile because implementations differ in preprocessing, coordinate normalization, splits, likelihood conventions, and evaluation protocols. We present SEAHORSE, a unified framework for reproducible STPP experimentation. SEAHORSE formalizes neural STPPs through a common encode-evolve-decode interface and trains, tunes, and evaluates every model family under a single executable benchmark protocol with raw-coordinate likelihood reporting. This enables fair comparisons but, more importantly, controlled diagnostic studies. We pair SEAHORSE with HawkesNest, a synthetic stress-test suite, and show that increasing event-pattern complexity exposes each family's inductive bias, degrading some models sharply and leaving others stable. Code: https://github.com/YahyaAalaila/seahorse.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SEAHORSE gives neural STPP work a shared encode-evolve-decode wrapper and a synthetic stress suite, but the unification step still needs direct checks that it leaves original model behavior untouched.

read the letter

The paper's core move is to put intensity models, conditional density models, latent dynamics, flow decoders, and score-based generators behind one interface, then run them all through the same training, tuning, and raw-coordinate likelihood protocol. It also ships HawkesNest, a set of synthetic event patterns whose complexity can be dialed up to see which families hold up and which degrade.

That standardization is the useful part. Different codebases currently differ in normalization, splits, and likelihood conventions, so direct comparisons have been noisy. A single executable protocol plus controlled synthetic data should let people run the diagnostic studies the abstract promises.

The soft spot is exactly the one the stress-test note flags. The claim that the shared interface preserves each family's inductive bias rests on the encode-evolve-decode abstraction being expressive enough for score-based and continuous-flow models without reparameterization that changes their effective class. The abstract asserts this works, but supplies no concrete verification that likelihood semantics or gradient behavior survive the wrapper. If the unification step alters any of those families, the HawkesNest results will reflect the wrapper rather than the original architectures.

This is a methods paper aimed at the neural STPP subfield. Readers who already work on these models and want reproducible baselines or diagnostic tools will find it directly usable. It is not aimed at broader ML or at deriving new theoretical results.

The work is coherent on its own terms and shows clear thinking about reproducibility. It deserves peer review so the interface claim can be examined with the actual code and experiments in hand.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces SEAHORSE, a unified benchmarking framework for neural spatiotemporal point processes that standardizes models from intensity-based, conditional-density, latent-dynamics, normalizing-flow, and score-based families through a shared encode-evolve-decode interface. All models are trained, tuned, and evaluated under one executable protocol that reports raw-coordinate likelihoods; the framework is paired with the HawkesNest synthetic stress-test suite, which increases event-pattern complexity to expose differential inductive biases across families.

Significance. If the encode-evolve-decode interface can embed the listed families while preserving their original inductive biases and likelihood semantics, SEAHORSE would provide a valuable contribution by enabling reproducible, apples-to-apples comparisons and controlled diagnostic experiments on STPP architectures. The provision of executable code and a single benchmark protocol is a concrete strength that directly addresses the reproducibility issues noted in the abstract.

major comments (2)

[§3] §3 (encode-evolve-decode interface definition): The central claim that the interface accommodates score-based generators and normalizing-flow decoders without forcing architectural compromises or altering original likelihood semantics is asserted but not supported by any equivalence verification (e.g., no side-by-side likelihood computation or reparameterization ablation is reported for these families). This is load-bearing for the diagnostic results on HawkesNest.
[§5] §5 (HawkesNest experiments): The reported degradation patterns are presented as exposing each family's inductive bias, yet the manuscript does not demonstrate that the synthetic data generation and coordinate handling in HawkesNest are identical to the raw-coordinate likelihood protocol used for the real benchmarks; any mismatch would make the diagnostics reflect the wrapper rather than the original model classes.

minor comments (2)

[Abstract] Abstract: The sentence claiming the interface 'enables ... controlled diagnostic studies' would benefit from one concrete example of a diagnostic that becomes possible only under the unified protocol.
Notation: The description of the evolve stage does not explicitly state whether the time and space components remain decoupled for all model families or whether certain families require joint evolution; a short clarifying sentence would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The two major comments identify important points where additional verification would strengthen the central claims regarding the encode-evolve-decode interface and the diagnostic validity of HawkesNest. We address each below and commit to revisions that directly incorporate the requested evidence.

read point-by-point responses

Referee: [§3] §3 (encode-evolve-decode interface definition): The central claim that the interface accommodates score-based generators and normalizing-flow decoders without forcing architectural compromises or altering original likelihood semantics is asserted but not supported by any equivalence verification (e.g., no side-by-side likelihood computation or reparameterization ablation is reported for these families). This is load-bearing for the diagnostic results on HawkesNest.

Authors: We agree that explicit verification is needed to substantiate the claim. The interface delegates likelihood evaluation to each model's native mechanism (score matching for score-based models; change-of-variables in raw coordinates for normalizing-flow decoders) without reparameterization of the density itself. To address the referee's concern, we will add an appendix containing side-by-side likelihood computations on a controlled synthetic dataset, comparing the wrapped implementations against the original standalone code for both families. This will confirm numerical equivalence within floating-point tolerance and will be referenced from §3. revision: yes
Referee: [§5] §5 (HawkesNest experiments): The reported degradation patterns are presented as exposing each family's inductive bias, yet the manuscript does not demonstrate that the synthetic data generation and coordinate handling in HawkesNest are identical to the raw-coordinate likelihood protocol used for the real benchmarks; any mismatch would make the diagnostics reflect the wrapper rather than the original model classes.

Authors: The HawkesNest generator produces events directly in the same raw spatiotemporal coordinate system used by the real-data loaders, and SEAHORSE applies an identical preprocessing and likelihood-evaluation pipeline to both synthetic and real datasets. Nevertheless, we acknowledge that this identity was stated rather than demonstrated. We will revise §5 to include an explicit subsection and accompanying code reference that verifies (i) identical coordinate ranges and units, (ii) the same raw-coordinate likelihood computation path, and (iii) the absence of any additional normalization steps unique to the synthetic suite. This will be supported by a small verification script released with the repository. revision: yes

Circularity Check

0 steps flagged

No circularity: software framework and benchmark protocol with no derivation chain

full rationale

The paper presents SEAHORSE as a unified benchmarking framework that formalizes neural STPPs via an encode-evolve-decode interface and provides a shared training/evaluation protocol. No mathematical derivations, parameter fits, predictions, or uniqueness theorems are claimed. The interface is an engineering design choice for reproducibility rather than a result derived from prior equations or self-citations. No steps reduce by construction to inputs, and the contribution is self-contained as executable code and protocol rather than any fitted or renamed result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper contributes an engineering artifact rather than a derivation; it rests on the domain assumption that a single interface can cover existing neural STPP families and that synthetic stress tests can expose meaningful differences in inductive bias.

axioms (1)

domain assumption Neural STPPs from different families can be represented without loss of fidelity inside a shared encode-evolve-decode structure
Invoked when the abstract states that SEAHORSE formalizes every model family through this common interface.

pith-pipeline@v0.9.1-grok · 5717 in / 1433 out tokens · 28804 ms · 2026-07-02T15:58:47.445188+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 6 canonical work pages · 2 internal anchors

[1]

Seahorse: Unified benchmarking for spatio-temporal point processes

Yahya Aalaila, Gerrit Großmann, and Sebastian V ollmer. Seahorse: Unified benchmarking for spatio-temporal point processes. https://github.com/YahyaAalaila/seahorse, 2026. Software, Apache-2.0. Archived athttps://doi.org/10.5281/zenodo.21078077

work page doi:10.5281/zenodo.21078077 2026
[2]

Springer Science & Business Media, 2007

Daryl J Daley and David Vere-Jones.An Introduction to the Theory of Point Processes: Volume II: General Theory and Structure. Springer Science & Business Media, 2007

2007
[3]

A review of self-exciting spatio-temporal point processes and their applications

Alex Reinhart. A review of self-exciting spatio-temporal point processes and their applications. Statistical Science, 33(3):299–318, 2018

2018
[4]

Recurrent marked temporal point processes: Embedding event history to vector

Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1555–1564, 2016

2016
[5]

The neural hawkes process: A neurally self-modulating multivariate point process.Advances in neural information processing systems, 30, 2017

Hongyuan Mei and Jason M Eisner. The neural hawkes process: A neurally self-modulating multivariate point process.Advances in neural information processing systems, 30, 2017. 10

2017
[6]

Transformer Hawkes process

Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, and Hongyuan Zha. Transformer Hawkes process. InInternational Conference on Machine Learning, pages 11692–11702. PMLR, 2020

2020
[7]

Ricky T. Q. Chen, Brandon Amos, and Maximilian Nickel. Neural spatio-temporal point processes. InInternational Conference on Learning Representations, 2021

2021
[8]

Neural point process for learning spatiotemporal event dynamics

Zihao Zhou, Xingyi Yang, Ryan Rossi, Handong Zhao, and Rose Yu. Neural point process for learning spatiotemporal event dynamics. InLearning for Dynamics and Control Conference, pages 777–789. PMLR, 2022

2022
[9]

Automatic integration for spatiotemporal neural point processes

Zihao Zhou and Rose Yu. Automatic integration for spatiotemporal neural point processes. Advances in Neural Information Processing Systems, 36, 2024

2024
[10]

Neural spatiotemporal point processes: Trends and challenges.Transactions on Machine Learning Research, 2025

Sumantrak Mukherjee, Mouad Elhamdi, George Mohler, David Antony Selby, Yao Xie, Sebas- tian Josef V ollmer, and Gerrit Großmann. Neural spatiotemporal point processes: Trends and challenges.Transactions on Machine Learning Research, 2025. Survey Certification

2025
[11]

Deep spatiotemporal point processes: Advances and new directions.Annual Review of Statistics and Its Application, 13, 2025

Xiuyuan Cheng, Zheng Dong, and Yao Xie. Deep spatiotemporal point processes: Advances and new directions.Annual Review of Statistics and Its Application, 13, 2025

2025
[12]

Imitation learning of neural spatio- temporal point processes.IEEE Transactions on Knowledge and Data Engineering, 34(11):5391– 5402, 2021

Shixiang Zhu, Shuang Li, Zhigang Peng, and Yao Xie. Imitation learning of neural spatio- temporal point processes.IEEE Transactions on Knowledge and Data Engineering, 34(11):5391– 5402, 2021

2021
[13]

Integration-free training for spatio-temporal multimodal covariate deep kernel point processes.Advances in Neural Information Processing Systems, 36:25031–25049, 2023

Yixuan Zhang, Quyu Kong, and Feng Zhou. Integration-free training for spatio-temporal multimodal covariate deep kernel point processes.Advances in Neural Information Processing Systems, 36:25031–25049, 2023

2023
[14]

HawkesNest: A multi-axis synthetic benchmark for spatiotemporal pattern complexity, 2026

Yahya Aalaila, Sumantrak Mukherjee, Gerrit Großmann, and Sebastian V ollmer. HawkesNest: A multi-axis synthetic benchmark for spatiotemporal pattern complexity, 2026

2026
[15]

CRC Press, 2003

Jesper Moller and Rasmus Plenge Waagepetersen.Statistical inference and simulation for spatial point processes. CRC Press, 2003

2003
[16]

Lecture Notes: Temporal Point Processes and the Conditional Intensity Function

Jakob Gulddahl Rasmussen. Lecture notes: Temporal point processes and the conditional intensity function.arXiv preprint arXiv:1806.00221, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[17]

Self-attentive Hawkes process

Qiang Zhang, Aldo Lipani, Omer Kirnap, and Emine Yilmaz. Self-attentive Hawkes process. InInternational Conference on Machine Learning, pages 11183–11193. PMLR, 2020

2020
[18]

Spatio-temporal diffusion point processes

Yuan Yuan, Jingtao Ding, Chenyang Shao, Depeng Jin, and Yong Li. Spatio-temporal diffusion point processes. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3173–3184, 2023

2023
[19]

Neural spectral marked point processes

Shixiang Zhu, Haoyun Wang, Xiuyuan Cheng, and Yao Xie. Neural spectral marked point processes. InInternational Conference on Learning Representations, 2022

2022
[20]

Beyond point prediction: Score matching-based pseudolikelihood estimation of neural marked spatio-temporal point process

Zichong Li, Qunzhi Xu, Zhenghao Xu, Yajun Mei, Tuo Zhao, and Hongyuan Zha. Beyond point prediction: Score matching-based pseudolikelihood estimation of neural marked spatio-temporal point process. InForty-first International Conference on Machine Learning, 2024

2024
[21]

Neural jump stochastic differential equations.Advances in Neural Information Processing Systems, 32, 2019

Junteng Jia and Austin R Benson. Neural jump stochastic differential equations.Advances in Neural Information Processing Systems, 32, 2019

2019
[22]

Citi bike system data

Citi Bike NYC (Lyft, Inc.). Citi bike system data. https://citibikenyc.com/ system-data. Accessed YYYY-MM-DD
[23]

Geological Survey, Earthquake Hazards Program

U.S. Geological Survey, Earthquake Hazards Program. Advanced national seismic system (ANSS) comprehensive catalog of earthquake events and products (ComCat). https:// earthquake.usgs.gov/data/comcat/, 2017

2017
[24]

Coronavirus (COVID-19) data in the united states

The New York Times. Coronavirus (COVID-19) data in the united states. https://github. com/nytimes/covid-19-data, 2021. County-level case data. Accessed YYYY-MM-DD. 11

2021
[25]

Zhang, Qingsong Wen, Jun Zhou, and Hongyuan Mei

Siqiao Xue, Xiaoming Shi, Zhixuan Chu, Yan Wang, Hongyan Hao, Fan Zhou, Caigao Jiang, Chen Pan, James Y . Zhang, Qingsong Wen, Jun Zhou, and Hongyuan Mei. EasyTPP: Towards open benchmarking temporal point processes. InThe Twelfth International Conference on Learning Representations, 2024

2024
[26]

Hotpp benchmark: Are we good at the long horizon events forecasting?arXiv preprint arXiv:2406.14341, 2024

Ivan Karpukhin, Foma Shipilov, and Andrey Savchenko. Hotpp benchmark: Are we good at the long horizon events forecasting?arXiv preprint arXiv:2406.14341, 2024

work page arXiv 2024
[27]

Uber tlc foil response

FiveThirtyEight. Uber tlc foil response. https://github.com/fivethirtyeight/ uber-tlc-foil-response , 2015. Data obtained from the New York City Taxi and Limousine Commission through a Freedom of Information Law request. Accessed: 2026-06-26

2015
[28]

A Countrywide Traffic Accident Dataset

Sobhan Moosavi, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, and Rajiv Ram- nath. A countrywide traffic accident dataset.arXiv preprint arXiv:1906.05409, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906
[29]

Accident risk prediction based on heterogeneous sparse data: New dataset and insights

Sobhan Moosavi, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, Radu Teodorescu, and Rajiv Ramnath. Accident risk prediction based on heterogeneous sparse data: New dataset and insights. InProceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. Association for Computing Machinery, 2019

2019
[30]

Crimes — 2001 to present

City of Chicago. Crimes — 2001 to present. https://data.cityofchicago.org/ Public-Safety/Crimes-2001-to-Present/ijzp-q8t2 , 2026. Chicago Data Portal. Ac- cessed: 2026-06-26

2001
[31]

Crime data from 2020 to 2024

City of Los Angeles. Crime data from 2020 to 2024. https://data.lacity.org/ Public-Safety/Crime-Data-from-2020-to-2024/2nrs-mtv8 , 2026. Los Angeles Open Data Portal. Accessed: 2026-06-26

2020
[32]

Global terrorism database (gtd)

National Consortium for the Study of Terrorism and Responses to Terrorism (START). Global terrorism database (gtd). https://www.start.umd.edu/data-tools/GTD, 2022. Univer- sity of Maryland. Accessed: 2026-06-26

2022
[33]

Austin 311 public data

City of Austin. Austin 311 public data. https://data.austintexas.gov/ Utilities-and-City-Services/Austin-311-Public-Data/xwdj-i9he , 2026. City of Austin Open Data Portal. Accessed: 2026-06-26

2026
[34]

Karen C. Short. Spatial wildfire occurrence data for the united states, 1992–2015 [fpa_fod_- 20170508] (4th edition).https://doi.org/10.2737/RDS-2013-0009.4, 2017

work page doi:10.2737/rds-2013-0009.4 1992
[35]

Myers, and Jure Leskovec

Eunjoon Cho, Seth A. Myers, and Jure Leskovec. Friendship and mobility: User movement in location-based social networks. InProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’11, pages 1082–1090, New York, NY , USA, 2011. Association for Computing Machinery

2011
[36]

history_state

Nadine Chang, John A. Pyles, Austin Marcus, Abhinav Gupta, Michael J. Tarr, and Elissa M. Aminoff. BOLD5000, a public fMRI dataset while viewing 5000 visual images.Scientific Data, 6(1):49, 2019. A Software Interface and Reproducibility A.1 Architecture Overview Figure 6 summarizes the internal software architecture of SEAHORSE. The framework separates co...

work page arXiv 2019

[1] [1]

Seahorse: Unified benchmarking for spatio-temporal point processes

Yahya Aalaila, Gerrit Großmann, and Sebastian V ollmer. Seahorse: Unified benchmarking for spatio-temporal point processes. https://github.com/YahyaAalaila/seahorse, 2026. Software, Apache-2.0. Archived athttps://doi.org/10.5281/zenodo.21078077

work page doi:10.5281/zenodo.21078077 2026

[2] [2]

Springer Science & Business Media, 2007

Daryl J Daley and David Vere-Jones.An Introduction to the Theory of Point Processes: Volume II: General Theory and Structure. Springer Science & Business Media, 2007

2007

[3] [3]

A review of self-exciting spatio-temporal point processes and their applications

Alex Reinhart. A review of self-exciting spatio-temporal point processes and their applications. Statistical Science, 33(3):299–318, 2018

2018

[4] [4]

Recurrent marked temporal point processes: Embedding event history to vector

Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1555–1564, 2016

2016

[5] [5]

The neural hawkes process: A neurally self-modulating multivariate point process.Advances in neural information processing systems, 30, 2017

Hongyuan Mei and Jason M Eisner. The neural hawkes process: A neurally self-modulating multivariate point process.Advances in neural information processing systems, 30, 2017. 10

2017

[6] [6]

Transformer Hawkes process

Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, and Hongyuan Zha. Transformer Hawkes process. InInternational Conference on Machine Learning, pages 11692–11702. PMLR, 2020

2020

[7] [7]

Ricky T. Q. Chen, Brandon Amos, and Maximilian Nickel. Neural spatio-temporal point processes. InInternational Conference on Learning Representations, 2021

2021

[8] [8]

Neural point process for learning spatiotemporal event dynamics

Zihao Zhou, Xingyi Yang, Ryan Rossi, Handong Zhao, and Rose Yu. Neural point process for learning spatiotemporal event dynamics. InLearning for Dynamics and Control Conference, pages 777–789. PMLR, 2022

2022

[9] [9]

Automatic integration for spatiotemporal neural point processes

Zihao Zhou and Rose Yu. Automatic integration for spatiotemporal neural point processes. Advances in Neural Information Processing Systems, 36, 2024

2024

[10] [10]

Neural spatiotemporal point processes: Trends and challenges.Transactions on Machine Learning Research, 2025

Sumantrak Mukherjee, Mouad Elhamdi, George Mohler, David Antony Selby, Yao Xie, Sebas- tian Josef V ollmer, and Gerrit Großmann. Neural spatiotemporal point processes: Trends and challenges.Transactions on Machine Learning Research, 2025. Survey Certification

2025

[11] [11]

Deep spatiotemporal point processes: Advances and new directions.Annual Review of Statistics and Its Application, 13, 2025

Xiuyuan Cheng, Zheng Dong, and Yao Xie. Deep spatiotemporal point processes: Advances and new directions.Annual Review of Statistics and Its Application, 13, 2025

2025

[12] [12]

Imitation learning of neural spatio- temporal point processes.IEEE Transactions on Knowledge and Data Engineering, 34(11):5391– 5402, 2021

Shixiang Zhu, Shuang Li, Zhigang Peng, and Yao Xie. Imitation learning of neural spatio- temporal point processes.IEEE Transactions on Knowledge and Data Engineering, 34(11):5391– 5402, 2021

2021

[13] [13]

Integration-free training for spatio-temporal multimodal covariate deep kernel point processes.Advances in Neural Information Processing Systems, 36:25031–25049, 2023

Yixuan Zhang, Quyu Kong, and Feng Zhou. Integration-free training for spatio-temporal multimodal covariate deep kernel point processes.Advances in Neural Information Processing Systems, 36:25031–25049, 2023

2023

[14] [14]

HawkesNest: A multi-axis synthetic benchmark for spatiotemporal pattern complexity, 2026

Yahya Aalaila, Sumantrak Mukherjee, Gerrit Großmann, and Sebastian V ollmer. HawkesNest: A multi-axis synthetic benchmark for spatiotemporal pattern complexity, 2026

2026

[15] [15]

CRC Press, 2003

Jesper Moller and Rasmus Plenge Waagepetersen.Statistical inference and simulation for spatial point processes. CRC Press, 2003

2003

[16] [16]

Lecture Notes: Temporal Point Processes and the Conditional Intensity Function

Jakob Gulddahl Rasmussen. Lecture notes: Temporal point processes and the conditional intensity function.arXiv preprint arXiv:1806.00221, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[17] [17]

Self-attentive Hawkes process

Qiang Zhang, Aldo Lipani, Omer Kirnap, and Emine Yilmaz. Self-attentive Hawkes process. InInternational Conference on Machine Learning, pages 11183–11193. PMLR, 2020

2020

[18] [18]

Spatio-temporal diffusion point processes

Yuan Yuan, Jingtao Ding, Chenyang Shao, Depeng Jin, and Yong Li. Spatio-temporal diffusion point processes. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3173–3184, 2023

2023

[19] [19]

Neural spectral marked point processes

Shixiang Zhu, Haoyun Wang, Xiuyuan Cheng, and Yao Xie. Neural spectral marked point processes. InInternational Conference on Learning Representations, 2022

2022

[20] [20]

Beyond point prediction: Score matching-based pseudolikelihood estimation of neural marked spatio-temporal point process

Zichong Li, Qunzhi Xu, Zhenghao Xu, Yajun Mei, Tuo Zhao, and Hongyuan Zha. Beyond point prediction: Score matching-based pseudolikelihood estimation of neural marked spatio-temporal point process. InForty-first International Conference on Machine Learning, 2024

2024

[21] [21]

Neural jump stochastic differential equations.Advances in Neural Information Processing Systems, 32, 2019

Junteng Jia and Austin R Benson. Neural jump stochastic differential equations.Advances in Neural Information Processing Systems, 32, 2019

2019

[22] [22]

Citi bike system data

Citi Bike NYC (Lyft, Inc.). Citi bike system data. https://citibikenyc.com/ system-data. Accessed YYYY-MM-DD

[23] [23]

Geological Survey, Earthquake Hazards Program

U.S. Geological Survey, Earthquake Hazards Program. Advanced national seismic system (ANSS) comprehensive catalog of earthquake events and products (ComCat). https:// earthquake.usgs.gov/data/comcat/, 2017

2017

[24] [24]

Coronavirus (COVID-19) data in the united states

The New York Times. Coronavirus (COVID-19) data in the united states. https://github. com/nytimes/covid-19-data, 2021. County-level case data. Accessed YYYY-MM-DD. 11

2021

[25] [25]

Zhang, Qingsong Wen, Jun Zhou, and Hongyuan Mei

Siqiao Xue, Xiaoming Shi, Zhixuan Chu, Yan Wang, Hongyan Hao, Fan Zhou, Caigao Jiang, Chen Pan, James Y . Zhang, Qingsong Wen, Jun Zhou, and Hongyuan Mei. EasyTPP: Towards open benchmarking temporal point processes. InThe Twelfth International Conference on Learning Representations, 2024

2024

[26] [26]

Hotpp benchmark: Are we good at the long horizon events forecasting?arXiv preprint arXiv:2406.14341, 2024

Ivan Karpukhin, Foma Shipilov, and Andrey Savchenko. Hotpp benchmark: Are we good at the long horizon events forecasting?arXiv preprint arXiv:2406.14341, 2024

work page arXiv 2024

[27] [27]

Uber tlc foil response

FiveThirtyEight. Uber tlc foil response. https://github.com/fivethirtyeight/ uber-tlc-foil-response , 2015. Data obtained from the New York City Taxi and Limousine Commission through a Freedom of Information Law request. Accessed: 2026-06-26

2015

[28] [28]

A Countrywide Traffic Accident Dataset

Sobhan Moosavi, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, and Rajiv Ram- nath. A countrywide traffic accident dataset.arXiv preprint arXiv:1906.05409, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1906

[29] [29]

Accident risk prediction based on heterogeneous sparse data: New dataset and insights

Sobhan Moosavi, Mohammad Hossein Samavatian, Srinivasan Parthasarathy, Radu Teodorescu, and Rajiv Ramnath. Accident risk prediction based on heterogeneous sparse data: New dataset and insights. InProceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems. Association for Computing Machinery, 2019

2019

[30] [30]

Crimes — 2001 to present

City of Chicago. Crimes — 2001 to present. https://data.cityofchicago.org/ Public-Safety/Crimes-2001-to-Present/ijzp-q8t2 , 2026. Chicago Data Portal. Ac- cessed: 2026-06-26

2001

[31] [31]

Crime data from 2020 to 2024

City of Los Angeles. Crime data from 2020 to 2024. https://data.lacity.org/ Public-Safety/Crime-Data-from-2020-to-2024/2nrs-mtv8 , 2026. Los Angeles Open Data Portal. Accessed: 2026-06-26

2020

[32] [32]

Global terrorism database (gtd)

National Consortium for the Study of Terrorism and Responses to Terrorism (START). Global terrorism database (gtd). https://www.start.umd.edu/data-tools/GTD, 2022. Univer- sity of Maryland. Accessed: 2026-06-26

2022

[33] [33]

Austin 311 public data

City of Austin. Austin 311 public data. https://data.austintexas.gov/ Utilities-and-City-Services/Austin-311-Public-Data/xwdj-i9he , 2026. City of Austin Open Data Portal. Accessed: 2026-06-26

2026

[34] [34]

Karen C. Short. Spatial wildfire occurrence data for the united states, 1992–2015 [fpa_fod_- 20170508] (4th edition).https://doi.org/10.2737/RDS-2013-0009.4, 2017

work page doi:10.2737/rds-2013-0009.4 1992

[35] [35]

Myers, and Jure Leskovec

Eunjoon Cho, Seth A. Myers, and Jure Leskovec. Friendship and mobility: User movement in location-based social networks. InProceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’11, pages 1082–1090, New York, NY , USA, 2011. Association for Computing Machinery

2011

[36] [36]

history_state

Nadine Chang, John A. Pyles, Austin Marcus, Abhinav Gupta, Michael J. Tarr, and Elissa M. Aminoff. BOLD5000, a public fMRI dataset while viewing 5000 visual images.Scientific Data, 6(1):49, 2019. A Software Interface and Reproducibility A.1 Architecture Overview Figure 6 summarizes the internal software architecture of SEAHORSE. The framework separates co...

work page arXiv 2019