EventFlow: Forecasting Temporal Point Processes with Flow Matching
Pith reviewed 2026-05-23 19:02 UTC · model grok-4.3
The pith
EventFlow uses flow matching to model temporal point processes non-autoregressively and cuts forecast error 20-53 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
EventFlow is a non-autoregressive generative model for temporal point processes. The model builds on the flow matching framework in order to directly learn joint distributions over event times, side-stepping the autoregressive process. It achieves a 20%-53% lower forecast error than the nearest baseline on standard TPP benchmarks while simultaneously using fewer model calls at sampling time.
What carries the argument
Flow matching objective applied to the joint distribution of event times and marks across sequences of varying lengths.
Load-bearing premise
Flow matching can faithfully capture the joint distribution of event times and marks without autoregressive conditioning to prevent cascading errors.
What would settle it
A benchmark result in which EventFlow forecast error on long sequences equals or exceeds autoregressive baselines because the joint distribution modeling fails to capture dependencies.
Figures
read the original abstract
Continuous-time event sequences, in which events occur at irregular intervals, are ubiquitous across a wide range of industrial and scientific domains. The contemporary modeling paradigm is to treat such data as realizations of a temporal point process, and in machine learning it is common to model temporal point processes in an autoregressive fashion using a neural network. While autoregressive models are successful in predicting the time of a single subsequent event, their performance can degrade when forecasting longer horizons due to cascading errors and myopic predictions. We propose EventFlow, a non-autoregressive generative model for temporal point processes. The model builds on the flow matching framework in order to directly learn joint distributions over event times, side-stepping the autoregressive process. EventFlow is simple to implement and achieves a 20%-53% lower forecast error than the nearest baseline on standard TPP benchmarks while simultaneously using fewer model calls at sampling time.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes EventFlow, a non-autoregressive generative model for temporal point processes based on flow matching. It directly learns joint distributions over event times and marks to avoid the cascading errors and myopic predictions of autoregressive neural TPP models, claiming 20%-53% lower forecast error than the nearest baseline on standard benchmarks while requiring fewer model calls at sampling time.
Significance. If the central empirical claims hold after verification that the flow-matching vector field respects point-process constraints (no simultaneous events, correct ordering) for variable-length sequences, the work would offer a substantive alternative modeling paradigm for TPP forecasting. The non-autoregressive formulation and reported sampling efficiency would be notable strengths if supported by reproducible experiments.
major comments (2)
- [Methods (flow-matching adaptation)] The non-autoregressive premise requires explicit verification that the fixed-dimensional flow-matching objective, after any padding/masking/length-conditioning, preserves the point-process measure (no simultaneous events, strict ordering) across the empirical distribution of sequence lengths N; without such checks the reported gains could be artifacts of benchmark length statistics rather than a solution to joint modeling.
- [Experiments] The central performance claim (20-53% lower forecast error) is load-bearing; the experiments section must supply baseline definitions, dataset statistics, error-bar information, and the exact forecast horizon/metrics used, as the abstract alone supplies none of these details.
minor comments (1)
- [Introduction] Notation for marks m_{1:N} and history conditioning should be clarified when first introduced to avoid ambiguity with standard TPP intensity notation.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the two major comments point-by-point below and will revise the manuscript to incorporate the requested clarifications and additions.
read point-by-point responses
-
Referee: [Methods (flow-matching adaptation)] The non-autoregressive premise requires explicit verification that the fixed-dimensional flow-matching objective, after any padding/masking/length-conditioning, preserves the point-process measure (no simultaneous events, strict ordering) across the empirical distribution of sequence lengths N; without such checks the reported gains could be artifacts of benchmark length statistics rather than a solution to joint modeling.
Authors: We agree that explicit verification strengthens the non-autoregressive claim. In the revision we will add a dedicated subsection describing how the flow-matching vector field, together with the padding/masking and length-conditioning mechanisms, enforces no simultaneous events and strict ordering. We will also report empirical diagnostics (e.g., fraction of invalid sequences and ordering violations) computed on held-out length distributions from each benchmark. revision: yes
-
Referee: [Experiments] The central performance claim (20-53% lower forecast error) is load-bearing; the experiments section must supply baseline definitions, dataset statistics, error-bar information, and the exact forecast horizon/metrics used, as the abstract alone supplies none of these details.
Authors: We accept that the experiments section should be self-contained. The revised manuscript will include: (i) precise definitions and hyper-parameter settings for all baselines, (ii) full dataset statistics (number of sequences, mean/variance of lengths and marks), (iii) error bars from at least five independent runs, and (iv) explicit statements of the forecast horizons and metrics (e.g., mean absolute error on inter-event times, log-likelihood on marks) used to obtain the reported 20-53% improvements. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper presents EventFlow as a new non-autoregressive flow-matching model for joint distributions over event times and marks in temporal point processes, explicitly positioned as an alternative to autoregressive conditioning to avoid cascading errors. No equations, parameter-fitting steps, or self-citations in the abstract or described approach reduce the reported 20-53% error reductions to quantities fitted from the evaluation data by construction. The central claim rests on applying an external flow-matching framework to TPP data with benchmark comparisons, which constitutes independent content rather than self-referential reduction.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose EventFlow, a non-autoregressive generative model for temporal point processes. The model builds on the flow matching framework in order to directly learn joint distributions over event times
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat recovery theorem unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
balanced coupling ... Πb(μ,ν) ... event count distributions ... μ(n)=ν(n)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...
-
[2]
Building normalizing flows with stochastic interpolants
Michael Samuel Albergo and Eric Vanden-Eijnden. Building normalizing flows with stochastic interpolants. In The Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[3]
Analysis and geometry on configuration spaces
Sergio Albeverio, Yu G Kondratiev, and Michael R \"o ckner. Analysis and geometry on configuration spaces. Journal of Functional Analysis, 154 0 (2): 0 444--500, 1998
work page 1998
-
[4]
On the predictive accuracy of neural temporal point process models for continuous-time event data
Tanguy Bosser and Souhaib Ben Taieb. On the predictive accuracy of neural temporal point process models for continuous-time event data. Transactions on Machine Learning Research, 2023
work page 2023
-
[5]
Probabilistic querying of continuous-time event sequences
Alex Boyd, Yuxin Chang, Stephan Mandt, and Padhraic Smyth. Probabilistic querying of continuous-time event sequences. In International Conference on Artificial Intelligence and Statistics, pp.\ 10235--10251. PMLR, 2023
work page 2023
-
[6]
Epic-ly fast particle cloud generation with flow-matching and diffusion
Erik Buhmann, Cedric Ewen, Darius A Faroughy, Tobias Golling, Gregor Kasieczka, Matthew Leigh, Guillaume Qu \'e tant, John Andrew Raine, Debajyoti Sengupta, and David Shih. Epic-ly fast particle cloud generation with flow-matching and diffusion. arXiv preprint arXiv:2310.00049, 2023
-
[7]
Andrew Campbell, Jason Yim, Regina Barzilay, Tom Rainforth, and Tommi Jaakkola. Generative flows on discrete state-spaces: Enabling multimodal flows with applications to protein co-design. In Proceedings of the 41st International Conference on Machine Learning, pp.\ 5453--5512, 2024
work page 2024
-
[8]
An Introduction to the Theory of Point Processes: Volume I: Elementary Theory and Methods
Daryl J Daley and David Vere-Jones. An Introduction to the Theory of Point Processes: Volume I: Elementary Theory and Methods. Springer, 2003
work page 2003
-
[9]
Quan Dao, Hao Phung, Binh Nguyen, and Anh Tran. Flow matching in latent space. arXiv preprint arXiv:2307.08698, 2023
-
[10]
Recurrent marked temporal point processes: Embedding event history to vector
Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp.\ 1555--1564, 2016
work page 2016
-
[11]
Arthur Gretton, Karsten M Borgwardt, Malte J Rasch, Bernhard Sch \"o lkopf, and Alexander Smola. A kernel two-sample test. The Journal of Machine Learning Research, 13: 0 723--773, 2012
work page 2012
-
[12]
Spectra of some self-exciting and mutually exciting point processes
Alan G Hawkes. Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58 0 (1): 0 83--90, 1971
work page 1971
-
[13]
Denoising diffusion probabilistic models
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33: 0 6840--6851, 2020
work page 2020
-
[14]
A self-correcting point process
Valerie Isham and Mark Westcott. A self-correcting point process. Stochastic Processes and Their Applications, 8 0 (3): 0 335--347, 1979
work page 1979
-
[15]
Random Measures, Theory and Applications, volume 1
Olav Kallenberg. Random Measures, Theory and Applications, volume 1. Springer, 2017
work page 2017
-
[16]
Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In International Conference on Learning Represenations, 2015
work page 2015
-
[17]
Simulation of nonhomogeneous P oisson processes with degree-two exponential polynomial rate function
Peter AW Lewis and Gerald S Shedler. Simulation of nonhomogeneous P oisson processes with degree-two exponential polynomial rate function. Operations Research, 27 0 (5): 0 1026--1040, 1979
work page 1979
-
[19]
Exploring generative neural temporal point process
Haitao Lin, Lirong Wu, Guojiang Zhao, Liu Pai, and Stan Z Li. Exploring generative neural temporal point process. Transactions on Machine Learning Research, 2022
work page 2022
-
[20]
Flow matching for generative modeling
Yaron Lipman, Ricky T Q Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling. In The Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[21]
Flow straight and fast: Learning to generate and transfer data with rectified flow
Xingchao Liu, Chengyue Gong, and Qiang liu. Flow straight and fast: Learning to generate and transfer data with rectified flow. In The Eleventh International Conference on Learning Representations, 2023
work page 2023
-
[22]
SGDR : Stochastic gradient descent with warm restarts
Ilya Loshchilov and Frank Hutter. SGDR : Stochastic gradient descent with warm restarts. In International Conference on Learning Representations, 2017
work page 2017
-
[23]
u dke, Marin Bilo s , Oleksandr Shchur, Marten Lienen, and Stephan G \
David L \"u dke, Marin Bilo s , Oleksandr Shchur, Marten Lienen, and Stephan G \"u nnemann. Add and thin: Diffusion for temporal point processes. Advances in Neural Information Processing Systems, 36: 0 56784--56801, 2023
work page 2023
-
[24]
Sit: Exploring flow and diffusion-based generative models with scalable interpolant transformers
Nanye Ma, Mark Goldstein, Michael S Albergo, Nicholas M Boffi, Eric Vanden-Eijnden, and Saining Xie. Sit: Exploring flow and diffusion-based generative models with scalable interpolant transformers. arXiv preprint arXiv:2401.08740, 2024
-
[25]
The neural H awkes process: A neurally self-modulating multivariate point process
Hongyuan Mei and Jason M Eisner. The neural H awkes process: A neurally self-modulating multivariate point process. Advances in Neural Information Processing Systems, 30, 2017
work page 2017
-
[26]
Imputing missing events in continuous-time event streams
Hongyuan Mei, Guanghui Qin, and Jason Eisner. Imputing missing events in continuous-time event streams. In International Conference on Machine Learning, pp.\ 4475--4485, 2019
work page 2019
-
[27]
Improved denoising diffusion probabilistic models
Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. In International conference on machine learning, pp.\ 8162--8171. PMLR, 2021
work page 2021
-
[28]
On L ewis' simulation method for point processes
Yosihiko Ogata. On L ewis' simulation method for point processes. IEEE Transactions on Information Theory, 27 0 (1): 0 23--31, 1981
work page 1981
-
[29]
Space-time point-process models for earthquake occurrences
Yosihiko Ogata. Space-time point-process models for earthquake occurrences. Annals of the Institute of Statistical Mathematics, 50: 0 379--402, 1998
work page 1998
-
[30]
Fully neural network based model for general temporal point processes
Takahiro Omi, Kazuyuki Aihara, et al. Fully neural network based model for general temporal point processes. Advances in Neural Information Processing Systems, 32, 2019
work page 2019
-
[31]
Normalizing flows for probabilistic modeling and inference
George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshminarayanan. Normalizing flows for probabilistic modeling and inference. Journal of Machine Learning Research, 22 0 (57): 0 1--64, 2021
work page 2021
-
[32]
Temporal point processes: The conditional intensity function
Jakob Gulddahl Rasmussen. Temporal point processes: The conditional intensity function. Lecture Notes, 2011
work page 2011
-
[33]
Intensity-free learning of temporal point processes
Oleksandr Shchur, Marin Bilo s , and Stephan G \"u nnemann. Intensity-free learning of temporal point processes. In International Conference on Learning Representations, 2020 a
work page 2020
-
[34]
Fast and flexible temporal point processes with triangular maps
Oleksandr Shchur, Nicholas Gao, Marin Bilo s , and Stephan G \"u nnemann. Fast and flexible temporal point processes with triangular maps. Advances in Neural Information Processing Systems, 33: 0 73--84, 2020 b
work page 2020
-
[35]
Score-based generative modeling through stochastic differential equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021
work page 2021
-
[36]
D irichlet flow matching with applications to DNA sequence design
Hannes Stark, Bowen Jing, Chenyu Wang, Gabriele Corso, Bonnie Berger, Regina Barzilay, and Tommi Jaakkola. D irichlet flow matching with applications to DNA sequence design. In Proceedings of the 41st International Conference on Machine Learning, pp.\ 46495--46513, 2024
work page 2024
-
[37]
Improving and generalizing flow-based generative models with minibatch optimal transport
Alexander Tong, Kilian Fatras, Nikolay Malkin, Guillaume Huguet, Yanlei Zhang, Jarrid Rector-Brooks, Guy Wolf, and Yoshua Bengio. Improving and generalizing flow-based generative models with minibatch optimal transport. Transactions on Machine Learning Research, 2024
work page 2024
-
[38]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in Neural Information Processing Systems, 2017
work page 2017
-
[39]
Fast point cloud generation with straight flows
Lemeng Wu, Dilin Wang, Chengyue Gong, Xingchao Liu, Yunyang Xiong, Rakesh Ranjan, Raghuraman Krishnamoorthi, Vikas Chandra, and Qiang Liu. Fast point cloud generation with straight flows. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pp.\ 9445--9454, 2023
work page 2023
-
[40]
Wasserstein learning of deep generative point process models
Shuai Xiao, Mehrdad Farajtabar, Xiaojing Ye, Junchi Yan, Le Song, and Hongyuan Zha. Wasserstein learning of deep generative point process models. Advances in Neural Information Processing Systems, 30, 2017 a
work page 2017
-
[41]
Modeling the intensity function of point process via recurrent neural networks
Shuai Xiao, Junchi Yan, Xiaokang Yang, Hongyuan Zha, and Stephen Chu. Modeling the intensity function of point process via recurrent neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31, 2017 b
work page 2017
-
[42]
Path to purchase: A mutually exciting point process model for online advertising and conversion
Lizhen Xu, Jason A Duan, and Andrew Whinston. Path to purchase: A mutually exciting point process model for online advertising and conversion. Management Science, 60 0 (6): 0 1392--1412, 2014
work page 2014
-
[43]
Hypro: A hybridly normalized probabilistic model for long-horizon prediction of event sequences
Siqiao Xue, Xiaoming Shi, James Zhang, and Hongyuan Mei. Hypro: A hybridly normalized probabilistic model for long-horizon prediction of event sequences. Advances in Neural Information Processing Systems, 35: 0 34641--34650, 2022
work page 2022
-
[44]
Zhang, Qingsong Wen, Jun Zhou, and Hongyuan Mei
Siqiao Xue, Xiaoming Shi, Zhixuan Chu, Yan Wang, Hongyan Hao, Fan Zhou, Caigao Jiang, Chen Pan, James Y. Zhang, Qingsong Wen, Jun Zhou, and Hongyuan Mei. Easy TPP : Towards open benchmarking temporal point processes. In International Conference on Learning Representations, 2024
work page 2024
-
[45]
Transformer embeddings of irregularly spaced events and their participants
Chenghao Yang, Hongyuan Mei, and Jason Eisner. Transformer embeddings of irregularly spaced events and their participants. In International Conference on Learning Representations, 2022
work page 2022
-
[46]
Self-attentive H awkes process
Qiang Zhang, Aldo Lipani, Omer Kirnap, and Emine Yilmaz. Self-attentive H awkes process. In Proceedings of the 37th International Conference on Machine Learning, pp.\ 11183--11193, 2020
work page 2020
-
[47]
Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, and Hongyuan Zha. Transformer H awkes process. In Proceedings of the 37th International Conference on Machine Learning, pp.\ 11692--11702, 2020
work page 2020
-
[48]
\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...
-
[49]
\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...
-
[50]
u dke, David and Bilo s , Marin and Shchur, Oleksandr and Lienen, Marten and G \
@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.