Adversarial Self-Paced Learning for Mixture Models of Hawkes Processes

Dixin Luo; Hongteng Xu; Lawrence Carin

arxiv: 1906.08397 · v1 · pith:7Y43VGYXnew · submitted 2019-06-20 · 📊 stat.ML · cs.LG

Adversarial Self-Paced Learning for Mixture Models of Hawkes Processes

Dixin Luo , Hongteng Xu , Lawrence Carin This is my paper

Pith reviewed 2026-05-25 19:47 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords Hawkes processesmixture modelsself-paced learningadversarial learningevent sequencesmaximum likelihood estimationdata augmentationpoint processes

0 comments

The pith

A new adversarial self-paced strategy learns mixture models of Hawkes processes by iteratively generating and selecting easy augmented sequences during maximum likelihood estimation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes learning mixture models of Hawkes processes not by direct fitting but through an iterative process that augments observed event sequences and applies an adversarial self-paced mechanism. In each round, sequences are generated from the originals, then maximum likelihood estimation proceeds under the rule that sequences easy for the current target model act as adversarial examples against misspecified alternatives. The model updates using only the sequences that obey it, and those sequences feed the next iteration. Experiments demonstrate consistent outperformance over traditional approaches on the task of recovering mixtures from heterogeneous event data.

Core claim

The central claim is that an adversarial self-paced learning strategy, built on Hawkes process data augmentation inside the self-paced framework, enables iterative refinement of mixture models. Starting from observed sequences, the method generates augmented versions, performs maximum likelihood estimation with an adversarial self-paced filter grounded in the easy-sample-equals-adversarial-sample property, updates the target model, and retains only the sequences consistent with that model for the subsequent round. This procedure yields mixture models that outperform conventional direct estimation on mixed Hawkes data.

What carries the argument

The adversarial self-paced mechanism inside maximum likelihood estimation, which exploits the property that an easy sample for the target model serves as an adversarial sample for a misspecified model.

If this is right

Mixture parameters for Hawkes processes can be recovered more reliably when training focuses only on sequences aligned with the current target.
Data augmentation of Hawkes processes supplies the varied samples needed to drive the self-paced selection.
The iterative loop progressively excludes sequences that would fit better under alternative component assignments.
The resulting models achieve higher likelihood on held-out mixed sequences than models trained without the adversarial filter.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same easy-versus-adversarial selection rule could be tested on other temporal point process families that admit similar augmentation schemes.
Because the mechanism operates inside each iteration of maximum likelihood, it may integrate naturally with online or streaming updates of mixture models.
If the property linking easy and adversarial samples holds only for certain parameter regimes, the method's gains would be limited to those regimes.

Load-bearing premise

An easy sample of the target model can be treated as an adversarial sample of a misspecified model.

What would settle it

Running the method on synthetic mixtures with known ground-truth parameters and observing no improvement in parameter recovery or log-likelihood over direct maximum likelihood estimation would falsify the central effectiveness claim.

Figures

Figures reproduced from arXiv: 1906.08397 by Dixin Luo, Hongteng Xu, Lawrence Carin.

**Figure 1.** Figure 1: The schemes of various learning methods. The sequences of different Hawkes processes are labeled in different colors. To overcome the challenges above, we propose a novel adversarial self-paced learning (ASPL) method and train it iteratively to robustly learn mixture models of Hawkes processes. As shown in [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

read the original abstract

We propose a novel adversarial learning strategy for mixture models of Hawkes processes, leveraging data augmentation techniques of Hawkes process in the framework of self-paced learning. Instead of learning a mixture model directly from a set of event sequences drawn from different Hawkes processes, the proposed method learns the target model iteratively, which generates "easy" sequences and uses them in an adversarial and self-paced manner. In each iteration, we first generate a set of augmented sequences from original observed sequences. Based on the fact that an easy sample of the target model can be an adversarial sample of a misspecified model, we apply a maximum likelihood estimation with an adversarial self-paced mechanism. In this manner the target model is updated, and the augmented sequences that obey it are employed for the next learning iteration. Experimental results show that the proposed method outperforms traditional methods consistently.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper combines adversarial and self-paced learning with Hawkes data augmentation into an iterative mixture training loop, but the whole thing rests on an unverified claim that easy target samples are adversarial for misspecified models.

read the letter

The main thing to know is that the authors have packaged adversarial learning, self-paced learning, and Hawkes-specific sequence augmentation into one iterative procedure for fitting mixture models. They generate augmented sequences from the observed data, then run maximum likelihood under an adversarial self-paced rule that selects sequences the current model finds easy, and feed those back for the next round. That exact recipe has not been tried before on Hawkes mixtures, so the combination itself counts as new work in this corner of statistical machine learning. The abstract also states that the method beats standard approaches on the experiments they ran, which would matter for anyone fitting point-process mixtures to real event data. The soft spot is exactly the one the stress test flags. The method is justified by the statement that an easy sample under the target model is automatically an adversarial sample under a misspecified model, yet the abstract gives no derivation, no citation, and no quick empirical check for the Hawkes case. If that property does not hold when the misspecification is a wrong component count or wrong branching ratio, the self-paced selection step loses its claimed advantage over plain mixture EM. Without the full equations or experimental protocol it is impossible to see how the adversarial term is actually implemented or what the quantitative gains really are. This paper is for people already working on temporal point processes or robust mixture estimation. A reader in that subfield could extract the training idea and test the key assumption on their own data. Outside that niche the payoff is limited. The authors show clear engagement with the problem of training stability, so the work deserves a serious referee who can check whether the central assumption survives contact with actual Hawkes data and whether the reported gains remain once the details are examined.

Referee Report

2 major / 2 minor

Summary. The paper proposes an adversarial self-paced learning strategy for mixture models of Hawkes processes. It augments observed event sequences, then iteratively performs MLE under an adversarial self-paced mechanism that selects 'easy' augmented sequences, justified by the claim that easy samples under the target model act as adversarial samples for misspecified models. The target model is updated each iteration and the process repeats; experiments are reported to show consistent outperformance over traditional mixture methods.

Significance. If the central assumption holds and the experimental protocol is sound, the approach could offer a principled way to mitigate misspecification effects during learning of Hawkes mixtures, which arise in many point-process applications. The paper does not, however, supply the required derivation, citation, or targeted empirical check for the key fact, so the claimed advantage over standard EM remains unsupported.

major comments (2)

[Abstract / §3] Abstract and §3 (method description): the iterative procedure is motivated by the statement 'Based on the fact that an easy sample of the target model can be an adversarial sample of a misspecified model' without derivation, citation, or any empirical verification specific to Hawkes processes (e.g., checking whether high-likelihood sequences under the fitted mixture exhibit systematically lower likelihood under plausible misspecifications such as wrong number of components or incorrect branching ratios). This assumption is load-bearing for the claimed superiority over standard mixture EM.
[Experiments] Experimental section: the reported outperformance is presented without ablation that isolates the contribution of the adversarial self-paced selection versus plain data augmentation or standard self-paced learning; therefore it is impossible to attribute gains to the unverified fact rather than to other implementation choices.

minor comments (2)

[§3] Notation for the augmented sequences and the adversarial loss should be introduced with explicit definitions before the iterative algorithm is described.
[Experiments] The abstract claims 'consistent' outperformance; the experimental tables should report per-dataset standard deviations or statistical significance tests to support that wording.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address the two major points below and will revise the manuscript to strengthen the presentation.

read point-by-point responses

Referee: [Abstract / §3] Abstract and §3 (method description): the iterative procedure is motivated by the statement 'Based on the fact that an easy sample of the target model can be an adversarial sample of a misspecified model' without derivation, citation, or any empirical verification specific to Hawkes processes (e.g., checking whether high-likelihood sequences under the fitted mixture exhibit systematically lower likelihood under plausible misspecifications such as wrong number of components or incorrect branching ratios). This assumption is load-bearing for the claimed superiority over standard mixture EM.

Authors: We agree that the motivating statement would benefit from additional justification. The claim draws from general principles in adversarial and self-paced learning, but the original manuscript presents it without a dedicated derivation, citation, or Hawkes-specific verification. In the revision we will expand §3 with a short discussion of the intuition, relevant citations from the adversarial learning literature, and a targeted synthetic experiment that checks likelihoods of high-likelihood sequences under controlled misspecifications (wrong component count, altered branching ratios). revision: yes
Referee: [Experiments] Experimental section: the reported outperformance is presented without ablation that isolates the contribution of the adversarial self-paced selection versus plain data augmentation or standard self-paced learning; therefore it is impossible to attribute gains to the unverified fact rather than to other implementation choices.

Authors: We concur that the current experiments do not isolate the adversarial self-paced component. The revised manuscript will add ablation studies that compare the full method against (i) data augmentation alone and (ii) standard self-paced learning without the adversarial selection step. These results will be reported alongside the existing comparisons to clarify the source of the observed gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes an adversarial self-paced learning method for Hawkes process mixtures, invoking the assumption that 'an easy sample of the target model can be an adversarial sample of a misspecified model' to justify the mechanism. This assumption is stated without derivation from the paper's own equations or reduction to fitted inputs, and the central claim of consistent outperformance rests on experimental results rather than any self-definitional equivalence, fitted prediction renamed as result, or self-citation load-bearing chain. No steps match the enumerated circularity patterns; the derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract only; the single domain assumption below is extracted directly from the text. No free parameters or invented entities are mentioned.

axioms (1)

domain assumption an easy sample of the target model can be an adversarial sample of a misspecified model
Invoked explicitly to enable the adversarial self-paced maximum-likelihood update step.

pith-pipeline@v0.9.0 · 5668 in / 1112 out tokens · 30177 ms · 2026-05-25T19:47:14.346213+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

[1]

D., and Tygar, J

Barreno, M., Nelson, B., Sears, R., Joseph, A. D., and Tygar, J. D. Can machine learning be secure? In Proceedings of the 2006 ACM Symposium on Information, computer and communications security, pp.\ 16--25. ACM, 2006

work page 2006
[2]

Curriculum learning

Bengio, Y., Louradour, J., Collobert, R., and Weston, J. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pp.\ 41--48. ACM, 2009

work page 2009
[3]

Hawkes, A. G. Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58 0 (1): 0 83--90, 1971

work page 1971
[4]

D., Nelson, B., Rubinstein, B

Huang, L., Joseph, A. D., Nelson, B., Rubinstein, B. I., and Tygar, J. Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence, pp.\ 43--58. ACM, 2011

work page 2011
[5]

E., Pollard, T

Johnson, A. E., Pollard, T. J., Shen, L., Li-wei, H. L., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., and Mark, R. G. MIMIC-III , a freely accessible critical care database. Scientific data, 3: 0 160035, 2016

work page 2016
[6]

P., Packer, B., and Koller, D

Kumar, M. P., Packer, B., and Koller, D. Self-paced learning for latent variable models. In Advances in Neural Information Processing Systems, pp.\ 1189--1197, 2010

work page 2010
[7]

and Chawla, S

Liu, W. and Chawla, S. A game theoretical model for adversarial learning. In 2009 IEEE International Conference on Data Mining Workshops, pp.\ 25--30. IEEE, 2009

work page 2009
[8]

and Meek, C

Lowd, D. and Meek, C. Adversarial learning. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp.\ 641--647. ACM, 2005

work page 2005
[9]

You are what you watch and when you watch: Inferring household structures from IPTV viewing data

Luo, D., Xu, H., Zha, H., Du, J., Xie, R., Yang, X., and Zhang, W. You are what you watch and when you watch: Inferring household structures from IPTV viewing data. IEEE Transactions on Broadcasting, 60 0 (1): 0 61--72, 2014

work page 2014
[10]

Multi-task multi-dimensional hawkes processes for modeling event sequences

Luo, D., Xu, H., Zhen, Y., Ning, X., Zha, H., Yang, X., and Zhang, W. Multi-task multi-dimensional hawkes processes for modeling event sequences. In Proceedings of the 24th International Conference on Artificial Intelligence, pp.\ 3685--3691. AAAI Press, 2015

work page 2015
[11]

and Zha, H

Xu, H. and Zha, H. A D irichlet mixture model of H awkes processes for event sequence clustering. In Advances in Neural Information Processing Systems, pp.\ 1354--1363, 2017

work page 2017
[12]

Learning H awkes processes from short doubly-censored event sequences

Xu, H., Luo, D., and Zha, H. Learning H awkes processes from short doubly-censored event sequences. In International Conference on Machine Learning, pp.\ 3831--3840, 2017

work page 2017
[13]

Learning registered point processes from idiosyncratic observations

Xu, H., Carin, L., and Zha, H. Learning registered point processes from idiosyncratic observations. In International Conference on Machine Learning, 2018 a

work page 2018
[14]

Benefits from superposed H awkes processes

Xu, H., Luo, D., Chen, X., and Carin, L. Benefits from superposed H awkes processes. In International Conference on Artificial Intelligence and Statistics, pp.\ 623--631, 2018 b

work page 2018
[15]

and Zha, H

Yang, S.-H. and Zha, H. Mixture of mutually exciting processes for viral diffusion. In International Conference on Machine Learning, pp.\ 1--9, 2013

work page 2013
[16]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page

[1] [1]

D., and Tygar, J

Barreno, M., Nelson, B., Sears, R., Joseph, A. D., and Tygar, J. D. Can machine learning be secure? In Proceedings of the 2006 ACM Symposium on Information, computer and communications security, pp.\ 16--25. ACM, 2006

work page 2006

[2] [2]

Curriculum learning

Bengio, Y., Louradour, J., Collobert, R., and Weston, J. Curriculum learning. In Proceedings of the 26th annual international conference on machine learning, pp.\ 41--48. ACM, 2009

work page 2009

[3] [3]

Hawkes, A. G. Spectra of some self-exciting and mutually exciting point processes. Biometrika, 58 0 (1): 0 83--90, 1971

work page 1971

[4] [4]

D., Nelson, B., Rubinstein, B

Huang, L., Joseph, A. D., Nelson, B., Rubinstein, B. I., and Tygar, J. Adversarial machine learning. In Proceedings of the 4th ACM workshop on Security and artificial intelligence, pp.\ 43--58. ACM, 2011

work page 2011

[5] [5]

E., Pollard, T

Johnson, A. E., Pollard, T. J., Shen, L., Li-wei, H. L., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., and Mark, R. G. MIMIC-III , a freely accessible critical care database. Scientific data, 3: 0 160035, 2016

work page 2016

[6] [6]

P., Packer, B., and Koller, D

Kumar, M. P., Packer, B., and Koller, D. Self-paced learning for latent variable models. In Advances in Neural Information Processing Systems, pp.\ 1189--1197, 2010

work page 2010

[7] [7]

and Chawla, S

Liu, W. and Chawla, S. A game theoretical model for adversarial learning. In 2009 IEEE International Conference on Data Mining Workshops, pp.\ 25--30. IEEE, 2009

work page 2009

[8] [8]

and Meek, C

Lowd, D. and Meek, C. Adversarial learning. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pp.\ 641--647. ACM, 2005

work page 2005

[9] [9]

You are what you watch and when you watch: Inferring household structures from IPTV viewing data

Luo, D., Xu, H., Zha, H., Du, J., Xie, R., Yang, X., and Zhang, W. You are what you watch and when you watch: Inferring household structures from IPTV viewing data. IEEE Transactions on Broadcasting, 60 0 (1): 0 61--72, 2014

work page 2014

[10] [10]

Multi-task multi-dimensional hawkes processes for modeling event sequences

Luo, D., Xu, H., Zhen, Y., Ning, X., Zha, H., Yang, X., and Zhang, W. Multi-task multi-dimensional hawkes processes for modeling event sequences. In Proceedings of the 24th International Conference on Artificial Intelligence, pp.\ 3685--3691. AAAI Press, 2015

work page 2015

[11] [11]

and Zha, H

Xu, H. and Zha, H. A D irichlet mixture model of H awkes processes for event sequence clustering. In Advances in Neural Information Processing Systems, pp.\ 1354--1363, 2017

work page 2017

[12] [12]

Learning H awkes processes from short doubly-censored event sequences

Xu, H., Luo, D., and Zha, H. Learning H awkes processes from short doubly-censored event sequences. In International Conference on Machine Learning, pp.\ 3831--3840, 2017

work page 2017

[13] [13]

Learning registered point processes from idiosyncratic observations

Xu, H., Carin, L., and Zha, H. Learning registered point processes from idiosyncratic observations. In International Conference on Machine Learning, 2018 a

work page 2018

[14] [14]

Benefits from superposed H awkes processes

Xu, H., Luo, D., Chen, X., and Carin, L. Benefits from superposed H awkes processes. In International Conference on Artificial Intelligence and Statistics, pp.\ 623--631, 2018 b

work page 2018

[15] [15]

and Zha, H

Yang, S.-H. and Zha, H. Mixture of mutually exciting processes for viral diffusion. In International Conference on Machine Learning, pp.\ 1--9, 2013

work page 2013

[16] [16]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

work page