AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

Chi Ma; Hang Xiang; Hongsong Li; Kai Sun; Lin Guo; Xiao Ma; Xiaoqiang Zhu; Xiaozhong Liu; Zheng Gao

arxiv: 1907.06582 · v1 · pith:GI424IC7new · submitted 2019-07-12 · 💻 cs.LG · stat.ML

AMAD: Adversarial Multiscale Anomaly Detection on High-Dimensional and Time-Evolving Categorical Data

Zheng Gao , Lin Guo , Chi Ma , Xiao Ma , Kai Sun , Hang Xiang , Xiaoqiang Zhu , Hongsong Li

show 1 more author

Xiaozhong Liu

This is my paper

Pith reviewed 2026-05-24 22:48 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords anomaly detectionadversarial autoencoderrecurrent neural networkattention mechanismmultiscale representationstime-evolving categorical datatwo-resolution detectorunlabeled data

0 comments

The pith

A hybrid model combines adversarial autoencoders with recurrent networks and attention to detect anomalies at multiple scales in unlabeled time-evolving categorical data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that an end-to-end architecture merging adversarial autoencoders and recurrent neural networks can learn representations across scales using attention mechanisms. This enables a two-resolution detector that flags anomalies in both single instances and data blocks. The method targets domains like cyber security and online recommendations where data is high-dimensional, categorical, and changes over time without available labels. If the approach holds, it would support monitoring irregular patterns at varying resolutions on such data streams.

Core claim

The paper claims that a unified end-to-end model combining the advantages of Adversarial Autoencoder and Recurrent Neural Network learns data representations across different scales with attention mechanisms, on which an enhanced two-resolution anomaly detector is developed for both instances and data blocks.

What carries the argument

Adversarial autoencoder combined with recurrent neural network and attention mechanisms that learns multiscale representations from unlabeled time-evolving categorical data.

If this is right

The model handles high-dimensional categorical features without any labeled samples.
It identifies irregular patterns simultaneously at instance level and block level.
Superior performance holds over state-of-the-art methods across three dataset types.
The approach applies directly to time-evolving data challenges in cyber security and online recommendation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The two-resolution design could support hierarchical alerting systems that escalate from single events to group patterns.
End-to-end training might reduce the need for separate feature engineering steps in streaming categorical data pipelines.
Attention across scales could be tested on other sequential data formats such as text or sensor logs.

Load-bearing premise

The adversarial autoencoder plus recurrent neural network architecture with attention mechanisms can learn effective multiscale representations from unlabeled time-evolving categorical data.

What would settle it

A new dataset from cyber security or recommendation domains on which the proposed model fails to outperform standard baselines in detecting anomalies at instance or block level would falsify the central claim.

Figures

Figures reproduced from arXiv: 1907.06582 by Chi Ma, Hang Xiang, Hongsong Li, Kai Sun, Lin Guo, Xiao Ma, Xiaoqiang Zhu, Xiaozhong Liu, Zheng Gao.

**Figure 2.** Figure 2: The overall architecture of AMAD 3.2.1 Feature and Attribute Representation. For the input layer, sparse embedding is implemented to embed each categorical feature to a fixed-size dense vector v F , which is automatically learned during the training process. For each attribute, its representation vector v A is extracted from all the embedding vectors of its input feature collection {v F 1 , ...,v F N A }… view at source ↗

**Figure 3.** Figure 3: The full model’s performance of block level detection [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Anomaly detection is facing with emerging challenges in many important industry domains, such as cyber security and online recommendation and advertising. The recent trend in these areas calls for anomaly detection on time-evolving data with high-dimensional categorical features without labeled samples. Also, there is an increasing demand for identifying and monitoring irregular patterns at multiple resolutions. In this work, we propose a unified end-to-end approach to solve these challenges by combining the advantages of Adversarial Autoencoder and Recurrent Neural Network. The model learns data representations cross different scales with attention mechanisms, on which an enhanced two-resolution anomaly detector is developed for both instances and data blocks. Extensive experiments are performed over three types of datasets to demonstrate the efficacy of our method and its superiority over the state-of-art approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper proposes AMAD, a unified end-to-end model combining Adversarial Autoencoder (AAE) and Recurrent Neural Network (RNN) with attention mechanisms to learn multiscale representations from high-dimensional time-evolving categorical data. It introduces an enhanced two-resolution anomaly detector operating on both instances and data blocks, and claims to demonstrate superiority over state-of-the-art methods via extensive experiments on three types of datasets from domains such as cyber security and online recommendation.

Significance. If the empirical results are robust, the work provides a practical framework for unsupervised multiscale anomaly detection on unlabeled categorical time series, addressing real industry needs. The architecture builds on established components (AAE + RNN + attention), so novelty lies in the specific multiscale integration and detector design rather than foundational innovation. No machine-checked proofs, reproducible code releases, or parameter-free derivations are described; the contribution is empirical.

minor comments (3)

[Abstract] Abstract: the claim of 'extensive experiments' and 'superiority over the state-of-the-art' is stated without any quantitative metrics, baseline names, dataset identifiers, or error bars. This weakens the abstract's ability to convey the central empirical claim.
[Method] The manuscript should clarify the precise definitions of the two resolutions in the anomaly detector (instance-level vs. block-level) and how attention weights are aggregated across scales; current high-level description leaves implementation details ambiguous for reproduction.
[Experiments] Experimental section: include statistical significance tests or confidence intervals when reporting superiority over baselines to support the cross-dataset claims.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary of our work and the recommendation of minor revision. The referee accurately describes the AMAD model, its components, and the empirical focus. No specific major comments appear in the provided report, so we have no individual points to rebut or revise at this time. We remain available to address any minor issues or clarifications requested by the editor.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper proposes an empirical architecture (AAE + RNN with attention) for multiscale anomaly detection on categorical time series and validates it via experiments on three dataset types. No equations, derivations, or parameter-fitting steps are described in the abstract or high-level claims that reduce a claimed prediction or result to the inputs by construction. The central claim is an empirical demonstration of effectiveness rather than a self-referential mathematical derivation, so the argument structure contains no load-bearing circular steps of the enumerated kinds.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no equations, so no free parameters, axioms, or invented entities can be identified; the model itself is the proposed contribution but its internal assumptions remain unstated.

pith-pipeline@v0.9.0 · 5679 in / 1087 out tokens · 20025 ms · 2026-05-24T22:48:21.123585+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 12 internal anchors

[1]

Samet Akcay, Amir Atapour-Abarghouei, and Toby P Breckon. 2018. GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training. arXiv preprint arXiv:1805.06725 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[2]

Jerone TA Andrews, Edward J Morton, and Lewis D Grifﬁn. 2016. Detecting anomalous data using auto-encoders. International Journal of Machine Learning and Computing 6, 1 (2016), 21

work page 2016
[3]

Raghavendra Chalapathy, Aditya Krishna Menon, and Sanjay Chawla. 2017. Robust, deep and inductive anomaly detection. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases . Springer, 36–51

work page 2017
[4]

Raghavendra Chalapathy, Aditya Krishna Menon, and Sanjay Chawla. 2018. Anomaly Detection using One-Class Neural Networks. arXiv preprint arXiv:1802.06360 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[5]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST) 2, 3 (2011), 27

work page 2011
[6]

Jinghui Chen, Saket Sathe, Charu Aggarwal, and Deepak Turaga. 2017. Out- lier detection with autoencoder ensembles. In Proceedings of the 2017 SIAM International Conference on Data Mining . SIAM, 90–98

work page 2017
[7]

Antonia Creswell, Kai Arulkumaran, and Anil A Bharath. 2017. On denois- ing autoencoders trained to minimise binary cross-entropy. arXiv preprint arXiv:1708.08487 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[8]

Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, and Anil A Bharath. 2018. Generative adversarial networks: An overview. IEEE Signal Processing Magazine 35, 1 (2018), 53–65

work page 2018
[9]

Jeff Donahue, Philipp Kr¨ahenb¨uhl, and Trevor Darrell. 2016. Adversarial feature learning. arXiv preprint arXiv:1605.09782 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[10]

Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, and Aaron Courville. 2016. Adversarially learned infer- ence. arXiv preprint arXiv:1606.00704 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[11]

Zheng Gao, Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Xiaozhong Liu, and Ying Ding. 2018. edge2vec: Learning Node Representation Using Edge Semantics. arXiv preprint arXiv:1809.02269 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[12]

Zizhe Gao, Zheng Gao, Heng Huang, Zhuoren Jiang, and Yuliang Yan. 2018. An End-to-end Model of Predicting Diverse Ranking On Heterogeneous Feeds. (2018)

work page 2018
[13]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems . 2672–2680

work page 2014
[14]

Farrokh Habibzadeh, Parham Habibzadeh, and Mahboobeh Yadollahie. 2016. On determining the most appropriate test cut-off value: the case of tests with continuous results. Biochemia medica: Biochemia medica 26, 3 (2016), 297–307

work page 2016
[15]

Mark Kliger and Shachar Fleishman. 2018. Novelty Detection with GAN. arXiv preprint arXiv:1802.10560 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[16]

Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International Conference on Machine Learning . 1188–1196

work page 2014
[17]

Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[18]

Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2012. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD) 6, 1 (2012), 3

work page 2012
[19]

Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018. Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . ACM, 1137–1140

work page 2018
[20]

Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[21]

Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014
[22]

Lukas Ruff, Nico G¨ornitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Robert Van- dermeulen, Alexander Binder, Emmanuel M¨uller, and Marius Kloft. 2018. Deep one-class classiﬁcation. In International Conference on Machine Learning . 4390– 4399

work page 2018
[23]

Mohammad Sabokrou, Mohammad Khalooei, Mahmood Fathy, and Ehsan Adeli

work page
[24]

In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Adversarially Learned One-Class Classiﬁer for Novelty Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3379–3388

work page
[25]

Doyen Sahoo, Quang Pham, Jing Lu, and Steven CH Hoi. 2017. Online deep learn- ing: Learning deep neural networks on the ﬂy. arXiv preprint arXiv:1711.03705 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[26]

Mayu Sakurada and Takehisa Yairi. 2014. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2014 2nd DLP-KDD’19, August 5, 2019, Anchorage, AK, USA Zheng Gao et al. Workshop on Machine Learning for Sensory Data Analysis . ACM, 4

work page 2014
[27]

Thomas Schlegl, Philipp Seeb ¨ock, Sebastian M Waldstein, Ursula Schmidt- Erfurth, and Georg Langs. 2017. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In International Conference on Information Processing in Medical Imaging . Springer, 146–157

work page 2017
[28]

Platt, John C

Bernhard Sch¨olkopf, John C. Platt, John C. Shawe-Taylor, Alex J. Smola, and Robert C. Williamson. 2001. Estimating the Support of a High-Dimensional Distribution. Neural Comput. 13, 7 (July 2001), 1443–1471. DOI:http://dx.doi. org/10.1162/089976601750264965

work page doi:10.1162/089976601750264965 2001
[29]

David MJ Tax and Robert PW Duin. 2004. Support vector data description. Machine learning 54, 1 (2004), 45–66

work page 2004
[30]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems . 5998–6008

work page 2017
[31]

Yongzhen Wang, Xiaozhong Liu, and Zheng Gao. 2019. Neural Related Work Summarization with a Joint Context-driven Attention Mechanism. arXiv preprint arXiv:1901.09492 (2019)

work page arXiv 2019
[32]

Houssam Zenati, Manon Romain, Chuan-Sheng Foo, Bruno Lecouat, and Vijay Chandrasekhar. 2018. Adversarially Learned Anomaly Detection. In 2018 IEEE International Conference on Data Mining (ICDM) . IEEE, 727–736

work page 2018
[33]

Chong Zhou and Randy C Paffenroth. 2017. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 665–674

work page 2017
[34]

Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2018. Deep Interest Evolution Network for Click-Through Rate Prediction. arXiv preprint arXiv:1809.03672 (2018). 7 APPENDIX 7.1 Details of Datasets Synthetic dataset: We initialize the ﬁrst instance with three categor- ical ids ‘0,10,20’, and then generate the follo...

work page internal anchor Pith review Pith/arXiv arXiv 2018

[1] [1]

Samet Akcay, Amir Atapour-Abarghouei, and Toby P Breckon. 2018. GANomaly: Semi-Supervised Anomaly Detection via Adversarial Training. arXiv preprint arXiv:1805.06725 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[2] [2]

Jerone TA Andrews, Edward J Morton, and Lewis D Grifﬁn. 2016. Detecting anomalous data using auto-encoders. International Journal of Machine Learning and Computing 6, 1 (2016), 21

work page 2016

[3] [3]

Raghavendra Chalapathy, Aditya Krishna Menon, and Sanjay Chawla. 2017. Robust, deep and inductive anomaly detection. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases . Springer, 36–51

work page 2017

[4] [4]

Raghavendra Chalapathy, Aditya Krishna Menon, and Sanjay Chawla. 2018. Anomaly Detection using One-Class Neural Networks. arXiv preprint arXiv:1802.06360 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[5] [5]

Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST) 2, 3 (2011), 27

work page 2011

[6] [6]

Jinghui Chen, Saket Sathe, Charu Aggarwal, and Deepak Turaga. 2017. Out- lier detection with autoencoder ensembles. In Proceedings of the 2017 SIAM International Conference on Data Mining . SIAM, 90–98

work page 2017

[7] [7]

Antonia Creswell, Kai Arulkumaran, and Anil A Bharath. 2017. On denois- ing autoencoders trained to minimise binary cross-entropy. arXiv preprint arXiv:1708.08487 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[8] [8]

Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, and Anil A Bharath. 2018. Generative adversarial networks: An overview. IEEE Signal Processing Magazine 35, 1 (2018), 53–65

work page 2018

[9] [9]

Jeff Donahue, Philipp Kr¨ahenb¨uhl, and Trevor Darrell. 2016. Adversarial feature learning. arXiv preprint arXiv:1605.09782 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[10] [10]

Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, and Aaron Courville. 2016. Adversarially learned infer- ence. arXiv preprint arXiv:1606.00704 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[11] [11]

Zheng Gao, Gang Fu, Chunping Ouyang, Satoshi Tsutsui, Xiaozhong Liu, and Ying Ding. 2018. edge2vec: Learning Node Representation Using Edge Semantics. arXiv preprint arXiv:1809.02269 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[12] [12]

Zizhe Gao, Zheng Gao, Heng Huang, Zhuoren Jiang, and Yuliang Yan. 2018. An End-to-end Model of Predicting Diverse Ranking On Heterogeneous Feeds. (2018)

work page 2018

[13] [13]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems . 2672–2680

work page 2014

[14] [14]

Farrokh Habibzadeh, Parham Habibzadeh, and Mahboobeh Yadollahie. 2016. On determining the most appropriate test cut-off value: the case of tests with continuous results. Biochemia medica: Biochemia medica 26, 3 (2016), 297–307

work page 2016

[15] [15]

Mark Kliger and Shachar Fleishman. 2018. Novelty Detection with GAN. arXiv preprint arXiv:1802.10560 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[16] [16]

Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International Conference on Machine Learning . 1188–1196

work page 2014

[17] [17]

Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio. 2017. A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[18] [18]

Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2012. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data (TKDD) 6, 1 (2012), 3

work page 2012

[19] [19]

Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai. 2018. Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval . ACM, 1137–1140

work page 2018

[20] [20]

Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. 2015. Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[21] [21]

Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014

[22] [22]

Lukas Ruff, Nico G¨ornitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Robert Van- dermeulen, Alexander Binder, Emmanuel M¨uller, and Marius Kloft. 2018. Deep one-class classiﬁcation. In International Conference on Machine Learning . 4390– 4399

work page 2018

[23] [23]

Mohammad Sabokrou, Mohammad Khalooei, Mahmood Fathy, and Ehsan Adeli

work page

[24] [24]

In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Adversarially Learned One-Class Classiﬁer for Novelty Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3379–3388

work page

[25] [25]

Doyen Sahoo, Quang Pham, Jing Lu, and Steven CH Hoi. 2017. Online deep learn- ing: Learning deep neural networks on the ﬂy. arXiv preprint arXiv:1711.03705 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[26] [26]

Mayu Sakurada and Takehisa Yairi. 2014. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2014 2nd DLP-KDD’19, August 5, 2019, Anchorage, AK, USA Zheng Gao et al. Workshop on Machine Learning for Sensory Data Analysis . ACM, 4

work page 2014

[27] [27]

Thomas Schlegl, Philipp Seeb ¨ock, Sebastian M Waldstein, Ursula Schmidt- Erfurth, and Georg Langs. 2017. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In International Conference on Information Processing in Medical Imaging . Springer, 146–157

work page 2017

[28] [28]

Platt, John C

Bernhard Sch¨olkopf, John C. Platt, John C. Shawe-Taylor, Alex J. Smola, and Robert C. Williamson. 2001. Estimating the Support of a High-Dimensional Distribution. Neural Comput. 13, 7 (July 2001), 1443–1471. DOI:http://dx.doi. org/10.1162/089976601750264965

work page doi:10.1162/089976601750264965 2001

[29] [29]

David MJ Tax and Robert PW Duin. 2004. Support vector data description. Machine learning 54, 1 (2004), 45–66

work page 2004

[30] [30]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems . 5998–6008

work page 2017

[31] [31]

Yongzhen Wang, Xiaozhong Liu, and Zheng Gao. 2019. Neural Related Work Summarization with a Joint Context-driven Attention Mechanism. arXiv preprint arXiv:1901.09492 (2019)

work page arXiv 2019

[32] [32]

Houssam Zenati, Manon Romain, Chuan-Sheng Foo, Bruno Lecouat, and Vijay Chandrasekhar. 2018. Adversarially Learned Anomaly Detection. In 2018 IEEE International Conference on Data Mining (ICDM) . IEEE, 727–736

work page 2018

[33] [33]

Chong Zhou and Randy C Paffenroth. 2017. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 665–674

work page 2017

[34] [34]

Guorui Zhou, Na Mou, Ying Fan, Qi Pi, Weijie Bian, Chang Zhou, Xiaoqiang Zhu, and Kun Gai. 2018. Deep Interest Evolution Network for Click-Through Rate Prediction. arXiv preprint arXiv:1809.03672 (2018). 7 APPENDIX 7.1 Details of Datasets Synthetic dataset: We initialize the ﬁrst instance with three categor- ical ids ‘0,10,20’, and then generate the follo...

work page internal anchor Pith review Pith/arXiv arXiv 2018