pith. machine review for the scientific record. sign in

arxiv: 2605.06726 · v1 · submitted 2026-05-07 · 💻 cs.LG

Recognition: no theorem link

Transformer-Based Wildlife Species Classification from Daily Movement Trajectories

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:52 UTC · model grok-4.3

classification 💻 cs.LG
keywords wildlife species classificationmovement trajectoriestransformer modelsGPS datasequence modelsanimal behaviorspecies identificationbalanced accuracy
0
0 comments X

The pith

Transformers classify wildlife species from daily movement trajectories more accurately than LSTM or CNN models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tests whether sequence models can tell apart seven wildlife species using only their GPS-recorded daily paths. The evaluation holds out entire studies or regions to check if patterns generalize beyond the training data. Transformers deliver consistent improvements of 8 to 22 percentage points in balanced accuracy over LSTM, CNN, and temporal convolutional networks. Adding descriptors such as speed, direction, and turning angles boosts results especially for species with limited records. One-hour sampling proves more reliable across datasets than thirty-minute intervals because it minimizes missing values while still capturing key behaviors.

Core claim

Transformer-based sequence models applied to wildlife GPS trajectories achieve higher balanced accuracy than LSTM, CNN, and TCN models when entire telemetry studies or regions are held out from training. In an elephant binary classification task with 1-hour resolution, the Transformer reaches 0.83 balanced accuracy and 0.92 AUC. Performance gains are clearest with augmented movement features for underrepresented species like lions and zebras. A unified 1-hour resolution yields better cross-study results than finer 30-minute sampling.

What carries the argument

Transformer sequence models processing time series of position and derived movement features such as displacement, speed, and turning to predict species labels.

If this is right

  • Species identity can be inferred from movement trajectories alone under realistic hold-out conditions.
  • Feature augmentation with speed, direction, and turning behavior improves classification for sparsely sampled species.
  • One-hour temporal resolution provides more consistent performance across multiple studies by reducing data gaps.
  • Binary tasks such as elephant detection reach AUC of 0.92, suggesting practical utility for common species.
  • Attention mechanisms in Transformers better capture long-range dependencies in daily trajectories than recurrent alternatives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Ecologists might use these models to monitor species presence in areas without direct observation infrastructure.
  • Similar trajectory classifiers could be tested on other taxa such as birds or fish to see if movement signatures are broadly taxonomic.
  • The approach opens a path toward using legacy collar datasets for retrospective species distribution modeling.

Load-bearing premise

Movement trajectories contain enough species-specific patterns that generalize across different telemetry studies and regions when entire studies or areas are held out from training.

What would settle it

Applying the trained Transformer to trajectories from a new species or a held-out continent and finding accuracy no better than the baselines or chance level would falsify the generalization result.

Figures

Figures reproduced from arXiv: 2605.06726 by Obed Irakoze, Prasenjit Mitra.

Figure 1
Figure 1. Figure 1: Confusion matrices for the best-performing Trans [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
read the original abstract

Inferring the identity of wildlife species from daily movement data alone is a challenging task. We train sequence models on large-scale, 7-species GPS trajectories from the Movebank platform. Trajectories models are evaluated using a protocol in which entire telemetry studies or regions are heldout during testing. We compare Transformer-based sequence models to LSTM, CNN, and Temporal Convolutional Networks, and find that Transformers consistently achieve higher balanced accuracy with gains of approximately 8 to 22 percentage points, depending on the species and experimental setting. In an elephant binary classification task with 1-hour resolution, the Transformer achieves a balanced accuracy of 0.83 and an AUC of 0.92, substantially outperforming all baseline models. We examine, under data-limited conditions, feature representations by analyzing the differences between a basic displacement-based encoding and an expanded range of movement descriptors that include speed, direction, and turning behavior. With feature augmentation, we see clear performance gains, especially for underrepresented and sparsely represented species, such as large carnivores, lions, and Zebras. Finally, experiments comparing 1-hour and 30-minutetemporal resolutions show that while finer sampling can capture short-term movement patterns for some species, a unified 1-hour resolution yields more promising performance across studies by reducing missing data and ensuring consistent temporal coverage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript trains sequence models (Transformers, LSTMs, CNNs, TCNs) on 7-species GPS trajectories from Movebank and evaluates them under an entire-study or entire-region hold-out protocol. It reports that Transformers achieve 8–22 percentage point gains in balanced accuracy over baselines, with a specific elephant binary task reaching 0.83 balanced accuracy and 0.92 AUC at 1-hour resolution. Additional experiments compare basic displacement encodings against augmented features (speed, direction, turning) and contrast 1-hour versus 30-minute temporal resolutions.

Significance. If the hold-out results hold after controlling for study-specific effects, the work would demonstrate a practical advance in automated species identification from telemetry data, with potential utility for conservation monitoring on large external datasets. The use of real-world Movebank trajectories and explicit hold-out protocols is a methodological strength.

major comments (2)
  1. [Abstract] Abstract and experimental protocol section: the headline claim of consistent 8–22 pp balanced-accuracy gains under entire-study/region hold-out is load-bearing for the generalization argument, yet the manuscript provides no counts of studies per species, no per-hold-out performance variance, and no ablation that isolates species movement signatures from regional covariates (habitat, sampling bias, missingness patterns).
  2. [Results] Results section (elephant binary task): the reported 0.83 balanced accuracy and 0.92 AUC at 1-hour resolution are presented without statistical significance tests or confidence intervals across multiple hold-out folds, making it impossible to assess whether the Transformer advantage is robust or could be explained by study-level confounders.
minor comments (2)
  1. [Abstract] Abstract: '30-minutetemporal' is a typographical error and should read '30-minute temporal'.
  2. [Methods] Feature-augmentation paragraph: the description of the expanded movement descriptors (speed, direction, turning) would benefit from explicit formulas or pseudocode for how these are computed from raw GPS points.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which underscore the need for greater transparency in our dataset statistics and statistical rigor in the reported results. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract and experimental protocol section: the headline claim of consistent 8–22 pp balanced-accuracy gains under entire-study/region hold-out is load-bearing for the generalization argument, yet the manuscript provides no counts of studies per species, no per-hold-out performance variance, and no ablation that isolates species movement signatures from regional covariates (habitat, sampling bias, missingness patterns).

    Authors: We agree that explicit counts of studies per species and variance across hold-outs would improve interpretability of the generalization results. In the revised manuscript we will add a table in the experimental protocol section reporting the number of studies and trajectories per species, together with mean and standard deviation of balanced accuracy across all entire-study and region hold-out folds. For isolating movement signatures from regional covariates, the region-holdout protocol already removes entire geographic areas; however, a full ablation controlling for habitat and sampling bias would require additional metadata not uniformly available in Movebank. We will therefore add a targeted discussion of these potential confounders and, where metadata permits, a limited post-hoc analysis comparing performance on subsets with similar habitat characteristics. revision: partial

  2. Referee: [Results] Results section (elephant binary task): the reported 0.83 balanced accuracy and 0.92 AUC at 1-hour resolution are presented without statistical significance tests or confidence intervals across multiple hold-out folds, making it impossible to assess whether the Transformer advantage is robust or could be explained by study-level confounders.

    Authors: We acknowledge the absence of statistical tests and confidence intervals for the elephant binary task. In the revision we will recompute the 1-hour elephant results across the available hold-out folds, report 95% bootstrap confidence intervals for balanced accuracy and AUC, and include paired statistical tests (McNemar’s test on predictions and a t-test on fold-wise balanced accuracies) to quantify whether the Transformer’s advantage over baselines is statistically significant. These additions will directly address concerns about study-level confounders. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation on external data

full rationale

The paper reports standard ML training and hold-out evaluation of sequence classifiers on Movebank GPS trajectories. No mathematical derivations, self-definitional equations, fitted parameters renamed as predictions, or load-bearing self-citations appear. All reported metrics (balanced accuracy, AUC) are computed on independent study/region hold-outs, so results do not reduce to the authors' own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on empirical ML performance rather than new theoretical derivations; standard neural network training assumptions apply but no custom free parameters, axioms, or invented entities are introduced beyond typical deep learning practice.

pith-pipeline@v0.9.0 · 5528 in / 1101 out tokens · 31451 ms · 2026-05-11T00:52:40.106351+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 28 canonical work pages · 1 internal anchor

  1. [1]

    Terrestrial animal tracking as an eye on life and planet,

    R. Kays, M. C. Crofoot, W. Jetz, and M. Wikelski, “Terrestrial animal tracking as an eye on life and planet,”Science, vol. 348, no. 6240, p. aaa2478, Jun. 2015, doi: 10.1126/science.aaa2478

  2. [2]

    Distinguishing technology from biology: A critical review of the use of GPS telemetry data in ecology,

    M. Hebblewhite and D. T. Haydon, “Distinguishing technology from biology: A critical review of the use of GPS telemetry data in ecology,” Philos. Trans. R. Soc. Lond. B Biol. Sci., vol. 365, no. 1550, pp. 2303– 2312, Jul. 2010, doi: 10.1098/rstb.2010.0087

  3. [3]

    blockCV: An R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution mod- els,

    R. Valavi, J. Elith, J. J. Lahoz-Monfort, and G. Guillera-Arroita, “blockCV: An R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution mod- els,”Methods Ecol. Evol., vol. 10, no. 2, pp. 225–232, Feb. 2019, doi: 10.1111/2041-210X.13107

  4. [4]

    A movement ecology paradigm for unifying organis- mal movement research,

    R. Nathan, W. M. Getz, E. Revilla, M. Holyoak, R. Kadmon, D. Saltz, and P. E. Smouse, “A movement ecology paradigm for unifying organis- mal movement research,”Proc. Natl. Acad. Sci. U.S.A., vol. 105, no. 49, pp. 19 052–19 059, Dec. 2008, doi: 10.1073/pnas.0800375105

  5. [5]

    and Bahn, Volker and Ciuti, Simone and Boyce, Mark S

    D. R. Roberts, V . Bahn, S. Ciuti, M. S. Boyce, J. Elith, G. Guillera- Arroitaet al., “Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure,”Ecography, vol. 40, no. 8, pp. 913–929, 2017, doi: 10.1111/ecog.02881

  6. [6]

    Outstanding challenges in the transferability of ecological models,

    K. L. Yates, P. J. Bouchet, M. J. Caley, K. Mengersen, C. F. Randin, S. Parnellet al., “Outstanding challenges in the transferability of ecological models,”Trends Ecol. Evol., vol. 33, no. 10, pp. 790–802, Oct. 2018, doi: 10.1016/j.tree.2018.08.001

  7. [7]

    A guide to sampling design for GPS-based studies of animal societies,

    P. He, J. A. Klarevas-Irby, D. Papageorgiou, C. Christensen, E. D. Strauss, and D. R. Farine, “A guide to sampling design for GPS-based studies of animal societies,”Methods Ecol. Evol., vol. 14, no. 8, pp. 1887–1905, 2023, doi: 10.1111/2041-210X.13999

  8. [8]

    Machine learning for inferring animal behavior from location and movement data,

    G. Wang, “Machine learning for inferring animal behavior from location and movement data,”Ecol. Inform., vol. 49, pp. 69–76, 2019, doi: 10.1016/j.ecoinf.2018.12.002

  9. [9]

    Combining animal movements and behavioural data to detect behavioural states,

    V . O. Nams, “Combining animal movements and behavioural data to detect behavioural states,”Ecol. Lett., vol. 17, no. 10, pp. 1228–1237, Oct. 2014, doi: 10.1111/ele.12328

  10. [10]

    State-space models of individual animal move- ment,

    T. A. Patterson, L. Thomas, C. Wilcox, O. Ovaskainen, and J. Matthiopoulos, “State-space models of individual animal move- ment,”Trends Ecol. Evol., vol. 23, no. 2, pp. 87–94, Feb. 2008, doi: 10.1016/j.tree.2007.10.009

  11. [11]

    The Movebank system for studying global animal movement and demography,

    R. Kays, S. C. Davidson, M. Berger, G. Bohrer, W. Fiedler, A. Flack et al., “The Movebank system for studying global animal movement and demography,”Methods Ecol. Evol., vol. 13, no. 2, pp. 419–431, 2022, doi: 10.1111/2041-210X.13767

  12. [12]

    Attention is all you need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems, vol. 30. Curran Asso- ciates, Inc., 2017, [Online]. Available: https://papers.nips.cc/paper/7181- attention-is-all-you-need

  13. [13]

    Neural Computation 9, 1735–1780

    S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neu- ral Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735

  14. [14]

    Statistical modelling of individual animal movement: An overview of key methods and a discussion of practical challenges,

    T. A. Patterson, A. Parton, R. Langrock, P. G. Blackwell, L. Thomas, and R. King, “Statistical modelling of individual animal movement: An overview of key methods and a discussion of practical challenges,” AStA Adv. Stat. Anal., vol. 101, no. 4, pp. 399–438, 2017, doi: 10.1007/s10182-017-0302-7

  15. [15]

    Integrated step selection analysis: Bridging the gap between resource selection and animal movement,

    T. Avgar, J. R. Potts, M. A. Lewis, and M. S. Boyce, “Integrated step selection analysis: Bridging the gap between resource selection and animal movement,”Methods Ecol. Evol., vol. 7, no. 5, pp. 619–630, 2016, doi: 10.1111/2041-210X.12528

  16. [16]

    Machine learning-based global maps of eco- logical variables and the challenge of assessing them,

    H. Meyer and E. Pebesma, “Machine learning-based global maps of eco- logical variables and the challenge of assessing them,”Nat. Commun., vol. 13, no. 1, p. 2208, Apr. 2022, doi: 10.1038/s41467-022-29838-9

  17. [17]

    Trajectory data management and mining: A survey from deep learning to the LLM era,

    W. Chen, Y . Liang, Y . Zhu, Y . Chang, K. Luo, H. Wen, L. Li, Y . Yu, Q. Wen, C. Chen, K. Zheng, Y . Gao, X. Zhou, and Y . Zheng, “Trajectory data management and mining: A survey from deep learning to the LLM era,”arXiv preprint, Mar. 2024, doi: 10.48550/arXiv.2403.14151

  18. [18]

    ST-SiameseNet: Spatio- temporal Siamese networks for human mobility signature identification,

    H. Ren, M. Pan, Y . Li, X. Zhou, and J. Luo, “ST-SiameseNet: Spatio- temporal Siamese networks for human mobility signature identification,” inProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1306–1315, doi: 10.1145/3394486.3403183

  19. [19]

    Addressing fairness in artificial intelligence for medical imaging.Nature Communications, 13:4581, 2022

    D. Tuia, B. Kellenberger, S. Beery, B. R. Costelloe, S. Zuffi, B. Risse et al., “Perspectives in machine learning for wildlife conservation,”Nat. Commun., vol. 13, no. 1, p. 792, Feb. 2022, doi: 10.1038/s41467-022- 27980-y

  20. [20]

    Time2Vec: Learning a Vector Representation of Time

    S. M. Kazemi, R. Goel, S. Eghbali, J. Ramanan, J. Sahota, S. Thakur, S. Wu, C. Smyth, P. Poupart, and M. Brubaker, “Time2Vec: Learn- ing a vector representation of time,”arXiv preprint, Jul. 2019, doi: 10.48550/arXiv.1907.05321

  21. [21]

    Moving in the Anthropocene: Global reductions in terrestrial mammalian movements,

    M. A. Tucker, K. B ¨ohning-Gaese, W. F. Fagan, J. M. Fryxell, B. Van Moorter, S. C. Albertset al., “Moving in the Anthropocene: Global reductions in terrestrial mammalian movements,”Science, vol. 359, no. 6374, pp. 466–469, Jan. 2018, doi: 10.1126/science.aam9712

  22. [22]

    How to reliably estimate the tortuosity of an animal’s path: Straightness, sinuosity, or fractal dimension?

    S. Benhamou, “How to reliably estimate the tortuosity of an animal’s path: Straightness, sinuosity, or fractal dimension?”J. Theor. Biol., vol. 229, no. 2, pp. 209–220, Jul. 2004, doi: 10.1016/j.jtbi.2004.03.016

  23. [23]

    Recurrent neural networks for multivariate time series with missing values,

    Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y . Liu, “Recurrent neural networks for multivariate time series with missing values,”Sci. Rep., vol. 8, no. 1, p. 6085, Apr. 2018, doi: 10.1038/s41598-018-24271- 9

  24. [24]

    A transformer-based framework for multivariate time series representation learning,

    G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, and C. Eickhoff, “A transformer-based framework for multivariate time series representation learning,” inProceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2021, pp. 2114– 2124, doi: 10.1145/3447548.3467401

  25. [25]

    2017 Time series classification from scratch with deep neural networks: A strong baseline

    Z. Wang, W. Yan, and T. Oates, “Time series classification from scratch with deep neural networks: A strong baseline,” in2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017, pp. 1578– 1585, doi: 10.1109/IJCNN.2017.7966039

  26. [26]

    An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

    S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv preprint, Mar. 2018, doi: 10.48550/arXiv.1803.01271

  27. [27]

    Decoupled weight decay reg- ularization,

    I. Loshchilov and F. Hutter, “Decoupled weight decay reg- ularization,” inProceedings of the 7th International Confer- ence on Learning Representations, May 2019, [Online]. Available: https://openreview.net/forum?id=Bkg6RiCqY7

  28. [28]

    Dropout: A simple way to prevent neural networks from over- fitting,

    N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut- dinov, “Dropout: A simple way to prevent neural networks from over- fitting,”J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958, Jun. 2014, [Online]. Available: http://jmlr.org/papers/v15/srivastava14a.html

  29. [29]

    Assessing transferability of ecological models: An underappreciated aspect of statistical validation,

    S. J. Wenger and J. D. Olden, “Assessing transferability of ecological models: An underappreciated aspect of statistical validation,”Methods Ecol. Evol., vol. 3, no. 2, pp. 260–267, 2012, doi: 10.1111/j.2041- 210X.2011.00170.x

  30. [30]

    Mazurowski , keywords =

    M. Buda, A. Maki, and M. A. Mazurowski, “A systematic study of the class imbalance problem in convolutional neural net- works,”Neural Netw., vol. 106, pp. 249–259, Oct. 2018, doi: 10.1016/j.neunet.2018.07.011

  31. [31]

    An introduction to ROC analysis,

    T. Fawcett, “An introduction to ROC analysis,”Pattern Recognit. Lett., vol. 27, no. 8, pp. 861–874, 2006, doi: 10.1016/j.patrec.2005.10.010

  32. [32]

    Molnar,Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed

    C. Molnar,Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed. Independently published, 2022, [Online]. Available: https://christophm.github.io/interpretable-ml-book/