arxiv: 2605.06726 · v1 · submitted 2026-05-07 · 💻 cs.LG

Recognition: no theorem link

Transformer-Based Wildlife Species Classification from Daily Movement Trajectories

Obed Irakoze , Prasenjit Mitra

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:52 UTC · model grok-4.3

classification 💻 cs.LG

keywords wildlife species classificationmovement trajectoriestransformer modelsGPS datasequence modelsanimal behaviorspecies identificationbalanced accuracy

0 comments

The pith

Transformers classify wildlife species from daily movement trajectories more accurately than LSTM or CNN models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tests whether sequence models can tell apart seven wildlife species using only their GPS-recorded daily paths. The evaluation holds out entire studies or regions to check if patterns generalize beyond the training data. Transformers deliver consistent improvements of 8 to 22 percentage points in balanced accuracy over LSTM, CNN, and temporal convolutional networks. Adding descriptors such as speed, direction, and turning angles boosts results especially for species with limited records. One-hour sampling proves more reliable across datasets than thirty-minute intervals because it minimizes missing values while still capturing key behaviors.

Core claim

Transformer-based sequence models applied to wildlife GPS trajectories achieve higher balanced accuracy than LSTM, CNN, and TCN models when entire telemetry studies or regions are held out from training. In an elephant binary classification task with 1-hour resolution, the Transformer reaches 0.83 balanced accuracy and 0.92 AUC. Performance gains are clearest with augmented movement features for underrepresented species like lions and zebras. A unified 1-hour resolution yields better cross-study results than finer 30-minute sampling.

What carries the argument

Transformer sequence models processing time series of position and derived movement features such as displacement, speed, and turning to predict species labels.

If this is right

Species identity can be inferred from movement trajectories alone under realistic hold-out conditions.
Feature augmentation with speed, direction, and turning behavior improves classification for sparsely sampled species.
One-hour temporal resolution provides more consistent performance across multiple studies by reducing data gaps.
Binary tasks such as elephant detection reach AUC of 0.92, suggesting practical utility for common species.
Attention mechanisms in Transformers better capture long-range dependencies in daily trajectories than recurrent alternatives.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Ecologists might use these models to monitor species presence in areas without direct observation infrastructure.
Similar trajectory classifiers could be tested on other taxa such as birds or fish to see if movement signatures are broadly taxonomic.
The approach opens a path toward using legacy collar datasets for retrospective species distribution modeling.

Load-bearing premise

Movement trajectories contain enough species-specific patterns that generalize across different telemetry studies and regions when entire studies or areas are held out from training.

What would settle it

Applying the trained Transformer to trajectories from a new species or a held-out continent and finding accuracy no better than the baselines or chance level would falsify the generalization result.

Figures

Figures reproduced from arXiv: 2605.06726 by Obed Irakoze, Prasenjit Mitra.

read the original abstract

Inferring the identity of wildlife species from daily movement data alone is a challenging task. We train sequence models on large-scale, 7-species GPS trajectories from the Movebank platform. Trajectories models are evaluated using a protocol in which entire telemetry studies or regions are heldout during testing. We compare Transformer-based sequence models to LSTM, CNN, and Temporal Convolutional Networks, and find that Transformers consistently achieve higher balanced accuracy with gains of approximately 8 to 22 percentage points, depending on the species and experimental setting. In an elephant binary classification task with 1-hour resolution, the Transformer achieves a balanced accuracy of 0.83 and an AUC of 0.92, substantially outperforming all baseline models. We examine, under data-limited conditions, feature representations by analyzing the differences between a basic displacement-based encoding and an expanded range of movement descriptors that include speed, direction, and turning behavior. With feature augmentation, we see clear performance gains, especially for underrepresented and sparsely represented species, such as large carnivores, lions, and Zebras. Finally, experiments comparing 1-hour and 30-minutetemporal resolutions show that while finer sampling can capture short-term movement patterns for some species, a unified 1-hour resolution yields more promising performance across studies by reducing missing data and ensuring consistent temporal coverage.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Transformers beat the baselines by 8-22 points on species ID from GPS tracks under study hold-outs, but the results could still be picking up study artifacts rather than pure species signals.

read the letter

The main thing to know is that this paper trains sequence models on Movebank GPS trajectories for seven wildlife species and reports that Transformers consistently outperform LSTM, CNN, and TCN baselines by 8 to 22 percentage points in balanced accuracy, with an elephant binary task reaching 0.83 balanced accuracy and 0.92 AUC at one-hour resolution. They also show that adding speed, direction, and turning features helps the rarer species and that one-hour sampling gives more stable results across studies than thirty-minute data because it cuts down on missing values. The hold-out protocol that keeps entire studies or regions out of training is a reasonable step toward testing real generalization. Those are the concrete contributions. The empirical comparisons are laid out with specific numbers instead of hand-waving, and the feature-augmentation experiments are a practical addition for handling class imbalance in telemetry data. The work is grounded in an existing public dataset rather than synthetic or self-generated examples. The soft spot is the risk that study-level differences are doing some of the heavy lifting. Different telemetry projects often vary in habitat, collar type, sampling frequency, and missing-data patterns; if those factors correlate with species, a model can score well on hold-out accuracy without having learned species-specific movement rules. The abstract states the hold-out design but gives no per-species study counts, no performance variance across individual hold-outs, and no ablation that removes regional covariates. Without those checks the reported gains are harder to interpret cleanly. This paper is useful for ecologists who already work with Movebank-style tracking data and want to try modern sequence models, or for ML people looking for a non-image, non-text sequence task with real constraints. It is not a foundational methods paper, but the experimental framing is honest enough that it merits referee time. I would send it to peer review and ask specifically for the hold-out statistics and any confounding checks.

Referee Report

2 major / 2 minor

Summary. The manuscript trains sequence models (Transformers, LSTMs, CNNs, TCNs) on 7-species GPS trajectories from Movebank and evaluates them under an entire-study or entire-region hold-out protocol. It reports that Transformers achieve 8–22 percentage point gains in balanced accuracy over baselines, with a specific elephant binary task reaching 0.83 balanced accuracy and 0.92 AUC at 1-hour resolution. Additional experiments compare basic displacement encodings against augmented features (speed, direction, turning) and contrast 1-hour versus 30-minute temporal resolutions.

Significance. If the hold-out results hold after controlling for study-specific effects, the work would demonstrate a practical advance in automated species identification from telemetry data, with potential utility for conservation monitoring on large external datasets. The use of real-world Movebank trajectories and explicit hold-out protocols is a methodological strength.

major comments (2)

[Abstract] Abstract and experimental protocol section: the headline claim of consistent 8–22 pp balanced-accuracy gains under entire-study/region hold-out is load-bearing for the generalization argument, yet the manuscript provides no counts of studies per species, no per-hold-out performance variance, and no ablation that isolates species movement signatures from regional covariates (habitat, sampling bias, missingness patterns).
[Results] Results section (elephant binary task): the reported 0.83 balanced accuracy and 0.92 AUC at 1-hour resolution are presented without statistical significance tests or confidence intervals across multiple hold-out folds, making it impossible to assess whether the Transformer advantage is robust or could be explained by study-level confounders.

minor comments (2)

[Abstract] Abstract: '30-minutetemporal' is a typographical error and should read '30-minute temporal'.
[Methods] Feature-augmentation paragraph: the description of the expanded movement descriptors (speed, direction, turning) would benefit from explicit formulas or pseudocode for how these are computed from raw GPS points.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which underscore the need for greater transparency in our dataset statistics and statistical rigor in the reported results. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses

Referee: [Abstract] Abstract and experimental protocol section: the headline claim of consistent 8–22 pp balanced-accuracy gains under entire-study/region hold-out is load-bearing for the generalization argument, yet the manuscript provides no counts of studies per species, no per-hold-out performance variance, and no ablation that isolates species movement signatures from regional covariates (habitat, sampling bias, missingness patterns).

Authors: We agree that explicit counts of studies per species and variance across hold-outs would improve interpretability of the generalization results. In the revised manuscript we will add a table in the experimental protocol section reporting the number of studies and trajectories per species, together with mean and standard deviation of balanced accuracy across all entire-study and region hold-out folds. For isolating movement signatures from regional covariates, the region-holdout protocol already removes entire geographic areas; however, a full ablation controlling for habitat and sampling bias would require additional metadata not uniformly available in Movebank. We will therefore add a targeted discussion of these potential confounders and, where metadata permits, a limited post-hoc analysis comparing performance on subsets with similar habitat characteristics. revision: partial
Referee: [Results] Results section (elephant binary task): the reported 0.83 balanced accuracy and 0.92 AUC at 1-hour resolution are presented without statistical significance tests or confidence intervals across multiple hold-out folds, making it impossible to assess whether the Transformer advantage is robust or could be explained by study-level confounders.

Authors: We acknowledge the absence of statistical tests and confidence intervals for the elephant binary task. In the revision we will recompute the 1-hour elephant results across the available hold-out folds, report 95% bootstrap confidence intervals for balanced accuracy and AUC, and include paired statistical tests (McNemar’s test on predictions and a t-test on fold-wise balanced accuracies) to quantify whether the Transformer’s advantage over baselines is statistically significant. These additions will directly address concerns about study-level confounders. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation on external data

full rationale

The paper reports standard ML training and hold-out evaluation of sequence classifiers on Movebank GPS trajectories. No mathematical derivations, self-definitional equations, fitted parameters renamed as predictions, or load-bearing self-citations appear. All reported metrics (balanced accuracy, AUC) are computed on independent study/region hold-outs, so results do not reduce to the authors' own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on empirical ML performance rather than new theoretical derivations; standard neural network training assumptions apply but no custom free parameters, axioms, or invented entities are introduced beyond typical deep learning practice.

pith-pipeline@v0.9.0 · 5528 in / 1101 out tokens · 31451 ms · 2026-05-11T00:52:40.106351+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 28 canonical work pages · 1 internal anchor

[1]

Terrestrial animal tracking as an eye on life and planet,

R. Kays, M. C. Crofoot, W. Jetz, and M. Wikelski, “Terrestrial animal tracking as an eye on life and planet,”Science, vol. 348, no. 6240, p. aaa2478, Jun. 2015, doi: 10.1126/science.aaa2478

work page doi:10.1126/science.aaa2478 2015
[2]

Distinguishing technology from biology: A critical review of the use of GPS telemetry data in ecology,

M. Hebblewhite and D. T. Haydon, “Distinguishing technology from biology: A critical review of the use of GPS telemetry data in ecology,” Philos. Trans. R. Soc. Lond. B Biol. Sci., vol. 365, no. 1550, pp. 2303– 2312, Jul. 2010, doi: 10.1098/rstb.2010.0087

work page doi:10.1098/rstb.2010.0087 2010
[3]

blockCV: An R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution mod- els,

R. Valavi, J. Elith, J. J. Lahoz-Monfort, and G. Guillera-Arroita, “blockCV: An R package for generating spatially or environmentally separated folds for k-fold cross-validation of species distribution mod- els,”Methods Ecol. Evol., vol. 10, no. 2, pp. 225–232, Feb. 2019, doi: 10.1111/2041-210X.13107

work page doi:10.1111/2041-210x.13107 2019
[4]

A movement ecology paradigm for unifying organis- mal movement research,

R. Nathan, W. M. Getz, E. Revilla, M. Holyoak, R. Kadmon, D. Saltz, and P. E. Smouse, “A movement ecology paradigm for unifying organis- mal movement research,”Proc. Natl. Acad. Sci. U.S.A., vol. 105, no. 49, pp. 19 052–19 059, Dec. 2008, doi: 10.1073/pnas.0800375105

work page doi:10.1073/pnas.0800375105 2008
[5]

and Bahn, Volker and Ciuti, Simone and Boyce, Mark S

D. R. Roberts, V . Bahn, S. Ciuti, M. S. Boyce, J. Elith, G. Guillera- Arroitaet al., “Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure,”Ecography, vol. 40, no. 8, pp. 913–929, 2017, doi: 10.1111/ecog.02881

work page doi:10.1111/ecog.02881 2017
[6]

Outstanding challenges in the transferability of ecological models,

K. L. Yates, P. J. Bouchet, M. J. Caley, K. Mengersen, C. F. Randin, S. Parnellet al., “Outstanding challenges in the transferability of ecological models,”Trends Ecol. Evol., vol. 33, no. 10, pp. 790–802, Oct. 2018, doi: 10.1016/j.tree.2018.08.001

work page doi:10.1016/j.tree.2018.08.001 2018
[7]

A guide to sampling design for GPS-based studies of animal societies,

P. He, J. A. Klarevas-Irby, D. Papageorgiou, C. Christensen, E. D. Strauss, and D. R. Farine, “A guide to sampling design for GPS-based studies of animal societies,”Methods Ecol. Evol., vol. 14, no. 8, pp. 1887–1905, 2023, doi: 10.1111/2041-210X.13999

work page doi:10.1111/2041-210x.13999 1905
[8]

Machine learning for inferring animal behavior from location and movement data,

G. Wang, “Machine learning for inferring animal behavior from location and movement data,”Ecol. Inform., vol. 49, pp. 69–76, 2019, doi: 10.1016/j.ecoinf.2018.12.002

work page doi:10.1016/j.ecoinf.2018.12.002 2019
[9]

Combining animal movements and behavioural data to detect behavioural states,

V . O. Nams, “Combining animal movements and behavioural data to detect behavioural states,”Ecol. Lett., vol. 17, no. 10, pp. 1228–1237, Oct. 2014, doi: 10.1111/ele.12328

work page doi:10.1111/ele.12328 2014
[10]

State-space models of individual animal move- ment,

T. A. Patterson, L. Thomas, C. Wilcox, O. Ovaskainen, and J. Matthiopoulos, “State-space models of individual animal move- ment,”Trends Ecol. Evol., vol. 23, no. 2, pp. 87–94, Feb. 2008, doi: 10.1016/j.tree.2007.10.009

work page doi:10.1016/j.tree.2007.10.009 2008
[11]

The Movebank system for studying global animal movement and demography,

R. Kays, S. C. Davidson, M. Berger, G. Bohrer, W. Fiedler, A. Flack et al., “The Movebank system for studying global animal movement and demography,”Methods Ecol. Evol., vol. 13, no. 2, pp. 419–431, 2022, doi: 10.1111/2041-210X.13767

work page doi:10.1111/2041-210x.13767 2022
[12]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems, vol. 30. Curran Asso- ciates, Inc., 2017, [Online]. Available: https://papers.nips.cc/paper/7181- attention-is-all-you-need

2017
[13]

Neural Computation 9, 1735–1780

S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neu- ral Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997
[14]

Statistical modelling of individual animal movement: An overview of key methods and a discussion of practical challenges,

T. A. Patterson, A. Parton, R. Langrock, P. G. Blackwell, L. Thomas, and R. King, “Statistical modelling of individual animal movement: An overview of key methods and a discussion of practical challenges,” AStA Adv. Stat. Anal., vol. 101, no. 4, pp. 399–438, 2017, doi: 10.1007/s10182-017-0302-7

work page doi:10.1007/s10182-017-0302-7 2017
[15]

Integrated step selection analysis: Bridging the gap between resource selection and animal movement,

T. Avgar, J. R. Potts, M. A. Lewis, and M. S. Boyce, “Integrated step selection analysis: Bridging the gap between resource selection and animal movement,”Methods Ecol. Evol., vol. 7, no. 5, pp. 619–630, 2016, doi: 10.1111/2041-210X.12528

work page doi:10.1111/2041-210x.12528 2016
[16]

Machine learning-based global maps of eco- logical variables and the challenge of assessing them,

H. Meyer and E. Pebesma, “Machine learning-based global maps of eco- logical variables and the challenge of assessing them,”Nat. Commun., vol. 13, no. 1, p. 2208, Apr. 2022, doi: 10.1038/s41467-022-29838-9

work page doi:10.1038/s41467-022-29838-9 2022
[17]

Trajectory data management and mining: A survey from deep learning to the LLM era,

W. Chen, Y . Liang, Y . Zhu, Y . Chang, K. Luo, H. Wen, L. Li, Y . Yu, Q. Wen, C. Chen, K. Zheng, Y . Gao, X. Zhou, and Y . Zheng, “Trajectory data management and mining: A survey from deep learning to the LLM era,”arXiv preprint, Mar. 2024, doi: 10.48550/arXiv.2403.14151

work page doi:10.48550/arxiv.2403.14151 2024
[18]

ST-SiameseNet: Spatio- temporal Siamese networks for human mobility signature identification,

H. Ren, M. Pan, Y . Li, X. Zhou, and J. Luo, “ST-SiameseNet: Spatio- temporal Siamese networks for human mobility signature identification,” inProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1306–1315, doi: 10.1145/3394486.3403183

work page doi:10.1145/3394486.3403183 2020
[19]

Addressing fairness in artificial intelligence for medical imaging.Nature Communications, 13:4581, 2022

D. Tuia, B. Kellenberger, S. Beery, B. R. Costelloe, S. Zuffi, B. Risse et al., “Perspectives in machine learning for wildlife conservation,”Nat. Commun., vol. 13, no. 1, p. 792, Feb. 2022, doi: 10.1038/s41467-022- 27980-y

work page doi:10.1038/s41467-022- 2022
[20]

Time2Vec: Learning a Vector Representation of Time

S. M. Kazemi, R. Goel, S. Eghbali, J. Ramanan, J. Sahota, S. Thakur, S. Wu, C. Smyth, P. Poupart, and M. Brubaker, “Time2Vec: Learn- ing a vector representation of time,”arXiv preprint, Jul. 2019, doi: 10.48550/arXiv.1907.05321

work page Pith review doi:10.48550/arxiv.1907.05321 2019
[21]

Moving in the Anthropocene: Global reductions in terrestrial mammalian movements,

M. A. Tucker, K. B ¨ohning-Gaese, W. F. Fagan, J. M. Fryxell, B. Van Moorter, S. C. Albertset al., “Moving in the Anthropocene: Global reductions in terrestrial mammalian movements,”Science, vol. 359, no. 6374, pp. 466–469, Jan. 2018, doi: 10.1126/science.aam9712

work page doi:10.1126/science.aam9712 2018
[22]

How to reliably estimate the tortuosity of an animal’s path: Straightness, sinuosity, or fractal dimension?

S. Benhamou, “How to reliably estimate the tortuosity of an animal’s path: Straightness, sinuosity, or fractal dimension?”J. Theor. Biol., vol. 229, no. 2, pp. 209–220, Jul. 2004, doi: 10.1016/j.jtbi.2004.03.016

work page doi:10.1016/j.jtbi.2004.03.016 2004
[23]

Recurrent neural networks for multivariate time series with missing values,

Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y . Liu, “Recurrent neural networks for multivariate time series with missing values,”Sci. Rep., vol. 8, no. 1, p. 6085, Apr. 2018, doi: 10.1038/s41598-018-24271- 9

work page doi:10.1038/s41598-018-24271- 2018
[24]

A transformer-based framework for multivariate time series representation learning,

G. Zerveas, S. Jayaraman, D. Patel, A. Bhamidipaty, and C. Eickhoff, “A transformer-based framework for multivariate time series representation learning,” inProceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2021, pp. 2114– 2124, doi: 10.1145/3447548.3467401

work page doi:10.1145/3447548.3467401 2021
[25]

2017 Time series classification from scratch with deep neural networks: A strong baseline

Z. Wang, W. Yan, and T. Oates, “Time series classification from scratch with deep neural networks: A strong baseline,” in2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017, pp. 1578– 1585, doi: 10.1109/IJCNN.2017.7966039

work page doi:10.1109/ijcnn.2017.7966039 2017
[26]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,”arXiv preprint, Mar. 2018, doi: 10.48550/arXiv.1803.01271

work page internal anchor Pith review doi:10.48550/arxiv.1803.01271 2018
[27]

Decoupled weight decay reg- ularization,

I. Loshchilov and F. Hutter, “Decoupled weight decay reg- ularization,” inProceedings of the 7th International Confer- ence on Learning Representations, May 2019, [Online]. Available: https://openreview.net/forum?id=Bkg6RiCqY7

2019
[28]

Dropout: A simple way to prevent neural networks from over- fitting,

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut- dinov, “Dropout: A simple way to prevent neural networks from over- fitting,”J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958, Jun. 2014, [Online]. Available: http://jmlr.org/papers/v15/srivastava14a.html

1929
[29]

Assessing transferability of ecological models: An underappreciated aspect of statistical validation,

S. J. Wenger and J. D. Olden, “Assessing transferability of ecological models: An underappreciated aspect of statistical validation,”Methods Ecol. Evol., vol. 3, no. 2, pp. 260–267, 2012, doi: 10.1111/j.2041- 210X.2011.00170.x

work page doi:10.1111/j.2041- 2012
[30]

Mazurowski , keywords =

M. Buda, A. Maki, and M. A. Mazurowski, “A systematic study of the class imbalance problem in convolutional neural net- works,”Neural Netw., vol. 106, pp. 249–259, Oct. 2018, doi: 10.1016/j.neunet.2018.07.011

work page doi:10.1016/j.neunet.2018.07.011 2018
[31]

An introduction to ROC analysis,

T. Fawcett, “An introduction to ROC analysis,”Pattern Recognit. Lett., vol. 27, no. 8, pp. 861–874, 2006, doi: 10.1016/j.patrec.2005.10.010

work page doi:10.1016/j.patrec.2005.10.010 2006
[32]

Molnar,Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed

C. Molnar,Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed. Independently published, 2022, [Online]. Available: https://christophm.github.io/interpretable-ml-book/

2022