Deep Joint Distribution Optimal Transport for Universal Domain Adaptation on Time Series
Pith reviewed 2026-05-23 00:38 UTC · model grok-4.3
The pith
UniJDOT modifies optimal transport to include unknown target samples in the cost for universal domain adaptation on time series.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
UniJDOT is an optimal-transport-based method for universal domain adaptation on time series that accounts for unknown target samples in the transport cost, introduces a joint decision space to improve the discriminability of the detection module, applies an auto-thresholding algorithm to remove dependence on fixed or tuned thresholds, and incorporates a Fourier transform-based layer inspired by the Fourier Neural Operator for better time-series representation, resulting in state-of-the-art performance on benchmarks.
What carries the argument
Joint distribution optimal transport extended to include unknown target samples in the cost, paired with a joint decision space for detection.
If this is right
- Detection of unknown emerging classes becomes more reliable because unknowns participate directly in the alignment cost.
- Auto-thresholding removes the need to choose or tune a fixed discriminability threshold for each new dataset.
- The Fourier layer supplies improved representations that support both alignment and detection on time-series inputs.
- The overall pipeline demonstrates robustness across varying degrees of class overlap between source and target.
Where Pith is reading between the lines
- The same unknown-aware transport term could be tested on non-time-series data to check whether the gains are specific to sequential structure.
- Explicit modeling of unknowns inside the transport plan may reduce negative transfer in other domain-adaptation settings where new classes appear over time.
- If the auto-thresholding step generalizes, it could replace manual threshold selection in other open-set or outlier-detection pipelines that currently rely on validation-set tuning.
Load-bearing premise
Incorporating unknown samples into the transport cost and using a joint decision space will improve detection reliability without creating new failure modes or needing dataset-specific fixes.
What would settle it
On a standard time-series UniDA benchmark, UniJDOT produces lower overall accuracy or worse unknown-class detection rates than a baseline joint-distribution optimal transport method without the unknown-aware cost term.
Figures
read the original abstract
Universal Domain Adaptation (UniDA) aims to transfer knowledge from a labeled source domain to an unlabeled target domain, even when their classes are not fully shared. Few dedicated UniDA methods exist for Time Series (TS), which remains a challenging case. In general, UniDA approaches align common class samples and detect unknown target samples from emerging classes. Such detection often results from thresholding a discriminability metric. The threshold value is typically either a fine-tuned hyperparameter or a fixed value, which limits the ability of the model to adapt to new data. Furthermore, discriminability metrics exhibit overconfidence for unknown samples, leading to misclassifications. This paper introduces UniJDOT, an optimal-transport-based method that accounts for the unknown target samples in the transport cost. Our method also proposes a joint decision space to improve the discriminability of the detection module. In addition, we use an auto-thresholding algorithm to reduce the dependence on fixed or fine-tuned thresholds. Finally, we rely on a Fourier transform-based layer inspired by the Fourier Neural Operator for better TS representation. Experiments on TS benchmarks demonstrate the discriminability, robustness, and state-of-the-art performance of UniJDOT.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces UniJDOT, an optimal-transport-based method for universal domain adaptation (UniDA) on time series. It modifies the OT cost to account for unknown target samples, defines a joint decision space by concatenating source and target logits to improve discriminability of the detection module, introduces a percentile-based auto-thresholding rule on the OT plan to avoid fixed or tuned thresholds, and adds a 1D Fourier layer adapted from the Fourier Neural Operator for TS representation. Experiments on standard TS UniDA benchmarks report state-of-the-art accuracy, with ablations isolating each component.
Significance. If the reported gains hold under the explicit loss terms and ablation protocol described, the work supplies a concrete, threshold-light OT formulation for TS UniDA that directly addresses overconfidence and hyperparameter sensitivity in unknown-class detection. The explicit definitions of the joint space and auto-thresholding rule, together with component-wise ablations showing gains without new failure modes, constitute reproducible strengths.
minor comments (3)
- [§3.2] §3.2: the precise definition of the joint decision space (concatenation of logits) should be written as an equation rather than prose to facilitate reproduction.
- [Table 2] Table 2: the caption does not state whether the reported means are over 3 or 5 random seeds; add this detail.
- [§4.3] §4.3: the percentile value used for auto-thresholding is stated as 0.95 but the sensitivity analysis is only qualitative; a small table or plot of performance vs. percentile would strengthen the robustness claim.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of UniJDOT, the recognition of its contributions to threshold-light OT for time-series UniDA, and the recommendation for minor revision. No major comments were raised that require point-by-point rebuttal.
Circularity Check
No significant circularity
full rationale
The derivation chain is self-contained. The UniJDOT method is defined via explicit loss terms that incorporate unknown samples into the OT cost, a joint decision space constructed by concatenating source/target logits, a percentile-based auto-thresholding rule applied to the OT plan, and a direct 1D adaptation of the Fourier Neural Operator layer; none of these reduce to self-definitions, fitted inputs renamed as predictions, or load-bearing self-citations. Ablations isolate each component and demonstrate gains on the reported benchmarks without the central claims collapsing to the inputs by construction. No uniqueness theorems or ansatzes are smuggled in via author-overlapping citations.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
UniJDOT ... rewriting the alignment cost to take into account the unknown target samples ... binary auto-thresholding ... joint decision space ... Fourier transform-based layer inspired by the Fourier Neural Operator
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
OT(a,b,C) = min ... UOT ... ℒ=λℒ_ce + (1-λ)ℒ_UOT(¯C)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
H. Ismail Fawaz, G. Forestier, J. Weber, L. Idoumghar, P.-A. Muller, Deep learning for time series classifica- tion: a review, Data Mining and Knowledge Discovery 33 (2019) 917–963
work page 2019
- [2]
-
[3]
S. Zhang, L. Su, J. Gu, K. Li, L. Zhou, M. Pecht, Rotating machinery fault detection and diagno- sis based on deep domain adaptation: A survey, Chinese Journal of Aeronautics 36 (2023) 45–74. URL: https://www.sciencedirect.com/science/article/ pii/S100093612100368X. doi:https://doi.org/10. 1016/j.cja.2021.10.006
work page 2023
-
[4]
W. Guo, G. Xu, Y. Wang, Multi-source do- main adaptation with spatio-temporal feature ex- tractor for eeg emotion recognition, Biomedical Signal Processing and Control 84 (2023) 104998. URL: https://www.sciencedirect.com/science/article/ pii/S1746809423004317. doi: https://doi.org/10. 1016/j.bspc.2023.104998
-
[5]
M. Liu, X. Chen, Y. Shu, X. Li, W. Guan, L. Nie, Boost- ing transferability and discriminability for time series domain adaptation, in: The Thirty-eighth Annual Con- ference on Neural Information Processing Systems, 2024
work page 2024
- [6]
-
[7]
S. Lee, T. Park, K. Lee, Soft contrastive learning for time series, in: The Twelfth International Confer- ence on Learning Representations, 2024. URL: https: //openreview.net/forum?id=pAsQSWlDUf
work page 2024
-
[8]
D. Biswas, J. Tešić, Unsupervised domain adapta- tion with debiased contrastive learning and support- set guided pseudolabeling for remote sensing images, IEEE Journal of Selected Topics in Applied Earth Ob- servations and Remote Sensing 17 (2024) 3197–3210. doi:10.1109/JSTARS.2024.3349541
-
[9]
F. Painblanc, L. Chapel, N. Courty, C. Friguet, C. Pel- letier, R. Tavenard, Match-and-deform: Time series domain adaptation through optimal transport and tem- poral alignment, in: D. Koutra, C. Plant, M. Gomez Ro- driguez, E. Baralis, F. Bonchi (Eds.), Machine Learn- ing and Knowledge Discovery in Databases: Research Track, Springer Nature Switzerland, C...
work page 2023
-
[10]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhat- tacharya, A. Stuart, A. Anandkumar, Fourier neural operator for parametric partial differential equations, arXiv preprint arXiv:2010.08895 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[11]
H. He, O. Queen, T. Koker, C. Cuevas, T. Tsiligkaridis, M. Zitnik, Domain adaptation for time series under feature and label shifts, in: International Conference on Machine Learning, PMLR, 2023
work page 2023
-
[12]
K. You, M. Long, Z. Cao, J. Wang, M. I. Jordan, Univer- sal domain adaptation, in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
work page 2019
-
[13]
Z. Zamanzadeh Darban, G. I. Webb, S. Pan, C. Aggar- wal, M. Salehi, Deep learning for time series anomaly detection: A survey, ACM Comput. Surv. 57 (2024)
work page 2024
- [14]
- [15]
- [16]
-
[17]
P. Haeusser, T. Frerix, A. Mordvintsev, D. Cremers, Associative domain adaptation, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 2765–2773
work page 2017
-
[18]
Y. Yang, X. Gu, J. Sun, Prototypical partial optimal transport for universal domain adaptation, Proceed- ings of the AAAI Conference on Artificial Intelligence (2023)
work page 2023
- [19]
-
[20]
Villani, et al., Optimal transport: old and new, vol- ume 338, Springer, 2009
C. Villani, et al., Optimal transport: old and new, vol- ume 338, Springer, 2009
work page 2009
-
[21]
B. B. Damodaran, B. Kellenberger, R. Flamary, D. Tuia, N. Courty, Deepjdot: Deep joint distribution optimal transport for unsupervised domain adaptation, in: Proceedings of the European Conference on Computer Vision (ECCV), 2018
work page 2018
- [22]
-
[23]
D. Zhu, Y. Li, J. Yuan, Z. Li, K. Kuang, C. Wu, Uni- versal domain adaptation via compressive attention matching., in: ICCV, 2023, pp. 6951–6962
work page 2023
-
[24]
B. Fu, Z. Cao, M. Long, J. Wang, Learning to de- tect open classes for universal domain adaptation, in: A. Vedaldi, H. Bischof, T. Brox, J.-M. Frahm (Eds.), Com- puter Vision – ECCV 2020, 2020, pp. 567–583
work page 2020
-
[25]
L. Chen, Y. Lou, J. He, T. Bai, M. Deng, Evidential neighborhood contrastive learning for universal do- main adaptation, Proceedings of the AAAI Conference on Artificial Intelligence 36 (2022) 6258–6267
work page 2022
- [26]
- [27]
-
[28]
B. Lakshminarayanan, A. Pritzel, C. Blundell, Simple and scalable predictive uncertainty estimation using deep ensembles, Advances in neural information pro- cessing systems 30 (2017)
work page 2017
-
[29]
M. Cuturi, Sinkhorn distances: Lightspeed compu- tation of optimal transport, in: Advances in Neural Information Processing Systems, Curran Associates, Inc., 2013
work page 2013
-
[30]
Computational Optimal Transport: With Ap- plications to Data Science
G. Peyré, M. Cuturi, Computational optimal transport: With applications to data science, Foundations and Trends in Machine Learning 11 (2019) 355–607. URL: http://dx.doi.org/10.1561/2200000073. doi: 10.1561/ 2200000073
- [31]
- [32]
-
[33]
N. Otsu, A threshold selection method from gray- level histograms, IEEE Transactions on Systems, Man, and Cybernetics 9 (1979) 62–66. doi:10.1109/TSMC. 1979.4310076
- [34]
-
[35]
C. Li, P. Tam, An iterative algorithm for minimum cross entropy thresholding, Pat- tern Recognition Letters 19 (1998) 771–776. URL: https://www.sciencedirect.com/science/ article/pii/S0167865598000579. doi: https: //doi.org/10.1016/S0167-8655(98)00057-9
- [36]
-
[37]
G. W. Zack, W. E. Rogers, S. A. Latt, Auto- matic measurement of sister chromatid exchange frequency., Journal of Histochemistry & Cyto- chemistry 25 (1977) 741–753. URL: https://doi. org/10.1177/25.7.70454. doi: 10.1177/25.7.70454. arXiv:https://doi.org/10.1177/25.7.70454, pMID: 70454
-
[38]
J. R. Kwapisz, G. M. Weiss, S. A. Moore, Activity recog- nition using cell phone accelerometers, ACM SigKDD Explorations Newsletter 12 (2011) 74–82
work page 2011
-
[39]
C. Lessmeier, J. K. Kimotho, D. Zimmer, W. Sextro, Con- dition monitoring of bearing damage in electromechan- ical drive systems by using motor current signals of electric motors: A benchmark data set for data-driven classification, in: PHM Society European Conference, volume 3, 2016
work page 2016
-
[40]
D. Anguita, A. Ghio, L. Oneto, X. Parra, J. L. Reyes- Ortiz, A public domain dataset for human activity recognition using smartphones, in: The European Symposium on Artificial Neural Networks, 2013
work page 2013
-
[41]
A. Stisen, H. Blunck, S. Bhattacharya, T. S. Prentow, M. B. Kjærgaard, A. Dey, T. Sonne, M. M. Jensen, Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition, in: Proceedings of the 13th ACM Conference on Embed- ded Networked Sensor Systems, Association for Com- puting Machinery, 2015
work page 2015
-
[42]
A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, H. E. Stanley, Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals, circulation (2000)
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.