Unsupervised feature selection using Bayesian Tucker decomposition
Pith reviewed 2026-05-10 10:00 UTC · model grok-4.3
The pith
Bayesian Tucker decomposition with Gaussian residuals enables unsupervised feature selection across synthetic and real datasets.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In this paper, we proposed Bayesian Tucker decomposition (BTuD) in which residual is supposed to obey Gaussian distribution analogous to linear regression. Although we have proposed an algorithm to perform the proposed BTuD, the conventional higher-order orthogonal iteration can generate Tucker decomposition consistent with the present implementation. Using the proposed BTuD, we can perform unsupervised feature selection successfully applied to various synthetic datasets, global coupled maps with randomized coupling strength, and gene expression profiles. Thus we can conclude that our newly proposed unsupervised feature selection method is promising. In addition to this, BTuD based unsupervs
What carries the argument
Bayesian Tucker decomposition (BTuD) with Gaussian residual assumption that supports unsupervised feature selection through tensor factorization.
If this is right
- The method performs unsupervised feature selection on various synthetic datasets.
- It succeeds on global coupled maps with randomized coupling strength.
- It applies successfully to gene expression profiles.
- The BTuD-based approach is expected to coincide with prior Tucker decomposition based unsupervised feature extraction.
- The procedure is promising for a wide range of problems previously addressed by TD methods.
Where Pith is reading between the lines
- The Gaussian residual framing could be tested for robustness gains in other noisy tensor data settings beyond the reported cases.
- Because the method aligns with earlier TD results, it may serve as a probabilistic bridge to unify deterministic and Bayesian tensor feature selection pipelines.
- Extensions to additional high-dimensional domains such as imaging or time-series tensors would be a direct next test of the approach.
Load-bearing premise
Modeling the residual as Gaussian produces a meaningfully different or improved unsupervised feature selection procedure compared with conventional higher-order orthogonal iteration Tucker decomposition.
What would settle it
Direct head-to-head comparison of selected features or downstream task performance on the same gene expression profiles using BTuD versus standard higher-order orthogonal iteration Tucker decomposition; large divergence or inferior results would disprove the success and coincidence claims.
read the original abstract
In this paper, we proposed Bayesian Tucker decomposition (BTuD) in which residual is supposed to obey Gaussian distribution analogous to linear regression. Although we have proposed an algorithm to perform the proposed BTuD, the conventional higher-order orthogonal iteration can generate Tucker decomposition consistent with the present implementation. Using the proposed BTuD, we can perform unsupervised feature selection successfully applied to various synthetic datasets, global coupled maps with randomized coupling strength, and gene expression profiles. Thus we can conclude that our newly proposed unsupervised feature selection method is promising. In addition to this, BTuD based unsupervised FE is expected to coincide with TD based unsupervised FE that were previously proposed and successfully applied to a wide range of problems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Bayesian Tucker decomposition (BTuD) in which residuals are modeled as Gaussian, analogous to linear regression. An algorithm is presented for performing BTuD, but the text states that this decomposition is consistent with the conventional higher-order orthogonal iteration (HOOI) procedure. The BTuD is then used for unsupervised feature selection and applied to synthetic datasets, global coupled maps with randomized coupling, and gene expression profiles. The paper concludes that the method is promising and explicitly notes that BTuD-based feature selection is expected to coincide with previously proposed tensor-decomposition-based unsupervised feature selection.
Significance. If the reported applications hold and the equivalence to prior work is properly contextualized, the manuscript could offer a modest unifying perspective by framing Tucker decomposition in Bayesian terms with Gaussian residuals. However, because the paper itself states that the conventional HOOI already produces consistent results and that the feature-selection outcomes coincide with earlier non-Bayesian TD methods, the Bayesian component does not appear to introduce new algorithmic behavior or improved predictions. No machine-checked proofs, reproducible code, or falsifiable predictions beyond the equivalence are highlighted. The significance is therefore limited to a probabilistic reinterpretation rather than a substantive advance in unsupervised feature selection.
major comments (3)
- [Abstract] Abstract: The central claim that BTuD constitutes a 'newly proposed' unsupervised feature selection method is directly contradicted by the statements that 'the conventional higher-order orthogonal iteration can generate Tucker decomposition consistent with the present implementation' and that 'BTuD based unsupervised FE is expected to coincide with TD based unsupervised FE that were previously proposed.' This equivalence means reported successes on synthetic data, coupled maps, and gene expression cannot be attributed to the Bayesian residual model.
- [Abstract and §3] Abstract and §3 (method): The Gaussian residual assumption is presented as analogous to linear regression, yet no derivation or independent test is supplied showing that this assumption produces a distinct decomposition or feature-selection ranking compared with standard HOOI. The claim that BTuD enables successful feature selection therefore rests on a procedure the manuscript acknowledges is not meaningfully different from prior work.
- [Abstract] Abstract: The assertion of successful application to 'various synthetic datasets, global coupled maps with randomized coupling strength, and gene expression profiles' supplies no quantitative metrics, error bars, baseline comparisons against standard HOOI or other feature-selection methods, or ablation of the Bayesian component, leaving the empirical support for any added value unverified.
minor comments (2)
- [Introduction] The manuscript should clarify in the introduction whether the Bayesian formulation is intended as a new algorithm or solely as a probabilistic interpretation of existing HOOI results.
- [§2] Notation for the Tucker core tensor, factor matrices, and residual variance should be introduced with explicit equations before the algorithm description to improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major point below, acknowledging where the abstract phrasing may overstate novelty given the stated equivalence to HOOI, and outlining revisions to improve clarity and empirical support.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that BTuD constitutes a 'newly proposed' unsupervised feature selection method is directly contradicted by the statements that 'the conventional higher-order orthogonal iteration can generate Tucker decomposition consistent with the present implementation' and that 'BTuD based unsupervised FE is expected to coincide with TD based unsupervised FE that were previously proposed.' This equivalence means reported successes on synthetic data, coupled maps, and gene expression cannot be attributed to the Bayesian residual model.
Authors: We agree that the abstract's use of 'newly proposed' for the unsupervised feature selection method is imprecise in light of the explicit statements on consistency with HOOI and expected coincidence with prior TD-based feature selection. The intended contribution is the Bayesian formulation with Gaussian residuals, which provides a probabilistic reinterpretation analogous to linear regression. However, this does not yield distinct point estimates or rankings. We will revise the abstract to emphasize the Bayesian perspective as a unifying framework rather than a new algorithmic method, while retaining the applications as demonstrations of the decomposition's utility. revision: yes
-
Referee: [Abstract and §3] Abstract and §3 (method): The Gaussian residual assumption is presented as analogous to linear regression, yet no derivation or independent test is supplied showing that this assumption produces a distinct decomposition or feature-selection ranking compared with standard HOOI. The claim that BTuD enables successful feature selection therefore rests on a procedure the manuscript acknowledges is not meaningfully different from prior work.
Authors: The Gaussian residual model is motivated directly by the linear regression analogy to justify the objective function for the decomposition. As noted in the manuscript, this leads to an optimization that is solved consistently by the conventional HOOI procedure, so we do not claim or demonstrate a distinct decomposition or altered feature rankings. The Bayesian framing may support future extensions (e.g., uncertainty quantification via posteriors), but no such tests are provided here. We will add a clarifying sentence in §3 to explicitly state that the current implementation produces results equivalent to HOOI and that any added value would require further development beyond the present work. revision: partial
-
Referee: [Abstract] Abstract: The assertion of successful application to 'various synthetic datasets, global coupled maps with randomized coupling strength, and gene expression profiles' supplies no quantitative metrics, error bars, baseline comparisons against standard HOOI or other feature-selection methods, or ablation of the Bayesian component, leaving the empirical support for any added value unverified.
Authors: The abstract provides a high-level summary of the applications; the full manuscript contains the detailed experimental results on these datasets. Given the acknowledged equivalence to HOOI, we recognize that quantitative metrics, error bars, direct baseline comparisons, and ablations of the Bayesian component are necessary to substantiate any incremental benefit. We will expand the results section with tables reporting performance metrics, comparisons to standard HOOI and other feature-selection baselines, and an explicit statement that the Bayesian component does not alter the observed feature selections in the current implementation. revision: yes
Circularity Check
BTuD is explicitly stated to coincide with conventional HOOI Tucker decomposition, so the Gaussian residual model yields no distinct feature selection procedure.
specific steps
-
renaming known result
[Abstract]
"Although we have proposed an algorithm to perform the proposed BTuD, the conventional higher-order orthogonal iteration can generate Tucker decomposition consistent with the present implementation. ... BTuD based unsupervised FE is expected to coincide with TD based unsupervised FE that were previously proposed and successfully applied to a wide range of problems."
The paper claims BTuD enables successful unsupervised feature selection on various datasets as a new method with Gaussian residuals, but states that its Tucker decomposition is consistent with conventional HOOI and that the feature selection coincides with prior TD methods. The claimed successes therefore reduce to the outputs of the previously proposed non-Bayesian procedure by the paper's own admission, with no independent grounding or distinct predictions from the Bayesian component.
full rationale
The paper's central claim of successful unsupervised feature selection via the newly proposed BTuD reduces directly to the performance of prior non-Bayesian TD methods. The abstract acknowledges that the implementation is consistent with conventional HOOI and that BTuD-based FE is expected to coincide with previously proposed TD-based FE. This equivalence means the reported successes on synthetic data, coupled maps, and gene expression profiles are not attributable to the Bayesian Gaussian residual model (presented as analogous to linear regression) but are the same as earlier results. The Bayesian framing adds no distinct algorithmic behavior or new predictions, rendering the method a renaming of known results rather than an independent advance.
Axiom & Free-Parameter Ledger
free parameters (1)
- Gaussian residual variance
axioms (1)
- domain assumption Residuals after Tucker decomposition obey a Gaussian distribution
Reference graph
Works this paper leans on
-
[1]
Unsupervised and Semi- Supervised Learning
Taguchi, Y.-h.: Unsupervised Feature Extraction Applied to Bioinformatics: A PCA Based and TD Based Approach, 2nd edn. Unsupervised and Semi- Supervised Learning. Springer, Switzerland (2024)
work page 2024
-
[2]
Springer, Cham, Switzerland (2023)
Cheng, L., Chen, Z., Wu, Y.-C.: Bayesian Tensor Decomposition for Signal Processing and Machine Learning, 1st edn. Springer, Cham, Switzerland (2023)
work page 2023
-
[3]
In: Balcan, M.F., Weinberger, K.Q
Kanagawa, H., Suzuki, T., Kobayashi, H., Shimizu, N., Tagami, Y.: Gaus- sian process nonparametric tensor estimator and its minimax optimality. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd Interna- tional Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 48, pp. 1632–1641. PMLR, New York, New York, USA (20...
work page 2016
-
[4]
In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q
Wimalawarne, K., Sugiyama, M., Tomioka, R.: Multitask learning meets tensor factorization: task imputation via convex optimization. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neu- ral Information Processing Systems, vol. 27. Curran Associates, Inc., ??? (2014). https://proceedings.neurips.cc/paper files/...
work page 2014
-
[5]
SN Computer Science3(3), 225 (2022) https://doi.org/10.1007/ s42979-022-01119-8
Takayama, H., Zhao, Q., Hontani, H., Yokota, T.: Bayesian Tensor Comple- tion and Decomposition with Automatic CP Rank Determination Using MGP Shrinkage Prior. SN Computer Science3(3), 225 (2022) https://doi.org/10.1007/ s42979-022-01119-8
work page 2022
-
[6]
https://arxiv.org/abs/1505.02343
Zhao, Q., Zhang, L., Cichocki, A.: Bayesian Sparse Tucker Models for Dimension Reduction and Tensor Completion (2015). https://arxiv.org/abs/1505.02343
work page internal anchor Pith review arXiv 2015
-
[7]
Mørup, M., Hansen, L.K.: Automatic relevance determi- nation for multi-way models. Journal of Chemometrics 23(7-8), 352–363 (2009) https://doi.org/10.1002/cem.1223 https://analyticalsciencejournals.onlinelibrary.wiley.com/doi/pdf/10.1002/cem.1223
-
[8]
IEEE Transactions on Signal Processing71, 4077–4091 (2023) https://doi
Tong, X., Cheng, L., Wu, Y.-C.: Bayesian tensor tucker completion with a flexible core. IEEE Transactions on Signal Processing71, 4077–4091 (2023) https://doi. org/10.1109/TSP.2023.3327845 23
-
[9]
Journal of the Royal Statistical Society
Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. Journal of the Royal Statistical Society. Series B (Statistical Methodology)61(3), 611–622 (1999). Accessed 2024-09-16
work page 1999
-
[10]
Information Science and Statistics
Bishop, C.M.: Pattern Recognition and Machine Learning, 1st edn. Information Science and Statistics. Springer, New York, NY (2006)
work page 2006
-
[11]
In: Proceedings of 3rd International Con- ference on Document Analysis and Recognition, vol
Ho, T.K.: Random decision forests. In: Proceedings of 3rd International Con- ference on Document Analysis and Recognition, vol. 1, pp. 278–282 (1995). IEEE
work page 1995
-
[12]
The Annals of Statistics13(1), 70–84 (1985)
Hartigan, J.A., Hartigan, P.M.: The dip test of unimodality. The Annals of Statistics13(1), 70–84 (1985). Accessed 2025-01-07
work page 1985
-
[13]
Barata, J.C.A., Hussein, M.S.: The Moore–Penrose Pseudoinverse: A Tutorial Review of the Theory. Brazilian Journal of Physics42(1-2), 146–165 (2012) https: //doi.org/10.1007/s13538-011-0052-z
-
[14]
Journal of Statistical Software87(10), 1–31 (2018) https://doi.org/10.18637/jss.v087.i10
Li, J., Bien, J., Wells, M.T.: rTensor: An R package for multidimensional array (tensor) unfolding, multiplication, and decomposition. Journal of Statistical Software87(10), 1–31 (2018) https://doi.org/10.18637/jss.v087.i10
-
[15]
Kaneko, K.: Globally coupled chaos violates the law of large numbers but not the central-limit theorem. Phys. Rev. Lett.65, 1391–1394 (1990) https://doi.org/10. 1103/PhysRevLett.65.1391
work page 1990
-
[16]
https://arxiv.org/abs/2304.06522
Mototake, Y.-i., Taguchi, Y.-h.: Signal identification without signal formulation (2023). https://arxiv.org/abs/2304.06522
-
[17]
iScience23(2), 100791 (2020) https://doi.org/10.1016/j.isci.2019.100791
Kozawa, S., Sagawa, F., Endo, S., De Almeida, G.M., Mitsuishi, Y., Sato, T.N.: Predicting human clinical outcomes using mouse multi-organ transcriptome. iScience23(2), 100791 (2020) https://doi.org/10.1016/j.isci.2019.100791
-
[18]
Frontiers in Genetics11(2020) https://doi.org/ 10.3389/fgene.2020.00695
Taguchi, Y.-h., Turki, T.: Universal nature of drug treatment responses in drug-tissue-wide model-animal experiments using tensor decomposition-based unsupervised feature extraction. Frontiers in Genetics11(2020) https://doi.org/ 10.3389/fgene.2020.00695
-
[19]
Scientific Reports12(1), 17438 (2022) https: //doi.org/10.1038/s41598-022-21474-z 24
Taguchi, Y.-h., Turki, T.: Adapted tensor decomposition and PCA based unsuper- vised feature extraction select more biologically reasonable differentially expressed genes than conventional methods. Scientific Reports12(1), 17438 (2022) https: //doi.org/10.1038/s41598-022-21474-z 24
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.