Rectified Fisher-Bingham Model for Compositional Data with Zeros

Eugene Han; Hannah D. Holscher; Marahi Perez-Tamayo; Ruoqing Zhu

arxiv: 2604.25030 · v1 · submitted 2026-04-27 · 📊 stat.ME

Rectified Fisher-Bingham Model for Compositional Data with Zeros

Eugene Han , Marahi Perez-Tamayo , Hannah D. Holscher , Ruoqing Zhu This is my paper

Pith reviewed 2026-05-08 01:59 UTC · model grok-4.3

classification 📊 stat.ME

keywords compositional dataFisher-Bingham distributionzerossquare-root transformationMonte Carlo EMscore testmicrobiotaspherical models

0 comments

The pith

Compositional data with exact zeros can be modeled coherently by rectifying a latent Fisher-Bingham distribution on the square-root transformed sphere, without imputation or separate zero modeling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Compositional data such as microbial abundances often contain exact zeros that break standard models. The paper maps these data to the positive orthant of the unit sphere via square-root transformation and represents them as the output of a latent Fisher-Bingham distribution followed by a deterministic rectification step that sets some components exactly to zero. This construction supplies a single coherent likelihood for all observations. Parameters are estimated with a Monte Carlo expectation-maximization algorithm that accounts for the latent variables, and a score test is derived to compare compositions across groups. Simulations show the fitted distribution matches the data well and the test gains power over distance-based alternatives, especially when zeros are frequent; the method also detects intervention-related shifts in a dietary microbiota study.

Core claim

The paper shows that a latent Fisher-Bingham distribution on the sphere, combined with a deterministic rectification that induces exact zeros while renormalizing the remaining components, produces a valid likelihood for square-root transformed compositional data. This unified representation supports consistent parameter estimation through Monte Carlo EM and a score test for group differences without requiring zero imputation or two-part modeling.

What carries the argument

The rectified Fisher-Bingham distribution, formed by a latent Fisher-Bingham random vector on the sphere followed by a deterministic rectification map that forces selected coordinates to zero and renormalizes the rest.

If this is right

A single likelihood function covers both zero and nonzero observations, so estimation and inference proceed without ad-hoc adjustments.
Monte Carlo EM yields consistent parameter estimates that integrate over the latent sphere variables.
The score test provides a parametric alternative to distance-based methods and shows higher power for structured group differences when zeros are common.
The fitted model reproduces the induced zero pattern and marginals closely enough to improve detection of real compositional shifts in applications such as dietary intervention studies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The rectification step could be adapted to other spherical or directional distributions when data exhibit hard boundaries or sparsity.
Direct incorporation of covariates into the latent Fisher-Bingham parameters would turn the model into a regression framework for compositional outcomes.
Because the likelihood is fully specified, posterior predictive checks for zero frequencies become straightforward and could guide model refinement.
The approach may reduce information loss in high-dimensional sparse settings compared with methods that treat zeros as missing or as a separate category.

Load-bearing premise

The square-root transformed data arise from a latent Fisher-Bingham distribution whose deterministic rectification accurately reproduces both the observed pattern of zeros and the marginal distributions of the nonzero components.

What would settle it

Generate data from the model with known parameters and many zeros; if the Monte Carlo EM procedure recovers parameters only with large bias or the score test exhibits incorrect size, the construction fails to deliver the claimed coherent likelihood and inference.

Figures

Figures reproduced from arXiv: 2604.25030 by Eugene Han, Hannah D. Holscher, Marahi Perez-Tamayo, Ruoqing Zhu.

**Figure 1.** Figure 1: Visualization of the induced observed-data density under structured perturba view at source ↗

**Figure 2.** Figure 2: Empirical power of the RRFB score test and PERMANOVA under structured view at source ↗

**Figure 3.** Figure 3: Permutation distributions of the RRFB score test statistic under the null hypoth view at source ↗

**Figure 4.** Figure 4: (A) Principal coordinates analysis of square-root transformed compositions us view at source ↗

**Figure 5.** Figure 5: Exploratory visualizations of compositional changes from baseline to end of view at source ↗

read the original abstract

This paper introduces a rectified and renormalized Fisher-Bingham model for compositional data with zeros, motivated in part by the presence of zeros in microbiota studies. The approach represents compositions through a square-root transformation that maps data to the positive orthant of the unit sphere, and models them via a latent Fisher-Bingham followed by a deterministic transformation that induces exact zeros. This construction yields a coherent likelihood without requiring zero imputation or separate modeling of zero and nonzero components. Parameter estimation is performed using a Monte Carlo expectation-maximization algorithm that accommodates the latent structure. We further develop a score test for detecting structured differences in composition across groups, providing a parametric alternative to commonly used distance-based methods. Simulation studies demonstrate that the proposed method closely approximates the induced distribution and achieves higher power for detecting structured compositional changes, particularly when observations include many zero-valued components. An application to a dietary intervention study illustrates that the method identifies meaningful microbiota shifts not detected by standard approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a parametric model for compositional data with exact zeros via square-root transform plus rectified latent Fisher-Bingham, but the induced likelihood may not correctly match the observed zero patterns.

read the letter

The main takeaway is a construction that models compositional data with zeros in one piece: square-root transform to the sphere, latent Fisher-Bingham, then a deterministic rectification that forces exact zeros before renormalization back to the simplex. This yields a single likelihood, MCEM estimation, and a score test for group differences, without imputation or separate zero/nonzero components. Simulations indicate it recovers the target distribution and outperforms distance-based methods on power, and the dietary intervention example finds shifts missed by standard approaches. That is the concrete advance over existing work on compositional models with zeros. The construction itself is new in this combination, and the score test is a practical addition for microbiome-style applications. The paper does a reasonable job showing the idea works in controlled simulations and on real data. The soft spot is exactly the one the stress test flags. The rectification map has to induce the right probability for each zero pattern from the latent measure, and the density on the positive parts has to be the correct conditional after accounting for the spherical geometry and renormalization. If the map is a simple threshold or projection, the pre-image measure generally will not equal the Fisher-Bingham density evaluated at the observed point, which would make the likelihood inconsistent and bias both the MCEM estimates and the score test. The abstract asserts coherence, but the provided text does not include the explicit derivation or any check that the measure is preserved, so this remains an open question rather than a minor detail. This paper is for statisticians who analyze microbiome or other compositional data and want a parametric alternative to PERMANOVA-style tests. Readers already comfortable with Fisher-Bingham or spherical distributions will get the most out of the technical parts. It deserves a serious referee because the model is new, the estimation and test are implementable, and the application area is active, even though the likelihood derivation needs to be verified before the claims can be taken as solid.

Referee Report

2 major / 2 minor

Summary. The paper proposes a rectified Fisher-Bingham model for compositional data containing zeros. Data are square-root transformed to the positive orthant of the unit sphere and modeled as arising from a latent Fisher-Bingham distribution followed by a deterministic rectification map that forces exact zeros before renormalization back to the simplex. This yields an observed-data likelihood that is claimed to be coherent without zero imputation or separate zero/nonzero modeling. Parameters are estimated via Monte Carlo EM; a score test is derived for group differences in composition. Simulations indicate the model approximates the induced distribution and the score test has higher power than distance-based alternatives, especially with many zeros; an application to a dietary intervention microbiota study is presented.

Significance. If the likelihood derivation is valid, the construction supplies a fully parametric, imputation-free model for zero-inflated compositional data together with a score test that can serve as a parametric counterpart to PERMANOVA-style methods. This would be useful in microbiome and other compositional applications where zeros are structural rather than missing. The MCEM procedure and simulation evidence for approximation quality are concrete strengths.

major comments (2)

[Likelihood construction (methods section)] The central claim that the rectified model produces a coherent observed-data likelihood (abstract and methods) rests on the deterministic rectification map inducing the correct probability measure on zero patterns and the correct conditional density on the positive components. The pre-image measure under the map must equal the latent FB probability of the corresponding region on the sphere, and the renormalized density must account for the spherical surface measure; no explicit derivation or verification of this equality is provided in the supplied text. Without it, the MCEM target is not guaranteed to be the true likelihood, which would bias both parameter estimates and the score test.
[Simulation studies] Simulation studies are described as showing that the method 'closely approximates the induced distribution,' but no quantitative metrics (e.g., Kolmogorov-Smirnov statistics, integrated squared error on marginals, or coverage of the score test under the null) are reported. This leaves the empirical support for the approximation claim difficult to assess.

minor comments (2)

[Model definition] Notation for the rectification map and the renormalization step should be introduced with an explicit equation early in the methods; the current description is informal.
[Score test] The score test derivation would benefit from an explicit statement of the null and alternative hypotheses in terms of the FB parameters and a clear expression for the test statistic.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. The comments highlight important aspects of the likelihood derivation and simulation evidence that we will address directly in the revision.

read point-by-point responses

Referee: The central claim that the rectified model produces a coherent observed-data likelihood (abstract and methods) rests on the deterministic rectification map inducing the correct probability measure on zero patterns and the correct conditional density on the positive components. The pre-image measure under the map must equal the latent FB probability of the corresponding region on the sphere, and the renormalized density must account for the spherical surface measure; no explicit derivation or verification of this equality is provided in the supplied text. Without it, the MCEM target is not guaranteed to be the true likelihood, which would bias both parameter estimates and the score test.

Authors: We acknowledge that the original manuscript presents the likelihood construction at a conceptual level without supplying the full measure-theoretic derivation. This omission leaves the justification incomplete. In the revised methods section we will insert an explicit derivation: we will show that for each zero pattern the probability equals the latent Fisher-Bingham measure of the corresponding pre-image region on the sphere, and that the conditional density on the positive components is obtained by renormalizing with respect to the spherical surface measure restricted to the positive orthant. The resulting expression will be the target of the MCEM algorithm. revision: yes
Referee: Simulation studies are described as showing that the method 'closely approximates the induced distribution,' but no quantitative metrics (e.g., Kolmogorov-Smirnov statistics, integrated squared error on marginals, or coverage of the score test under the null) are reported. This leaves the empirical support for the approximation claim difficult to assess.

Authors: We agree that the simulation section would be strengthened by quantitative diagnostics. In the revision we will report Kolmogorov-Smirnov statistics for the marginal distributions of the positive components, integrated squared error between the empirical and model-induced densities where feasible, and empirical coverage probabilities of the score test under the null across the simulated settings. revision: yes

Circularity Check

0 steps flagged

No circularity: new rectified model built from standard FB plus deterministic map

full rationale

The derivation introduces a latent Fisher-Bingham on the square-root sphere, applies a deterministic rectification to force exact zeros, renormalizes to the simplex, then uses MCEM for the resulting observed-data likelihood and derives a score test. None of these steps reduce a prediction or parameter to a fitted input by construction, invoke self-citations for load-bearing uniqueness theorems, or smuggle ansatzes; the construction is presented as a direct extension of the known FB distribution with an explicit new transformation rule. The abstract and description contain no equations or claims that equate outputs to inputs tautologically.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard properties of the Fisher-Bingham distribution on the sphere and the validity of the square-root transformation for compositional data; no new free parameters or invented entities are introduced beyond the usual concentration parameters of the latent distribution.

free parameters (1)

Fisher-Bingham concentration parameters
Parameters of the latent distribution are estimated from data via MCEM and control the shape of the induced distribution.

axioms (1)

domain assumption Square-root transformed compositions lie in the positive orthant of the unit sphere and can be modeled by a latent Fisher-Bingham distribution.
Invoked to justify the mapping and latent structure for compositional data.

pith-pipeline@v0.9.0 · 5470 in / 1235 out tokens · 36311 ms · 2026-05-08T01:59:48.940829+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages

[1]

Optimization algorithms on matrix manifolds

P-A Absil, Robert Mahony, and Rodolphe Sepulchre. Optimization algorithms on matrix manifolds. In Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2009

work page 2009
[2]

The statistical analysis of compositional data

John Aitchison. The statistical analysis of compositional data. Journal of the Royal Statistical Society: Series B (Methodological), 44 0 (2): 0 139--160, 1982

work page 1982
[3]

A new method for non-parametric multivariate analysis of variance

Marti J Anderson. A new method for non-parametric multivariate analysis of variance. Austral ecology, 26 0 (1): 0 32--46, 2001

work page 2001
[4]

Clustering on the unit hypersphere using von mises-fisher distributions

Arindam Banerjee, Inderjit S Dhillon, Joydeep Ghosh, Suvrit Sra, and Greg Ridgeway. Clustering on the unit hypersphere using von mises-fisher distributions. Journal of Machine Learning Research, 6 0 (9), 2005

work page 2005
[5]

An antipodally symmetric distribution on the sphere

Christopher Bingham. An antipodally symmetric distribution on the sphere. The Annals of Statistics, pages 1201--1225, 1974

work page 1974
[6]

JAX : composable transformations of P ython+ N um P y programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake Vander P las, Skye Wanderman- M ilne, and Qiao Zhang. JAX : composable transformations of P ython+ N um P y programs, 2018. URL http://github.com/jax-ml/jax

work page 2018
[7]

An ordination of the upland forest communities of southern wisconsin

J Roger Bray and John T Curtis. An ordination of the upland forest communities of southern wisconsin. Ecological monographs, 27 0 (4): 0 326--349, 1957

work page 1957
[8]

Maximum likelihood estimation of the fisher--bingham distribution via efficient calculation of its normalizing constant

Yici Chen and Ken’ichiro Tanaka. Maximum likelihood estimation of the fisher--bingham distribution via efficient calculation of its normalizing constant. Statistics and Computing, 31 0 (4): 0 40, 2021

work page 2021
[9]

and Falorsi, Luca and Cao, Nicola De and Kipf, Thomas and Tomczak, Jakub M

Tim R Davidson, Luca Falorsi, Nicola De Cao, Thomas Kipf, and Jakub M Tomczak. Hyperspherical variational auto-encoders. arXiv preprint arXiv:1804.00891, 2018

work page arXiv 2018
[10]

Symmetric multivariate and related distributions

Kai-Tai Fang, Samuel Kotz, and Kai W Ng. Symmetric multivariate and related distributions. Chapman and Hall/CRC, 2018

work page 2018
[11]

Temporal variability is a personalized feature of the human microbiome

Gilberto E Flores, J Gregory Caporaso, Jessica B Henley, Jai Ram Rideout, Daniel Domogala, John Chase, Jonathan W Leff, Yoshiki V \'a zquez-Baeza, Antonio Gonzalez, Rob Knight, et al. Temporal variability is a personalized feature of the human microbiome. Genome biology, 15 0 (12): 0 531, 2014

work page 2014
[12]

Orthogonal polynomials: computation and approximation

Walter Gautschi. Orthogonal polynomials: computation and approximation. OUP Oxford, 2004

work page 2004
[13]

Microbiome datasets are compositional: and this is not optional

Gregory B Gloor, Jean M Macklaim, Vera Pawlowsky-Glahn, and Juan J Egozcue. Microbiome datasets are compositional: and this is not optional. Frontiers in microbiology, 8: 0 2224, 2017

work page 2017
[14]

Strictly proper scoring rules, prediction, and estimation

Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102 0 (477): 0 359--378, 2007

work page 2007
[15]

Toward a health-associated core keystone index for the human gut microbiome

Abhishek Goel, Omprakash Shete, Sourav Goswami, Amit Samal, Lavanya CB, Saurabh Kedia, Vineet Ahuja, Paul W O’Toole, Fergus Shanahan, and Tarini Shankar Ghosh. Toward a health-associated core keystone index for the human gut microbiome. Cell Reports, 44 0 (3), 2025

work page 2025
[16]

Hass avocado inclusion in a weight-loss diet supported weight loss and altered gut microbiota: a 12-week randomized, parallel-controlled trial

Susanne M Henning, Jieping Yang, Shih Lung Woo, Ru-Po Lee, Jianjun Huang, Anna Rasmusen, Catherine L Carpenter, Gail Thames, Irene Gilbuena, Chi-Hong Tseng, et al. Hass avocado inclusion in a weight-loss diet supported weight loss and altered gut microbiota: a 12-week randomized, parallel-controlled trial. Current developments in nutrition, 3 0 (8): 0 nzz...

work page 2019
[17]

Analysis of microbiome data in the presence of excess zeros

Abhishek Kaul, Siddhartha Mandal, Ori Davidov, and Shyamal D Peddada. Analysis of microbiome data in the presence of excess zeros. Frontiers in microbiology, 8: 0 2114, 2017

work page 2017
[18]

The fisher-bingham distribution on the sphere

John T Kent. The fisher-bingham distribution on the sphere. Journal of the Royal Statistical Society: Series B (Methodological), 44 0 (1): 0 71--80, 1982

work page 1982
[19]

A new unified approach for the simulation of a wide class of directional distributions

John T Kent, Asaad M Ganeiber, and Kanti V Mardia. A new unified approach for the simulation of a wide class of directional distributions. Journal of Computational and Graphical Statistics, 27 0 (2): 0 291--301, 2018

work page 2018
[20]

Best practices for analysing microbiomes

Rob Knight, Alison Vrbanac, Bryn C Taylor, Alexander Aksenov, Chris Callewaert, Justine Debelius, Antonio Gonzalez, Tomasz Kosciolek, Laura-Isobel McCall, Daniel McDonald, et al. Best practices for analysing microbiomes. Nature Reviews Microbiology, 16 0 (7): 0 410--422, 2018

work page 2018
[21]

On the exact maximum likelihood inference of fisher--bingham distributions using an adjusted holonomic gradient method

Alfred Kume and Tomonari Sei. On the exact maximum likelihood inference of fisher--bingham distributions using an adjusted holonomic gradient method. Statistics and Computing, 28 0 (4): 0 835--847, 2018

work page 2018
[22]

On the fisher--bingham distribution

Alfred Kume and Stephen G Walker. On the fisher--bingham distribution. Statistics and Computing, 19 0 (2): 0 167--172, 2009

work page 2009
[23]

Saddlepoint approximations for the bingham and fisher--bingham normalising constants

Alfred Kume and Andrew TA Wood. Saddlepoint approximations for the bingham and fisher--bingham normalising constants. Biometrika, 92 0 (2): 0 465--476, 2005

work page 2005
[24]

Zero-inflated poisson regression, with an application to defects in manufacturing

Diane Lambert. Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics, 34 0 (1): 0 1--14, 1992

work page 1992
[25]

Beta diversity as the variance of community data: dissimilarity coefficients and partitioning

Pierre Legendre and Miquel De C \'a ceres. Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecology letters, 16 0 (8): 0 951--963, 2013

work page 2013
[26]

Ecologically meaningful transformations for ordination of species data

Pierre Legendre and Eugene D Gallagher. Ecologically meaningful transformations for ordination of species data. Oecologia, 129 0 (2): 0 271--280, 2001

work page 2001
[27]

Microbiome, metagenomics, and high-dimensional compositional data analysis

Hongzhe Li. Microbiome, metagenomics, and high-dimensional compositional data analysis. Annual Review of Statistics and Its Application, 2 0 (1): 0 73--94, 2015

work page 2015
[28]

On the limited memory bfgs method for large scale optimization

Dong C Liu and Jorge Nocedal. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45 0 (1): 0 503--528, 1989

work page 1989
[29]

Analysis of composition of microbiomes: a novel method for studying microbial composition

Siddhartha Mandal, Will Van Treuren, Richard A White, Merete Eggesb , Rob Knight, and Shyamal D Peddada. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microbial ecology in health and disease, 26 0 (1): 0 27663, 2015

work page 2015
[30]

Directional statistics

Kanti V Mardia and Peter E Jupp. Directional statistics. John Wiley & Sons, 2000

work page 2000
[31]

Dealing with zeros and missing values in compositional data sets using nonparametric imputation

Josep A Mart \' n-Fern \'a ndez, Carles Barcel \'o -Vidal, and Vera Pawlowsky-Glahn. Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Mathematical Geology, 35 0 (3): 0 253--278, 2003

work page 2003
[32]

Specification and testing of some modified count data models

John Mullahy. Specification and testing of some modified count data models. Journal of econometrics, 33 0 (3): 0 341--365, 1986

work page 1986
[33]

Large sample estimation and hypothesis testing

Whitney K Newey and Daniel McFadden. Large sample estimation and hypothesis testing. Handbook of econometrics, 4: 0 2111--2245, 1994

work page 1994
[34]

Application of the bingham distribution function in paleomagnetic studies

Tullis C Onstott. Application of the bingham distribution function in paleomagnetic studies. Journal of Geophysical Research: Solid Earth, 85 0 (B3): 0 1500--1510, 1980

work page 1980
[35]

Differential abundance analysis for microbial marker-gene surveys

Joseph N Paulson, O Colin Stine, H \'e ctor Corrada Bravo, and Mihai Pop. Differential abundance analysis for microbial marker-gene surveys. Nature methods, 10 0 (12): 0 1200--1202, 2013

work page 2013
[36]

Persea americana for total health (path-2): Effects of avocado consumption on gastrointestinal health in a randomized, crossover, complete feeding trial

Maria G Sanabria-Veaz, Tori A Holthaus, Maggie Oleksiak, David Revilla, David A Alvarado, Marahi Perez-Tamayo, Naiman A Khan, and Hannah D Holscher. Persea americana for total health (path-2): Effects of avocado consumption on gastrointestinal health in a randomized, crossover, complete feeding trial. medRxiv, pages 2026--03, 2026

work page 2026
[37]

Fitting kent models to compositional data with small concentration

Janice L Scealy and Alan H Welsh. Fitting kent models to compositional data with small concentration. Statistics and Computing, 24 0 (2): 0 165--179, 2014

work page 2014
[38]

Spatial hyperspheric models for compositional data

Michael R Schwob, Mevin B Hooten, Nicholas M Calzada, and Timothy H Keitt. Spatial hyperspheric models for compositional data. The Annals of Applied Statistics, 19 0 (4): 0 2644--2663, 2025

work page 2025
[39]

Fecal bacteria as biomarkers for predicting food intake in healthy adults

Leila M Shinn, Yutong Li, Aditya Mansharamani, Loretta S Auvil, Michael E Welge, Colleen Bushell, Naiman A Khan, Craig S Charron, Janet A Novotny, David J Baer, et al. Fecal bacteria as biomarkers for predicting food intake in healthy adults. The Journal of nutrition, 151 0 (2): 0 423--433, 2021

work page 2021
[40]

Use of the von mises distribution to analyse continuous proportions

Michael A Stephens. Use of the von mises distribution to analyse continuous proportions. Biometrika, 69 0 (1): 0 197--203, 1982

work page 1982
[41]

Avocado consumption alters gastrointestinal bacteria abundance and microbial metabolite concentrations among adults with overweight or obesity: a randomized controlled trial

Sharon V Thompson, Melisa A Bailey, Andrew M Taylor, Jennifer L Kaczmarek, Annemarie R Mysonhimer, Caitlyn G Edwards, Ginger E Reeser, Nicholas A Burd, Naiman A Khan, and Hannah D Holscher. Avocado consumption alters gastrointestinal bacteria abundance and microbial metabolite concentrations among adults with overweight or obesity: a randomized controlled...

work page 2021
[42]

Pymanopt: A python toolbox for optimization on manifolds using automatic differentiation

James Townsend, Niklas Koep, and Sebastian Weichwald. Pymanopt: A python toolbox for optimization on manifolds using automatic differentiation. Journal of Machine Learning Research, 17 0 (137): 0 1--5, 2016

work page 2016
[43]

A core gut microbiome in obese and lean twins

Peter J Turnbaugh, Micah Hamady, Tanya Yatsunenko, Brandi L Cantarel, Alexis Duncan, Ruth E Ley, Mitchell L Sogin, William J Jones, Bruce A Roe, Jason P Affourtit, et al. A core gut microbiome in obese and lean twins. nature, 457 0 (7228): 0 480--484, 2009

work page 2009
[44]

Asymptotic statistics, volume 3

Aad W Van der Vaart. Asymptotic statistics, volume 3. Cambridge university press, 2000

work page 2000
[45]

A monte carlo implementation of the em algorithm and the poor man's data augmentation algorithms

Greg CG Wei and Martin A Tanner. A monte carlo implementation of the em algorithm and the poor man's data augmentation algorithms. Journal of the American statistical Association, 85 0 (411): 0 699--704, 1990

work page 1990
[46]

Maximum likelihood estimation of misspecified models

Halbert White. Maximum likelihood estimation of misspecified models. Econometrica: Journal of the econometric society, pages 1--25, 1982

work page 1982
[47]

Estimation, inference and specification analysis

Halbert White. Estimation, inference and specification analysis. Cambridge university press, 1996

work page 1996
[48]

Assessment and selection of competing models for zero-inflated microbiome data

Lizhen Xu, Andrew D Paterson, Williams Turpin, and Wei Xu. Assessment and selection of competing models for zero-inflated microbiome data. PloS one, 10 0 (7): 0 e0129606, 2015

work page 2015
[49]

o rg Peplies, Wolfgang Ludwig, and Frank Oliver Gl \

Pelin Yilmaz, Laura Wegener Parfrey, Pablo Yarza, Jan Gerken, Elmar Pruesse, Christian Quast, Timmy Schweer, J \"o rg Peplies, Wolfgang Ludwig, and Frank Oliver Gl \"o ckner. The silva and “all-species living tree project (ltp)” taxonomic frameworks. Nucleic acids research, 42 0 (D1): 0 D643--D648, 2014

work page 2014

[1] [1]

Optimization algorithms on matrix manifolds

P-A Absil, Robert Mahony, and Rodolphe Sepulchre. Optimization algorithms on matrix manifolds. In Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2009

work page 2009

[2] [2]

The statistical analysis of compositional data

John Aitchison. The statistical analysis of compositional data. Journal of the Royal Statistical Society: Series B (Methodological), 44 0 (2): 0 139--160, 1982

work page 1982

[3] [3]

A new method for non-parametric multivariate analysis of variance

Marti J Anderson. A new method for non-parametric multivariate analysis of variance. Austral ecology, 26 0 (1): 0 32--46, 2001

work page 2001

[4] [4]

Clustering on the unit hypersphere using von mises-fisher distributions

Arindam Banerjee, Inderjit S Dhillon, Joydeep Ghosh, Suvrit Sra, and Greg Ridgeway. Clustering on the unit hypersphere using von mises-fisher distributions. Journal of Machine Learning Research, 6 0 (9), 2005

work page 2005

[5] [5]

An antipodally symmetric distribution on the sphere

Christopher Bingham. An antipodally symmetric distribution on the sphere. The Annals of Statistics, pages 1201--1225, 1974

work page 1974

[6] [6]

JAX : composable transformations of P ython+ N um P y programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake Vander P las, Skye Wanderman- M ilne, and Qiao Zhang. JAX : composable transformations of P ython+ N um P y programs, 2018. URL http://github.com/jax-ml/jax

work page 2018

[7] [7]

An ordination of the upland forest communities of southern wisconsin

J Roger Bray and John T Curtis. An ordination of the upland forest communities of southern wisconsin. Ecological monographs, 27 0 (4): 0 326--349, 1957

work page 1957

[8] [8]

Maximum likelihood estimation of the fisher--bingham distribution via efficient calculation of its normalizing constant

Yici Chen and Ken’ichiro Tanaka. Maximum likelihood estimation of the fisher--bingham distribution via efficient calculation of its normalizing constant. Statistics and Computing, 31 0 (4): 0 40, 2021

work page 2021

[9] [9]

and Falorsi, Luca and Cao, Nicola De and Kipf, Thomas and Tomczak, Jakub M

Tim R Davidson, Luca Falorsi, Nicola De Cao, Thomas Kipf, and Jakub M Tomczak. Hyperspherical variational auto-encoders. arXiv preprint arXiv:1804.00891, 2018

work page arXiv 2018

[10] [10]

Symmetric multivariate and related distributions

Kai-Tai Fang, Samuel Kotz, and Kai W Ng. Symmetric multivariate and related distributions. Chapman and Hall/CRC, 2018

work page 2018

[11] [11]

Temporal variability is a personalized feature of the human microbiome

Gilberto E Flores, J Gregory Caporaso, Jessica B Henley, Jai Ram Rideout, Daniel Domogala, John Chase, Jonathan W Leff, Yoshiki V \'a zquez-Baeza, Antonio Gonzalez, Rob Knight, et al. Temporal variability is a personalized feature of the human microbiome. Genome biology, 15 0 (12): 0 531, 2014

work page 2014

[12] [12]

Orthogonal polynomials: computation and approximation

Walter Gautschi. Orthogonal polynomials: computation and approximation. OUP Oxford, 2004

work page 2004

[13] [13]

Microbiome datasets are compositional: and this is not optional

Gregory B Gloor, Jean M Macklaim, Vera Pawlowsky-Glahn, and Juan J Egozcue. Microbiome datasets are compositional: and this is not optional. Frontiers in microbiology, 8: 0 2224, 2017

work page 2017

[14] [14]

Strictly proper scoring rules, prediction, and estimation

Tilmann Gneiting and Adrian E Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102 0 (477): 0 359--378, 2007

work page 2007

[15] [15]

Toward a health-associated core keystone index for the human gut microbiome

Abhishek Goel, Omprakash Shete, Sourav Goswami, Amit Samal, Lavanya CB, Saurabh Kedia, Vineet Ahuja, Paul W O’Toole, Fergus Shanahan, and Tarini Shankar Ghosh. Toward a health-associated core keystone index for the human gut microbiome. Cell Reports, 44 0 (3), 2025

work page 2025

[16] [16]

Hass avocado inclusion in a weight-loss diet supported weight loss and altered gut microbiota: a 12-week randomized, parallel-controlled trial

Susanne M Henning, Jieping Yang, Shih Lung Woo, Ru-Po Lee, Jianjun Huang, Anna Rasmusen, Catherine L Carpenter, Gail Thames, Irene Gilbuena, Chi-Hong Tseng, et al. Hass avocado inclusion in a weight-loss diet supported weight loss and altered gut microbiota: a 12-week randomized, parallel-controlled trial. Current developments in nutrition, 3 0 (8): 0 nzz...

work page 2019

[17] [17]

Analysis of microbiome data in the presence of excess zeros

Abhishek Kaul, Siddhartha Mandal, Ori Davidov, and Shyamal D Peddada. Analysis of microbiome data in the presence of excess zeros. Frontiers in microbiology, 8: 0 2114, 2017

work page 2017

[18] [18]

The fisher-bingham distribution on the sphere

John T Kent. The fisher-bingham distribution on the sphere. Journal of the Royal Statistical Society: Series B (Methodological), 44 0 (1): 0 71--80, 1982

work page 1982

[19] [19]

A new unified approach for the simulation of a wide class of directional distributions

John T Kent, Asaad M Ganeiber, and Kanti V Mardia. A new unified approach for the simulation of a wide class of directional distributions. Journal of Computational and Graphical Statistics, 27 0 (2): 0 291--301, 2018

work page 2018

[20] [20]

Best practices for analysing microbiomes

Rob Knight, Alison Vrbanac, Bryn C Taylor, Alexander Aksenov, Chris Callewaert, Justine Debelius, Antonio Gonzalez, Tomasz Kosciolek, Laura-Isobel McCall, Daniel McDonald, et al. Best practices for analysing microbiomes. Nature Reviews Microbiology, 16 0 (7): 0 410--422, 2018

work page 2018

[21] [21]

On the exact maximum likelihood inference of fisher--bingham distributions using an adjusted holonomic gradient method

Alfred Kume and Tomonari Sei. On the exact maximum likelihood inference of fisher--bingham distributions using an adjusted holonomic gradient method. Statistics and Computing, 28 0 (4): 0 835--847, 2018

work page 2018

[22] [22]

On the fisher--bingham distribution

Alfred Kume and Stephen G Walker. On the fisher--bingham distribution. Statistics and Computing, 19 0 (2): 0 167--172, 2009

work page 2009

[23] [23]

Saddlepoint approximations for the bingham and fisher--bingham normalising constants

Alfred Kume and Andrew TA Wood. Saddlepoint approximations for the bingham and fisher--bingham normalising constants. Biometrika, 92 0 (2): 0 465--476, 2005

work page 2005

[24] [24]

Zero-inflated poisson regression, with an application to defects in manufacturing

Diane Lambert. Zero-inflated poisson regression, with an application to defects in manufacturing. Technometrics, 34 0 (1): 0 1--14, 1992

work page 1992

[25] [25]

Beta diversity as the variance of community data: dissimilarity coefficients and partitioning

Pierre Legendre and Miquel De C \'a ceres. Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecology letters, 16 0 (8): 0 951--963, 2013

work page 2013

[26] [26]

Ecologically meaningful transformations for ordination of species data

Pierre Legendre and Eugene D Gallagher. Ecologically meaningful transformations for ordination of species data. Oecologia, 129 0 (2): 0 271--280, 2001

work page 2001

[27] [27]

Microbiome, metagenomics, and high-dimensional compositional data analysis

Hongzhe Li. Microbiome, metagenomics, and high-dimensional compositional data analysis. Annual Review of Statistics and Its Application, 2 0 (1): 0 73--94, 2015

work page 2015

[28] [28]

On the limited memory bfgs method for large scale optimization

Dong C Liu and Jorge Nocedal. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45 0 (1): 0 503--528, 1989

work page 1989

[29] [29]

Analysis of composition of microbiomes: a novel method for studying microbial composition

Siddhartha Mandal, Will Van Treuren, Richard A White, Merete Eggesb , Rob Knight, and Shyamal D Peddada. Analysis of composition of microbiomes: a novel method for studying microbial composition. Microbial ecology in health and disease, 26 0 (1): 0 27663, 2015

work page 2015

[30] [30]

Directional statistics

Kanti V Mardia and Peter E Jupp. Directional statistics. John Wiley & Sons, 2000

work page 2000

[31] [31]

Dealing with zeros and missing values in compositional data sets using nonparametric imputation

Josep A Mart \' n-Fern \'a ndez, Carles Barcel \'o -Vidal, and Vera Pawlowsky-Glahn. Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Mathematical Geology, 35 0 (3): 0 253--278, 2003

work page 2003

[32] [32]

Specification and testing of some modified count data models

John Mullahy. Specification and testing of some modified count data models. Journal of econometrics, 33 0 (3): 0 341--365, 1986

work page 1986

[33] [33]

Large sample estimation and hypothesis testing

Whitney K Newey and Daniel McFadden. Large sample estimation and hypothesis testing. Handbook of econometrics, 4: 0 2111--2245, 1994

work page 1994

[34] [34]

Application of the bingham distribution function in paleomagnetic studies

Tullis C Onstott. Application of the bingham distribution function in paleomagnetic studies. Journal of Geophysical Research: Solid Earth, 85 0 (B3): 0 1500--1510, 1980

work page 1980

[35] [35]

Differential abundance analysis for microbial marker-gene surveys

Joseph N Paulson, O Colin Stine, H \'e ctor Corrada Bravo, and Mihai Pop. Differential abundance analysis for microbial marker-gene surveys. Nature methods, 10 0 (12): 0 1200--1202, 2013

work page 2013

[36] [36]

Persea americana for total health (path-2): Effects of avocado consumption on gastrointestinal health in a randomized, crossover, complete feeding trial

Maria G Sanabria-Veaz, Tori A Holthaus, Maggie Oleksiak, David Revilla, David A Alvarado, Marahi Perez-Tamayo, Naiman A Khan, and Hannah D Holscher. Persea americana for total health (path-2): Effects of avocado consumption on gastrointestinal health in a randomized, crossover, complete feeding trial. medRxiv, pages 2026--03, 2026

work page 2026

[37] [37]

Fitting kent models to compositional data with small concentration

Janice L Scealy and Alan H Welsh. Fitting kent models to compositional data with small concentration. Statistics and Computing, 24 0 (2): 0 165--179, 2014

work page 2014

[38] [38]

Spatial hyperspheric models for compositional data

Michael R Schwob, Mevin B Hooten, Nicholas M Calzada, and Timothy H Keitt. Spatial hyperspheric models for compositional data. The Annals of Applied Statistics, 19 0 (4): 0 2644--2663, 2025

work page 2025

[39] [39]

Fecal bacteria as biomarkers for predicting food intake in healthy adults

Leila M Shinn, Yutong Li, Aditya Mansharamani, Loretta S Auvil, Michael E Welge, Colleen Bushell, Naiman A Khan, Craig S Charron, Janet A Novotny, David J Baer, et al. Fecal bacteria as biomarkers for predicting food intake in healthy adults. The Journal of nutrition, 151 0 (2): 0 423--433, 2021

work page 2021

[40] [40]

Use of the von mises distribution to analyse continuous proportions

Michael A Stephens. Use of the von mises distribution to analyse continuous proportions. Biometrika, 69 0 (1): 0 197--203, 1982

work page 1982

[41] [41]

Avocado consumption alters gastrointestinal bacteria abundance and microbial metabolite concentrations among adults with overweight or obesity: a randomized controlled trial

Sharon V Thompson, Melisa A Bailey, Andrew M Taylor, Jennifer L Kaczmarek, Annemarie R Mysonhimer, Caitlyn G Edwards, Ginger E Reeser, Nicholas A Burd, Naiman A Khan, and Hannah D Holscher. Avocado consumption alters gastrointestinal bacteria abundance and microbial metabolite concentrations among adults with overweight or obesity: a randomized controlled...

work page 2021

[42] [42]

Pymanopt: A python toolbox for optimization on manifolds using automatic differentiation

James Townsend, Niklas Koep, and Sebastian Weichwald. Pymanopt: A python toolbox for optimization on manifolds using automatic differentiation. Journal of Machine Learning Research, 17 0 (137): 0 1--5, 2016

work page 2016

[43] [43]

A core gut microbiome in obese and lean twins

Peter J Turnbaugh, Micah Hamady, Tanya Yatsunenko, Brandi L Cantarel, Alexis Duncan, Ruth E Ley, Mitchell L Sogin, William J Jones, Bruce A Roe, Jason P Affourtit, et al. A core gut microbiome in obese and lean twins. nature, 457 0 (7228): 0 480--484, 2009

work page 2009

[44] [44]

Asymptotic statistics, volume 3

Aad W Van der Vaart. Asymptotic statistics, volume 3. Cambridge university press, 2000

work page 2000

[45] [45]

A monte carlo implementation of the em algorithm and the poor man's data augmentation algorithms

Greg CG Wei and Martin A Tanner. A monte carlo implementation of the em algorithm and the poor man's data augmentation algorithms. Journal of the American statistical Association, 85 0 (411): 0 699--704, 1990

work page 1990

[46] [46]

Maximum likelihood estimation of misspecified models

Halbert White. Maximum likelihood estimation of misspecified models. Econometrica: Journal of the econometric society, pages 1--25, 1982

work page 1982

[47] [47]

Estimation, inference and specification analysis

Halbert White. Estimation, inference and specification analysis. Cambridge university press, 1996

work page 1996

[48] [48]

Assessment and selection of competing models for zero-inflated microbiome data

Lizhen Xu, Andrew D Paterson, Williams Turpin, and Wei Xu. Assessment and selection of competing models for zero-inflated microbiome data. PloS one, 10 0 (7): 0 e0129606, 2015

work page 2015

[49] [49]

o rg Peplies, Wolfgang Ludwig, and Frank Oliver Gl \

Pelin Yilmaz, Laura Wegener Parfrey, Pablo Yarza, Jan Gerken, Elmar Pruesse, Christian Quast, Timmy Schweer, J \"o rg Peplies, Wolfgang Ludwig, and Frank Oliver Gl \"o ckner. The silva and “all-species living tree project (ltp)” taxonomic frameworks. Nucleic acids research, 42 0 (D1): 0 D643--D648, 2014

work page 2014