Introduces the AR(1)-MSBM for evolving multilayer networks and provides online estimators with minimax-optimal rates and community recovery guarantees under stationarity and non-stationarity via adaptive windowing.
Mixed citations
Journal of Classification , year=1985, volume=
Mixed citation behavior. Most common role is background (60%).
citation-role summary
citation-polarity summary
representative citing papers
A large benchmark finds traditional imputation methods for scRNA-seq data generally outperform deep learning ones, but numerical recovery does not reliably improve biological downstream analyses and no method wins across all settings.
SBTA reformulates topic modeling to assign topics at the segment level rather than document level, yielding cleaner topics on a new SemEval-STM dataset created via LLM decomposition and human refinement.
CADI quantifies the preservation of relative cluster angles in low-dimensional projections using internal angles from point triples.
Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.
Survey of 400 practitioners identifies four failure archetypes for network segmentation projects, with uniform preference for general IT project management fixes over segmentation-specific ones.
Extends blurring mean shift to functional data in Hilbert space with convergence analysis and a scalable stochastic variant based on random partitioning.
LLM digital personas improve alignment with human survey response distributions for stable attributes but remain limited for individual prediction and fail to recover multivariate respondent structure.
ProfileGLMM is an R package extending Bayesian profile regression with GLMMs to support hierarchical data, random effects, and cluster-covariate interactions for continuous or binary outcomes.
The effectiveness of dimensionality reduction before clustering depends on matching the specific technique and target dimension count to the data geometry and the clustering algorithm used.
Football fever in spectators follows a V-shaped time course captured as a latent process from heart rate and stress data via time-dependent structural equation modeling.
The authors provide a systematization of differentially private graph release methods along with an objective-based framework and two illustrative evaluations for social network analysts.
An unsupervised-to-supervised ML pipeline on UK NDNS data discovers four dietary patterns, reproduces them with macro-F1 0.963 using a surrogate classifier, and interprets them via SHAP for potential clinical use.
citing papers explorer
-
Online Learning for Autoregressive Multilayer Stochastic Block Models under Stationarity and Non-Stationarity
Introduces the AR(1)-MSBM for evolving multilayer networks and provides online estimators with minimax-optimal rates and community recovery guarantees under stationarity and non-stationarity via adaptive windowing.
-
A Large-Scale Comparative Analysis of Imputation Methods for Single-Cell RNA Sequencing Data
A large benchmark finds traditional imputation methods for scRNA-seq data generally outperform deep learning ones, but numerical recovery does not reliably improve biological downstream analyses and no method wins across all settings.
-
From Documents to Segments: A Contextual Reformulation for Topic Assignment
SBTA reformulates topic modeling to assign topics at the segment level rather than document level, yielding cleaner topics on a new SemEval-STM dataset created via LLM decomposition and human refinement.
-
Class Angular Distortion Index for Dimensionality Reduction
CADI quantifies the preservation of relative cluster angles in low-dimensional projections using internal angles from point triples.
-
Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus
Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.
-
Why Network Segmentation Projects Fail
Survey of 400 practitioners identifies four failure archetypes for network segmentation projects, with uniform preference for general IT project management fixes over segmentation-specific ones.
-
Blurring Mean Shift for Clustering Functional Data: A Scalable Algorithm and Convergence Analysis
Extends blurring mean shift to functional data in Hilbert space with convergence analysis and a scalable stochastic variant based on random partitioning.
-
When Can Digital Personas Reliably Approximate Human Survey Findings?
LLM digital personas improve alignment with human survey response distributions for stable attributes but remain limited for individual prediction and fail to recover multivariate respondent structure.
-
ProfileGLMM: a R Package Extending Bayesian Profile Regression using Generalised Linear Mixed Models
ProfileGLMM is an R package extending Bayesian profile regression with GLMMs to support hierarchical data, random effects, and cluster-covariate interactions for continuous or binary outcomes.
-
Assessing the impact of dimensionality reduction on clustering performance -- a systematic study
The effectiveness of dimensionality reduction before clustering depends on matching the specific technique and target dimension count to the data geometry and the clustering algorithm used.
-
Time-dependent structural equation modeling of fans' football fever using activity tracking data during the 2025 DFB Cup final
Football fever in spectators follows a V-shaped time course captured as a latent process from heart rate and stress data via time-dependent structural equation modeling.
-
SoK: Practical Aspects of Releasing Differentially Private Graphs
The authors provide a systematization of differentially private graph release methods along with an objective-based framework and two illustrative evaluations for social network analysts.
-
An Explainable Unsupervised-to-Supervised Machine Learning Framework for Dietary Pattern Discovery Using UK National Dietary Survey Data
An unsupervised-to-supervised ML pipeline on UK NDNS data discovers four dietary patterns, reproduces them with macro-F1 0.963 using a surrogate classifier, and interprets them via SHAP for potential clinical use.