Adaptive Bayesian computation for efficient biobank-scale genomic inference

Helene Ruffieux; John Whittaker; Sylvia Richardson; Yiran Li

arxiv: 2509.10736 · v2 · submitted 2025-09-12 · 📊 stat.AP

Adaptive Bayesian computation for efficient biobank-scale genomic inference

Yiran Li , John Whittaker , Sylvia Richardson , Helene Ruffieux This is my paper

Pith reviewed 2026-05-18 17:41 UTC · model grok-4.3

classification 📊 stat.AP

keywords adaptive variational inferencebiobank-scale genomicspQTL mappinghierarchical Bayesian modelscoordinate ascentmulti-trait analysiscomputational efficiencyUK Biobank

0 comments

The pith

Adaptive focus strategy in variational inference halves runtime for biobank pQTL mapping

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an adaptive focus strategy inside block coordinate ascent variational inference that updates only the parameter subsets judged relevant from current estimates. This targets the computational bottleneck in fitting hierarchical Bayesian models to biobank data that jointly analyze many traits or units. A sympathetic reader would care because full updates become prohibitive at genome-wide scale with thousands of traits and large samples. The approach is shown on a joint model of hierarchically linked regressions for protein QTL mapping, delivering up to 50 percent runtime reduction while preserving statistical performance in both simulated and real UK Biobank proteomic data.

Core claim

We propose an adaptive focus (AF) strategy within a block coordinate ascent variational inference (CAVI) framework that selectively updates subsets of parameters at each iteration, corresponding to units deemed relevant based on current estimates. We illustrate this approach in protein quantitative trait locus (pQTL) mapping using a joint model of hierarchically linked regressions with shared parameters across traits. In both simulated data and real proteomic data from the UK Biobank, AF-CAVI achieves up to a 50% reduction in runtime while maintaining statistical performance. We also provide a genome-wide pipeline for multi-trait pQTL mapping across thousands of traits.

What carries the argument

Adaptive focus (AF) strategy within block coordinate ascent variational inference (CAVI), which selects and updates only relevant parameter subsets based on current variational estimates to concentrate effort on biologically important units.

If this is right

Enables routine joint modeling of thousands of traits at biobank scale by cutting computation time.
Preserves statistical performance in pQTL mapping tasks compared with full updates.
Supports practical genome-wide pipelines for multi-trait analyses.
Applies to other hierarchical Bayesian models where effects concentrate in few units.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar selective updating could accelerate variational methods in other high-dimensional sparse-signal domains such as imaging or single-cell data.
The method suggests testing dynamic relevance criteria that evolve during optimization rather than fixing them early.
Combining the focus strategy with stochastic or parallel updates could yield further speed gains in even larger datasets.

Load-bearing premise

That parameter subsets identified as relevant from current variational estimates are sufficient to preserve the quality of the joint posterior approximation over the full high-dimensional space without systematic under-updating of important units.

What would settle it

Run both full CAVI and AF-CAVI on the same UK Biobank proteomic dataset with thousands of traits and compare the sets of discovered pQTLs and posterior effect estimates; large discrepancies in detected associations would falsify the claim of maintained performance.

Figures

Figures reproduced from arXiv: 2509.10736 by Helene Ruffieux, John Whittaker, Sylvia Richardson, Yiran Li.

**Figure 2.** Figure 2: Relative differences in runtime (local and total) and iterations for the RF-CAVI and series [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Pipeline for genome-wide joint pQTL mapping using the AF-CAVI algorithm for the [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Regression coefficients (a.k.a. BETA, left) and PPI (right) inferred by the vanilla CAVI ad AF-CAVI algorithms in each locus. Maximum value is taken for each response in each locus. perturbation mechanism in the selection of local factors, which shares similarity with the adaptive scanning MCMC suggested in Richardson, Bottolo, and Rosenthal [24] for sparse Bayesian hierarchical regressions. Such ideas hav… view at source ↗

read the original abstract

Motivation: Modern biobanks, with unprecedented sample sizes and phenotypic diversity, have become foundational resources for genomic studies, enabling powerful cross-phenotype and population-scale analyses. As studies grow in complexity, Bayesian hierarchical models offer a principled framework for jointly modeling multiple units such as cells, traits, and experimental conditions, increasing statistical power through information sharing. However, adoption of Bayesian hierarchical models in biobank-scale studies remains limited due to computational inefficiencies, particularly in posterior inference over high-dimensional parameter spaces. Deterministic approximations such as variational inference provide scalable alternatives to Markov Chain Monte Carlo, yet current implementations do not fully exploit the structure of genome-wide multi-unit modeling, especially when biological effects of interest are concentrated in a few units. Results: We propose an adaptive focus (AF) strategy within a block coordinate ascent variational inference (CAVI) framework that selectively updates subsets of parameters at each iteration, corresponding to units deemed relevant based on current estimates. We illustrate this approach in protein quantitative trait locus (pQTL) mapping using a joint model of hierarchically linked regressions with shared parameters across traits. In both simulated data and real proteomic data from the UK Biobank, AF-CAVI achieves up to a 50\% reduction in runtime while maintaining statistical performance. We also provide a genome-wide pipeline for multi-trait pQTL mapping across thousands of traits, demonstrating AF-CAVI as an efficient scheme for large-scale, multi-unit Bayesian analysis in biobanks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AF-CAVI shows practical runtime cuts for multi-trait pQTL but the adaptive block selection needs checks on whether it preserves ELBO stationarity.

read the letter

The main point is that this paper adds an adaptive focus step inside block CAVI to skip irrelevant parameter blocks during variational updates for hierarchically linked multi-trait regressions, and it reports up to 50% faster runtimes on simulated data plus UK Biobank proteomics while holding statistical performance steady. The concrete setup is pQTL mapping with shared parameters across traits, and the method uses current variational means or variances to pick which units to update at each iteration. That targeted sparsity exploitation is the actual novelty here, and the empirical side is straightforward: they run the pipeline genome-wide on thousands of traits and show the speed gain without obvious accuracy loss on both simulated and real data. Credit is due for shipping a usable large-scale implementation rather than just theory. The soft spot is the one the stress test flags. Because updates are coordinate-wise and traits are coupled through shared parameters, an early low estimate for one trait can cause a unit to be dropped even if its contribution to the joint ELBO would grow later. The paper does not report the fraction of units permanently excluded, nor does it compare final ELBO values against plain CAVI, so it is not clear whether the final approximation is still a stationary point. The abstract claims maintained performance, but without sensitivity checks on the relevance threshold or exact baseline details, that claim stays somewhat provisional. This is for people who already run variational methods on biobank-scale genomic data and need faster coordinate ascent without rewriting their whole pipeline. A reader working on scalable hierarchical models in genetics would get concrete implementation ideas and runtime numbers worth testing. The work has enough empirical grounding and addresses a real bottleneck to deserve serious referee time, even if the adaptive guarantee needs tightening.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces an adaptive focus (AF) strategy within a block coordinate ascent variational inference (CAVI) framework for scalable posterior inference in high-dimensional Bayesian hierarchical models applied to biobank-scale genomic data. The approach selectively updates parameter subsets deemed relevant based on current variational estimates and is illustrated in protein quantitative trait locus (pQTL) mapping via a joint model of hierarchically linked regressions with shared parameters across traits. Empirical evaluations on simulated data and real proteomic data from the UK Biobank report up to 50% runtime reduction while maintaining statistical performance, and a genome-wide pipeline for multi-trait pQTL mapping is presented.

Significance. If the adaptive selection reliably preserves the fidelity of the variational approximation to the joint posterior, the method could meaningfully expand the feasibility of Bayesian hierarchical modeling for multi-unit analyses at biobank scales. The empirical results on both simulated and real UK Biobank data, together with the provided pipeline, support practical utility for large-scale genomic inference.

major comments (2)

[Methods (AF-CAVI algorithm description)] The AF strategy selects parameter blocks for update using thresholds on current variational means or variances. In the hierarchically linked regression model, shared parameters across traits couple the units; an early underestimate for one trait can therefore cause a unit to be skipped even when its marginal contribution to the joint ELBO is non-negligible. Because CAVI updates are coordinate-wise, repeated skipping can leave the variational distribution at a point that is not a stationary point of the full ELBO, violating the usual monotonicity guarantee.
[Results (simulated and real-data experiments)] The paper reports runtime and statistical performance on simulated and UK Biobank data but does not quantify the fraction of units that are permanently excluded or compare final ELBO values between AF-CAVI and full CAVI. This information is needed to assess whether the adaptive approximation preserves quality over the full high-dimensional space.

minor comments (3)

[Abstract] The abstract states that AF-CAVI 'maintains statistical performance' but provides no details on the exact metrics (e.g., power, false discovery rate) or the precise baseline methods used for comparison.
[Results] Sensitivity of results to the choice of relevance threshold is not explored; reporting performance across a range of thresholds would strengthen the robustness claims.
Notation for the hierarchically linked regression model and the shared parameters could be introduced more explicitly with a small illustrative diagram to aid readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment below and outline the revisions we will make to strengthen the presentation.

read point-by-point responses

Referee: [Methods (AF-CAVI algorithm description)] The AF strategy selects parameter blocks for update using thresholds on current variational means or variances. In the hierarchically linked regression model, shared parameters across traits couple the units; an early underestimate for one trait can therefore cause a unit to be skipped even when its marginal contribution to the joint ELBO is non-negligible. Because CAVI updates are coordinate-wise, repeated skipping can leave the variational distribution at a point that is not a stationary point of the full ELBO, violating the usual monotonicity guarantee.

Authors: We appreciate the referee's careful analysis of the convergence implications. The adaptive block selection in AF-CAVI is driven by current variational estimates with the goal of concentrating computation on units that contribute meaningfully to the joint posterior. We acknowledge that this dynamic selection means the standard monotonicity proof for fixed-block coordinate ascent does not apply directly, and the final variational distribution may not be a stationary point of the unrestricted ELBO. In the revised manuscript we will expand the Methods section to discuss this point explicitly, describe the safeguards built into our threshold rules, and report empirical checks confirming that the attained ELBO values remain close to those of full CAVI. revision: yes
Referee: [Results (simulated and real-data experiments)] The paper reports runtime and statistical performance on simulated and UK Biobank data but does not quantify the fraction of units that are permanently excluded or compare final ELBO values between AF-CAVI and full CAVI. This information is needed to assess whether the adaptive approximation preserves quality over the full high-dimensional space.

Authors: We agree that these diagnostics would provide valuable reassurance about approximation quality. In the revised manuscript we will add quantitative summaries of the fraction of units excluded at each iteration and the proportion that remain permanently skipped. We will also include direct ELBO comparisons between AF-CAVI and standard CAVI on the simulated data and on a representative subset of UK Biobank traits for which full CAVI remains computationally tractable. These additions will allow readers to evaluate the fidelity of the adaptive approximation more rigorously. revision: yes

Circularity Check

0 steps flagged

Empirical runtime gains rest on separate validation datasets with no derivation reducing to fitted inputs

full rationale

The paper introduces an adaptive focus (AF) block-coordinate variational inference scheme for a hierarchically linked multi-trait regression model and reports up to 50% runtime reduction on simulated and UK Biobank pQTL data while preserving statistical performance. No equations or coordinate-ascent updates are shown to be equivalent by construction to the selection thresholds or to any fitted quantity; the central claims are supported by direct comparison against full CAVI on held-out data rather than by self-referential definitions or load-bearing self-citations. The method therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The method implicitly assumes the hierarchical regression structure and the validity of the relevance heuristic.

pith-pipeline@v0.9.0 · 5793 in / 1037 out tokens · 32177 ms · 2026-05-18T17:41:09.293542+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose an adaptive focus (AF) strategy within a block coordinate ascent variational inference (CAVI) framework that selectively updates subsets of parameters at each iteration, corresponding to units deemed relevant based on current estimates.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

AF-CAVI achieves up to a 50% reduction in runtime while maintaining statistical performance

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

[1]

UK Biobank: From Concept to Reality

William Ollier, Sprosen Tim, et al. “UK Biobank: From Concept to Reality”. In:Pharmacoge- nomics6.6 (Sept. 2005), pp. 639–646.DOI:10.2217/14622416.6.6.639

work page doi:10.2217/14622416.6.6.639 2005
[2]

Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease

Wei Zhou, Masahiro Kanai, Kuan-Han H. Wu, et al. “Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease”. In:Cell Genomics2.10 (Oct. 2022).DOI: 10.1016/j.xgen.2022.100192

work page doi:10.1016/j.xgen.2022.100192 2022
[3]

The UK Biobank resource with deep phenotyping and genomic data

Clare Bycroft, Colin Freeman, Desislava Petkova, et al. “The UK Biobank resource with deep phenotyping and genomic data”. en. In:Nature562.7726 (Oct. 2018), pp. 203–209.DOI: 10. 1038/s41586-018-0579-z

work page 2018
[4]

Plasma proteomic associations with genetics and health in the UK Biobank

Benjamin B. Sun, Joshua Chiou, Matthew Traylor, et al. “Plasma proteomic associations with genetics and health in the UK Biobank”. en. In:Nature622.7982 (Oct. 2023), pp. 329–338.DOI: 10.1038/s41586-023-06592-6

work page doi:10.1038/s41586-023-06592-6 2023
[5]

Genetic associations with ratios between protein levels detect new pQTLs and reveal protein-protein interactions

Karsten Suhre. “Genetic associations with ratios between protein levels detect new pQTLs and reveal protein-protein interactions”. English. In:Cell Genomics4.3 (Mar. 2024).DOI: 10.1016/ j.xgen.2024.100506

work page arXiv 2024
[6]

Mihir G. Sukhatme, Asha Kar, Uma Thanigai Arasu, et al.Integration of single cell omics with biobank data discovers trans effects of SREBF1 abdominal obesity risk variants on adipocyte expression of more than 100 genes. en. Nov. 2024.DOI: 10.1101/2024.11.22.24317804

work page doi:10.1101/2024.11.22.24317804 2024
[7]

Bayesian hierarchical modeling for signaling pathway inference from single cell interventional data

Ruiyan Luo and Hongyu Zhao. “Bayesian hierarchical modeling for signaling pathway inference from single cell interventional data”. In:The annals of applied statistics5.2A (2011), pp. 725–745. DOI:10.1214/10-AOAS425

work page doi:10.1214/10-aoas425 2011
[8]

Bayesian Quantitative Trait Loci Mapping for Multiple Traits

Samprit Banerjee, Brian S. Yandell, and Nengjun Yi. “Bayesian Quantitative Trait Loci Mapping for Multiple Traits”. en. In:Genetics179.4 (Aug. 2008), p. 2275.DOI: 10.1534/genetics. 108.088427

work page doi:10.1534/genetics 2008
[9]

A Statistical Framework for Joint eQTL Analysis in Multiple Tissues

Timothée Flutre, Xiaoquan Wen, Jonathan Pritchard, et al. “A Statistical Framework for Joint eQTL Analysis in Multiple Tissues”. en. In:PLOS Genetics9.5 (May 2013), e1003486.DOI: 10.1371/journal.pgen.1003486

work page doi:10.1371/journal.pgen.1003486 2013
[10]

HBI: a hierarchical Bayesian interaction model to estimate cell-type-specific methylation quantitative trait loci incorporating priors from cell- sorted bisulfite sequencing data

Youshu Cheng, Biao Cai, Hongyu Li, et al. “HBI: a hierarchical Bayesian interaction model to estimate cell-type-specific methylation quantitative trait loci incorporating priors from cell- sorted bisulfite sequencing data”. In:Genome Biology25.1 (Oct. 2024), p. 273.DOI:10.1186/ s13059-024-03411-7

work page 2024
[11]

Journal of the American Statistical Association , author =

David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. “Variational Inference: A Review for Statisticians”. en. In:Journal of the American Statistical Association112.518 (Apr. 2017), pp. 859– 877.DOI:10.1080/01621459.2017.1285773

work page doi:10.1080/01621459.2017.1285773 2017
[12]

Spike and slab variable selection: Frequentist and Bayesian strategies

Hemant Ishwaran and J. Sunil Rao. “Spike and slab variable selection: Frequentist and Bayesian strategies”. In:The Annals of Statistics33.2 (Apr. 2005), pp. 730–773.DOI: 10.1214/009053604000001147

work page doi:10.1214/009053604000001147 2005
[13]

The horseshoe estimator for sparse signals

Carlos M. Carvalho, Nicholas G. Polson, and James G. Scott. “The horseshoe estimator for sparse signals”. en. In:Biometrika97.2 (2010), pp. 465–480

work page 2010
[14]

Homogenous 96-Plex PEA Immunoas- say Exhibiting High Sensitivity, Specificity, and Excellent Scalability

Erika Assarsson, Martin Lundberg, Göran Holmquist, et al. “Homogenous 96-Plex PEA Immunoas- say Exhibiting High Sensitivity, Specificity, and Excellent Scalability”. en. In:PLoS ONE9.4 (Apr. 2014). Ed. by Jörg D. Hoheisel, e95192.DOI:10.1371/journal.pone.0095192. 14

work page doi:10.1371/journal.pone.0095192 2014
[15]

Sun, Joshua Chiou, Matthew Traylor, et al.Genetic regulation of the human plasma proteome in 54,306 UK Biobank participants

Benjamin B. Sun, Joshua Chiou, Matthew Traylor, et al.Genetic regulation of the human plasma proteome in 54,306 UK Biobank participants. en. June 2022.DOI: 10.1101/2022.06.17. 496443

work page doi:10.1101/2022.06.17 2022
[17]

Efficient inference for genetic association studies with multiple outcomes

Helene Ruffieux, Anthony C. Davison, Jorg Hager, et al. “Efficient inference for genetic association studies with multiple outcomes”. In:Biostatistics18.4 (Oct. 2017), pp. 618–636.DOI: 10.1093/ biostatistics/kxx007

work page 2017
[18]

An Integrated Hierarchical Bayesian Model for Multivariate eQTL Mapping

Marie Pier Scott-Boyer, Gregory C. Imholte, Arafat Tayeb, et al. “An Integrated Hierarchical Bayesian Model for Multivariate eQTL Mapping”. en. In:Statistical Applications in Genetics and Molecular Biology11.4 (Jan. 2012).DOI:10.1515/1544-6115.1760

work page doi:10.1515/1544-6115.1760 2012
[19]

A multi-trait Bayesian method for mapping QTL and genomic prediction

Kathryn E. Kemper, Philip J. Bowman, Benjamin J. Hayes, et al. “A multi-trait Bayesian method for mapping QTL and genomic prediction”. In:Genetics Selection Evolution50.1 (Mar. 2018), p. 10.DOI:10.1186/s12711-018-0377-y

work page doi:10.1186/s12711-018-0377-y 2018
[20]

A Systematic Heritability Analysis of the Human Whole Blood Transcriptome

Tianxiao Huan, Chunyu Liu, Roby Joehanes, et al. “A Systematic Heritability Analysis of the Human Whole Blood Transcriptome”. In:Human genetics134.3 (Mar. 2015), pp. 343–358.DOI: 10.1007/s00439-014-1524-3

work page doi:10.1007/s00439-014-1524-3 2015
[21]

Approximately independent linkage disequilibrium blocks in human populations

Tomaz Berisa and Joseph K. Pickrell. “Approximately independent linkage disequilibrium blocks in human populations”. In:Bioinformatics32.2 (Jan. 2016), pp. 283–285.DOI: 10 . 1093 / bioinformatics/btv546

work page 2016
[22]

The Median Probability Model and Correlated Variables

Maria M. Barbieri, James O. Berger, Edward I. George, et al. “The Median Probability Model and Correlated Variables”. en. In:Bayesian Analysis16.4 (Dec. 2021).DOI: 10.1214/20-BA1249

work page doi:10.1214/20-ba1249 2021
[23]

Stochastic Variational Inference

Matthew D Hoffman, David M. Blei, Chong Wang, et al. “Stochastic Variational Inference”. en. In: Journal of Machine Learning Research14 (2013), pp. 1303–1347

work page 2013
[24]

2013 , isbn =

Sylvia Richardson, Leonardo Bottolo, and Jeffrey S. Rosenthal. “Bayesian Models for Sparse Regression Analysis of High Dimensional Data”. In:Bayesian Statistics 9. Ed. by José M. Bernardo, M. J. Bayarri, James O. Berger, et al. Oxford University Press, Oct. 2011, p. 0.DOI: 10.1093/ acprof:oso/9780199694587.003.0018

work page arXiv 2011
[25]

Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Prob- lems

Yu. Nesterov. “Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Prob- lems”. en. In:SIAM Journal on Optimization22.2 (Jan. 2012), pp. 341–362.DOI: 10.1137/ 100802001

work page 2012
[26]

A global-local approach for detecting hotspots in multiple-response regression

Hélène Ruffieux, Anthony C. Davison, Jörg Hager, et al. “A global-local approach for detecting hotspots in multiple-response regression”. en. In:The Annals of Applied Statistics14.2 (June 2020). DOI:10.1214/20-AOAS1332

work page doi:10.1214/20-aoas1332 2020
[27]

Ruffieux Ruffieux.ECHOSEQ R-package (https://github.com/hruffieux/echoseq)

work page
[28]

A fully joint Bayesian quantitative trait locus mapping of human protein abundance in plasma

Hélène Ruffieux, Jérôme Carayol, Radu Popescu, et al. “A fully joint Bayesian quantitative trait locus mapping of human protein abundance in plasma”. en. In:PLOS Computational Biology16.6 (June 2020), e1007882.DOI:10.1371/journal.pcbi.1007882

work page doi:10.1371/journal.pcbi.1007882 2020
[29]

Robust relationship inference in genome-wide association studies

Ani Manichaikul, Josyf C. Mychaleckyj, Stephen S. Rich, et al. “Robust relationship inference in genome-wide association studies”. In:Bioinformatics26.22 (Nov. 2010), pp. 2867–2873.DOI: 10.1093/bioinformatics/btq559. 15 Appendices A Details of the atlaQTL model Here we provide details of the atlasQTL model by Ruffieux, Davison, Hager, et al. [26]. Given p...

work page doi:10.1093/bioinformatics/btq559 2010
[30]

Define the dependence structureγ st =1{X s is associated withy t}for each pair ofX s,y t

work page
[31]

Simulate the error termsε t with a specified correlation structure in the responses

work page
[32]

active”, i.e., associated with at least one response, while the other SNPs are set as “inactive

Simulate the effect sizesβ st. The rest of this section explains the details of each step utilizing functions in the echoseq package [27] and parameters selected according to Ruffieux, Carayol, Popescu, et al. [28]. No missing value is inserted in the simulated responses for simplicity. Step 1: Defining the dependence structureRandomly select ap percentag...

work page 2006

[1] [1]

UK Biobank: From Concept to Reality

William Ollier, Sprosen Tim, et al. “UK Biobank: From Concept to Reality”. In:Pharmacoge- nomics6.6 (Sept. 2005), pp. 639–646.DOI:10.2217/14622416.6.6.639

work page doi:10.2217/14622416.6.6.639 2005

[2] [2]

Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease

Wei Zhou, Masahiro Kanai, Kuan-Han H. Wu, et al. “Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease”. In:Cell Genomics2.10 (Oct. 2022).DOI: 10.1016/j.xgen.2022.100192

work page doi:10.1016/j.xgen.2022.100192 2022

[3] [3]

The UK Biobank resource with deep phenotyping and genomic data

Clare Bycroft, Colin Freeman, Desislava Petkova, et al. “The UK Biobank resource with deep phenotyping and genomic data”. en. In:Nature562.7726 (Oct. 2018), pp. 203–209.DOI: 10. 1038/s41586-018-0579-z

work page 2018

[4] [4]

Plasma proteomic associations with genetics and health in the UK Biobank

Benjamin B. Sun, Joshua Chiou, Matthew Traylor, et al. “Plasma proteomic associations with genetics and health in the UK Biobank”. en. In:Nature622.7982 (Oct. 2023), pp. 329–338.DOI: 10.1038/s41586-023-06592-6

work page doi:10.1038/s41586-023-06592-6 2023

[5] [5]

Genetic associations with ratios between protein levels detect new pQTLs and reveal protein-protein interactions

Karsten Suhre. “Genetic associations with ratios between protein levels detect new pQTLs and reveal protein-protein interactions”. English. In:Cell Genomics4.3 (Mar. 2024).DOI: 10.1016/ j.xgen.2024.100506

work page arXiv 2024

[6] [6]

Mihir G. Sukhatme, Asha Kar, Uma Thanigai Arasu, et al.Integration of single cell omics with biobank data discovers trans effects of SREBF1 abdominal obesity risk variants on adipocyte expression of more than 100 genes. en. Nov. 2024.DOI: 10.1101/2024.11.22.24317804

work page doi:10.1101/2024.11.22.24317804 2024

[7] [7]

Bayesian hierarchical modeling for signaling pathway inference from single cell interventional data

Ruiyan Luo and Hongyu Zhao. “Bayesian hierarchical modeling for signaling pathway inference from single cell interventional data”. In:The annals of applied statistics5.2A (2011), pp. 725–745. DOI:10.1214/10-AOAS425

work page doi:10.1214/10-aoas425 2011

[8] [8]

Bayesian Quantitative Trait Loci Mapping for Multiple Traits

Samprit Banerjee, Brian S. Yandell, and Nengjun Yi. “Bayesian Quantitative Trait Loci Mapping for Multiple Traits”. en. In:Genetics179.4 (Aug. 2008), p. 2275.DOI: 10.1534/genetics. 108.088427

work page doi:10.1534/genetics 2008

[9] [9]

A Statistical Framework for Joint eQTL Analysis in Multiple Tissues

Timothée Flutre, Xiaoquan Wen, Jonathan Pritchard, et al. “A Statistical Framework for Joint eQTL Analysis in Multiple Tissues”. en. In:PLOS Genetics9.5 (May 2013), e1003486.DOI: 10.1371/journal.pgen.1003486

work page doi:10.1371/journal.pgen.1003486 2013

[10] [10]

HBI: a hierarchical Bayesian interaction model to estimate cell-type-specific methylation quantitative trait loci incorporating priors from cell- sorted bisulfite sequencing data

Youshu Cheng, Biao Cai, Hongyu Li, et al. “HBI: a hierarchical Bayesian interaction model to estimate cell-type-specific methylation quantitative trait loci incorporating priors from cell- sorted bisulfite sequencing data”. In:Genome Biology25.1 (Oct. 2024), p. 273.DOI:10.1186/ s13059-024-03411-7

work page 2024

[11] [11]

Journal of the American Statistical Association , author =

David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. “Variational Inference: A Review for Statisticians”. en. In:Journal of the American Statistical Association112.518 (Apr. 2017), pp. 859– 877.DOI:10.1080/01621459.2017.1285773

work page doi:10.1080/01621459.2017.1285773 2017

[12] [12]

Spike and slab variable selection: Frequentist and Bayesian strategies

Hemant Ishwaran and J. Sunil Rao. “Spike and slab variable selection: Frequentist and Bayesian strategies”. In:The Annals of Statistics33.2 (Apr. 2005), pp. 730–773.DOI: 10.1214/009053604000001147

work page doi:10.1214/009053604000001147 2005

[13] [13]

The horseshoe estimator for sparse signals

Carlos M. Carvalho, Nicholas G. Polson, and James G. Scott. “The horseshoe estimator for sparse signals”. en. In:Biometrika97.2 (2010), pp. 465–480

work page 2010

[14] [14]

Homogenous 96-Plex PEA Immunoas- say Exhibiting High Sensitivity, Specificity, and Excellent Scalability

Erika Assarsson, Martin Lundberg, Göran Holmquist, et al. “Homogenous 96-Plex PEA Immunoas- say Exhibiting High Sensitivity, Specificity, and Excellent Scalability”. en. In:PLoS ONE9.4 (Apr. 2014). Ed. by Jörg D. Hoheisel, e95192.DOI:10.1371/journal.pone.0095192. 14

work page doi:10.1371/journal.pone.0095192 2014

[15] [15]

Sun, Joshua Chiou, Matthew Traylor, et al.Genetic regulation of the human plasma proteome in 54,306 UK Biobank participants

Benjamin B. Sun, Joshua Chiou, Matthew Traylor, et al.Genetic regulation of the human plasma proteome in 54,306 UK Biobank participants. en. June 2022.DOI: 10.1101/2022.06.17. 496443

work page doi:10.1101/2022.06.17 2022

[16] [17]

Efficient inference for genetic association studies with multiple outcomes

Helene Ruffieux, Anthony C. Davison, Jorg Hager, et al. “Efficient inference for genetic association studies with multiple outcomes”. In:Biostatistics18.4 (Oct. 2017), pp. 618–636.DOI: 10.1093/ biostatistics/kxx007

work page 2017

[17] [18]

An Integrated Hierarchical Bayesian Model for Multivariate eQTL Mapping

Marie Pier Scott-Boyer, Gregory C. Imholte, Arafat Tayeb, et al. “An Integrated Hierarchical Bayesian Model for Multivariate eQTL Mapping”. en. In:Statistical Applications in Genetics and Molecular Biology11.4 (Jan. 2012).DOI:10.1515/1544-6115.1760

work page doi:10.1515/1544-6115.1760 2012

[18] [19]

A multi-trait Bayesian method for mapping QTL and genomic prediction

Kathryn E. Kemper, Philip J. Bowman, Benjamin J. Hayes, et al. “A multi-trait Bayesian method for mapping QTL and genomic prediction”. In:Genetics Selection Evolution50.1 (Mar. 2018), p. 10.DOI:10.1186/s12711-018-0377-y

work page doi:10.1186/s12711-018-0377-y 2018

[19] [20]

A Systematic Heritability Analysis of the Human Whole Blood Transcriptome

Tianxiao Huan, Chunyu Liu, Roby Joehanes, et al. “A Systematic Heritability Analysis of the Human Whole Blood Transcriptome”. In:Human genetics134.3 (Mar. 2015), pp. 343–358.DOI: 10.1007/s00439-014-1524-3

work page doi:10.1007/s00439-014-1524-3 2015

[20] [21]

Approximately independent linkage disequilibrium blocks in human populations

Tomaz Berisa and Joseph K. Pickrell. “Approximately independent linkage disequilibrium blocks in human populations”. In:Bioinformatics32.2 (Jan. 2016), pp. 283–285.DOI: 10 . 1093 / bioinformatics/btv546

work page 2016

[21] [22]

The Median Probability Model and Correlated Variables

Maria M. Barbieri, James O. Berger, Edward I. George, et al. “The Median Probability Model and Correlated Variables”. en. In:Bayesian Analysis16.4 (Dec. 2021).DOI: 10.1214/20-BA1249

work page doi:10.1214/20-ba1249 2021

[22] [23]

Stochastic Variational Inference

Matthew D Hoffman, David M. Blei, Chong Wang, et al. “Stochastic Variational Inference”. en. In: Journal of Machine Learning Research14 (2013), pp. 1303–1347

work page 2013

[23] [24]

2013 , isbn =

Sylvia Richardson, Leonardo Bottolo, and Jeffrey S. Rosenthal. “Bayesian Models for Sparse Regression Analysis of High Dimensional Data”. In:Bayesian Statistics 9. Ed. by José M. Bernardo, M. J. Bayarri, James O. Berger, et al. Oxford University Press, Oct. 2011, p. 0.DOI: 10.1093/ acprof:oso/9780199694587.003.0018

work page arXiv 2011

[24] [25]

Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Prob- lems

Yu. Nesterov. “Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Prob- lems”. en. In:SIAM Journal on Optimization22.2 (Jan. 2012), pp. 341–362.DOI: 10.1137/ 100802001

work page 2012

[25] [26]

A global-local approach for detecting hotspots in multiple-response regression

Hélène Ruffieux, Anthony C. Davison, Jörg Hager, et al. “A global-local approach for detecting hotspots in multiple-response regression”. en. In:The Annals of Applied Statistics14.2 (June 2020). DOI:10.1214/20-AOAS1332

work page doi:10.1214/20-aoas1332 2020

[26] [27]

Ruffieux Ruffieux.ECHOSEQ R-package (https://github.com/hruffieux/echoseq)

work page

[27] [28]

A fully joint Bayesian quantitative trait locus mapping of human protein abundance in plasma

Hélène Ruffieux, Jérôme Carayol, Radu Popescu, et al. “A fully joint Bayesian quantitative trait locus mapping of human protein abundance in plasma”. en. In:PLOS Computational Biology16.6 (June 2020), e1007882.DOI:10.1371/journal.pcbi.1007882

work page doi:10.1371/journal.pcbi.1007882 2020

[28] [29]

Robust relationship inference in genome-wide association studies

Ani Manichaikul, Josyf C. Mychaleckyj, Stephen S. Rich, et al. “Robust relationship inference in genome-wide association studies”. In:Bioinformatics26.22 (Nov. 2010), pp. 2867–2873.DOI: 10.1093/bioinformatics/btq559. 15 Appendices A Details of the atlaQTL model Here we provide details of the atlasQTL model by Ruffieux, Davison, Hager, et al. [26]. Given p...

work page doi:10.1093/bioinformatics/btq559 2010

[29] [30]

Define the dependence structureγ st =1{X s is associated withy t}for each pair ofX s,y t

work page

[30] [31]

Simulate the error termsε t with a specified correlation structure in the responses

work page

[31] [32]

active”, i.e., associated with at least one response, while the other SNPs are set as “inactive

Simulate the effect sizesβ st. The rest of this section explains the details of each step utilizing functions in the echoseq package [27] and parameters selected according to Ruffieux, Carayol, Popescu, et al. [28]. No missing value is inserted in the simulated responses for simplicity. Step 1: Defining the dependence structureRandomly select ap percentag...

work page 2006