pith. sign in

arxiv: 2605.22665 · v1 · pith:FLT2OUJTnew · submitted 2026-05-21 · 🧬 q-bio.PE

Fitness Inference in Presence of Migrations between Coupled Evolving Populations

Pith reviewed 2026-05-22 03:49 UTC · model grok-4.3

classification 🧬 q-bio.PE
keywords quasi-linkage equilibriummigrationfitness inferenceepistasispopulation geneticsmulti-locus selectiontime-series data
0
0 comments X

The pith

Low migration rates between two populations preserve the quasi-linkage equilibrium phase and enable accurate inference of additive fitness and epistatic interactions from genomic time-series data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper extends the quasi-linkage equilibrium concept from isolated populations to two populations connected by migration. It shows that when migration is sufficiently slow the equilibrium state is still maintained by the combined effects of selection, mutation, recombination, drift and migration. In that regime the authors derive analytical relations that recover both direct fitness effects of alleles and their pairwise interactions from whole-genome time-series data produced in simulations. A sympathetic reader would care because real populations are rarely isolated and low-level gene flow is common, so being able to infer fitness parameters without bias from migration matters for understanding evolution in structured settings.

Core claim

We extend QLE theory to two populations interacting via symmetric or asymmetric migration while evolving under multi-locus selection. Using whole-genome time-series data generated through simulations we demonstrate that the QLE phase persists under conditions of sufficiently low migration rates. In this regime we derive analytical inference relations that allow for the accurate and quantitative estimation of both additive fitness and epistatic interactions.

What carries the argument

The quasi-linkage equilibrium phase extended to migrating populations, which maintains stationary allele-frequency statistics through the interplay of selection, mutation, recombination, genetic drift and low-rate migration.

If this is right

  • Accurate estimation of both additive fitness and epistatic interactions remains possible from time-series data even when low-rate migration couples the populations.
  • The inference relations apply to both symmetric and asymmetric migration between the two populations.
  • Whole-genome time-series data suffice to recover the fitness parameters quantitatively under the low-migration regime.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Migration can be treated as a small perturbation that does not destroy the equilibrium approximation used for inference.
  • The same relations may apply to larger numbers of subpopulations connected by limited gene flow.
  • The method could be applied to real genomic datasets from species known to have low but nonzero migration.

Load-bearing premise

The quasi-linkage equilibrium phase continues to hold and the derived analytical relations remain accurate when two populations are coupled by symmetric or asymmetric migration at low rates.

What would settle it

Simulations at migration rates above the low threshold in which the inferred additive fitness and epistasis values deviate substantially from the known ground-truth parameters used to generate the data.

Figures

Figures reproduced from arXiv: 2605.22665 by Bastien Dumont, Erik Aurell, Hong-Li Zeng, John Barton, Yu-Han Huang.

Figure 1
Figure 1. Figure 1: FIG. 1. Schematic illustration of subpopulations A and B before and after migrations. [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Effects of migration on the inferences of fitness parameters in two subpopulations with different selection strengths. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Effects of selection strength on the inference behavior of fitness parameters under asymmetric migration [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. Representative scatter plots comparing the migration inference theory (solids) and the no-migration approximation [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5. Inference accuracy for the inter-population differences in fitness parameters under symmetric and asymmetric migration. [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6. Supplementary analysis of the effects of migration on fitness parameter inference when the additive fitness terms [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7. Supplementary analysis of the effect of selection strength on fitness parameter inference under symmetric migration [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
read the original abstract

The phase of Quasi-Linkage Equilibrium (QLE) in evolutionary populations is analogous to the thermal equilibrium state in statistical mechanics, a concept pioneered by Kimura in 1965 for two-locus two-allele models. QLE describes a stationary state maintained by the interplay of selection, mutation, recombination and genetic drift. Here we extend QLE theory to populations connected by migration, a fundamental evolutionary force that couples the evolutionary dynamics of interacting subpopulations. Specifically, we examine two populations interacting via symmetric or asymmetric migration while evolving under multi-locus selection. Using whole-genome time-series data generated through FFPopSim, we demonstrate that the QLE phase persists under conditions of sufficiently low migration rates. In this regime, we derive analytical inference relations that allow for the accurate and quantitative estimation of both additive fitness and epistatic interactions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript extends quasi-linkage equilibrium (QLE) theory to two populations coupled by symmetric or asymmetric migration. It uses FFPopSim simulations of whole-genome time-series data to show that the QLE phase persists when migration rates are sufficiently low relative to recombination and selection, and derives analytical inference relations for estimating additive fitness effects and epistatic interactions.

Significance. If the central claims hold, the work is significant for fitness inference in spatially structured populations where migration is present. It provides a perturbative extension of single-population QLE results and supplies practical analytical formulas validated by simulations with multiple replicates showing low inference error. The explicit checks that linkage disequilibrium remains O(m/r) or smaller constitute a strength.

major comments (1)
  1. [§4.1] §4.1, Eq. (8): the leading-order inference formula for additive fitness is derived under the assumption m ≪ r, s; the manuscript does not report the scaling of the residual error with m/r in the simulations, which is load-bearing for the claim of 'accurate and quantitative estimation' across the stated regime.
minor comments (3)
  1. [Figure 3] Figure 3 caption: the color coding for symmetric versus asymmetric migration is not explained in the legend; add explicit labels.
  2. [§2.3] §2.3: the definition of the epistatic interaction term appears after its first use in the inference relation; move the definition forward for clarity.
  3. [References] References: the citation to Kimura (1965) is incomplete; supply the full journal details.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript, positive assessment of its significance, and recommendation for minor revision. We address the major comment below.

read point-by-point responses
  1. Referee: §4.1, Eq. (8): the leading-order inference formula for additive fitness is derived under the assumption m ≪ r, s; the manuscript does not report the scaling of the residual error with m/r in the simulations, which is load-bearing for the claim of 'accurate and quantitative estimation' across the stated regime.

    Authors: We agree with the referee that reporting the scaling of the residual inference error with m/r would strengthen the validation of the perturbative formulas. In the revised manuscript we will add a supplementary analysis (new figure or table) that explicitly plots the additive-fitness inference error versus m/r for the FFPopSim ensembles, confirming the expected O((m/r)^2) scaling in the regime m ≪ r, s. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper derives inference relations for additive fitness and epistasis by perturbative expansion of the single-population QLE solution under the explicit low-migration regime m ≪ r and m ≪ s. This expansion is independent of the target quantities being inferred; the resulting formulas are then validated on external FFPopSim simulations with multiple replicates. No self-definitional steps, fitted parameters renamed as predictions, or load-bearing self-citations that collapse the central claim to unverified inputs appear in the provided derivation outline. The analytic content remains self-contained against the simulation benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the persistence of the QLE regime when migration is added and on the validity of the low-migration approximation for the inference formulas.

axioms (1)
  • domain assumption Quasi-linkage equilibrium continues to describe the stationary state when symmetric or asymmetric migration couples two multi-locus populations.
    This is the key extension stated in the abstract and is required for the analytical relations to hold.

pith-pipeline@v0.9.0 · 5676 in / 1259 out tokens · 36905 ms · 2026-05-22T03:49:44.076995+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · 1 internal anchor

  1. [1]

    We find that migration introduces coupled correction terms in the cumulant dynamical equations, thereby altering the inference method and accuracy of fitness parameters

    and recently carried forward in [ 31], as well as other foundational work [ 32, 33], the present study incorporates migration into the QLE framework, constructing a model with migration between two subpopulations. We find that migration introduces coupled correction terms in the cumulant dynamical equations, thereby altering the inference method and accur...

  2. [2]

    genotype

    Definition of QLE and Physical Analogy The concept of Quasi-Linkage Equilibrium state was originally introduced by Kimura in 1965 for two-locus two-allele models with selection, mutation and recombination in evolving populations of finite size [ 1]. It was developed further by Neher and Shraiman for multi-locus two-allele models [2] and extended to multi-...

  3. [3]

    , sL) of L loci si, where i = 1,

    Genomic structure.A haploid genome is a vector g = (s1, . . . , sL) of L loci si, where i = 1, . . . , L. The number L of loci is fixed and equal for all individual genomes. A population is a collection {gA}A∈A, where A is a set of indices. Each genomegappears in the population with probabilityP(g)

  4. [4]

    They can then be coded by spin-like variabless i =±1∀i

    Ising loci.Loci are bi-allelic,i.e., there are two alleles at each locus. They can then be coded by spin-like variabless i =±1∀i

  5. [5]

    This hypothesis can model,e.g., the struggle for survival in an environment with limited resources

    Constant population.When considering an isolated population, the average number of individuals in this population is fixed. This hypothesis can model,e.g., the struggle for survival in an environment with limited resources

  6. [6]

    This distribution evolves in time driven by three operators representing natural selection, mutations and recombination

    One-genome evolution.The distribution of one genome in a population is given by a genome distribution P (g, t). This distribution evolves in time driven by three operators representing natural selection, mutations and recombination. That recombination can be understood in terms of only one-genome distributions is a simplifying assumption analogous to Bolt...

  7. [7]

    Each of these distributions will fulfill the conditions above, and will the evolve also by exchanging individuals,

    Coupled one-genome evolutions.To accommodate migration between spatially separated populations, we introduce in the same paradigm one distribution over genomes for each location. Each of these distributions will fulfill the conditions above, and will the evolve also by exchanging individuals,

  8. [8]

    Selection.The model for natural selection is based on a fitness function F that can be understood as proportional to the expected number of offspring of an individual of genotype g

    Natural forces of population evolution The action of the three evolutionary forces is then encoded in a master equation,i.e., a phenomenological first-order differential equation: d dt P(g, t) = d dt P(g, t) fit + d dt P(g, t) mut + d dt P(g, t) rec + d dt P(g, t) mig (23) In this section we discuss the first of these three terms of this master equation i...

  9. [9]

    the pairwise coupling appears as the weighted average of the two epistatic fitnesses, with weights determined by the migration rates relative to the other evolutionary forces

    Note on the expression of the pairwise coupling in the case of symmetric migration Going back to the expression of the pairwise coupling: J A ij = 1 2(4µ+rc ij) f A ij +f B ij + f A ij −f B ij 4µ+rc ij + ˜mB→A −˜mA→B 4µ+rc ij + ˜mB→A + ˜mA→B .(53) In the case of symmetric migrationm B→A =m A→B =m, it becomes: J A ij = 1 (4µ+rc ij)(4µ+rc ij + 2m) (4µ+rc ij...

  10. [10]

    We note that if ˜f A i = ˜f B i = ˜fi, then hA i = ˜fi 2µ =h i =h B i

    Note on the expression of the local field when its derivative is zero Combining (52) and its equivalent for population B and assuming ˙h= 0,we have hA i = 1 4µ ˜f A i + ˜f B i + ˜f A i − ˜f B i 2µ+ ˜mB→A −˜mA→B 2µ+ ˜mB→A + ˜mA→B (54) with ˜mA→B = mA→B ZA ZB and ˜f A i = f A i +rP j̸=i cijJ A ij χA j the effective additive fitness due to the effects of rec...

  11. [11]

    (61) which is (57) above

    Entropy terms difference in the ratio of partition functions Again, we writeχ B i =χ A i +δχ i, then H 1 +χ B i 2 =− 1 +χ A i +δχ i 2 log 1 +χ A i 2 + log 1 + δχi 1 +χ A i =− 1 +χ A i +δχ i 2 log 1 +χ A i 2 + log 1 +χ B i 1 +χ A i =H 1 +χ A i 2 − δχi 2 log 1 +χ A i 2 − 1 2 log 1 +χ B i 1 +χ A i − χB i 2 log 1 +χ B i 1 +χ A i (59) The same goes for the (1−...

  12. [12]

    This differs from the main-text setting, where the two subpopulations have different mean additive fitness coefficients

    Effects of migration on the inference on fitness parameters As a supplementary test, we examined the case in which populations A and B are sampled from the same underlying Gaussian fitness distribution, with identical mean and variance. This differs from the main-text setting, where the two subpopulations have different mean additive fitness coefficients....

  13. [13]

    As shown in Fig

    Effect of standard deviation of additive fitness We further examined the effect of selection strength on fitness parameter inference under symmetric migration with mA→B = mB→A = 0.05. As shown in Fig. 7, increasing the mean additive fitness of two subpopulations mainly affects 23 the inference of additive fitness terms fi. When ⟨fi⟩A,B is varied simultane...

  14. [14]

    Kimura, Genetics52, 875 (1965)

    M. Kimura, Genetics52, 875 (1965)

  15. [15]

    R. A. Neher and B. I. Shraiman, Proceedings of the National Academy of Sciences106, 6866 (2009)

  16. [16]

    R. A. Neher and B. I. Shraiman, Rev. Mod. Phys.83, 1283 (2011)

  17. [17]

    Zeng and E

    H.-L. Zeng and E. Aurell, Physical Review E101, 052409 (2020)

  18. [18]

    D. M. Weinreich, R. A. Watson, and L. Chao, Evolution59, 1165 (2005)

  19. [19]

    J. A. G. De Visser and J. Krug, Nature Reviews Genetics15, 480 (2014)

  20. [20]

    Cocco, C

    S. Cocco, C. Feinauer, M. Figliuzzi, R. Monasson, and M. Weigt, Reports on Progress in Physics81, 032601 (2018)

  21. [21]

    Wright, Genetics28, 114 (1943)

    S. Wright, Genetics28, 114 (1943)

  22. [22]

    Slatkin, Annual review of ecology and systematics , 393 (1985)

    M. Slatkin, Annual review of ecology and systematics , 393 (1985)

  23. [23]

    H. Li, T. Kamath, R. Mazumder, X. Lin, and L. O’Connor, medRxiv , 2024 (2024)

  24. [24]

    Mc, Heredity82, 117 (1999)

    W. Mc, Heredity82, 117 (1999)

  25. [25]

    Rousset,Genetic structure and selection in subdivided populations, Vol

    F. Rousset,Genetic structure and selection in subdivided populations, Vol. 40 (Princeton University Press, 2004)

  26. [26]

    Lenormand, Trends in ecology & evolution17, 183 (2002)

    T. Lenormand, Trends in ecology & evolution17, 183 (2002)

  27. [27]

    Wright, Genetics16, 97 (1931)

    S. Wright, Genetics16, 97 (1931)

  28. [28]

    Kimura and G

    M. Kimura and G. H. Weiss, Genetics49, 561 (1964)

  29. [29]

    Haldane, Journal of genetics48, 277 (1948)

    J. Haldane, Journal of genetics48, 277 (1948)

  30. [30]

    Slatkin, Genetics75, 733 (1973)

    M. Slatkin, Genetics75, 733 (1973)

  31. [31]

    N. H. Barton, Heredity43, 333 (1979)

  32. [32]

    Notohara, Journal of mathematical biology29, 59 (1990)

    M. Notohara, Journal of mathematical biology29, 59 (1990)

  33. [33]

    Beerli and J

    P. Beerli and J. Felsenstein, Proceedings of the national academy of sciences98, 4563 (2001)

  34. [34]

    Nielsen and J

    R. Nielsen and J. Wakeley, Genetics158, 885 (2001)

  35. [35]

    G. Coop, D. Witonsky, A. Di Rienzo, and J. K. Pritchard, Genetics185, 1411 (2010)

  36. [36]

    G¨ unther and G

    T. G¨ unther and G. Coop, Genetics195, 205 (2013)

  37. [37]

    Frichot, S

    E. Frichot, S. D. Schoville, G. Bouchard, and O. Fran¸ cois, Molecular biology and evolution30, 1687 (2013)

  38. [38]

    Mathieson and G

    I. Mathieson and G. McVean, Genetics193, 973 (2013)

  39. [39]

    W. Lyu, X. Dai, M. Beaumont, F. Yu, and Z. He, Molecular ecology resources22, 1362 (2022)

  40. [40]

    P. A. P. Moran, Mathematical proceedings of the cambridge philosophical society54, 60 (1958)

  41. [41]

    C. J. Battey, P. L. Ralph, and A. D. Kern, Genetics215, 193 (2020)

  42. [42]

    Ringbauer, J

    H. Ringbauer, J. Novembre, and M. Steinr¨ ucken, Nature communications12, 5425 (2021)

  43. [43]

    Dichio, H.-L

    V. Dichio, H.-L. Zeng, and E. Aurell, Reports on Progress in Physics86, 052601 (2023)

  44. [44]

    Zeng, Y.-H

    H.-L. Zeng, Y.-H. Huang, E. Aurell, and J. Barton, Phys. Rev. E113, 044415 (2026)

  45. [45]

    J. G. Schraiber, S. N. Evans, and M. Slatkin, Genetics203, 493 (2016)

  46. [46]

    Mavroudi and C

    E. Mavroudi and C. Nagel,Global migration: Patterns, processes and politics(Routledge, 2023)

  47. [47]

    H. C. Nguyen, R. Zecchina, and J. Berg, Advances in Physics66, 197 (2017), https://doi.org/10.1080/00018732.2017.1341604

  48. [48]

    H.-L. Zeng, E. Mauri, V. Dichio, S. Cocco, R. Monasson, and E. Aurell, Journal of Statistical Mechanics: Theory and Experiment2021, 083501 (2021)

  49. [49]

    Zanini and R

    F. Zanini and R. A. Neher, Bioinformatics28, 3332 (2012)

  50. [50]

    Wainschtein, Y

    P. Wainschtein, Y. Zhang, J. Schwartzentruber, I. Kassam, J. Sidorenko, P. P. Fiziev, H. Wang, J. McRae, R. Border, N. Zaitlen,et al., Nature , 1 (2025)

  51. [51]

    C.-Y. Gao, F. Cecconi, A. Vulpiani, H.-J. Zhou, and E. Aurell, Physical biology16, 026002 (2019)

  52. [52]

    H. A. Orr, Nature Reviews Genetics10, 531 (2009)

  53. [53]

    Introduction to the statistical theory of Darwinian evolution

    L. Peliti, arXiv preprint cond-mat/9712027 (1997)

  54. [54]

    Tanaka, Neural Computation12, 1951 (2000), https://direct.mit.edu/neco/article- pdf/12/8/1951/814585/089976600300015213.pdf

    T. Tanaka, Neural Computation12, 1951 (2000), https://direct.mit.edu/neco/article- pdf/12/8/1951/814585/089976600300015213.pdf

  55. [55]

    Ricci-Tersenghi, Journal of Statistical Mechanics: Theory and Experiment2012, P08015 (2012)

    F. Ricci-Tersenghi, Journal of Statistical Mechanics: Theory and Experiment2012, P08015 (2012)

  56. [56]

    Shun-ichi Amari,Applied Mathematical Sciences(Springer Tokyo, 2016)