pith. sign in

arxiv: 2405.09216 · v2 · pith:7RFJZJVPnew · submitted 2024-05-15 · 🧬 q-bio.PE

The Human Genomic Landscape of Oceania

Pith reviewed 2026-05-24 01:22 UTC · model grok-4.3

classification 🧬 q-bio.PE
keywords Oceaniahuman genomicspopulation structureadmixtureLapita expansionPolynesian Outliersarchaic introgressiongenetic variation
0
0 comments X

The pith

Genome-wide data from 92 populations across Oceania resolves island connections and traces layered migration histories.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper generates and analyzes genome-wide data from 92 populations spanning 58 islands and 30 countries in Oceania and its perimeter. This dataset is used to map genetic relationships among islands and identify which island groups contributed to the settlement of several Polynesian Outliers. Ancestry-specific analyses separate historical layers, tracing Austronesian ancestry to the Lapita expansion and measuring variable archaic introgression within the basal Papuan component of Oceanians and Southeast Asians. The work also catalogs pronounced differences in allele frequencies for biomedically relevant variants across population groups.

Core claim

Leveraging this diverse dataset, we resolve genetic connections among islands, providing a detailed view of regional population structure and identifying the island groups involved in the settlement of several Polynesian Outliers. Ancestry-specific analyses allow us to deconvolve different layers of history, from tracing groups deriving their Austronesian ancestry via the Lapita expansion to quantifying variable archaic introgression across the basal Papuan component of Oceanians and Southeast Asians. We map biomedically relevant variants across Oceania and Southeast Asia, observing pronounced allele-frequency differences between population groups.

What carries the argument

Ancestry-specific analyses on genome-wide data from 92 populations that deconvolve Austronesian, Papuan, and archaic components while resolving island-level structure.

Load-bearing premise

The 92 sampled populations and 58 islands are representative enough of Oceania's full genetic diversity that ancestry deconvolution and structure analyses recover historical migration events without major sampling bias.

What would settle it

New genome-wide data from additional unsampled islands that shows ancestry proportions or island connections inconsistent with the identified settlement sources for Polynesian Outliers or the quantified archaic introgression levels would undermine the historical reconstructions.

Figures

Figures reproduced from arXiv: 2405.09216 by Abdul Salam M. Sofro, Adrian V. S. Hill, Alexander G. Ioannidis, Alexander J. Mentzer, Alissa L. Severson, Andr\'es Moreno-Estrada, Angela Allen, Carlos D. Bustamante, Carmina Barberena Jonas, Consuelo D. Quinto-Cort\'es, Francoise Friedlaender, Frederick Delfin, Genevieve L. Wojcik, George Aho, George Koki, Ishwar Verma, Javier Blanco-Portillo, Jonathan S. Friedlaender, Julian R. Homburger, J. V\'ictor Moreno-Mayar, Karla Sandoval, Kathryn Auckland, Kathryn Robson, Maria Corazon A. De Ungria, Mark Stoneking, Matthew Spriggs, Maude Phipps, Peter A. Gerlach, Petria R. Russell, Phillip Endicott, Ram Gonz\'alez-Buenfil, Sof\'ia Vieyra-S\'anchez, Stephen Allen, Stephen Oppenheimer, Tom Parks, Toomas Kivisild, William Pomat.

Figure 1
Figure 1. Figure 1: A. Geographic distribution of all the samples used in this study. Each circle represents a population, colored by its geographic location and its size is proportional to the number of samples. B. Pie charts show the average ancestry proportions for each location at K=9. The Neighbor-Joining tree shows the relationship between the genetic clusters inferred by ADMIXTURE. C. Bar charts show the frequency of 3… view at source ↗
Figure 2
Figure 2. Figure 2: Populations are colored by their geographic group. A. Principal component analysis of the OGVP dataset and ancient samples from the Pacific. Ancient Lapita samples are represented as black symbols. B. Neighbor-joining tree of the Lapita sequences from Tonga and Vanuatu together with the Austronesian ancestry specific sequences from the modern samples. 16/19 [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: A. Network based on the amount of IBD sharing between populations. The amount of shared IBD is shown as an edge between locations and the color of each edge is relative to the average cM of shared IBD. Only edges with values > 10 cM are shown. The nodes of the network are colored according to the value of the f3 statistic when Mamanwa and Manobo were used as test populations. B. Heatmap representing the am… view at source ↗
Figure 4
Figure 4. Figure 4: A. IBD segment length distributions for all pairs of individuals in the populations that share ￾ 20 cM with Mamanwa and Manobo, used to fit the exponential decay constants (l). B. Heatmap representing Psi statistic values, it depicts an increase in retained rare variant frequencies along paths of settlements in the Pacific.C. Map with the l results for each of the populations in their geographical location… view at source ↗
Figure 5
Figure 5. Figure 5: Proportion of Denisovan ancestry estimated by f4 ratio. A. In global ancestry. B. In Papuan ancestry. C. In East Asian ancestry. Symbol shape distinguishes significant from non-significant Denisovan signal. Symbol color in populations with significant signals give the proportion of Denisovan signals in the ancestry component. Symbol size in panels B and C represents the average proportion of the human ance… view at source ↗
read the original abstract

Oceania and Island Southeast Asia have a rich, yet understudied, human genomic landscape. This region encompasses some of the first areas inhabited by humans following the out-of-Africa expansion, includes populations with the highest levels of archaic hominin introgression, and contains Pacific islands that are among the most remote continuously inhabited locations in the world. Here, we describe the first region-wide analysis of individuals from population groups spanning Oceania and its broad perimeter. In total we generate and analyze genome-wide data from 92 different populations, 58 separate islands, and 30 countries, covering one third of the planet. Leveraging this diverse dataset, we resolve genetic connections among islands, providing a detailed view of regional population structure and identifying the island groups involved in the settlement of several Polynesian Outliers. Ancestry-specific analyses allow us to deconvolve different layers of history, from tracing groups deriving their Austronesian ancestry via the Lapita expansion to quantifying variable archaic introgression across the basal Papuan component of Oceanians and Southeast Asians. Finally, we map biomedically relevant variants across Oceania and Southeast Asia, observing pronounced allele-frequency differences between population groups. Together, these findings refine models of oceanic settlement and admixture and establish a comprehensive reference that will advance global efforts to ensure broad and equitable representation in human genomics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript presents genome-wide data from 92 populations across 58 islands and 30 countries in Oceania and Island Southeast Asia. It resolves genetic connections among islands and identifies source groups for several Polynesian Outliers, deconvolves ancestry layers (including Lapita-derived Austronesian ancestry and the basal Papuan component), quantifies variable archaic introgression, and maps allele-frequency differences in biomedically relevant variants.

Significance. If the central claims hold, the scale of the dataset offers a valuable reference for an understudied region central to human migration history, with potential to refine settlement and admixture models while advancing equitable genomic representation.

major comments (2)
  1. [Sampling description (early Results/Methods)] Sampling description (early Results/Methods): the claim that 92 populations and 58 islands provide a representative view sufficient for ancestry deconvolution and structure resolution lacks quantitative support such as genetic coverage metrics, completeness of reference panels for the Papuan/Austronesian split, or sensitivity tests to unsampled groups; this is load-bearing for the deconvolution claims.
  2. [Ancestry-specific analyses section] Ancestry-specific analyses section: without demonstrated representativeness, the tracing of Lapita-derived groups and quantification of archaic introgression across the basal Papuan component risks misattribution if key source populations or remote islands are absent from the sample.
minor comments (1)
  1. [Abstract] Abstract provides no sample sizes per group, statistical thresholds, or validation details, complicating initial assessment of the reported findings.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their recognition of the dataset's scale and potential value as a reference for Oceania genomics. We address the two major comments below, agreeing that additional quantitative support for sampling representativeness is warranted and have revised the manuscript accordingly.

read point-by-point responses
  1. Referee: Sampling description (early Results/Methods): the claim that 92 populations and 58 islands provide a representative view sufficient for ancestry deconvolution and structure resolution lacks quantitative support such as genetic coverage metrics, completeness of reference panels for the Papuan/Austronesian split, or sensitivity tests to unsampled groups; this is load-bearing for the deconvolution claims.

    Authors: We agree that the original sampling description would benefit from explicit quantitative metrics. In the revised manuscript we have expanded the early Results and Methods sections to report per-population SNP coverage, call rates, and heterozygosity; we describe the published reference panels used for the Papuan/Austronesian split and their coverage of major source groups; and we include sensitivity tests that repeat the structure and deconvolution analyses on random subsamples and after removal of individual island groups. These additions confirm that the reported patterns are robust. revision: yes

  2. Referee: Ancestry-specific analyses section: without demonstrated representativeness, the tracing of Lapita-derived groups and quantification of archaic introgression across the basal Papuan component risks misattribution if key source populations or remote islands are absent from the sample.

    Authors: We have added a dedicated sensitivity subsection to the Ancestry-specific analyses section. This subsection reports re-estimation of Lapita ancestry proportions and archaic introgression levels after iterative removal of remote-island and potential source populations; the primary signals (source groups for Polynesian Outliers and variable archaic introgression) remain stable. We have also inserted a limitations paragraph that explicitly discusses the implications of any unsampled groups. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical genomic analysis with no derived predictions or self-referential chains

full rationale

The paper reports an empirical analysis of newly generated genome-wide data from 92 populations across 58 islands, with central claims (island connections, Polynesian Outlier settlement sources, Lapita-derived ancestry tracing, and archaic introgression quantification) presented as direct outputs of standard population-genetic methods applied to the dataset. No equations, fitted parameters renamed as predictions, self-definitional constructs, or load-bearing self-citations appear in the abstract or described claims. The derivation chain consists of data collection followed by ancestry deconvolution and structure inference; these steps do not reduce to the inputs by construction. The analysis is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical population-genetics survey paper; central claims rest on standard ancestry deconvolution methods and sampling assumptions rather than new theoretical constructs.

pith-pipeline@v0.9.0 · 5956 in / 1096 out tokens · 22515 ms · 2026-05-24T01:22:10.434759+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages

  1. [1]

    & Price, D

    Groube, L., Chappell, J., Muke, J. & Price, D. A 40,000 year-old human occupation site at Huon Peninsula, Papua New Guinea. Nature 324, 453–455 (1986)

  2. [2]

    Patrick, V . K. On the road of the winds: an archeaological history of the Pacific Islands before European Contact (University of California Press, 2017)

  3. [3]

    & Walter, R

    Sheppard, P ., Chiu, S. & Walter, R. Re-dating Lapita Movement into Remote Oceania. J. Pac. Archaeol. 6, 26–36 (2015)

  4. [4]

    The position of Chamorro and Palauan in the Austronesian family tree: Evidence from verb morphosyntax

    Zobel, E. The position of Chamorro and Palauan in the Austronesian family tree: Evidence from verb morphosyntax. In The history and typology of Western Austronesian voice systems , 405–434 (The Australian National University, 2002)

  5. [5]

    Jacobs, G. S. et al. Multiple Deeply Divergent Denisovan Ancestries in Papuans. Cell 177, 1010–1021.e32 (2019)

  6. [6]

    Larena, M. et al. Philippine Ayta possess the highest level of Denisovan ancestry in the world. Curr. Biol. 31, 4219–4230.e10 (2021)

  7. [7]

    V .et al

    Hill, A. V .et al. A population genetic survey of the haptoglobin polymorphism in Melanesians by DNA analysis. Am. J. Hum. Genet. 38, 382–9 (1986)

  8. [8]

    F., Hill, A

    O’Shaughnessy, D. F., Hill, A. V . S., Bowdent, D. K., Weatherall, D. J. & Clegg, J. B. Globin Genes in Micronesia: Origins and Affinities of Pacific Island Peoples. Am. J. Hum. Genet. 46, 144–155 (1990)

  9. [11]

    Malaspinas, A. S. et al. A genomic history of Aboriginal Australia. Nature 538, 207–214 (2016)

  10. [12]

    Bergstr¨om, A. et al. A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea. Science 357, 1160–1163 (2017)

  11. [13]

    Choin, J. et al. Genomic insights into population history and biological adaptation in Oceania. Nature 592, 583–589 (2021)

  12. [15]

    Ioannidis, A. G. et al. Paths and timings of the peopling of Polynesia inferred from genomic networks. Nature 597, 522–526 (2021)

  13. [16]

    M., Dixon, B

    Bayman, J. M., Dixon, B. M., Mont ´on-Sub´ıas, S. & Segura, N. M. Colonial surveillance, l ˚anchos, and the perpetuation of intangible cultural heritage in guam, mariana islands. The Glob. Span. Emp. 222–241 (2020). DOI 10.2307/j.ctv105bb41.14

  14. [17]

    & Fitzpatrick, S

    Callaghan, R. & Fitzpatrick, S. M. On the relative isolation of a micronesian archipelago during the historic pe- riod: The palau case-study. Int. J. Naut. Archaeol. 36, 353–364 (2007). DOI 10.1111/j.1095-9270.2007.00147.x

  15. [18]

    Ioannidis, A. G. et al. Native American gene flow into Polynesia predating Easter Island settlement. Nature 583, 572–577 (2020)

  16. [19]

    Pierron, D. et al. Genomic landscape of human diversity across madagascar. Proc. Natl. Acad. Sci. 114 (2017). DOI 10.1073/pnas.1704906114

  17. [21]

    Lipson, M. et al. Population Turnover in Remote Oceania Shortly after Initial Settlement. Curr. Biol. 28, 1157–1165.e7 (2018). 11/19

  18. [23]

    Hudjashov, G. et al. Investigating the origins of eastern Polynesians using genome-wide data from the Leeward Society Isles. Sci. Reports 8, 1823 (2018)

  19. [24]

    Thomson, V . A.et al. Using ancient DNA to study the origins and dispersal of ancestral Polynesian chickens across the Pacific. Proc. Natl. Acad. Sci. 111, 4826–4831 (2014)

  20. [25]

    Pugach, I. et al. Ancient DNA from Guam and the peopling of the Pacific. Proc. Natl. Acad. Sci. 118, e2022112118 (2021)

  21. [26]

    Patterson, N. et al. Ancient Admixture in Human History. Genetics 192, 1065 LP – 1093 (2012)

  22. [27]

    & Stoneking, M

    Qin, P . & Stoneking, M. Denisovan Ancestry in East Eurasian and Native American Populations. Mol. Biol. Evol. 32, 2665–2674 (2015)

  23. [28]

    R., Browning, B

    Browning, S. R., Browning, B. L., Zhou, Y ., Tucci, S. & Akey, J. M. Analysis of human sequence data reveals two pulses of archaic denisovan admixture. Cell 173, 53–61.e9 (2018)

  24. [29]

    & Hakonarson, H

    Wang, K., Li, M. & Hakonarson, H. ANNOV AR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164–e164 (2010)

  25. [30]

    Landrum, M. J. et al. Clinvar: improvements to accessing data. Nucleic Acids Res 48, D835–D844 (2020)

  26. [31]

    SLCO1B1 variants and statin-induced myopathy–a genomewide study

    SEARCH Collaborative Group, E., Link et al. SLCO1B1 variants and statin-induced myopathy–a genomewide study. N Engl J Med 359, 789–799 (2008)

  27. [32]

    Mapping the human genetic architecture of COVID-19

    COVID-19 Host Genetics Initiative. Mapping the human genetic architecture of COVID-19. Nature 600, 472–477 (2021)

  28. [33]

    Kousathanas, A. et al. Whole genome sequencing reveals host factors underlying critical COVID-19. Nature (2022)

  29. [34]

    Knapp, S. et al. Polymorphisms in interferon-induced genes and the outcome of hepatitis C virus infection: roles of MxA, OAS-1 and PKR. Genes Immun 4, 411–419 (2003)

  30. [35]

    Hamano, E. et al. Polymorphisms of interferon-inducible genes OAS-1 and MxA associated with SARS in the Vietnamese population. Biochem. Biophys Res Commun 329, 1234–1239 (2005)

  31. [36]

    Fedetz, M. et al. OAS1 gene haplotype confers susceptibility to multiple sclerosis. Tissue Antigens 68, 446–449 (2006)

  32. [37]

    Lim, J. K. et al. Genetic V ariation in OAS1 Is a Risk Factor for Initial Infection with West Nile Virus in Man. PLOS Pathog. 5, 1–12 (2009)

  33. [38]

    & P ¨a¨abo, S

    Zeberg, H. & P ¨a¨abo, S. A genomic region associated with protection against severe COVID-19 is inherited from Neandertals. Proc. Natl. Acad. Sci. United States Am. 118, 3–7 (2021)

  34. [39]

    Huffman, J. E. et al. Multi-ancestry fine mapping implicates OAS1 splicing in risk of severe COVID-19. Nat. Genet. 54, 125–127 (2022)

  35. [40]

    L., Watkins, J

    Mendez, F. L., Watkins, J. C. & Hammer, M. F. Global genetic variation at OAS1 provides evidence of archaic admixture in Melanesian populations. Mol. Biol. Evol. 29, 1513–1520 (2012)

  36. [41]

    Obesity:preventing and managing the global epidemic:report of a WHO consultation (2000)

    World Health Organization. Obesity:preventing and managing the global epidemic:report of a WHO consultation (2000)

  37. [42]

    Ilardo, M. A. et al. Physiological and Genetic Adaptations to Diving in Sea Nomads. Cell 173, 569–580.e15 (2018)

  38. [43]

    Chang, C.-S. et al. A holistic picture of Austronesian migrations revealed by phylogeography of Pacific paper mulberry. Proc. Natl. Acad. Sci. 112, 13537–13542 (2015). 12/19

  39. [47]

    H., Lander, E

    Waterson, R. H., Lander, E. S., Wilson, R. K., Sequencing, T. C. & Consortium, A. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87 (2005)

  40. [49]

    & Marchini, J

    Delaneau, O., Zagury, J.-F. & Marchini, J. Improved whole-chromosome phasing for disease and population genetic studies. Nat. Methods 10, 5–6 (2013)

  41. [50]

    K., Gravel, S., Kenny, E

    Maples, B. K., Gravel, S., Kenny, E. E. & Bustamante, C. D. RFMix: A discriminative modeling approach for rapid and robust local-ancestry inference. The Am. J. Hum. Genet. 93, 278 – 288 (2013)

  42. [51]

    Browning, B. L. & Browning, S. R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013)

  43. [52]

    & Schliep, K

    Paradis, E. & Schliep, K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35, 526–528 (2019)

  44. [53]

    Moreno-Mayar, J. V . FrAnTK: a Frequency-based Analysis ToolKit for efficient exploration of allele sharing patterns in present-day and ancient genomic datasets. G3 Genes—Genomes—Genetics 12 (2021)

  45. [54]

    Polynesian motif

    Minster, R. L. et al. A thrifty variant in CREBRF strongly influences body mass index in Samoans. Nat. Genet. 48, 1049–1054 (2016). Acknowledgements We thank the participants who contributed samples from the populations included in this study. We recognize the leadership of John Clegg in assembling the collection of materials that enabled the foundation of...

  46. [55]

    H., Novembre, J

    Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009)

  47. [56]

    A., Liu, K

    Behr, A. A., Liu, K. Z., Liu-Fang, G., Nakka, P . & Ramachandran, S. pong: fast analysis and visualization of latent clusters in population genetic data. Bioinformatics 32, 2817–2823 (2016)

  48. [57]

    Weissensteiner, H. et al. Haplogrep 2: mitochondrial haplogroup classification in the era of high-throughput sequencing. Nucleic Acids Res. 44, W58–W63 (2016)

  49. [58]

    Severson, A. L. et al. SNAPPY: Single Nucleotide Assignment of Phylogenetic Parameters on the Y chromosome. bioRxiv (2018)

  50. [59]

    Ancient voyaging and Polynesian origins

    Soares, P .et al. Ancient voyaging and Polynesian origins. Am. J. Hum. Genet. 88, 239–247 (2011)

  51. [60]

    Soares, P . A.et al. Resolving the ancestry of Austronesian-speaking populations. Hum. Genet. 135, 309–326 (2016)

  52. [61]

    Brand˜ao, A. et al. Quantifying the legacy of the Chinese Neolithic on the maternal genetic heritage of Taiwan and Island Southeast Asia. Hum. Genet. 135, 363–376 (2016)

  53. [62]

    Bergstr¨om, A. et al. Deep roots for aboriginal australian y chromosomes. Curr Biol 26 (2016)

  54. [63]

    Karafet, T. M. et al. Major East–West Division Underlies Y Chromosome Stratification across Indonesia. Mol. Biol. Evol. 27, 1833–1844 (2010)

  55. [64]

    The Human Genetic History of Oceania: Near and Remote Views of Dispersal

    Kayser, M. The Human Genetic History of Oceania: Near and Remote Views of Dispersal. Curr. Biol. 20, R194–R201 (2010)

  56. [65]

    M., Mendez, F

    Karafet, T. M., Mendez, F. L., Sudoyo, H., Lansing, J. S. & Hammer, M. F. Improved phylogenetic resolution and rapid diversification of Y -chromosome haplogroup K-M526 in Southeast Asia. Eur. J. Hum. Genet. 23, 369–373 (2015)

  57. [66]

    Mondal, M. et al. Y -Chromosomal sequences of diverse indian populations and the ancestry of the andamanese. Hum. Genet. 136, 499–510 (2017)

  58. [67]

    Wojcik, G. L. et al. Genetic analyses of diverse populations improves discovery for complex traits. Nature 570, 514–518 (2019)

  59. [68]

    Genomic insights into the peopling of the Southwest Pacific

    Skoglund, P .et al. Genomic insights into the peopling of the Southwest Pacific. Nature 538, 510–513 (2016)

  60. [69]

    Posth, C. et al. Language continuity despite population replacement in Remote Oceania. Nat. Ecol. Evol. 2, 731–740 (2018)

  61. [70]

    Pr¨ufer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014)

  62. [71]

    Meyer, M. et al. A High-Coverage Genome Sequence from an Archaic Denisovan Individual. Science 338, 222–226 (2012)

  63. [72]

    H., Lander, E

    Waterson, R. H., Lander, E. S., Wilson, R. K., Sequencing, T. C. & Consortium, A. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87 (2005). 26/26