pith. sign in

arxiv: 2511.01943 · v2 · submitted 2025-11-03 · 🧬 q-bio.PE

Multilevel genomic constraints shape nuclear tRNA gene organization in plants

Pith reviewed 2026-05-18 02:02 UTC · model grok-4.3

classification 🧬 q-bio.PE
keywords tRNA genesnuclear organizationplant genomicsevolutionary constraintstDNA dosagechromosomal distributioncis-regulatory elementscomparative genomics
0
0 comments X

The pith

The proportions of tRNA genes for different amino acids remain nearly constant across plant lineages even though total tDNA copy numbers differ by more than 100 times.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines the organization of nuclear tRNA genes (tDNAs) in 53 photosynthetic eukaryotes and outgroups. It shows that while the overall number of these genes can vary dramatically between species, the relative numbers assigned to each amino acid are highly conserved. This conservation points to strong evolutionary pressures that maintain balanced tRNA supplies for translation. The study also describes specific patterns in regulatory sequences in flowering plants and how these genes are distributed along chromosomes without clustering much or sitting near centromeres.

Core claim

Nuclear tDNA copy numbers vary by more than two orders of magnitude across species, yet the relative representation of tRNA families corresponding to each amino acid remains strikingly conserved across lineages, revealing strong evolutionary constraints on tDNA dosage. In angiosperms, tDNAs show reinforced cis-regulatory features linked to RNA polymerase III transcription, including expanded AT-rich regions, enriched CAA motifs, and extended poly(T) tracts. At the chromosomal scale, tDNAs are predominantly dispersed along chromosome arms, with homogeneous spacing that scales with genome size, while also showing non-random chromosomal distribution, exclusion from centromeric regions, and a

What carries the argument

The conserved relative representation of tRNA isoacceptor families across species, which enforces dosage balance on tDNA copy numbers despite large variations in total gene count.

If this is right

  • Translation efficiency and accuracy depend on maintaining specific ratios of tRNAs for each amino acid rather than overall gene abundance.
  • Angiosperms have evolved stronger regulatory motifs around tDNAs to control their transcription by RNA polymerase III.
  • tDNAs avoid centromeres and show limited clustering, suggesting selection for even distribution along chromosomes.
  • Spacing between tDNAs increases with larger genome sizes, maintaining homogeneous distribution.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar conservation might exist in animal genomes and could be tested by applying the same analysis to non-plant eukaryotes.
  • The conserved dosage could influence codon usage preferences in plant proteins across species.
  • If annotation errors are minimal, this provides a baseline to predict tRNA gene content in newly sequenced plant genomes.
  • Disrupting this balance experimentally might reveal impacts on protein synthesis rates in model plants.

Load-bearing premise

That the tDNA annotations in the analyzed genomes accurately reflect functional genes without including many non-functional pseudogenes or suffering from systematic identification errors.

What would settle it

Re-annotating the genomes of these species with stricter criteria for functional tDNAs and finding that the relative family proportions no longer appear conserved, or discovering a species with markedly different tRNA family ratios.

Figures

Figures reproduced from arXiv: 2511.01943 by Alexandre Berr, David Pflieger, Guillaume Hummel, Laurence Drouard, Valerie Cognat.

Figure 1
Figure 1. Figure 1: Evolutionary trajectories of key tRNA features in plants. a) Cladogram adapted from 32, highlighting key evolutionary milestones such as terrestrialization and the emergence of flowering plants. Names of organisms are abbreviated as in Table S1. Purple circles indicate the presence of tRNASec (Sec). Red (nu I) and orange (pl I) circles mark nuclear and plastidial introns, respectively. Green circles (Imp) … view at source ↗
Figure 3
Figure 3. Figure 3: Tracing the origin of red algae and secondary endosymbiotic lineages [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Sequence elements contributing to efficient tRNA gene transcription [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Organization and chromosomal distribution of tRNA genes in the green lineage. [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Identification and classification of tRNA gene clusters across evolution. [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Evolutionary drivers of tRNA gene repertoires in photosynthetic organisms. [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
read the original abstract

Transfer RNAs (tRNAs) are essential components of the translational machinery. Their abundance and diversity shape decoding capacity and protein synthesis efficiency and accuracy. Because tRNA abundance is encoded in the genome through tDNA copy number, chromosomal organization, and cis-regulatory sequences controlling transcription, these features are expected to influence translational. However, the principles governing nuclear tDNA organization remain poorly understood. Here, we analyzed nuclear tDNA repertoires across 53 photosynthetic eukaryotes spanning major Archaeplastida lineages and secondary endosymbionts, along with seven non-plant eukaryotic outgroups, using comparative genomic approaches at sequence, chromosomal, and genome-wide scales. To support these analyses and enable interactive exploration of tDNA organization, we developed ShinytRNA (https://nebula.ibmp.unistra.fr/shinytRNA/), a web application for genome-wide exploration of tDNA organization. Nuclear tDNA copy numbers vary by more than two orders of magnitude across species, yet the relative representation of tRNA families corresponding to each amino acid remains strikingly conserved across lineages, revealing strong evolutionary constraints on tDNA dosage. In angiosperms, tDNAs show reinforced cis-regulatory features linked to RNA polymerase III transcription, including expanded AT-rich regions, enriched CAA motifs, and extended poly(T) tracts. At the chromosomal scale, tDNAs are predominantly dispersed along chromosome arms, with homogeneous spacing that scales with genome size, while also showing non-random chromosomal distribution, exclusion from centromeric regions, and limited clustering. Together, these patterns reveal conserved yet lineage-specific principles governing nuclear tDNA organization in plants, and highlight how multiple genomic constraints shape the evolution of nuclear tDNA repertoires.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript reports a comparative genomic analysis of nuclear tDNA organization across 53 photosynthetic eukaryotes (Archaeplastida lineages plus secondary endosymbionts) and seven non-plant outgroups. Key findings include >100-fold variation in total nuclear tDNA copy number yet striking conservation of relative tRNA family (amino-acid level) representation; reinforced cis-regulatory features (AT-rich regions, CAA motifs, poly(T) tracts) in angiosperms; and chromosomal-scale patterns of dispersed, homogeneously spaced tDNAs that avoid centromeres and show limited clustering. The authors also introduce the ShinytRNA web application for interactive exploration of these data.

Significance. If the central patterns hold, the work demonstrates multilevel evolutionary constraints on tDNA dosage and organization that are relevant to translational efficiency and plant genome evolution. The broad phylogenetic sampling and provision of an open interactive tool (ShinytRNA) are clear strengths that enhance reproducibility and community utility.

major comments (1)
  1. [Methods (tDNA annotation)] Methods section on tDNA annotation and identification: the central claim of conserved relative tRNA-family representation across >100-fold copy-number variation is load-bearing on the accuracy of functional tDNA counts. The manuscript must specify the exact annotation pipeline (including tRNAscan-SE or equivalent parameters and score thresholds), pseudogene filtering criteria, and any cross-validation steps (e.g., against GtRNAdb) applied uniformly to all 53 species whose genomes differ in repeat content, GC bias, and assembly quality.
minor comments (2)
  1. [Abstract] Abstract: the clause 'these features are expected to influence translational.' is grammatically incomplete; please complete the sentence to indicate what aspect of translation is affected.
  2. [Results / Methods] Figure legends and Methods: clarify whether statistical tests for non-random chromosomal distribution and homogeneous spacing were performed with genome-size normalization and report the exact p-value thresholds and multiple-testing corrections used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback, which highlights an important aspect of methodological transparency in our comparative analysis. We address the major comment below and have revised the manuscript accordingly.

read point-by-point responses
  1. Referee: Methods section on tDNA annotation and identification: the central claim of conserved relative tRNA-family representation across >100-fold copy-number variation is load-bearing on the accuracy of functional tDNA counts. The manuscript must specify the exact annotation pipeline (including tRNAscan-SE or equivalent parameters and score thresholds), pseudogene filtering criteria, and any cross-validation steps (e.g., against GtRNAdb) applied uniformly to all 53 species whose genomes differ in repeat content, GC bias, and assembly quality.

    Authors: We agree that explicit details on the annotation pipeline are necessary to substantiate the robustness of the conserved tRNA-family proportions we report. The original Methods section described the overall approach but did not include the full parameter set. In the revised manuscript we have added a new subsection ('tDNA annotation pipeline') that specifies: (i) uniform use of tRNAscan-SE v2.0.12 with eukaryotic mode and default covariance models; (ii) a minimum score threshold of 50 for inclusion as functional tDNAs; (iii) pseudogene filtering that removes predictions lacking intact anticodon loops or containing premature stops/introns inconsistent with known plant tRNA structures; and (iv) cross-validation against GtRNAdb for the 12 species with existing database entries, with manual curation of discrepancies. The identical pipeline was applied to all 53 Archaeplastida genomes plus outgroups, with additional quality checks for low-coverage assemblies. These additions directly address the referee's concern without altering any results. revision: yes

Circularity Check

0 steps flagged

No significant circularity: purely observational genomic patterns

full rationale

The paper reports empirical observations from comparative analysis of tDNA annotations across 53 species and outgroups, documenting copy-number variation and conserved relative amino-acid family representation without any fitted parameters, predictive equations, or model-derived quantities. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the derivation chain; the central claims rest on direct data patterns and chromosomal-scale descriptions rather than reductions to prior inputs by construction. The analysis is self-contained as a descriptive genomic survey.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The study relies on established bioinformatics pipelines for tDNA annotation and standard comparative methods without introducing new fitted parameters or postulated entities; background assumptions concern the reliability of existing genome assemblies and tRNA gene finders.

axioms (1)
  • domain assumption tDNA genes can be reliably identified and classified into isoacceptor families using existing sequence-based annotation tools across diverse eukaryotic genomes
    The dosage conservation and chromosomal distribution claims depend on accurate and consistent tDNA calls from the sampled assemblies.

pith-pipeline@v0.9.0 · 5840 in / 1321 out tokens · 49679 ms · 2026-05-18T02:02:12.054959+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    Nuclear tDNA copy numbers vary by more than two orders of magnitude across species, yet the relative representation of tRNA families corresponding to each amino acid remains strikingly conserved across lineages, revealing strong evolutionary constraints on tDNA dosage.

  • IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_add echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    tRNA gene copy number tightly co-varies with codon usage frequencies... the correlation between tRNA abundance and amino acid frequency is broadly conserved across eukaryotes

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

  1. [1]

    Superwobbling

    Black arrows highlight clusters detected under both conditions. White arrows point to three isoacceptor families (proline/P, serine/S, and tyrosine/Y) that display frequent clustering in angiosperms. In C. merolae (Cme) and P. tricornutum (Phatr), asparagine (D) and cysteine (C) tRNA genes could not be confidently annotated and were excluded (black crosse...

  2. [2]

    EMBO J 11, 1907-1912, doi:10.1002/j.1460-2075.1992.tb05243.x (1992)

    in plant pre-tRNA(Tyr). EMBO J 11, 1907-1912, doi:10.1002/j.1460-2075.1992.tb05243.x (1992). 101 Vitali, P. & Kiss, T. Cooperative 2'-O-methylation of the wobble cytidine of human elongator tRNA(Met)(CAT) by a nucleolar and a Cajal body-specific box C/D RNP. Genes Dev 33, 741-746, doi:10.1101/gad.326363.119 (2019). 102 Keeling, P. J. & Palmer, J. D. Horiz...