MOSAIC: Codon Harmonization of Monte Carlo-Based Simulated Annealing for Linked Codons in Heterologous Protein Expression
Pith reviewed 2026-05-21 18:48 UTC · model grok-4.3
The pith
Monte Carlo annealing on linked codon sets produces more soluble ribosomal protein than wild-type genes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that applying Monte Carlo simulated annealing to sets of linked codons generates harmonized gene versions that, when expressed heterologously, deliver higher total yields of ribosomal protein S18 together with substantially more soluble protein than the corresponding wild-type gene.
What carries the argument
MOSAIC (Monte Carlo-based Simulated Annealing for Linked Codons), an algorithm that jointly optimizes groups of consecutive codons to match native translation timing.
If this is right
- Higher expression levels become possible for other proteins whose folding depends on controlled translation speed.
- The fraction of soluble, active protein rises when codon groups are harmonized together.
- Recombinant production of sensitive proteins becomes more predictable without extensive trial-and-error codon trials.
- The method extends naturally to additional ribosomal or folding-sensitive targets beyond the four tested.
Where Pith is reading between the lines
- The same linked-set approach could be tested on membrane proteins or enzymes that misfold under rapid translation.
- Combining the algorithm with measured ribosome dwell times from ribosome profiling might tighten the match to native kinetics.
- If the yield benefit holds across hosts, the technique could reduce the need for rare-codon supplementation in industrial strains.
Load-bearing premise
The measured gains in total protein and soluble fraction for S18 arise specifically from the linked-codon harmonization rather than from unmeasured differences in vector, host, or growth conditions.
What would settle it
Express the identical harmonized S18 sequence under the same vector and induction conditions but without the linked-codon optimization step and observe no increase in yield or solubility.
read the original abstract
Codon usage bias has a crucial impact on the translation efficiency and co-translational folding of proteins, necessitating the algorithmic development of codon optimization/harmonization methods, particularly for heterologous recombinant protein expression. Codon harmonization is especially valuable for proteins sensitive to translation rates, because it can potentially replicate native translation speeds, preserving proper folding and maintaining protein activity. This work proposes a Monte Carlo-based codon harmonization algorithm, MOSAIC (Monte Carlo-based Simulated Annealing for Linked Codons), for the harmonization of a set of linked codons, which differs from conventional codon harmonization, by focusing on the codon sets rather than individual ones. Our MOSAIC demonstrates robust computational performance on ribosomal proteins (S18, S15, S10, and L11) as model systems. Among them, the harmonized gene of RP S18 was expressed and compared with the expression of the wild-type gene. The harmonized gene clearly yielded a larger quantity of the protein, from which the amount of the soluble protein was also significant. These results underscored the potential of the linked codon harmonization approach to enhance the expression and functionality of sensitive proteins, setting the stage for more efficient production of recombinant proteins in various biotechnological and pharmaceutical applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces MOSAIC, a Monte Carlo-based simulated annealing algorithm for harmonizing sets of linked codons (rather than individual codons) to optimize heterologous protein expression while preserving native translation kinetics. It reports computational application to ribosomal proteins S18, S15, S10, and L11 as model systems, and provides experimental comparison for RP S18 showing that the harmonized gene produced a larger quantity of total protein with a significant soluble fraction relative to the wild-type gene.
Significance. If the experimental results can be shown to isolate the effect of linked-codon harmonization through matched controls, replicates, and quantitative metrics, the work could contribute a new search strategy for codon optimization that targets co-translational folding in sensitive proteins. This has potential relevance for recombinant protein production in biotechnology, though the current support for the central experimental claim remains limited.
major comments (2)
- [Abstract] Abstract (and corresponding results section): The claim that the MOSAIC-harmonized RP S18 gene 'clearly yielded a larger quantity of the protein' with 'the amount of the soluble protein was also significant' supplies no quantitative values, replicate numbers, statistical tests, error bars, or controls for confounding variables such as vector backbone, promoter, ribosome binding site, host strain, growth media, induction protocol, or protein quantification method. Without these, the observed difference cannot be attributed specifically to the linked-codon harmonization strategy.
- [Methods] Methods/Results (algorithm validation): The description of MOSAIC as a 'robust' Monte Carlo simulated annealing procedure for linked codons does not include sufficient implementation details (e.g., exact energy function for codon sets, annealing schedule parameters, convergence criteria, or comparison baselines against standard codon harmonization tools) to evaluate reproducibility or to confirm that performance gains arise from the linked-codon formulation rather than generic optimization.
minor comments (1)
- [Abstract] The abstract would be strengthened by adding at least one quantitative metric (e.g., fold-change in yield or solubility percentage) and a brief statement of the number of biological replicates performed.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review of our manuscript on MOSAIC. We address each major comment below and describe the revisions that will be made to improve the clarity, reproducibility, and rigor of the work.
read point-by-point responses
-
Referee: [Abstract] Abstract (and corresponding results section): The claim that the MOSAIC-harmonized RP S18 gene 'clearly yielded a larger quantity of the protein' with 'the amount of the soluble protein was also significant' supplies no quantitative values, replicate numbers, statistical tests, error bars, or controls for confounding variables such as vector backbone, promoter, ribosome binding site, host strain, growth media, induction protocol, or protein quantification method. Without these, the observed difference cannot be attributed specifically to the linked-codon harmonization strategy.
Authors: We agree that the current presentation of the experimental results for RP S18 lacks the quantitative detail needed to fully support the claims and to isolate the contribution of linked-codon harmonization. In the revised manuscript we will expand both the abstract and the corresponding results section to report specific protein yield values (total and soluble fractions), the number of biological and technical replicates, appropriate statistical tests with p-values, error bars or standard deviations, and a complete description of all experimental controls including vector backbone, promoter, RBS, host strain, media, induction conditions, and the protein quantification method employed. These additions will allow readers to evaluate the magnitude and specificity of the observed improvement. revision: yes
-
Referee: [Methods] Methods/Results (algorithm validation): The description of MOSAIC as a 'robust' Monte Carlo simulated annealing procedure for linked codons does not include sufficient implementation details (e.g., exact energy function for codon sets, annealing schedule parameters, convergence criteria, or comparison baselines against standard codon harmonization tools) to evaluate reproducibility or to confirm that performance gains arise from the linked-codon formulation rather than generic optimization.
Authors: We acknowledge that the current Methods section does not provide enough implementation specifics for independent reproduction or for distinguishing the benefit of the linked-codon formulation. In the revised manuscript we will supply the precise energy function used to score sets of linked codons, the full annealing schedule (initial temperature, cooling rate, number of iterations per temperature), the convergence criteria, and explicit benchmark comparisons against representative individual-codon harmonization methods. These details will be presented in a new or expanded subsection to demonstrate that the reported performance improvements derive from the linked-codon treatment. revision: yes
Circularity Check
No circularity: new Monte Carlo algorithm validated by direct experimental comparison
full rationale
The paper introduces MOSAIC as a Monte Carlo simulated annealing procedure for linked-codon harmonization and reports computational performance on ribosomal proteins plus one experimental expression comparison (harmonized RP S18 vs wild-type). No derivation chain, fitted parameter, or self-citation is invoked to generate the reported yield improvement; the result is an empirical observation rather than a quantity forced by construction from the algorithm's inputs or prior self-referential claims. The derivation is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- Monte Carlo and annealing schedule parameters
axioms (1)
- domain assumption Codon usage bias has a crucial impact on the translation efficiency and co-translational folding of proteins
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
MOSAIC … loss ℒ(x(t)) = Σ |ℳ(H)(xWT)k − ℳ(E)(x(t))k| … simulated annealing with linear cooling schedule τ(t) = τ(0) − η t
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
window size z = 10 … %MinMax profile … linked codon harmonization
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
(1) Huang, C. J; Lin, H; Yang, X. Industrial production of recombinant therapeutics in Escherichia coli and its recent advancements. J. Ind. Microbiol. Biotechnol. 2012, 39, 383-399. (2) Elena, C.; Ravasi, P.; Castelli, M. E.; Peirú, S.; Menzella, H. G. Expression of codon optimized genes in microbial systems: current industrial applications and perspecti...
work page 2012
-
[2]
(3) Wang, J. R.; Li, Y. Y.; Liu, D. N.; Liu, J. S.; Li, P.; Chen, L. Z.; Xu, S. D. Codon optimization significantly improves the expression level of α‐amylase gene from Bacillus licheniformis in Pichia pastoris. BioMed res. Int. 2015, 2015, 248680. (4) Angov, E.; Hillier, C. J.; Kincaid, R. L.; Lyon, J. A. Heterologous protein expression is enhanced by ha...
work page 2015
-
[3]
Codon optimization of the calf prochymosin gene and its expression in Kluyveromyces lactis
(12) Feng, Z.; Zhang, L.; Han, X.; Zhang, Y. Codon optimization of the calf prochymosin gene and its expression in Kluyveromyces lactis. World J. Microbiol. Biotechnol. 2010, 26, 895-901. (13) Marlatt, N. M.; Spratt, D. E.; Shaw, G. S. Codon optimization for enhanced Escherichia coli expression of human S100A11 and S100A1 proteins. Protein Expr. Purif. 20...
work page 2010
-
[4]
Engineering genes for predictable protein expression
(15) Gustafsson, C.; Minshull, J.; Govindarajan, S.; Ness, J.; Villalobos, A.; Welch, M. Engineering genes for predictable protein expression. Protein Expr. Purif. 2012, 83, 37-46. (16) Gong, M.; Gong, F.; Yanofsky, C. Overexpression of tnaC of Escherichia coli inhibits growth by depleting tRNA2Pro availability. J. Bacteriol. 2006, 188, 1892-1898. (17) Wa...
work page 2012
-
[5]
(19) Xu, Y.; Ma, P.; Shah, P.; Rokas, A.; Liu, Y.; Johnson, C. H. Non-optimal codon usage is a mechanism to achieve circadian clock conditionality. Nature 2013, 495, 116-120. (20) Yu, C. H.; Dang, Y.; Zhou, Z.; Wu, C.; Zhao, F.; Sachs, M. S.; Liu, Y. Codon usage influences the local rate of translation elongation to regulate co-translational protein foldi...
work page 2013
-
[6]
Codon harmonization–going beyond the speed limit for protein expression
(22) Mignon, C.; Mariano, N.; Stadthagen, G.; Lugari, A.; Lagoutte, P.; Donnat, S.; Chenavas, S.; Perot, C.; Sodoyer, R.; Werle, B. Codon harmonization–going beyond the speed limit for protein expression. FEBS Lett. 2018, 592, 1554-1564. (23) Punde, N.; Kooken, J.; Leary, D.; Legler, P. M.; Angov, E. Codon harmonization reduces amino acid misincorporation...
work page 2018
-
[7]
(24) Chowdhury, D. R.; A ngov, E.; Kariuki, T.; Kumar, N. A potent malaria transmission blocking vaccine based on codon harmonized full length Pfs48/45 expressed in Escherichia coli. PLoS One 2009, 4, e6352. 23 (25) Gaspar, P.; Oliveira, J. L.; Frommlet, J.; Santos, M. A.; Moura, G. EuGene: maximizing synthetic gene design for heterologous expression. Bio...
-
[8]
(40) Dequard-Chablat, M.; Rötig, A. Homologous and heterologous expression of a ribosomal protein gene in Podospora anserina requires an intron. Mol. Gen. Genet. 1997, 253, 546-552. 25 (41) Liao, X.; Zhao, J.; Liang, S.; Jin, J.; Li, C.; Xiao, R.; Li, L.; Guo, M.; Zhang, G.; Lin, Y. Enhancing co-translational folding of heterologous protein by deleting no...
work page 1997
-
[9]
(42) Nurk, S.; Koren, S.; Rhie, A.; Rautiainen, M.; Bzikadze, A. V.; Mikheenko, A.; Vollger, M. R.; Altemose, N.; Uralsky, L.; Gershman, A.; et al. The complete sequence of a human genome. Science 2022, 376, 44-53. (43) Liu, Y. A code within the genetic code: codon usage regulates co-translational protein folding. Cell Commun. Signal. 2020, 18,
work page 2022
-
[10]
Protein synthesis rates and ribosome occupancies reveal determinants of translation elongation rates
(44) Riba, A.; Di Nanni, N.; Mittal, N.; Arhné, E.; Schmidt, A.; Zavolan, M. Protein synthesis rates and ribosome occupancies reveal determinants of translation elongation rates. Proc. Natl Acad. Sci. USA 2019, 116, 15023-15032. (45) Lorenz, R.; Bernhart, S. H.; Höner zu Siederdissen, C.; Tafer, H.; Flamm, C.; Stadler, P. F.; Hofacker, I. L. ViennaRNA Pac...
work page 2019
-
[11]
(46) Pujar, S.; O’Leary, N. A.; Farrell, C. M.; Loveland, J. E.; Mudge, J. M.; Wallin, C.; Girón, C. G.; Diekhans, M.; Barnes, I.; Bennett, R.; et al. Consensus coding sequence (CCDS) database: a standardized set of human and mouse protein-coding regions supported by expert curation. Nucleic Acids Res. 2018, 46, D221-D228. (47) Alexaki, A.; Kames, J.; Hol...
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.