Network analysis of synonymous codon usage

Gabriel Wright; Jacob Piland; Jun Li; Khalique Newaz; Patricia Clark; Scott Emrich; Tijana Milenkovic

arxiv: 1907.03351 · v1 · pith:CKMVSQYMnew · submitted 2019-07-07 · 🧬 q-bio.MN

Network analysis of synonymous codon usage

Khalique Newaz , Gabriel Wright , Jacob Piland , Jun Li , Patricia Clark , Scott Emrich , Tijana Milenkovic This is my paper

Pith reviewed 2026-05-25 01:03 UTC · model grok-4.3

classification 🧬 q-bio.MN

keywords synonymous codonsprotein structure networksnetwork centralityco-translational foldingrare codonsprotein functionevolutionary conservation

0 comments

The pith

In 84% of proteins, at least one codon category occupies significantly different network-central positions than the others.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper models protein three-dimensional structures as networks to compare the network positions of amino acids encoded by evolutionary conserved rare codons, evolutionary non-conserved rare codons, and commonly used codons. It reports that in 84% of the analyzed proteins, at least one of these three categories shows statistically significant differences in centrality compared with the other categories. Proteins are then grouped according to the specific pattern of these centrality differences, and the groups turn out to be enriched for distinct biological functions. This supplies evidence for a connection between codon usage at the sequence level, the structural positions of those codons, and the functional roles of the finished proteins.

Core claim

By representing protein structures as networks and analyzing the network centrality of residues encoded by three codon categories—evolutionary conserved rare, evolutionary non-conserved rare, and commonly used—the analysis reveals that in 84% of the proteins at least one codon category occupies significantly more or less central positions than the others. Protein groups defined by their distinct codon-centrality trends are enriched in different biological functions, implying a link between codon usage, protein folding, and protein function.

What carries the argument

Network centrality measures computed on amino-acid nodes in graphs derived from protein three-dimensional structures, with nodes partitioned by the synonymous codon category that encodes each amino acid.

If this is right

Protein groups defined by different codon-centrality trends are enriched in different biological functions.
The placement of rare codons may be tuned to the folding requirements of particular protein classes.
A connection exists between codon usage patterns, co-translational folding, and final protein function.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Network centrality could serve as a filter to flag candidate codon sites whose mutation would most affect folding in a given protein.
The functional enrichment patterns could be used to generate testable predictions about which protein classes are most sensitive to codon optimization choices.
Overlaying codon-centrality data with known chaperone-binding sites might reveal whether central rare codons coincide with folding bottlenecks.

Load-bearing premise

The chosen network representation of each protein structure and the chosen centrality measure accurately capture positions that matter for co-translational folding.

What would settle it

A set of proteins in which the three codon categories show no difference in measured folding kinetics or chaperone interaction when their network-central positions are experimentally swapped would falsify the claimed link.

Figures

Figures reproduced from arXiv: 1907.03351 by Gabriel Wright, Jacob Piland, Jun Li, Khalique Newaz, Patricia Clark, Scott Emrich, Tijana Milenkovic.

**Figure 1.** Figure 1: The six possible relationships between amino acids in a protein (i.e., nodes in a PSN) encoded by conserved rare, non-conserved rare, and common codons. We perform the above six comparisons (i.e., test the six relationships) for each of the 63 proteins (i.e., PSNs), using each of the six network centrality measures. Hence, we perform 6×63×6 =2,268 comparisons, i.e., Wilcoxon signed-rank tests, and obtain 2… view at source ↗

**Figure 2.** Figure 2: The 17 different codon centrality trends (i.e., different combinations of relationships between PSN positions of the three codon categories) present in our data. 3.2 No meaningful codon usage trends can be observed from randomized codon usage data If there is some biochemical signal behind our identified codon usage groups, we expect that if we randomize the codon usage data (i.e., randomly reshuffle label… view at source ↗

**Figure 3.** Figure 3: Numbers of proteins having the different codon usage trends. The 16 trends (i.e., codon usage groups) that exhibit at least one relationship with respect to at least one centrality measure are shown. The 17th “no codon usage” group with 10 proteins is left out, since no relationship is exhibited with respect to any centrality measure. The figure can be interpreted as follows. As an illustration, there are … view at source ↗

**Figure 4.** Figure 4: Functional enrichment of codon usage groups in terms of biological process GO terms. We consider those 13 out of all 17 groups that have more than two proteins, and we consider only those biological process GO terms that annotate at least two proteins in at least one of the 13 groups (Section 3.1). In the figure, a colored matrix cell indicates that the given GO term annotates at least two proteins in the … view at source ↗

read the original abstract

Most amino acids are encoded by multiple synonymous codons. For an amino acid, some of its synonymous codons are used much more rarely than others. Analyses of positions of such rare codons in protein sequences revealed that rare codons can impact co-translational protein folding and that positions of some rare codons are evolutionary conserved. Analyses of positions of rare codons in proteins' 3-dimensional structures, which are richer in biochemical information than sequences alone, might further explain the role of rare codons in protein folding. We analyze a protein set recently annotated with codon usage information, considering non-redundant proteins with sufficient structural information. We model the proteins' structures as networks and study potential differences between network positions of amino acids encoded by evolutionary conserved rare, evolutionary non-conserved rare, and commonly used codons. In 84% of the proteins, at least one of the three codon categories occupies significantly more or less network-central positions than the other codon categories. Different protein groups showing different codon centrality trends (i.e., different types of relationships between network positions of the three codon categories) are enriched in different biological functions, implying the existence of a link between codon usage, protein folding, and protein function.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper finds codon-type centrality differences in 84% of proteins via 3D networks and reports functional enrichments, but does not validate that centrality tracks co-translational folding positions.

read the letter

This paper turns protein structures into residue contact networks and checks whether amino acids from conserved rare codons, non-conserved rare codons, and common codons sit in different network positions. It reports that in 84% of the proteins at least one codon category is significantly more or less central, then groups proteins by the pattern of those differences and finds functional enrichments in each group.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes a set of non-redundant proteins annotated with codon usage data. Protein structures are modeled as residue contact networks. The authors compare network centrality positions of residues encoded by three codon categories (evolutionary conserved rare codons, non-conserved rare codons, and common codons). They report that in 84% of proteins at least one category occupies significantly more or less central positions than the others. Proteins are grouped by their codon-centrality trend patterns; these groups show distinct functional enrichments, which the authors interpret as evidence for a link between codon usage, co-translational folding, and protein function.

Significance. If the network centrality differences are shown to mark positions relevant to folding, the 84% statistic and the function-specific trend enrichments would constitute a useful empirical observation linking codon bias to structural positioning in a large fraction of proteins. The work is grounded in existing annotations rather than new derivations or simulations and supplies no parameter-free predictions or machine-checked results. Its interpretive reach depends on the untested assumption that the chosen graph model and centrality statistic proxy co-translational folding constraints.

major comments (2)

[Abstract / Results] Abstract and Results: the central claim that 'in 84% of the proteins, at least one of the three codon categories occupies significantly more or less network-central positions' cannot be evaluated because the manuscript supplies no information on protein selection criteria, the precise definition of the residue contact network, the centrality measure employed, the statistical test used, or any multiple-testing correction. These omissions are load-bearing for the quantitative result.
[Abstract / Discussion] Abstract and Discussion: the interpretation that the observed centrality trends imply a link to co-translational folding is not anchored by any validation that the chosen network representation or centrality statistic identifies positions known to affect folding kinetics. No comparison is made to folding data, alternative structural encodings, or experimentally verified rare-codon sites.

minor comments (2)

[Abstract] The abstract states 'non-redundant proteins with sufficient structural information' without defining the redundancy or structural-quality thresholds applied.
[Throughout] Notation for the three codon categories is introduced only in the abstract; consistent labels should be used throughout the text and figures.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major point below, with clarifications on methodology and adjustments to interpretation where appropriate.

read point-by-point responses

Referee: [Abstract / Results] Abstract and Results: the central claim that 'in 84% of the proteins, at least one of the three codon categories occupies significantly more or less network-central positions' cannot be evaluated because the manuscript supplies no information on protein selection criteria, the precise definition of the residue contact network, the centrality measure employed, the statistical test used, or any multiple-testing correction. These omissions are load-bearing for the quantitative result.

Authors: The protein selection criteria (non-redundant proteins with sufficient structural information from the annotated set), residue contact network definition (residues in spatial proximity), centrality measure, statistical tests for comparing category positions, and multiple-testing approach are detailed in the Methods section. To make the 84% result directly evaluable from the Abstract and Results without cross-reference, we will add a concise summary of these elements to the Results section. revision: yes
Referee: [Abstract / Discussion] Abstract and Discussion: the interpretation that the observed centrality trends imply a link to co-translational folding is not anchored by any validation that the chosen network representation or centrality statistic identifies positions known to affect folding kinetics. No comparison is made to folding data, alternative structural encodings, or experimentally verified rare-codon sites.

Authors: We acknowledge that the manuscript does not include direct comparisons to folding kinetics data or experimental rare-codon sites, as the study is observational and relies on existing structural annotations and functional enrichment analysis. The residue-contact network and centrality measures are motivated by established literature linking network centrality to structurally critical positions. We will revise the Discussion to present the co-translational folding link as an interpretive hypothesis supported by the observed patterns and functional enrichments, rather than a validated conclusion, and explicitly note the absence of direct kinetic validation. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical statistical comparison of network centralities on external annotations

full rationale

The manuscript performs direct computation of network centralities on residue-contact graphs derived from PDB structures, then applies standard statistical tests to compare positions of three codon classes drawn from an externally annotated protein set. No equations are presented, no parameters are fitted and relabeled as predictions, and no self-citations supply uniqueness theorems or ansatzes that the results depend upon. The 84 % figure and functional-enrichment observations are data-driven counts, not reductions to the modeling choices by construction. The modeling assumptions (contact definition, centrality measure) are stated but remain external to the reported statistics; their biological interpretation is a separate validation question, not a circularity issue.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Abstract-only review; ledger populated from stated elements in the abstract. The analysis depends on standard definitions of codon rarity and conservation plus the validity of the network model.

free parameters (2)

rare-codon frequency threshold
Abstract does not specify how 'rare' versus 'common' codons are defined; this cutoff is required to assign amino acids to categories.
evolutionary conservation criterion
Abstract does not detail the sequence-alignment or conservation score threshold used to label codons as 'evolutionary conserved rare'.

axioms (2)

domain assumption Protein structures can be represented as networks in which node centrality reflects positions relevant to co-translational folding.
The paper invokes this when interpreting differences in network positions as informative about folding.
domain assumption The selected non-redundant proteins with sufficient structural information form a representative sample for the reported trends.
Abstract states the protein set but provides no justification or sampling details.

pith-pipeline@v0.9.0 · 5748 in / 1486 out tokens · 30797 ms · 2026-05-25T01:03:41.562489+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

[1]

Sharp, P . M. & Li, W .-H. An evolutionary perspective on synon ymous codon usage in unicellular organisms. Journal of Molecular Evolution 24, 28–38 (1986)

work page 1986
[2]

Chaney, J. L. & Clark, P . L. Roles for synonymous codon usage i n protein biogenesis. Annual Review of Biophysics 44, 143–166 (2015)

work page 2015
[3]

S., Hockenberry, A

Liu, S. S., Hockenberry, A. J., Jewett, M. C. & Amaral, L. A. A n ovel framework for evaluating the performance of codon usage bias metrics. Journal of The Royal Society Interface 15, 20170667 (2018)

work page 2018
[4]

Codon usage and trna content in unicellular and m ulticellular organisms

Ikemura, T. Codon usage and trna content in unicellular and m ulticellular organisms. Molecular Biology and Evolution 2, 13–34 (1985)

work page 1985
[5]

Sharp, P . M. & Li, W .-H. The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Research 15, 1281–1295 (1987)

work page 1987
[6]

Kramer, E. B. & Farabaugh, P . J. The frequency of translation al misreading errors in e. coli is largely determined by trna competition. RNA 13, 87–96 (2007)

work page 2007
[7]

& Wilke, C

Zhou, T., Weems, M. & Wilke, C. O. Translationally optimal co dons associate with structurally sensitive sites in protei ns. Molecular Biology and Evolution 26, 1571–1580 (2009)

work page 2009
[8]

& Hurst, L

Warnecke, T. & Hurst, L. D. Groel dependency affects codon us age—support for a critical role of misfolding in gene evolution. Molecular Systems Biology 6, 340 (2010)

work page 2010
[9]

& Frydman, J

Pechmann, S. & Frydman, J. Evolutionary conservation of cod on optimality reveals hidden signatures of cotranslationa l folding. Nature Structural & Molecular Biology 20, 237 (2013)

work page 2013
[10]

& Liu, Y

Zhou, M., Wang, T., Fu, J., Xiao, G. & Liu, Y . Nonoptimal codon usage inﬂuences protein structure in intrinsically disordered regions. Molecular Microbiology 97, 974–987 (2015)

work page 2015
[11]

Komar, A. A. A pause for thought along the co-translational f olding pathway. Trends in Biochemical Sciences 34, 16–24 (2009)

work page 2009
[12]

Kimchi-Sarfaty, C. et al. A ”silent” polymorphism in the mdr1 gene changes substrate s peciﬁcity. Science 315, 525–528 (2007)

work page 2007
[13]

Zhou, M. et al. Non-optimal codon usage affects expression, structure and function of clock protein frq. Nature 495, 111 (2013)

work page 2013
[14]

A., Lesnik, T

Komar, A. A., Lesnik, T. & Reiss, C. Synonymous codon substit utions affect ribosome trafﬁc and protein folding during in vitro translation. FEBS Letters 462, 387–391 (1999)

work page 1999
[15]

M., Chaney, J

Sander, I. M., Chaney, J. L. & Clark, P . L. Expanding Anﬁnsen’ s principle: contributions of synonymous codon selection to rational protein design. Journal of the American Chemical Society 136, 858–861 (2014). 10/12

work page 2014
[16]

Buhr, F. et al. Synonymous codons direct cotranslational folding toward d ifferent protein conformations. Molecular Cell 61, 341–351 (2016)

work page 2016
[17]

Jacobson, G. N. & Clark, P . L. Quality over quantity: optimiz ing co-translational protein folding with non- ‘optimal’synonymous codons. Current Opinion in Structural Biology 38, 102–110 (2016)

work page 2016
[18]

Illerg˚ ard, K., Ardell, D. H. & Elofsson, A. Structure is thr ee to ten times more conserved than sequence—a study of structural response in protein cores. Proteins: Structure, Function, and Bioinformatics 77, 499–508 (2009)

work page 2009
[19]

Chaney, J. L. et al. Widespread position-speciﬁc conservation of synonymous rare codons within coding sequences. PLoS Computational Biology 13, e1005531 (2017)

work page 2017
[20]

Jacobs, W . M. & Shakhnovich, E. I. Evidence of evolutionary s election for cotranslational folding. Proceedings of the National Academy of Sciences 114, 11434–11439 (2017)

work page 2017
[21]

Ba, A. N. N. et al. Proteome-wide discovery of evolutionary conserved sequences in disordered regions. Science Signaling 5, rs1–rs1 (2012)

work page 2012
[22]

& Medina, F

Gonz´ alez-Camacho, F. & Medina, F. J. Nucleolins from different model organisms have conserved sequences reﬂecting the conservation of key cellular functions through evoluti on. Journal of Applied Biomedicine 2, 151–161 (2004)

work page 2004
[23]

& Ghosh, T

Gupta, S., Majumdar, S., Bhattacharya, T. & Ghosh, T. Studies on the relationships between the synonymous codon usage and protein secondary structural units. Biochemical and Biophysical Research Communications 269, 692–696 (2000)

work page 2000
[24]

Networks (Oxford University Press, 2018)

Newman, M. Networks (Oxford University Press, 2018)

work page 2018
[25]

& Prˇ zulj, N

Milenkovi´ c, T., Filippis, I., Lappe, M. & Prˇ zulj, N. Optimized null model for protein structure networks. PLoS ONE 4, e5967 (2009)

work page 2009
[26]

Faisal, F. E. et al. GRAFENE: Graphlet-based alignment-free network approach integrates 3d structural and sequence (residue order) data to improve protein structural compari son. Scientiﬁc Reports 7, 14890 (2017)

work page 2017
[27]

Newaz, K., Rahnama, A., Ghalehnovi, M., Antsaklis, P . J. & Mi lenkovic, T. Network-based protein structural classiﬁca- tion. arXiv:1804.04725v2 (2018)

work page arXiv 2018
[28]

& Rosenstr¨ om, P

Holm, L. & Rosenstr¨ om, P . Dali server: conservation mapping in 3d. Nucleic Acids Research 38, W545 (2010)

work page 2010
[29]

& Skolnick, J

Zhang, Y . & Skolnick, J. Tm-align: a protein structure alignment algorithm based on the tm-score. Nucleic Acids Research 33, 2302–2309 (2005)

work page 2005
[30]

& Hamelryck, T

Harder, T., Borg, M., Boomsma, W ., Røgen, P . & Hamelryck, T. F ast large-scale clustering of protein structures using gauss integrals. Bioinformatics 28, 510–515 (2012)

work page 2012
[31]

& Y ang, J

Xia, J., Peng, Z., Qi, D., Mu, H. & Y ang, J. An ensemble approac h to protein fold classiﬁcation by integration of template-based assignment and support vector machine clas siﬁer. Bioinformatics 33, 863–870 (2016)

work page 2016
[32]

V ., Paci, E

V endruscolo, M., Dokholyan, N. V ., Paci, E. & Karplus, M. Sma ll-world view of the amino acids that play a key role in protein folding. Physical Review E 65, 061910 (2002)

work page 2002
[33]

Amitai, G. et al. Network analysis of protein structures identiﬁes function al residues. Journal of Molecular Biology 344, 1135–1146 (2004)

work page 2004
[34]

D., Fujihashi, H., Amoros, D

Sol, A. D., Fujihashi, H., Amoros, D. & Nussinov, R. Residue c entrality, functionally important residues, and active si te shape: analysis of enzyme and non-enzyme families. Protein Science 15, 2120–2128 (2006)

work page 2006
[35]

M., Lonardi, S

V acic, V ., Iakoucheva, L. M., Lonardi, S. & Radivojac, P . Graphlet kernels for prediction of functional residues in prot ein structures. Journal of Computational Biology 17, 55–72 (2010)

work page 2010
[36]

& Lensink, M

Brysbaert, G., Mauri, T., de Ruyck, J. & Lensink, M. F. Identi ﬁcation of key residues in proteins through centrality analysis and ﬂexibility prediction with rinspector. Current Protocols in Bioinformatics e66 (2018)

work page 2018
[37]

Berman, H. M. et al. The protein data bank. Nucleic Acids Research 28, 235–242 (2000)

work page 2000
[38]

Faisal, F. E. & Milenkovi´ c, T. Dynamic networks reveal key players in aging. Bioinformatics 30, 1721 (2014)

work page 2014
[39]

Clarke, T. F. & Clark, P . L. Rare codons cluster. PloS ONE 3, e3412 (2008)

work page 2008
[40]

F., Gish, W ., Miller, W ., Myers, E

Altschul, S. F., Gish, W ., Miller, W ., Myers, E. W . & Lipman, D . J. Basic local alignment search tool. Journal of Molecular Biology 215, 403–410 (1990)

work page 1990
[41]

Predicting function: from genes to genomes and back

Bork, P .et al. Predicting function: from genes to genomes and back. Journal of Molecular Biology 283, 707–725 (1998). 11/12

work page 1998
[42]

& Sander, C

Holm, L. & Sander, C. Removing near-neighbour redundancy fr om large protein sequence collections. Bioinformatics 14, 423–429 (1998)

work page 1998
[43]

& Carugo, O

Sikic, K. & Carugo, O. Protein sequence redundancy reductio n: comparison of various method. Bioinformation 5, 234 (2010)

work page 2010
[44]

& Prˇ zulj, N

Milenkovi´ c, T., Memiˇ sevi´ c, V ., Bonato, A. & Prˇ zulj, N. Dominating biological networks. PloS ONE 6, e23016 (2011)

work page 2011
[45]

Greene, L. H. et al. The CA TH domain structure database: new protocols and class iﬁcation levels give a more compre- hensive resource for exploring evolution. Nucleic Acids Research 35, D291–D297 (2006)

work page 2006
[46]

G., Brenner, S

Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. SCOP: a structural classiﬁcation of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247, 536–540 (1995)

work page 1995
[47]

Prˇ zulj, N., Corneil, D. G. & Jurisica, I. Modeling interact ome: scale-free or geometric? Bioinformatics 20, 3508–3515 (2004)

work page 2004
[48]

& Gentleman, R

Falcon, S. & Gentleman, R. Hypergeometric testing used for g ene set enrichment analysis. In Bioconductor Case Studies, 207–220 (Springer, 2008)

work page 2008
[49]

Feise, R. J. Do multiple outcome measures require p-value ad justment? BMC Medical Research Methodology 2, 8 (2002)

work page 2002
[50]

& Hochberg, Y

Benjamini, Y . & Hochberg, Y . Controlling the false discovery rate: a practical and powerful approach to multiple testin g. Journal of the Royal Statistical Society. Series B (Methodo logical) 289–300 (1995). 12/12

work page 1995

[1] [1]

Sharp, P . M. & Li, W .-H. An evolutionary perspective on synon ymous codon usage in unicellular organisms. Journal of Molecular Evolution 24, 28–38 (1986)

work page 1986

[2] [2]

Chaney, J. L. & Clark, P . L. Roles for synonymous codon usage i n protein biogenesis. Annual Review of Biophysics 44, 143–166 (2015)

work page 2015

[3] [3]

S., Hockenberry, A

Liu, S. S., Hockenberry, A. J., Jewett, M. C. & Amaral, L. A. A n ovel framework for evaluating the performance of codon usage bias metrics. Journal of The Royal Society Interface 15, 20170667 (2018)

work page 2018

[4] [4]

Codon usage and trna content in unicellular and m ulticellular organisms

Ikemura, T. Codon usage and trna content in unicellular and m ulticellular organisms. Molecular Biology and Evolution 2, 13–34 (1985)

work page 1985

[5] [5]

Sharp, P . M. & Li, W .-H. The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Research 15, 1281–1295 (1987)

work page 1987

[6] [6]

Kramer, E. B. & Farabaugh, P . J. The frequency of translation al misreading errors in e. coli is largely determined by trna competition. RNA 13, 87–96 (2007)

work page 2007

[7] [7]

& Wilke, C

Zhou, T., Weems, M. & Wilke, C. O. Translationally optimal co dons associate with structurally sensitive sites in protei ns. Molecular Biology and Evolution 26, 1571–1580 (2009)

work page 2009

[8] [8]

& Hurst, L

Warnecke, T. & Hurst, L. D. Groel dependency affects codon us age—support for a critical role of misfolding in gene evolution. Molecular Systems Biology 6, 340 (2010)

work page 2010

[9] [9]

& Frydman, J

Pechmann, S. & Frydman, J. Evolutionary conservation of cod on optimality reveals hidden signatures of cotranslationa l folding. Nature Structural & Molecular Biology 20, 237 (2013)

work page 2013

[10] [10]

& Liu, Y

Zhou, M., Wang, T., Fu, J., Xiao, G. & Liu, Y . Nonoptimal codon usage inﬂuences protein structure in intrinsically disordered regions. Molecular Microbiology 97, 974–987 (2015)

work page 2015

[11] [11]

Komar, A. A. A pause for thought along the co-translational f olding pathway. Trends in Biochemical Sciences 34, 16–24 (2009)

work page 2009

[12] [12]

Kimchi-Sarfaty, C. et al. A ”silent” polymorphism in the mdr1 gene changes substrate s peciﬁcity. Science 315, 525–528 (2007)

work page 2007

[13] [13]

Zhou, M. et al. Non-optimal codon usage affects expression, structure and function of clock protein frq. Nature 495, 111 (2013)

work page 2013

[14] [14]

A., Lesnik, T

Komar, A. A., Lesnik, T. & Reiss, C. Synonymous codon substit utions affect ribosome trafﬁc and protein folding during in vitro translation. FEBS Letters 462, 387–391 (1999)

work page 1999

[15] [15]

M., Chaney, J

Sander, I. M., Chaney, J. L. & Clark, P . L. Expanding Anﬁnsen’ s principle: contributions of synonymous codon selection to rational protein design. Journal of the American Chemical Society 136, 858–861 (2014). 10/12

work page 2014

[16] [16]

Buhr, F. et al. Synonymous codons direct cotranslational folding toward d ifferent protein conformations. Molecular Cell 61, 341–351 (2016)

work page 2016

[17] [17]

Jacobson, G. N. & Clark, P . L. Quality over quantity: optimiz ing co-translational protein folding with non- ‘optimal’synonymous codons. Current Opinion in Structural Biology 38, 102–110 (2016)

work page 2016

[18] [18]

Illerg˚ ard, K., Ardell, D. H. & Elofsson, A. Structure is thr ee to ten times more conserved than sequence—a study of structural response in protein cores. Proteins: Structure, Function, and Bioinformatics 77, 499–508 (2009)

work page 2009

[19] [19]

Chaney, J. L. et al. Widespread position-speciﬁc conservation of synonymous rare codons within coding sequences. PLoS Computational Biology 13, e1005531 (2017)

work page 2017

[20] [20]

Jacobs, W . M. & Shakhnovich, E. I. Evidence of evolutionary s election for cotranslational folding. Proceedings of the National Academy of Sciences 114, 11434–11439 (2017)

work page 2017

[21] [21]

Ba, A. N. N. et al. Proteome-wide discovery of evolutionary conserved sequences in disordered regions. Science Signaling 5, rs1–rs1 (2012)

work page 2012

[22] [22]

& Medina, F

Gonz´ alez-Camacho, F. & Medina, F. J. Nucleolins from different model organisms have conserved sequences reﬂecting the conservation of key cellular functions through evoluti on. Journal of Applied Biomedicine 2, 151–161 (2004)

work page 2004

[23] [23]

& Ghosh, T

Gupta, S., Majumdar, S., Bhattacharya, T. & Ghosh, T. Studies on the relationships between the synonymous codon usage and protein secondary structural units. Biochemical and Biophysical Research Communications 269, 692–696 (2000)

work page 2000

[24] [24]

Networks (Oxford University Press, 2018)

Newman, M. Networks (Oxford University Press, 2018)

work page 2018

[25] [25]

& Prˇ zulj, N

Milenkovi´ c, T., Filippis, I., Lappe, M. & Prˇ zulj, N. Optimized null model for protein structure networks. PLoS ONE 4, e5967 (2009)

work page 2009

[26] [26]

Faisal, F. E. et al. GRAFENE: Graphlet-based alignment-free network approach integrates 3d structural and sequence (residue order) data to improve protein structural compari son. Scientiﬁc Reports 7, 14890 (2017)

work page 2017

[27] [27]

Newaz, K., Rahnama, A., Ghalehnovi, M., Antsaklis, P . J. & Mi lenkovic, T. Network-based protein structural classiﬁca- tion. arXiv:1804.04725v2 (2018)

work page arXiv 2018

[28] [28]

& Rosenstr¨ om, P

Holm, L. & Rosenstr¨ om, P . Dali server: conservation mapping in 3d. Nucleic Acids Research 38, W545 (2010)

work page 2010

[29] [29]

& Skolnick, J

Zhang, Y . & Skolnick, J. Tm-align: a protein structure alignment algorithm based on the tm-score. Nucleic Acids Research 33, 2302–2309 (2005)

work page 2005

[30] [30]

& Hamelryck, T

Harder, T., Borg, M., Boomsma, W ., Røgen, P . & Hamelryck, T. F ast large-scale clustering of protein structures using gauss integrals. Bioinformatics 28, 510–515 (2012)

work page 2012

[31] [31]

& Y ang, J

Xia, J., Peng, Z., Qi, D., Mu, H. & Y ang, J. An ensemble approac h to protein fold classiﬁcation by integration of template-based assignment and support vector machine clas siﬁer. Bioinformatics 33, 863–870 (2016)

work page 2016

[32] [32]

V ., Paci, E

V endruscolo, M., Dokholyan, N. V ., Paci, E. & Karplus, M. Sma ll-world view of the amino acids that play a key role in protein folding. Physical Review E 65, 061910 (2002)

work page 2002

[33] [33]

Amitai, G. et al. Network analysis of protein structures identiﬁes function al residues. Journal of Molecular Biology 344, 1135–1146 (2004)

work page 2004

[34] [34]

D., Fujihashi, H., Amoros, D

Sol, A. D., Fujihashi, H., Amoros, D. & Nussinov, R. Residue c entrality, functionally important residues, and active si te shape: analysis of enzyme and non-enzyme families. Protein Science 15, 2120–2128 (2006)

work page 2006

[35] [35]

M., Lonardi, S

V acic, V ., Iakoucheva, L. M., Lonardi, S. & Radivojac, P . Graphlet kernels for prediction of functional residues in prot ein structures. Journal of Computational Biology 17, 55–72 (2010)

work page 2010

[36] [36]

& Lensink, M

Brysbaert, G., Mauri, T., de Ruyck, J. & Lensink, M. F. Identi ﬁcation of key residues in proteins through centrality analysis and ﬂexibility prediction with rinspector. Current Protocols in Bioinformatics e66 (2018)

work page 2018

[37] [37]

Berman, H. M. et al. The protein data bank. Nucleic Acids Research 28, 235–242 (2000)

work page 2000

[38] [38]

Faisal, F. E. & Milenkovi´ c, T. Dynamic networks reveal key players in aging. Bioinformatics 30, 1721 (2014)

work page 2014

[39] [39]

Clarke, T. F. & Clark, P . L. Rare codons cluster. PloS ONE 3, e3412 (2008)

work page 2008

[40] [40]

F., Gish, W ., Miller, W ., Myers, E

Altschul, S. F., Gish, W ., Miller, W ., Myers, E. W . & Lipman, D . J. Basic local alignment search tool. Journal of Molecular Biology 215, 403–410 (1990)

work page 1990

[41] [41]

Predicting function: from genes to genomes and back

Bork, P .et al. Predicting function: from genes to genomes and back. Journal of Molecular Biology 283, 707–725 (1998). 11/12

work page 1998

[42] [42]

& Sander, C

Holm, L. & Sander, C. Removing near-neighbour redundancy fr om large protein sequence collections. Bioinformatics 14, 423–429 (1998)

work page 1998

[43] [43]

& Carugo, O

Sikic, K. & Carugo, O. Protein sequence redundancy reductio n: comparison of various method. Bioinformation 5, 234 (2010)

work page 2010

[44] [44]

& Prˇ zulj, N

Milenkovi´ c, T., Memiˇ sevi´ c, V ., Bonato, A. & Prˇ zulj, N. Dominating biological networks. PloS ONE 6, e23016 (2011)

work page 2011

[45] [45]

Greene, L. H. et al. The CA TH domain structure database: new protocols and class iﬁcation levels give a more compre- hensive resource for exploring evolution. Nucleic Acids Research 35, D291–D297 (2006)

work page 2006

[46] [46]

G., Brenner, S

Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. SCOP: a structural classiﬁcation of proteins database for the investigation of sequences and structures. Journal of Molecular Biology 247, 536–540 (1995)

work page 1995

[47] [47]

Prˇ zulj, N., Corneil, D. G. & Jurisica, I. Modeling interact ome: scale-free or geometric? Bioinformatics 20, 3508–3515 (2004)

work page 2004

[48] [48]

& Gentleman, R

Falcon, S. & Gentleman, R. Hypergeometric testing used for g ene set enrichment analysis. In Bioconductor Case Studies, 207–220 (Springer, 2008)

work page 2008

[49] [49]

Feise, R. J. Do multiple outcome measures require p-value ad justment? BMC Medical Research Methodology 2, 8 (2002)

work page 2002

[50] [50]

& Hochberg, Y

Benjamini, Y . & Hochberg, Y . Controlling the false discovery rate: a practical and powerful approach to multiple testin g. Journal of the Royal Statistical Society. Series B (Methodo logical) 289–300 (1995). 12/12

work page 1995