The tAI Formula

The tRNA Adaptation Index was first described by dos Reis, Savva & Wernisch (2004). It measures how well a gene's codon usage matches the tRNA pool, predicting translational efficiency.

wi = Σj (1 − sij) × tGCNij
Per-codon weight, summing over all tRNA isoacceptors

Where sij is the wobble base-pairing penalty between codon i and anticodon j, and tGCNij is the tRNA gene copy number for anticodon j. The product (1 − sij) × tGCNij captures both the abundance and pairing efficiency of each isoacceptor.

The final tAI of a gene is the geometric mean of the relative adaptiveness values (wi / max(wi)) across all its codons.

Wobble Base-Pairing

Not all codon–anticodon interactions are equally efficient. The wobble position (3rd codon base ↔ 1st anticodon base) allows non-standard pairings:

Anticodon base Codon base Penalty (sij) Interaction
AU0Watson-Crick
CG0Watson-Crick
GC0Watson-Crick
UA0Watson-Crick
GU0.41Wobble G:U
IA, C, U0.28Inosine wobble

Penalty values from dos Reis et al. (2004). Inosine (I) results from adenosine-to-inosine editing at the anticodon wobble position.

Domain-Specific Modifications

Because tRNA anticodons can decode multiple synonymous mRNA codons through non-Watson-Crick base pairing, translation efficiency calculations must account for the thermodynamic efficiency of specific wobble interactions. Critically, these interactions are heavily influenced by clade-specific post-transcriptional modifications:

  • Eukaryotes: Base modifications are modeled under the assumption of ubiquitous Adenosine to Inosine (A → I) editing by ADAT enzymes, allowing standard generalized decoding across the 8 ANN anticodon families.
  • Bacteria: A → I editing is mediated by TadA and is strictly limited to the Arginine anticodon ACG. Furthermore, U34 hypermodifications (e.g., cmo5U, mnm5s2U) are distinctly penalized due to their differential pairing kinetics.
  • Archaea: We strictly account for domain-specific decoding, such as the Agmatidine modification of tRNAIle(CAU), which allows decoding of the AUA codon while preventing misreading of Methionine AUG.

Our Pipeline

From genome assembly to per-codon tAI weights

1

Genome Retrieval

Download RefSeq genome assemblies from NCBI for all available vertebrate species. We target chromosomal or scaffold-level assemblies for comprehensive tRNA gene detection.

2

tRNA Gene Prediction

Run tRNAscan-SE 2.0 on each genome assembly to predict tRNA gene locations, anticodon types, and quality scores. tRNAscan-SE uses covariance models and secondary structure analysis.

3

Quality Filtering

Remove pseudogenes, low-confidence predictions (infernal score < 50), and mitochondrial tRNAs. Only high-confidence nuclear tRNA genes are retained for Wi computation.

4

Wobble Optimization

Apply the dos Reis wobble base-pairing rules to compute Wi values for each of the 61 sense codons. Each tRNA anticodon contributes to multiple codons through wobble interactions.

5

Dynamic QC & Capping

Detect and truncate severe SINE retrotransposon contamination (frequent in mammalian genomes) using a strict Interquartile Range (IQR) saturation limit to prevent mathematical artifacting.

6

gtAI Optimization

Deploy domain-specific genetic algorithms to refine classical dos Reis penalty assumptions (e.g. restricting Inosine editing to TadA in bacteria). Final dataset: 8,722 species × 61 codons.

Quality Control & Validation

To verify our automated pipeline across the Tree of Life, we extensively cross-validated our NCBI-derived datasets against the Genomic tRNA Database (GtRNAdb), the gold-standard resource for curated tRNA annotations.

Our analysis revealed that classical uncorrected pipelines routinely overestimate tRNA abundance in Mammals due to SINE retrotransposon contamination (e.g. Alu/B2 elements), and miscalculate Bacterial translation by generalizing Eukaryotic A → I wobble rules.

tAIatlas rectifies these critical issues by strictly enforcing species-specific TadA constraints and dynamically capping SINE inflation via a novel Interquartile Range (IQR) algorithm, restoring biological accuracy across all domains.

Data Sources

NCBI RefSeq

8,722 pan-domain genome assemblies.
ncbi.nlm.nih.gov/refseq

gtAI Architecture

Machine learning / genetic algorithms for optimized S-values (Anwar et al. 2023).
doi.org/10.3389/fmolb.2023.1218518

tRNAscan-SE 2.0

tRNA gene prediction engine (filtered at >55 bits).
lowelab.ucsc.edu/tRNAscan-SE

Authorship & Citation

Authors

Derek L. Thompson1,2 and William C. Ray1,2,3

  1. Biophysics Graduate Program, The Ohio State University, Columbus, OH, USA
  2. Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
  3. Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA

Citation

If you use tAIatlas in your research, please cite our NAR Database Issue manuscript or link directly to: https://taiatlas.org.

Key References

  1. dos Reis, M., Savva, R. and Wernisch, L. (2004) Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res., 32, 5036–5044.
  2. Chan, P.P. and Lowe, T.M. (2016) GtRNAdb 2.0: an expanded database of transfer RNA genes identified in complete and draft genomes. Nucleic Acids Res., 44, D184–D189.
  3. Yoon, J., Chung, Y.J. and Lee, M. (2018) STADIUM: Species-Specific tRNA Adaptive Index Compendium. Genomics Inform., 16, e28.
  4. Anwar, A.M., Khodary, S.M., Ahmed, E.A., Osama, A., Ezzeldin, S., Tanios, A., Mahgoub, S. and Magdeldin, S. (2023) gtAI: an improved species-specific tRNA adaptation index using the genetic algorithm. Front. Mol. Biosci., 10, 1218518.
  5. Nawrocki, E.P. and Eddy, S.R. (2013) Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics, 29, 2933–2935.
  6. Chan, P.P., Lin, B.Y., Mak, A.J. and Lowe, T.M. (2021) tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res., 49, 9077–9096.
  7. Wolf, J., Gerber, A.P. and Keller, W. (2002) tadA, an essential tRNA-specific adenosine deaminase from Escherichia coli. EMBO J., 21, 3841–3851.
  8. Suzuki, T. (2021) The expanding world of tRNA modifications and their disease relevance. Nat. Rev. Mol. Cell Biol., 22, 375–392.
  9. Grosjean, H., de Crécy-Lagard, V. and Marck, C. (2010) Deciphering synonymous codons in the three domains of life: co-evolution with specific tRNA modification enzymes. FEBS Lett., 584, 252–264.
  10. Novoa, E.M., Pavon-Eternod, M., Pan, T. and Ribas de Pouplana, L. (2012) A role for tRNA modifications in genome structure and codon usage. Cell, 149, 202–213.
  11. Mandal, D., Köhrer, C., Su, D., Russell, S.P., Krivos, K., Castleberry, C.M., Blum, P., Limbach, P.A., Söll, D. and RajBhandary, U.L. (2010) Agmatidine, a modified cytidine in the anticodon of archaeal tRNA(Ile), base pairs with adenosine but not with guanosine. Proc. Natl. Acad. Sci. U. S. A., 107, 2872–2877.

Open Source

The tAIatlas pipeline code and all data files are available on GitHub. Contributions welcome.

View on GitHub