Measuring the tolerance of the genetic code to altered codon size

Abstract
Editor's evaluation
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Translation using four-base codons occurs in both natural and synthetic systems. What constraints contributed to the universal adoption of a triplet codon, rather than quadruplet codon, genetic code? Here, we investigate the tolerance of the Escherichia coli genetic code to tRNA mutations that increase codon size. We found that tRNAs from all 20 canonical isoacceptor classes can be converted to functional quadruplet tRNAs (qtRNAs). Many of these selectively incorporate a single amino acid in response to a specified four-base codon, as confirmed with mass spectrometry. However, efficient quadruplet codon translation often requires multiple tRNA mutations. Moreover, while tRNAs were largely amenable to quadruplet conversion, only nine of the twenty aminoacyl tRNA synthetases tolerate quadruplet anticodons. These may constitute a functional and mutually orthogonal set, but one that sharply limits the chemical alphabet available to a nascent all-quadruplet code. Our results suggest that the triplet codon code was selected because it is simpler and sufficient, not because a quadruplet codon code is unachievable. These data provide a blueprint for synthetic biologists to deliberately engineer an all-quadruplet expanded genetic code.

Editor's evaluation

Using a phage-based library generation and selection, the authors generated a suite of 4-base decoding tRNAs with improved efficiency in quadruplet decoding. The data represent an important step toward enhancing protein synthesis with 4-base codons. Overall, the approach to generate many tRNA variants with quadruplet anticodons is intriguing and provides a wealth of valuable information to the field. The results should become foundational for the field of synthetic biology.

https://doi.org/10.7554/eLife.76941.sa0

Introduction

The genetic code is determined by a combination of tRNAs and aminoacyl tRNA synthetases (AARSs). Codons are dictated by the three bases in the center of the anticodon loop of each tRNA, which undergo Watson-Crick base pairing to an mRNA transcript during translation, enabling accurate codon recognition. The correspondence between codons and amino acids – one tRNA isoacceptor class for each canonical amino acid – is dictated by the 20 AARSs which specifically recognize bases (identity elements) in the tRNAs, and attach the cognate amino acid onto only the CCA 3’ terminus of the cognate tRNA. The aminoacylation process is exquisitely accurate, enabling high-fidelity protein synthesis (Reynolds et al., 2010). Anticodon mutations frequently alter or abolish selective charging with the cognate amino acid because most AARSs rely on bases in the anticodon to identify the cognate tRNA (Giegé et al., 1998). However, certain natural anticodon mutations generate ‘suppressor’ tRNAs that insert their cognate amino acid in response to 5’-UAG-3’ stop codons pairing with its 5’-CUA-3’ anticodon (Eggertsson and Söll, 1988). Frameshift suppression, in which quadruplet tRNAs (qtRNAs) ‘suppress’ a +1 frameshift mutation can also arise (Riddle and Carbon, 1973; Roth, 1981); that is, qtRNA-Gly-GGGG can decode the four-base codon 5’-GGGG-3’ in mRNA transcripts using the 5’-CCCC-3’ anticodon. In the presence of efficiently aminoacylated qtRNAs, the ribosome is capable of translation with individual non-canonical stop or quadruplet codons within an otherwise all-triplet transcript (Dunkelmann et al., 2020; de la Torre and Chin, 2021). If individual quadruplet codon translation is known to arise through simple point insertions, what functional constraints, if any, prevent the natural or synthetic evolution of an all-quadruplet genetic code?

These origin-of-life questions have newfound importance to engineering with the advent of genetic code expansion technology. An expanded all-quadruplet genetic code would offer 256 total codons (de la Torre and Chin, 2021), including hundreds of free codons that could be assigned to non-canonical amino acids (ncAAs), valuable chemical additions to the genetic code that enable improved protein therapeutics (Hutchins et al., 2011). Indeed, the efficiency of quadruplet translation can be so high that as many as four unique quadruplets can be translated within one transcript to incorporate natural (DeBenedictis et al., 2021) or non-canonical (Dunkelmann et al., 2021) amino acids site-specifically. These inspiring results raise the question: is it possible to create an all-quadruplet genetic code? Such a code would presumably require the ability to incorporate the 20 canonical amino acids directed by quadruplet codons. Might the necessary translation components be easily derived from existing tRNAs and AARSs?

To investigate these questions, we present a comprehensive study of whether tRNAs in general (from all 20 isoacceptor classes) can decode diverse quadruplet codons in Escherichia coli (we examined 57 of the 256 possible quadruplet codons). We systematically explored whether quadruplet codon translation can arise through simple point insertions in each of the tRNA anticodon loops, or through mutation of many bases in the anticodon. We then used directed evolution to determine how often additional mutations throughout the tRNA can improve translation of the resulting qtRNAs (Figure 1). Finally, we characterized the fidelity quadruplet codon translation. Remarkably, we found that 12/20 isoacceptor classes of tRNAs can be readily converted to selectively charged qtRNAs, as confirmed with mass spectrometry. The efficiency of quadruplet decoding is often low, but can frequently be improved by accumulating additional mutations along the sides of the anticodon loop. Most of the resulting qtRNAs selectively incorporate a single amino acid in response to a quadruplet codon. Our results identify some of the barriers limiting the adoption of quadruplet codons by natural evolution, and present 9/20 qtRNAs necessary to synthetically create an all-quadruplet expanded genetic code.

Figure 1

Download asset Open asset

Evolution of quadrupelt tRNAs.

We studied whether tRNAs can arise through simple changes to the anticodon followed by additional mutations accumulated during evolution.

Results

Evolution of qtRNAs through point insertions

Many known examples of frameshift suppressors contain single base insertions in the anticodon that convert a triplet tRNA into a qtRNA. We initially tested whether tRNAs can evolve into qtRNAs through simple point insertions. We selected 21 endogenous E. coli tRNAs, one cognate tRNA for each canonical amino acid and the initiator methionine tRNA (Supplementary file 1, Materials and methods – ‘tRNA scaffold selection’), to serve as ‘scaffolds’ into which we introduced anticodon point insertions. We tested two point insertion locations: if the original tRNA decodes the triplet codon ‘XYZ’, we created a qtRNA that decodes ‘XYZZ’ and a qtRNA that decodes ‘XYYZ’ (Figure 2A). These anticodon patterns preserve the nature of the bases in the anticodon loop, which most AARSs use to recognize the cognate tRNA (Giegé et al., 1998), and are found in known qtRNAs, such as sufD (Riddle and Carbon, 1973) (qtRNA^Gly_GGGG) and sufG (O’Connor, 2002) (qtRNA^Gln_CAAA). Throughout this paper, for ease of comparison to standard triplet codon tables, we use qtRNA^{three letter scaffold}_{four letter DNA codon} nomenclature to refer to qtRNAs; for example, a serine qtRNA bearing a 5’-UCUA-3’ anticodon is referred to as qtRNA^Ser_TAGA.

Figure 2

Download asset Open asset

Engineering qtRNAs with codon patterns.

(A) We measured quadruplet tRNAs (qtRNAs) that might arise through point insertions in the anticodon loop.

Each qtRNA is based upon a tRNA from the *Escherichia coli* genome that serves as a ‘scaffold’ (Supplementary file 1). We tested two quadruplet codon patterns: a tRNA decoding the triplet codon ‘XYZ’ to a qtRNA decoding the quadruplet codon ‘XYZZ’ or ‘XYYZ’. In instances in which XYZZ and XYYZ are the same, the qtRNA is depicted on the XYZZ graph. We use ‘qtRNA’-‘three letter scaffold’-‘four letter codon’ nomenclature to refer to qtRNAs; for example, a serine qtRNA bearing a 5’-UCUA-3’ anticodon that recognizes 5’-UAGA-3’ in mRNA transcripts is referred to as qtRNA^Ser_TAGA. (B) We measured qtRNAs using a luciferase readthrough assay. Measurements are taken kinetically and normalized to culture density, and efficiency is reported relative to luminescence produced by a wildtype (WT), all triplet luciferase transcript. qtRNAs that are statistically >0 are annotated with their one-sample t-test p-value: 0.033(*), 0.0021(**), 0.0002(***), < 0.001(****). fMet qtRNAs are measured with a luciferase reporter bearing a quadruplet codon at residue 1; all others are measured with a quadruplet codon at residue 357 of luxAB. (C) Expression of qtRNAs can be toxic. Here, we report the fractional OD₆₀₀ density difference between cultures where qtRNA expression had been induced versus suppressed. Data in (B and C) represent the mean and standard deviation of three to eight technical replicates in one biological replicate. For raw data, see Figure 2—source data 1.

Figure 2—source data 1 Luminescence and growth metric of XYZZ and XYYZ qtRNAs.: https://cdn.elifesciences.org/articles/76941/elife-76941-fig2-data1-v2.csv
Download elife-76941-fig2-data1-v2.csv

We used two techniques to characterize these qtRNAs. First, to measure the quadruplet codon translation efficiency, we used a luciferase readthrough assay (Materials and methods – ‘Luciferase readthrough assay’). This reporter contains a single quadruplet codon at permissive residue 357 of luxAB (DeBenedictis et al., 2021); failure to decode the quadruplet codon leads to premature termination, whereas successful four-base decoding results in full-length luxAB translation and luminescence (Figure 2B). Seven of the twenty ‘XYZZ’-decoding qtRNAs (qtRNA^Arg_CGTT, qtRNA^Gln_CAGG, qtRNA^Gly_GGGG, qtRNA^fMet_ATGG, qtRNA^Pro_CCGG, qtRNA^Ser_TCGG, qtRNA^Thr_ACGG) and two of the ‘XYYZ’-decoding qtRNAs (qtRNA^Gln_CAAG, qtRNA^Pro_CCCG) functionally decode a quadruplet codon during translation. The frequent functionality of the XYZZ codon pattern may be due to flexibility in synthetase recognition at the third position of the codon, which is frequently a wobble base pair. Next, we quantified the toxicity of qtRNA expression by comparing the growth defect (Materials and methods – ‘Growth defect’) of cultures with qtRNA expression induced or suppressed (Figure 2C). The qtRNAs fall into several categories: those that exhibit no fitness defect such as the naturally occurring and highly functional qtRNA^Gly_GGGG; those that exhibit severe fitness defects and effectively halt bacterial growth upon induction such as qtRNA^Ala_GCCC, and those that moderately slow bacterial growth. The translation efficiency and growth defect of qtRNAs depends upon more than just the interaction with the cognate AARS: AlaRS, LeuRS, and SerRS are all tRNA synthetases that do not interact with the anticodon loop of their cognate tRNA, yet qtRNAs derived from these scaffolds exhibit a range of behaviors depending upon the new anticodon. For example, qtRNA^Ser_TCCG exhibits high growth defect, while qtRNA^Ser_TCGG exhibits no growth defect and modest quadruplet translation efficiency. Together, these data show that a third of the 20 isoacceptor classes have access to single base insertions that enable modestly functional quadruplet codon translation, however, other point insertions create qtRNAs that do not functionally decode quadruplet codons or incur large growth defects when expressed.

Evolution of qtRNAs through anticodon replacement

Next, we tested whether tRNAs could evolve into qtRNAs through whole anticodon replacement, as might occur during recombination or more intense mutagenesis. Antibiotic selection markers have previously been used to identify functional qtRNAs (Magliery et al., 2001). We applied an equivalent approach based on the use of an M13 bacteriophage tail fiber pIII as a selection marker (DeBenedictis et al., 2022). In this selection scheme, a qtRNA is encoded on the genome of a ΔpIII M13 bacteriophage. Phage are challenged to infect bacteria bearing a plasmid that encodes pIII containing a quadruplet codon at permissive residue 29 (Bryson et al., 2017). Functional qtRNAs are capable of producing full-length pIII and thus phage progeny, while nonfunctional qtRNAs result in production of truncated pIII and thus no further phage.

For each of the 20 representative E. coli tRNA scaffolds used above, we created a 256-member qtRNA library containing degenerate anticodons (Figure 3A, Materials and methods – ‘Phage library primer design’, ‘Phage library cloning’). We selected eight quadruplet codons of interest, focusing on codons for which at least one functional qtRNA was already known that could act as positive control. We selected for functional qtRNAs from these libraries (Figure 3B, Materials and methods – ‘pIII-based selection of NNNN anticodon libraries’) to identify qtRNAs that decode the eight quadruplet codons of interest (Figure 3C). High final phage titers after selection indicate the presence of a functional qtRNA, and we selected 69 putative qtRNAs that exhibit high final phage titer for further characterization (Figure 3D). We used plaque assays (Materials and methods – ‘Phage plaque assays’) to isolate clonal phage variants and determined the anticodon of each highly selected variant with Sanger sequencing. We found that for most variants, the qtRNA agrees with the quadruplet codon in the reporter at all four positions, or the first three positions of the codon (Figure 3E), in agreement with previous findings on quadruplet codon crosstalk with fourth-base mismatches (Anderson et al., 2002). We found that qtRNAs identified by this phage-based assay are also functional in the luciferase assay. These results demonstrate that most tRNA scaffolds are capable of supporting quadruplet codon translation through whole anticodon replacement.

Figure 3 with 1 supplement see all

Download asset Open asset

Selection for functional qtRNAs.

(A) We created quadruplet tRNAs (qtRNA) libraries using degenerate primers to randomize the four bases in the anticodon. The qtRNA library is expressed from a ΔpIII M13 bacteriophage. (B) We selected these libraries by challenging the phage to infect and propagate in bacteria that require a quadruplet codon to be translated in order to produce a functional version of the essential phage gene, pIII. Plaques from selected libraries were Sanger sequenced. (C) We crossed each library with each pIII-based reporter for a total of 160 separate library selections. (D) The log fold change in phage titer when comparing the post-selection population to the pre-selection population. Data represent the mean in one biological replicate. For raw data, see Figure 3—source data 1. (E) For every filled square, we sequenced two plaques in order to determine the anticodon identity. Results include instances in which the qtRNAs match the reporter at all four positions, or that match with the first three bases of the codon (shades of blue). Additionally, we show instances in which qtRNAs were discovered that suppress a different quadruplet codon near residue 29. (F) Although we intend for the phage to suppress the quadruplet codon located at permissive residue 29 of pIII (Bryson et al., 2017), in several instances the selection identified qtRNAs that suppress a nearby quadruplet codon instead.

Figure 3—source data 1 Log fold change in phage titer.: https://cdn.elifesciences.org/articles/76941/elife-76941-fig3-data1-v2.csv
Download elife-76941-fig3-data1-v2.csv

In addition to the expected anticodons, we found that some qtRNAs instead matched quadruplet codons that appear nearby in the sequence context of the reporter (Figure 3F). In these cases, qtRNAs were validated using this novel codon in position 357 of the luciferase reporter, confirming that they decode the novel codon in two different sequence contexts. We noticed that the novel codons often bear similarity to the qtRNA’s original codon; that is, qtRNA^Lys_AGAA is highly similar to the scaffold’s original AAA codon, and qtRNA^Tyr_TACA and qtRNA^Phe_TACA are similar to TAC and TTC, respectively. The emergence of these anticodons suggests that these isoacceptor classes favor quadruplet codons that are related to their natural triplet codon, and demonstrates that qtRNA evolution depends upon the sequence context of relevant ORFs.

Together, these experiments identified functional qtRNAs involving four or fewer mutations for 18/20 isoacceptor classes. For the remaining two isoacceptor classes, Met and Asn, we systematically tested additional codons and found that qtRNA^Met_AGGG and qtRNA^Asn_AGGA both exhibit weak quadruplet codon translation (Figure 3—figure supplement 1). Therefore, every isoacceptor class can give rise to qtRNAs capable of decoding quadruplet codons during protein translation.

Directed evolution of qtRNAs

Having found that qtRNAs that functionally decode quadruplet codons can arise generally through just a few mutations, we sought to understand other factors that may prevent more widespread use of quadruplet codons. Many qtRNAs were quite inefficient in translation: the presence of a single quadruplet codon in an mRNA transcript can reduce total protein yield to less than 3% relative to an all-triplet mRNA (Figure 2). Mutations at the anticodon loop sides of the qtRNAs have been observed to improve translation efficiency for TAGA-qtRNAs (DeBenedictis et al., 2021; DeBenedictis et al., 2022; Niu et al., 2013). Triplet tRNAs are known to exhibit patterns that relate the bases in the anticodon loop sides to the bases in the anticodon itself (Yarus, 1982), and similarly benefit from anticodon loop side mutations after anticodon replacement (Kleina et al., 1990; Raftery and Yarus, 1987; Cervettini et al., 2020). In some cases, mutations in this area can alter qtRNA charging; in others they improve quadruplet translation efficiency without altering the qtRNA’s interaction with the cognate AARS (DeBenedictis et al., 2021). We hypothesized that qtRNAs in general require mutations at bases 32, 37, and 38 to better accommodate a new codon, and that this requirement may present a key barrier preventing the natural evolution of efficient qtRNAs.

To test this hypothesis experimentally, we selected 41 functional qtRNAs, including at least one qtRNA for every unique scaffold, and cloned a library containing degenerate nucleotides at bases 32, 37, and 38 (Figure 4A, Materials and methods – ‘Phage library primer design’, ‘Phage library cloning’). We selected functional members of these libraries using the pIII-based selection and used next-generation sequencing (NGS) to characterize the abundance of each library member before and after selection (Figure 4B, Materials and methods – ‘pIII-based selection and NGS of libraries diversified at positions 32, 37, and 38’).

Figure 4

Download asset Open asset

Directed evolution of anticodon loop sides.

(A) We created quadruplet tRNAs (qtRNA) libraries using degenerate primers to randomize positions 32, 37, and 38 of the anticodon. The qtRNA library is expressed from a ΔpIII M13 bacteriophage. (B) We selected these libraries by challenging the phage to infect and propagate in bacteria that require a quadruplet codon to be translated in order to produce a functional version of the essential phage gene, pIII. Libraries were next-generation sequencing (NGS) sequenced to >10× library size before and after selection. (C) tRNA^Ser_TCG is known to be a scaffold for the functional qtRNA^Ser_TAGA after anticodon replacement alone. Additional mutations to the sides of the anticodon loop are known to improve quadruplet codon translation efficiency. (D) Log fold enrichment of the population abundance and (E) translation efficiency as measured by a luciferase readthrough assay of the 64 possible combinations of nucleotides at positions 32, 37, and 38. (F) Log fold enrichment of all 41 qtRNA libraries for each of the 64 library members. Libraries are separated by those that exhibit abundance changes during selection (above) from those that exhibit no significant abundance changes (below). The anticodon loop sides present in the wildtype (WT) tRNA scaffold are boxed in gold. For raw data, see Figure 4—source data 1. (G) For each library, the trend in nucleotide preference for each position is listed. (H) Nucleotide preferences for select libraries were measured by cloning a qtRNA variant and measuring it using a luciferase readthrough assay. The fold improvement in activity over the WT values of base 32, 37, and 38 are listed. In (**e and f**), the original identities of bases 32, 37, and 38 found in the WT triplet tRNA scaffold are boxed in gold.

Figure 4—source data 1 Log fold enrichment of anticodon loop side libraries.: https://cdn.elifesciences.org/articles/76941/elife-76941-fig4-data1-v2.csv
Download elife-76941-fig4-data1-v2.csv

We began by assessing results for the well-studied qtRNA^Ser_TAGA, which is known to have an improved variant, qtRNA^Ser_TAGA-32A-38C, that exhibits improved quadruplet codon translation but unaltered, selective aminoacylation with serine (DeBenedictis et al., 2021; Figure 4C). Of the 64 possible combinations of DNA bases at positions 32, 37, and 38, the single library member A32 A37 C38 is enriched four log fold above all other variants (Figure 4D). We measured several qtRNA^Ser_TAGA variants that correspond to different levels of enrichment using a luciferase readthrough assay, and confirmed that the strongly enriched variant exhibits more efficient quadruplet decoding than deenriched variants (Figure 4E).

Next, we applied the same procedure to quantify fold enrichment for the other 40 qtRNA libraries (Figure 4F). We identified 15 libraries that exhibit no selective pressure, 12 libraries that strongly enrich a single library member, like qtRNA^Ser_TAGA, and an additional 11 libraries exhibit strong enrichment for library members with a specific base at one or two positions, but not all three (Figure 4G). For several of these libraries, we used a luciferase readthrough assay to measure the fold change activity when mutations to bases 32, 37, and 38 are introduced. In most cases, introduction of these mutations substantially increases quadruplet codon translation efficiency (Figure 4H).

We were curious whether there are overall trends in the identity of optimal anticodon loop sides for efficient quadruplet codon translation. The optimal library member is not determined by the scaffold or codon independently, indicating that the mechanism by which these mutations improve quadruplet codon translation does not improve the qtRNA’s interaction with its respective AARS. Among the libraries there was a prominent preference for A37, a base known to be associated with reading frame maintenance (Agris, 2004). Additionally, C32 A37 T38 and T32 A37 G38 appear often among libraries that exhibit strong preference for a single library member. The presence of modified nucleotides is especially important at two sites: position 34, the first base of the anticodon, and position 37, the nucleotide downstream of the anticodon (Agris, 2004; Grosjean and Westhof, 2016). The location of ‘identity elements’ for some of these modifying enzymes within the anticodon loop sides has been implied (Grosjean and Westhof, 2016). Mutation of these bases may improve the RNA modification of the anticodon loop, which is essential for many tRNA functions (Edwards et al., 2020).

Which amino acid(s) do qtRNAs incorporate during translation?

Having established that anticodon loop changes can often produce tRNAs that decode four-base codons, we sought to determine which amino acid these qtRNAs incorporate during translation. To do so, we translated sfGFP mRNA containing a quadruplet codon at permissive residue 151 (Young et al., 2010) in the presence of a qtRNA. We then used mass spectral analysis to determine the nature and occupancy of amino acid 151 in the resulting protein (Cervettini et al., 2020) for qtRNAs based on the 20 distinct tRNA scaffolds (Figure 5, Materials and methods – ‘Quantification of qtRNA charging using mass spectrometry’, Supplementary file 3).

Figure 5

Download asset Open asset

Characterization of amino acid incorporation by quadruplet tRNAs (qtRNAs).

(A) We characterized the amino acid incorporated during translation by co-expressing a qtRNA and an sfGFP-151-quad transcript, purifying the resulting GFP, and analyzing the occupancy of residue 151 using mass spectrometry (Supplementary file 3). (B) Results of applying this pipeline to at least one qtRNA based on each of the 20 canonical scaffolds (Materials and methods – ‘Quantification of qtRNA charging using mass spectrometry’). Charging of (*) qtRNA^Arg_TAGA and qtRNA^Tyr_TAGA has been previously reported (DeBenedictis et al., 2021). Data represents the mean of one biological replicate. For some qtRNAs, yield was too low to allow for charging characterization even when purified at 1 L scale (Supplementary file 2). Raw spectra have been deposited in the PRIDE database (Perez-Riverol et al., 2022), dataset identifier PXD031925 and 10.6019/PXD031925.

We were unable to characterize the acylation properties of six qtRNAs due to low sfGFP purification yield, even when purified at 1 L scale (Supplementary file 2) (qtRNA^Asn, qtRNA^Cys, qtRNA^His, qtRNA^Leu, qtRNA^Lys, and qtRNA^Thr). These qtRNAs would be unlikely to be selected in a natural system due to their inability to produce full-length protein efficiently. Eight qtRNAs are selectively acylated with the amino acid cognate to the original scaffold (qtRNA^Arg, qtRNA^Gln, qtRNA^Glu, qtRNA^Gly, qtRNA^Phe, qtRNA^Pro, qtRNA^Ser, qtRNA^Tyr). Four qtRNAs selectively incorporate Arg, rather than the amino acid cognate to the scaffold (qtRNA^Asp, qtRNA^Ile, qtRNA^Trp, qtRNA^Val). Two qtRNAs that are charged with the cognate amino acid as well as promiscuously charged with Arg (qtRNA^Ala and qtRNA^Met). All 14 of these qtRNAs might be selected by a natural system to rescue a frameshift mutation. The 12 qtRNAs that are selectively charged might be further selected as components of an all-quadruplet genetic code.

These data highlight the plasticity of AARS recognition for altered codons, both in nucleobase composition and size. The presence of a quadruplet anticodon presumably distorts the structure of the anticodon binding domain, which is a major identity element for many AARSs (Giegé et al., 1998). How much distortion will the enzyme functionally accept, and/or will recognition of the additional base lead to mischarging with a different amino acid? Of the eight qtRNAs that retained the cognate specificity, tRNA^Gly and tRNA^Pro are known as naturally occurring functional frameshift suppressors (Riddle and Carbon, 1973; Sroga et al., 1992); we confirm that they are selectively charged by the cognate AARS. Together with tRNA^Gln, our results show that E. coli GlnRS, GlyRS, and ProRS recognize the first three bases of the quadruplet anticodon, and are capable of charging qtRNAs despite the increased anticodon size. Correct charging of qtRNAs derived from qtRNA^Ser is expected, as E. coli SerRS does not interact with the anticodon (Biou et al., 1994). A striking result is the correct charging of qtRNA^Glu_CGGT and qtRNA^Phe_CGGC by their respective E. coli tRNA synthetases, as their triplet anticodon sequence is unlike that of the quadruplet anticodon. However, in both cases all other known critical identity elements (Giegé et al., 1998) are present in the tRNA scaffold. This suggests that the major recognition feature, the anticodon, can be outweighed by the sum of the other identity elements, creating in some cases an avenue for anticodon evolution.

Finally, why did we observe acylation of multiple qtRNAs with arginine? ArgRS is responsible for synthesis of a family of Arg-tRNAs needed for the recognition of six codons, including tRNAs that differ at both the first and last position of the anticodon. As a consequence, the identity of just one anticodon position is invariant (C35). Examination of the qtRNAs charged with Arg (Figure 5B) shows that all satisfy this anticodon identity element, causing widespread promiscuous charging with Arg, even for qtRNAs that lack other known argRS identity elements such as A20, which is absent in qtRNA^Asp_CGGC and qtRNA^Val_CGGC. For this reason, the presence of promiscuous ArgRS substantially increases the probability that tRNA point insertions in diverse scaffolds will result in an aminoacylated qtRNA.

Taken together, we found that in the majority of cases, qtRNAs exhibit properties that would render them evolutionarily favored building blocks: they are selectively charged by a single amino acid and incorporate that amino acid in response to their quadruplet codon.

Compiled trends in nascent qtRNA evolution

In total, we characterized 116 different qtRNAs based on 20 tRNA scaffolds that decode 20 unique quadruplet codons (Figure 6A). This greatly expands the total number of known qtRNAs, the diversity of triplet tRNA scaffolds they are based upon, and the diversity of quadruplet codons they recognize beyond what has previously been reported. We found that 60 out of 109 are functional, that is, they generate increased luminescence upon induction of qtRNA expression. In total, every tRNA scaffold we tested is capable of supporting quadruplet codon translation given an appropriate four-base codon choice (i.e., in Figure 6A each column has at least one filled square).

Figure 6

Download asset Open asset

Compiled qtRNA measurements.

(A) Compiled results of quantifying quadruplet tRNAs (qtRNA) translation efficiency (Materials and methods, ‘Luciferase readthrough assay’) and charging (Materials and methods, ‘Quantification of qtRNA charging using mass spectrometry’). The qtRNA scaffold (columns) and codon (rows) are indicated. The translation efficiency is indicated on each qtRNA that could be measured, or marked as ‘nonfunctional’, indicating that the qtRNA exhibits too strong a growth defect to measure, or does not produce a measurable increase luminescence upon induction of qtRNA expression. Translation efficiency is measured as a percent of all-triplet translation (Materials and methods, ‘Luciferase readthrough assay’). fMet qtRNAs are measured with a luciferase reporter bearing a quadruplet codon at residue 1; all others are measured with a quadruplet codon at residue 357 of luxAB. Charging results are indicated on the table for qtRNAs derived from each scaffold-codon pair; the measured qtRNAs may contain additional mutations; see Supplementary files 2 and 3. Data represent the mean of three to eight technical replicates in each of between one and eight biological replicates. For complete compiled raw data, see Figure 6—source data 1. (B) A miniature all-quadruplet genetic code. For each of the 10 qtRNAs (rows), we measured readthrough of a luciferase transcript containing the indicated quadruplet codon at residue 357 (columns). For raw data, see Figure 6—source data 2.

Figure 6—source data 1 Luminescence measurements of qtRNAs.: https://cdn.elifesciences.org/articles/76941/elife-76941-fig6-data1-v2.csv
Download elife-76941-fig6-data1-v2.csv
Figure 6—source data 2 Luminescence measurements of qtRNA crosstalk.: https://cdn.elifesciences.org/articles/76941/elife-76941-fig6-data2-v2.csv
Download elife-76941-fig6-data2-v2.csv

These compiled trends can be interpreted in the context of known AARS identity elements (Giegé et al., 1998). Most prominently, we tested several qtRNAs that compete with the arginine codons AGG and found 18/20 to be functional. Further, we characterized the amino acid incorporated during translation and found that in 4/4 cases arginine is the dominant amino acid inserted. Together, these data indicate that the anticodon is a strong identity element for ArgRS, which charges tRNAs with arginine-like anticodons regardless of matching of other putative identity elements like A20 or A/G73. We also found that several isoacceptor classes were especially resistant to accepting a quadruplet anticodon. For example, 9/10 qtRNA^Asn species tested were nonfunctional. Only qtRNA^Asn_AGGA exhibits weak activity and is likely charged with arginine. This suggests that AsnRS does not tolerate expanded anticodons. Similarly, qtRNA^Ile and qtRNA^Met only tolerate arginine-like anticodons. Interestingly, we also found that certain isoacceptor classes do not tolerate new anticodons even when this would not be predicted from known identity elements. For example, qtRNA^Ser_TCGG and qtRNA^Ser_GCAT are both nonfunctional, even though SerRS does not interact with the anticodon loop of its cognate tRNA. The function of these qtRNAs may instead be governed by interaction with other enzymes that add chemical modifications.

To be used as part of a genetic code, qtRNAs must be capable of operating together with minimal crosstalk. We selected 10 qtRNAs and measured crosstalk between members of this set, revealing a high degree of orthogonality: each qtRNA is specific for its own four-base codon (Figure 6B).

Discussion

We present a comprehensive study of whether tRNAs can be altered to functionally decode quadruplet codons. Our results show that quadruplet decoding is a remarkably universal phenomenon: every tRNA we tested is capable of exhibiting weak quadruplet decoding activity with four or fewer mutations. Additionally, weak quadruplet decoding requires very few mutations in many cases. For example, a single point insertion in the anticodon is sufficient to generate functional qtRNAs for 7/20 isoacceptor classes.

The tRNA anticodon is a major recognition element for many tRNA synthetases (Giegé et al., 1998). Thus we were surprised to find nine different AARSs (ArgRS, GluRS, GlnRS, GlyRS, HisRS, PheRS, ProRS, SerRS, TyrRS) that selectively charged the cognate qtRNA, creating the opportunity for amino acid and codon-diverse all-quadruplet peptide translation. The diversity of codons that can be recognized by qtRNAs can easily enable codon-selective translation, as we demonstrate with a miniature all-quadruplet genetic code composed of 10 orthogonal qtRNAs based on unique isoacceptor classes. Together, these results demonstrate that qtRNAs capable of amino acid-selective efficient translation of quadruplet codons can arise during evolution.

However, our study identifies several factors that may have prevented universal adoption of an all-quadruplet code. First, we find that some qtRNAs exhibit strong fitness defects which prevent them from being maintained and accumulated in genomes. Second, additional mutations outside the anticodon are often required for efficient translation, reducing the probability that efficient qtRNAs would arise by chance. Such mutations may affect identity elements of the tRNA modifying enzymes that chemically alter the bases surrounding the anticodon (Hori, 2014); similar patterns have been observed for canonical triplet tRNAs (Yarus, 1982). Finally, only some AARSs are amenable to altered codon sizes for some amino acids: 12/20 modern E. coli AARSs are intolerant, inefficient, or promiscuous with qtRNAs. In particular, AARSs that arose later during evolution may be more finely tuned to precise anticodon base recognition and thus less tolerant of anticodon mutations or expansion. A nascent all-quadruplet code would initially have access to only the limited chemical vocabulary of the nine AARSs that are most amenable to quadruplet codon translation, decreasing the likelihood that valuable quadruplet-ORFs might arise naturally and be selected. Similarly, as evolution progressed, the translation apparatus would have been under increasingly strong selection favoring increased fidelity of translation to support a genetic code with an increased number of codons, making it increasingly difficult for an all-quadruplet codon code to arise. Finally, studies have shown that ORFs can also arise de novo from non-coding regions (Carvunis et al., 2012) and that random ORFs can rapidly evolve into functional proteins (Keefe and Szostak, 2001), offering an avenue for gene birth in a different codon size should the appropriate qtRNAs be available. However, given that the canonical genetic code only includes 20 or 21 amino acids, even though there are 64 codons, suggests that there is little natural selective pressure to expand the genetic code to a much larger number than the currently known canonical amino acids. Together, these factors contribute to the absence of naturally occurring quadruplet codon codes, despite the frequency with which individual quadruplet translation components can arise through evolution. We expect that naturally occurring instances of qtRNAs or quadruplet codon codes would arise even less frequently than other non-canonical genetic codes in bacteria (Shulgina and Eddy, 2021).

Although these factors explain the absence of naturally occurring all-quadruplet genetic codes, our results offer a blueprint for deliberately engineering an expanded all-quadruplet genetic code. While there is much current interest in the detailed aspects of quadruplet decoding on the ribosome (Gamper et al., 2021; Choi et al., 2020), more work is needed to understand the hurdles a mRNA quadruplet codon encounters on its way through the ribosome. In addition, tRNA modification of the anticodon loop may be crucial to endow the anticodon with the flexibility needed to efficiently pair with the quadruplet codon in the ribosome environment (Grosjean and Westhof, 2016). Our work was not designed to contribute to mechanistic understanding. Instead, we aim to survey the empirical properties of qtRNAs. Our data have revealed exactly which triplet codons can be reassigned to quadruplet codons without compromising amino acid selectivity: an initial all-quadruplet code will be composed of nine selectively charged qtRNAs from this study (qtRNA^Gln_CAGG, qtRNA^Pro_CCGG, qtRNA^Phe_CGGC, qtRNA^Glu_CGGT, qtRNA^Gly_GGGG, qtRNA^Ser_TCGG, qtRNA^Arg_AGGG, and either qtRNA^Tyr_TAGA or qtRNA^His_TAGA).

Further, the flexibility of AARSs suggests that anticodon re-engineering may be a viable strategy for reassignment of the additional 11 canonical amino acids to quadruplet codons. We and others (Krishnakumar et al., 2013) have observed that ArgRS is a major source of promiscuous charging; refactoring or replacing this ArgRS may be broadly valuable for enabling reassignment of arginine-like codons; that is, enabling selective charging of qtRNA^Ala_AGGG. A primary limitation of quadruplet codon translation is competition with the canonical triplet tRNAs, resulting in toxicity and low translation efficiency. To overcome this major limitation, strategies such as CCA tail engineering (Samaha et al., 1995; Terasaka et al., 2014), or ribosome engineering (Neumann et al., 2010; Schmied et al., 2018) to reject triplet codons, will be essential for creating an all-quadruplet genetic code. Additionally, current qtRNAs are known to crosstalk with four-base mismatched codons (DeBenedictis et al., 2021), and engineering to eliminate this behavior will be required to fully utilize the expanded codon set. Re-engineering the majority of endogenous tRNA synthetases is a considerable engineering investment, but advances in protein engineering likely make this task not more onerous than current genetic code expansion techniques, which require years of iterative whole genome synthesis and assembly to free up a large number of codons. Together, our deliberate exploration of the evolution of functional quadruplet translation will launch synthetic efforts to assemble a 256 codon genetic code.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Strain, strain background (Escherichia coli)	S2060	Addgene	#105064	K12 derivative optimized for directed evolution
Recombinant DNA reagent	pED1a6 (plasmid)	Addgene	#134787	Luciferase reporter for quadruplet codon translation
Recombinant DNA reagent	pED17 × 11 (plasmid)	Addgene	#134814	sfGFP reporter for quadruplet codon translation
Recombinant DNA reagent	pED7 × 10 (plasmid)	Addgene	#134812	pIII reporter for quadruplet codon translation
Recombinant DNA reagent	pED14xS-1(plasmid)	Addgene	#134800	IPTG inducible expression of qtRNA

Share this article

Cite this article

Evolution of quadrupelt tRNAs.

Engineering qtRNAs with codon patterns.

Figure 2—source data 1

Selection for functional qtRNAs.

Figure 3—source data 1

Directed evolution of anticodon loop sides.

Figure 4—source data 1

Characterization of amino acid incorporation by quadruplet tRNAs (qtRNAs).

Compiled qtRNA measurements.

Figure 6—source data 1

Figure 6—source data 2

Author details

Erika Alden DeBenedictis

Contribution

For correspondence

Competing interests

Dieter Söll

Contribution

Competing interests

Kevin M Esvelt

Contribution

Competing interests

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Research organism

Further reading