DNA is the fundamental genetic material of all living organisms on Earth that stores and propagates genetic information. It is widely acknowledged that DNA, with a very uniform structure, is composed of four nucleobases (A, adenine; G, guanine; C, cytosine; T, thymidine), 2′-deoxyriboses, and charged phosphate backbones (Figure 1a). Four letters in the genetic alphabet form two base pairs (A:T and G:C) following the complementary rule, which is essential for the formation of a double helix structure and genetic information transmission (Figure 1b).
Modified bases, including but are not limited to N6-methyladenine (m6A), 5-methylcytosine (5mC), 5-formylcytosine (5fC), 5-carboxycytosine (5caC), 5-hydroxymethyl-uracil (5hmU), and 5-formyluracil (foU), have been identified to convey epigenetic information (Figure 1c). In eukaryotes, such modifications function as a regulatory element, representing a second layer of regulatory information beyond the essential information encoded in the base sequence. In mammals, 5mC and 5hmC are abundant in the brain and play an important role in organizing the chromatin structure and regulating gene expression. Methylated nucleobases, such as m6A and 5mC, are observed in bacteria to protect cellular DNA from endonuclease-mediated cleavage that destroys the invading bacteriophage or viral DNA. Moreover, m6A and 5mC are also active in cellular processes, such as DNA replication, transcription, transposition, and DNA repair.
To investigate the key chemical and structural parameters for genetic information storage, heredity, and evolution in vitro, a series of xenobiotic-nucleic acids (XNAs) are synthesized by replacing natural bases, sugars, and phosphate linkages with their unnatural counterpart. Modification of three subunits of nucleotides leads to sugar-modified XNAs, phosphate-modified XNAs, as well as base-modified XNAs, or their combination. Some of these XNAs can mimic natural nucleic acids to form a stable double helix between DNA/RNA or themselves following Watson–Crick base-pairing rules. The biological evaluation of these XNAs gives inspiring insights into the question of why nature chooses DNA/RNA as genetic materials rather than other chemicals, a fundamental question of the origin of life.
Sugar-Modified XNAs
Sugars, including arabinose, lyxose, threose, allose, mannose, and glucose, are postulated to coexist with ribose under prebiotic circumstances. Sugar-modified XNAs, such as bicyclo-DNA, LNA (locked nucleic acid), and HNA (hexitol-nucleic acid) , have been developed and display better chemical stability, superior nuclease resistance, and improved pharmacokinetic properties. Only a small number of XNAs can form a stable antiparallel duplex with DNA/RNA following the canonical base-pairing rules, with examples including HNA, TNA (threose nucleic acid), CeNA (cyclohexenyl nucleic acid), ANA (arabino nucleic acid), FANA (2′-fluoro-arabino nucleic acid), GNA (glycol nucleic acid), and LNA. XNA replication systems with high fidelity were built for six XNAs (HNA, CeNA, LNA, ANA, FANA, and TNA). The replications between FANA and FANA, CeNA and CeNA, or HNA and CeNA were also realized based on these engineered polymerases.
Aptamers are single-stranded oligonucleotides that fold into complex 3-D structures and bind to specific targets, and are commonly produced by systematic evolution of ligands by exponential enrichment (SELEX), also referred to as in vitro selection or in vitro evolution. To select XNA aptamers, the XNA library used in SELEX has to be amplified by XNA polymerases, and selected XNAs should be able to be transcribed into cDNA. Starting from random XNA oligomer pools, functional XNAs (ANA, FANA, HNA, and CeNA) with trans-RNA endonuclease and ligase activities (XNAzymes) were obtained by in vitro selection.
Phosphate-Modified XNAs
The classic phosphate diester linkages are substituted by different functional groups in sulfone-DNA, PS-DNA, PN-DNA, NP-DNA, triazole-DNA, and dPhoNA.
PS-DNA, a modified XNA in which the non-bridging oxygen of the phosphate diester is replaced by a sulphur atom, was synthesized, aiming to develop antiviral agents against HIV in the 1980s. The phosphorothioate motif shares a similar structure with the commonly seen phosphate diester linkage and offers considerable advantages, such as enhanced resistance to nucleases. It is also reported that naturally occurring phosphorothioation was discovered in bacterial DNA and played an important role in many cellular processes.
Recently, Herdewijn’s group reported the in vivo study of a synthetic genetic polymer bearing the P3′→N5′ phosphoramidate linkages, which is denoted as PN-DNA. Compared to DNA, PN-DNA is more stable under basic conditions and more acid-labile, such that the phosphoramidate linkage can be cleaved under acid conditions. Enzymatical synthesis of PN-DNA was successfully achieved by employing Taq DNA polymerase together with KF (Klenow fragment) and Vent (exo−) polymerases. Multiple NH-dCTPs were incorporated into the R67DHFR gene sequence to give modified plasmids with trimethoprim resistance, which was then transformed into E. coli cells on trimethoprim-containing medium.
dPhoNA represents a family of nucleic acids that contain a phosphonate linkage rather than the natural phosphate diester. Phosphonate is an isostere of phosphate diester that contains a stable P-C bond, which is resistant against chemical and enzymatical degradation. Therminator polymerase is demonstrated as a catalyst for the condensation of the phosphonate derivatives of adenine to afford dPhoNA with enhanced stability against nucleolytic degradation.
Sugar- and Phosphate-Modified XNAs
The combination of phosphonate linkages and artificial sugar rings gave rise to the discovery of two orthogonal XNAs: tPhoNA (3′–2′ phosphonomethyl-threosyl nucleic acid) and ZNA (an XNA analogue with an acyclic methylphosphonate backbone) . The former contains a threose ring while the latter one bears an acyclic backbone. Initially, the repeating nucleoside units of both tPhoNA and ZNA are developed for antiviral drug discovery. tPhoNA successfully served as a xenobiotic genetic material in vitro in the presence of engineered polymerases.
Invented by Nielsen, Egholm, and their collaborators, peptide nucleic acid (PNA) is a neutral oligomer in which the whole deoxyribose phosphodiester backbone is replaced by N-(2-aminoethyl)glycine. It forms very stable duplex structures with complementary DNA, RNA, or PNA. PNA has been used in the areas of gene therapy, genetic diagnostics, and nanotechnology.
Base-Modified XNAs
DZA, described as ‘a fully morphed DNA containing all four non-conical nucleotides’, was developed by Herdewijn’s group. Unnatural bases used in DZAs share similar skeletons with their natural counterparts, including 5-chloro-2′-deoxyuridine (5ClU), 5-methyl-2′-deoxycytidine (5MeC), 5-fluoro-2′-deoxycytidine (5FC), 7-deaza-2′-deoxyadenosine (7dA), 7-deaza-2′-deoxyguanosine (7dG), 7-fluoro-7-deaza-2′-deoxyguanosine (7FG), and 2′-deoxyinosine (dI). Taq or Vent (exo−) DNA polymerases served as catalyst for the PCR amplification of DZAs. DZA fragments efficiently block the restriction sites from enzymatic cleavage, representing a better property as alternative genetic materials.
Unnatural Base Pairs (UBPs)
Numerous unnatural base pairs (UBPs), also known as artificially expanded genetic alphabets, have been developed ever since the first UBP (isoG: isoC) was synthesized and incorporated successfully into DNA/RNA by Steven Benner in 1989. Other UBPs based on altered hydrogen bonding were created, such as dB:dS and P:Z base pairs, and subsequently used for the construction of a genetic system containing eight letters.