WSSP Session 2: Life of Earth - Nucleic Acids, Proteins, and Carbohydrate Biopolymers

Paridhi Latawa
7 min readOct 1, 2021

In the second session of the Welch Summer Scholars Program, Dr. Eric Anslyn, professor at the University of Texas at Austin, shared a talk about the assembly and sequencing for diagnostics and information storage. This talk was particularly interesting to me as it connected chemistry to genetics and biology, a field that I am greatly interested in! Here are some of the key takeaway points I learned from this session.

We started by discussing the life of Earth: nucleic acids, proteins, and carbohydrate biopolymers. We created the analogy that proteins are machines, and the encoding of these protein machines is stored in the sequence of DNA (ACTG). We discussed that if we could generate, as an organic chemist, a set of pairings that only pair at high fidelity, this would be something novel. It would generate a sequence of monomers that efficiently store a sequence of information.

We then discussed four orthogonal dynamic covalent reactions as founded by Dr. Helen Seifert in 2016. We explored the idea that we could utilize the interesting properties of nucleic acids to catalyze various reactions. Instead of using a nucleic acid backbone, we would use a peptide backbone.

Peptide pairing was found to be analogous to how DNA pairs with its complementary strand. The peptide strands were synthesized as shown in the diagram above, the blue piece fit with the red. The aldehyde functional group pairs very specifically with the hydroxyl functional group.

Imagine when mixing the three hydroxides on a peptide stand with an oppositional stand of 3 aldehydes, we get a very clean parking space. It works with 3, 4, and all the way up to 6, as shown in the figures labeled 9, 10, 11, and 12. We could take a 6-mer and pair it with a 3-mer and it would still work. This is evidence that shows that we can have high fidelity of taking a certain strand and sorting through other strands and finding the strands that pair appropriately with the original strand.

We can tie this back to the original goal of nucleic acids having a certain sequence that can pair very specifically with its complementary sequence In that case here, using only one monomer, we have shown that we can get a sequence to pair with its complementary sequence using chemistry which is different from nucleic acid chemistry. Interesting!

From there, the explored the concepts of temptation and replication exploring whether we could use the turntable orthogonal reversible covalent bonds (TORCs) to create sequence-specific replicators. This concept was researched was Joseph Maier.

When nucleic acids replicate, one strand is used to generate a complete copy of itself, as shown in the figure above, such that A pairs with T and C pairs with G. Once this happens, we have new letters that correspond to the new code or the new language. We could combine the new letters that repeat the claim structures to determine whether we could pair them up. If we are able to combine the comments from the very specific pairing interactions, we would be on the way to creating life! While this might sound crazy, NASA defines life as a replicating system that can undergo evolution and consumes energy in doing so. With our example of the replicating system, it is not undergoing evolution or consuming energy in order to replicate, but the Lab is working on the first step of replication via a complete unnatural system that is yet analogous to nucleic acids.

Dr. Anslyn’s group took a peptide that had two aldehydes and paired it with a hydroxide as a pairing and assessed whether they could make the complementary strand by sewing up the two ends.

If we sow up both ends, we make a complement to both ends, which actually works and can turn into a complimentary dimer which can template another dimer formation. Not only can it form dimers, but it can also form trimers and eventually tetramers which can template other trimers and tetramers, respectively.

As presented by Lee Cronin from the University of Glasgow, the goal now is that starting with nucleic acids as the inspiration, the goal is to then make an artificial chemical analogy to nucleic acids that’s created in an organic chemistry lab. This is an important concept as it can then make us wonder whether life on other planets has to use nucleic acids to survive or are there other chemical analogies? When we define life, we don’t specify the presence of nucleic acids in that definition, which implies that life elsewhere in the universe is highly likely to not include nucleic acids.

We have nucleic acid on earth as those are the chemical entities that life has evolved to use on earth but elsewhere in the universe, it’s highly doubtful that nucleic acids are used to store the information that creates and templates life. So, how could we find life elsewhere in the universe when looking at chemistry when we are not looking for DNA or proteins? We are not expecting any of the chemistry used on earth to be used for life on other planets. We know that chemistry on the earth is pretty complex, so maybe there is a threshold indicating a point that only life could be creating the molecules as their complexity evolves. If we are simply looking for the complexity of the molecule, we are not looking for any specific molecule that we expect to find — we are simply looking for the specific threshold of complexity. That could be known as a biosignature —a signature that starts to say when you’re on the moon or someplace without nucleic acids, there are such complex models that there must be life there.

Lee Cronin designed a specific molecular complexity scale where we can look for a complexity number of the respective molecule using the Corbin scale. Cronin defined complexity by the peaks in mass spectrometry graphs.

The various spectra on NMR are put through machine learning protocols to analyze complexity by Dr. Anslyn and Dr. Ellington at UT Austin. Computer programs have been trained to recognize patterns and make predictions to analyze the spec data.

The applications of this research are very relevant. NASA’s upcoming missions, planned for 2060, have had a research methodology that attempts to define whether life is existing on moons or not. In order to assess this, they need to run mass spec experiments and further analyze the data.

Through further sequencing processes such as Edman Degradation, fluorosequencing, a lot of information about polymers can be obtained. Edman Degradation can tell the variety of amino acids in a peptide to protein sequence (need a polyurethane varnish).

The slide on the right includes one of the complex syntheses (characteristic in orgo!) involved in Dr. Ansyln’s research.

After synthesis and the development of the polyurethanes, the next order of business includes determining how to store information this way. A mean question of interest is whether we can write a language in the polyurethanes. Through Huffman encoding, it is possible to use polyurethanes to convey messages. As there are 8 different monomers in polyurethane, octal code (0 to 7) is used (DNA writes in four codes due to the four bases, English has a 26 letter code).

Below explains molecular decoding and a Jane Austen quote conveyed through synthesized molecules!

Data storage is one of the primray application of converting the polyurethanes into binary. Converting it into binary initially allows for data compression as the binary can be converted to hexadecimal. The unifying theme of this presentation is that information, in nucleic acids that can create life on this planet, can be stored in polymers. On other planets, this information is stored in the complexity of the other molecules we are striving to find.

Unifying theme that information can be stored in polymers

The information in nucleic acids can create life on this planet

--

--

Paridhi Latawa

Pari is a student at MIT in Cambridge, MA, studying CS & Biology