Lackey Lab Journal Club: Shape JuMPing

Written by Luke Hatfield

8/20/21 – The world of RNA structure is bubbling with potential and a necessary desire to find accurate and reliable methods of interpreting not only two dimensional RNA structure, but three dimensional as well. Some studies have already been completed using complex in silico simulations, kinetics based physics models to interpret 3D structure conformation using nucleotide based SHAPE-MaP probing, and now direct crosslinking of folding conformations using SHAPE-JuMP from the Weeks lab at UNC, Chapel Hill. This method is accomplished using a SHAPE compound developed specifically for this method, called trans-bis-isatoic anhydride (TBIA). This TBIA acts like a typical SHAPE compound in that it reacts with a nucleotide’s 2’-OH group. However, unlike other SHAPE compounds, it is a double ended molecule. This means that when the primary oxygen bonds the molecule can then use another available oxygen on the other side of the compound to attach to a nearby nucleotide, which can be paired or unpaired. This then links the two strands of RNA that are in close proximity to each other. The RNA is then crosslinked so that the specific conformation the structure is taking is tethered using TBIA. Reverse transcription is then performed, utilizing a uniquely mutated RT-C8 reverse transcriptase, that is capable of “jumping” the TBIA, inducing mutations where the nucleotides were linked. These mutations are then read out (to an impressive depth of 500000 reads) and compared to standard sequencing reads and SHAPE-MaP reads in order to identify the locations of the mutations from SHAPE-JuMP. SHAPE-JuMP provides important details about which nucleotides were crosslinked and skipped, showing interaction sites of the TBIA, while the other standard reads serve to help fill in the gaps and build the 3D structures. This provides unique structural information about what sections of the RNA in question is in proximity to each other, giving insight into the 3D conformations the RNA are taking when in vitro.

This method is still fairly new and there are a fair few improvements to be made. However, this introductory experiment is promising and gives a fair amount of credence to the ability of this technique and related compounds to improve upon the current understanding and methodology in uncovering RNA three dimensional structure. As our lab currently has two significant projects underway regarding RNA structure and form, observing and learning from techniques and experiments like these is crucial in moving forward with innovative and creative approaches to the questions that are at the forefront of RNA research. The primary drawback to utilizing this technique for our research projects is in the limited scope of RNA’s that it can probe. The P546 domain, VS riobzyme, RNase P, and Group II intron interrogated in this paper are no longer than 412 nucleotides at the longest (Group II intron) and 158 nucleotides long at the shortest (P546 domain). The current focus of our research involves RNA lengths averaging around three kilobase pairs, with some alternative structures reaching up to seven or even eight kilobase pairs. It is unclear how well the TBIA and SHAPE-JuMP would be able to handle forming its bonds it relies on to JuMP at strands of this length – but the ideas and concepts presented here will help us with our experimental setup and planning in the future.

Lackey Lab Journal Club: Fishing for RNA

Written by Edward Mabry

Week 21 (7/2/21) Dystrophin is an incredibly long gene of critical importance to crippling diseases such as Duchenne muscular dystrophy, but due to the limitations of traditional in situ hybridization (ISH) methods, the mRNA dynamics remained difficult to study. We recently discussed a research article from the Comparative Neuromuscular Diseases Laboratory validating RNAscope as a method to reveal the complex transcriptional dynamics of dystrophin mRNA. Hildyard et al. utilizes this sensitive ISH method for quantifying the dp247 isoform in mouse skeletal muscle localized both in the nucleus and sarcoplasm, possible in part to RNAscope’s reduction to off-targeting and its use of amplifiers for depositing high concentrations of fluorophore at the probe site. Probes were developed for both the 5’ and 3’ ends of the dystrophin mRNA such that detection and differentiation of the nascent and mature dp247 was possible at a subcellular level for individual transcripts. Quantification of the transcripts by RNAscope, in addition with support by qPCR analysis, implied that even in healthy mice the mature dp247 transcripts had a half-life of about 2 to 3 hours, even though the transcription of one such mRNA took about 16 hours. The researchers believe that this is due to the supply of transcripts for muscular dystrophin being higher than demand, but in dystrophic mice the transcription initiation events are less frequent, leading to fewer 5’ nascent signals and then prompt degradation following nuclear export due to premature termination codon and the NMD pathway. Overall, this ISH method shows promise towards quantifying long transcripts with low mRNA expression, such that complex transcriptional pathways could be more readily perceived and understood.

Lackey Lab Journal Club – Human genomic variation

Written by Kaila Honaker (Undergraduate Researcher)

Week 19 (6/20/21) – We discussed the 2020 Nature paper “The Mutational Constraint Spectrum Quantified from Variation in 141,456 Humans” by Karczewski, et al and including authors from the GnomAD consortium, Neale, Daly and MacArthur laboratories . In this manuscript, the authors wanted to form a catalogue/library of rare loss of function variants in humans that can be used to influence medical protocols/knowledge. They defined these variants as those that cause a premature stop codon, frameshift mutation, or change the necessary splice-site nucleotides of exons in the protein transcript. The sample pool that they drew their data from was very filtered so as not to allow for false positive pLoF variants. The data was filtered using the LOFTEE (loss-of-function transcript effect estimator) package which they created. The data mainly came from studies completed on common adult-onset diseases. One issue with this sample pool was that they needed more ancestral diversity, so in the future it might be beneficial to increase their range. It was discovered that known haploinsufficient genes seem to be very strongly depleted of pLoF variation, which is something that one would think to be the opposite. Constraint on genes and the prevalence of those genes in disease were found to be positively correlated. The data from this paper will be very beneficial in the future as a basis for further research on human diseases. The next step for this specific area of the research could be to perform mouse knockout experiments to determine the specific functions of each of these rare variants in a gene.

Lackey Lab Journal Club – Splicing in different cellular localizations

Written by Austin Herbert

Week 18 (6/11/21) – mRNAs can be localized to different cellular compartments. Several mechanisms affecting mRNA localization include intron based mRNA retention and polyadenylation. The extent to which post-transcriptional modifications alter the localization of RNA in different cell types is unknown. The Black laboratory recently published research on how splicing affects RNA localization. In this work, Yeom, et al. devised a method based on sample fractionation to assess mRNA localization patterns in cytoplasmic, nucleoplasmic, and chromatin fractions. Sequencing different fractions of mouse embryonic stem cells, neuronal progenitor cells, and postmitotic neurons, allowed the identification of differently enriched RNAs between cell types. Yeom, et al. further validated their method by describing differences in retained introns between cell types and studied the neuronal gamma-aminobutyric acid B receptor, 1 (Gabbr1) as an example of differentially retained introns. Gabbr1 RNA was found to be incompletely spliced and sequestered to the chromatin fraction of mouse embryonic stem cells whereas after neuronal differentiation, Gabbr1 RNA became fully processed and exported for translation. This example describes how intron retention and chromatin anchoring simultaneously act as post-transcriptional regulators preventing full gene expression. Finally, this method provides a large data set for analyzing the differences in localization of RNAs and offers a pipeline for people to repeat this method on different cell types.

Lackey Lab Journal Club – RNA World

Written by Edward Mabry (Master’s Student)

Week 17 (5/28/21) – We recently discussed a research article from the Mast + Braun Lab describing primary sequence structure selection in a template ligation replication system. An important argument that Kudella et al. presented in this paper is that Eigen error catastrophe would have made base-by-base replication difficult for the selection of structural and functional oligomers from random ones in the beginning of life on earth. They proposed that templated ligation does self-select for structure from randomness, such that the studied system within the paper did contain all the elements necessary for Darwinian evolution. However, the study implied that this system was comparable to an ‘RNA world scenario’, in which the processes are all prebiotic. Unfortunately, there were too many discrepancies between their experiment and the prebiotic world.

The RNA was replaced with DNA and they used an enzyme, TAQ DNA ligase, to ligate these oligonucleotides to study a period before enzymes, which the authors argued both were since their paper was focused on inherent properties of base pairing and not the chemical mechanisms of ligation. The application of the temperature cycling system for many hundreds of cycles along with an extra-efficient ligation system with a more stable nucleotide molecule than that which would have been used in the prebiotic environment does not inspire much confidence that this system gave an accurate depiction to how templated ligation may have occurred. To me, the biggest problem was the reduction of nucleobases in their study. The largest focus of this study was the reduction of sequence space to drive structured primary structure of nucleotide strands. This structure would be driven by the avoidance of secondary structure hairpins, which prevent templated ligation. By reducing the number of nucleobases from 4 to 2, not only was the sequence space magnitudes smaller, 4.096 x 103 versus ~1.678 x 107 sequence space, but the likelihood of a hairpin which was the primary selection mechanism focused on in the paper was reduced significantly as well. The probability of the formation of a 9 bp hairpin, which was a significant structure in their study, had a ~1.953 x 10-3 probability in the A & T nucleobase system at random versus ~3.815 x 10-6, the probability of a 9 bp hairpin’s formation in a G,C,A, & T nucleobase system at random. In fact, the study stated in the results that they had attempted to do all four bases under the same experimental conditions but found no ligation occurred. Overall, the study did show strong statistical analysis, but the issue itself would be with the system used, at least for trying to apply this experimental data to an the aforementioned, prebiotic ‘RNA world’ scenario.

Lackey Lab Journal Club – Splicing/RNA structure/Breast Cancer

Written by Luke Hatfield (research technician)

We discussed a recent publication from the Goodarzi lab describing a structural motif bound by SNRPA1 that regulates a network of transcripts important for cancer metastasis. Fish et al., showed that cancer cell metastasis and invasion can be propagated through alternative splicing in mRNA. This propagation depends on specific structural elements within RNAs (PLEC and ERFFI1) that cause aberrant splicing, are targeted by SNRPA1, and upregulated in highly metastasized breast cancer cells. They performed numerous assays to investigate existing data sets and verify experimental data. The paper was rigorous in its experimental procedures being well designed to address the specific question being posed at every point. This was specifically true in experiments like their RNA co-precipitation and transwell assay experiment: a simple and effective test to determine the invasion potential of highly metastasized cells. This experiment may seem simple, but the data it provided allowed them to move forward with their knockdown experiment in mice, where they got the crux of the disease-relevance aspect of the paper. They did show relative confidence in confirming that SNRPA1 has a non-canonical role involved in regulating alternative splicing of RNA to regulate the pathways that develop metastasis in breast and lung cancer development. The only metric by which the analysis falls short is in experimental RNA stucture research. More complex experiments with DMS-seq and other mutational profiling could go a long way in developing new theories about what isoforms are contributing to the increased activity of SNRPA1.

Lackey Lab Journal Club – circular RNAs and cancer

Week 15 (5/21/21) – We discussed research from the Pedersen laboratory on the biological role of circular RNAs (circRNAs) and their role in cancer. In this publication, Okholm et. al investigated the circRNAs profiles and circRNA-RBP interactions in several different lines of cancer. They were able to identify ~160 highly expressed circRNAs in the HepG2 and K562 cells lines and characterize their similarities and genomic properties. These circRNAs all contained high coverage of RBP binding sites with themselves, and their flanking introns. Moreover, each circRNAs was more highly expressed than its full-length transcript counterpart. Additionally, Okholm et. al also characterized the effects of a circRNA, circCDYL, and RBP knockdowns on bladder cancer cells lines. A knockdown of circCDYL resulted in the increase of expression in key cancer genes and that by acting as a sponge for the RNA binding protein GRWD1, circCDYL depletion counteracts the effects of GRWD1 depletion. In conclusion, the work by Okholm et al highlights that no special sequencing methods are needed for circRNA analyses and that circRNAs may play a bigger role in cancer than once thought.

Lackey Lab Journal Club – Nascent RNA structure

Week 15 (5/7/21) – We discussed a recent publication from the Bentley lab on nascent RNA structure. Saldi and colleagues took advantage of the tNET-seq protocol, which uses RNA polymerase immunoprecipitation to select for newly synthesized RNA, to ask questions about the structure of nascent RNAs. They developed tNET-structure-seq by combining nascent RNA selection with enzymatic and chemical structure probing to develop structure data on precursor RNAs. Interestingly they found that a subset of well-spliced junctions have steep differences in the amount of structure between the intronic and exonic regions. Structure could be an important element of defining splice junctions. Much of the highly structured nature of introns is due to Alu elements, which are often present as inverted repeats that can form double-stranded RNA. Structure is an inherent aspect of RNA. One question in the RNA field is what the impact of transcription speed is on RNA folding, especially in vivo. Saldi, et al., use a RNA Pol II mutant that is about 3 times slower than normal to understand what altering the transcription rate does to nascent RNA across the transcriptome. This slower transcription results in higher amounts of structure overall. One surprising take-away is that many genes are actually spliced more efficiently at slower speeds. The ability of RNA pol II speed to influence transcript structure also provides a mechanism for linking upstream effect on transcription to downstream post-transcriptional effects.

Lackey Lab Journal Club – Zika and Dengue RNA structure

Written by Edward Mabry (upcoming M.S. student)

Week 14 (4/23/21) – Edward led our discussion on long-range structure mapping of Dengue and Zika viruses published in 2019 by the Wan lab.

Goals

The goals of the paper were to develop a better understanding of how the genome of Dengue and zika viruses are structurally organized, including long range interactions and structure conservation, and how these interactions from in cells versus in virions.

Method

The authors used NAI-MaP structure probing within virions and within Hela cells and computationally analyzed mutation rates in resultant cDNA to model RNA secondary structures.

They also performed pairwise interactome mapping by utilizing the SPLASH protocol. They performed SPLASH both in virions, in solution and in infected Hela cells to determine long range interaction mapping via crosslinking of the pairwise interactions and proximity ligation.

The authors analyzed their data with R-scape for covariation across the serotypes and viral species. They also analyzed structure and sequence conservation across viral genomes to create an identification method to find regions of functional interest.

Mutations that disrupted long range interactions were tested in interferon-deficient mice for their impact on viral fitness. Disruption of long range interactions diminished viral fitness.

Results

Utilizing NAI-Map and SPLASH, the study was able to identify a number of highly structured and conserved functional regions within the genomic RNA and large number of pairwise interactions. Many of these, in addition to being long range, showed high heterogeneity, suggesting that many regions can have many different structures and bind to different partners in the viral genome.

Comparative analysis showed that long-interactions were disrupted within cells significantly compared to those within virions. However, when the interactions identified within virions were disrupted via mutagenesis, viral replication was severely reduced and the long-ranges pairings were deemed important for viral fitness.

Discussion

One of the important topics focused on by our lab was the use of SPLASH to identify long distance pairwise interactions, which is something that is difficult to study in SHAPE MaP. This could be utilized to determine long-range pre-mRNA structures to understand transcript and/or gene regulation, even within long introns.

The study shows strong data for the regions of functional interest having a strong effect on viral fitness and does produce a better understanding of how the viruses in the Flaviviridae family are organized.

There was some confusion on figure labeling (such as the structure model in Figure 2b), but overall this was an informative paper utilizing a new procedure for RNA structure analysis.

Lackey Lab Journal Club – NCI RNA Biology Symposium

Week 13 (4/14-4/16/21) – We attended the 2021 RNA Biology Symposium organized by the National Cancer Institute. This Symposium spanned 3 days with a range of topics including small and noncoding RNAs, translation and RNA modification. A major theme of the Symposium was phase transition, which we discussed in detail in February for our Journal Club on phase transition mediated by the SAR-CoV-2 nucleocapsid protein. Of the many informative presentations at the NCI Symposium we had several favorites. One favorite presentation was “Genome Regulation by long-noncoding RNAs” by Dr. Howard Chang. Dr. Chang described the in-depth work his group has performed on the XIST lncRNA and embryonic and somatic X-inactivation, some of which has been published. Another amazing presentation was “RNA in Genomic Medicine, diagnosing rare disease and COVID-19 implications” by Dr. Diana Baralle. Dr. Baralle’s recent publication on differential expression of ACE2 isoforms during viral infection provides insight into her presentation for those unable to attend.

Upcoming (4/23/21) – Edward Mabry (M.S. student) will lead our discussion on long-range structure mapping of Dengue and Zika viruses published in 2019 and led by the Wan lab.