Lackey Lab Summer 2023 Journal Club

Our focus this summer is on RNA structure analysis methods. We are reading papers to understand analysis methods that quantify how similar two structures are to one another. Specifically, we are focusing on reading papers that describe methods that allow us to identify conserved or enrichment substructures across different RNAs, such as within precursor RNAs with alternatively spliced intron-exon junctions.

We are reading:

Lackey Lab Journal Club: Branchpoints brought to you by CoLa-Seq

11/19/21 – In our pre-Thanksgiving journal club we discussed a bioRxiv paper from the Li and Staley laboratories at the University of Chicago. In this work, Zeng and co-authors describe CoLa-Seq (co- transcriptional lariat sequencing). CoLa-Seq is a new technique that detects lariat containing precursor RNAs and splicing by-products to identify branchpoints. The authors call the two lariat containing species they detect with CoLa-Seq NLIs (nascent lariat intermediates) and ELIs (excised lariat introns). To obtain reads corresponding to NLIs and ELIs, the authors enriched for precursor RNAs by isolating chromatin. They further selected RNA in the process of splicing by decapping and degrading linear RNA, leaving only RNA protected by the 2’-5′ lariat linkage. Using CoLa-Seq, Zeng, et al. identified the largest number of branchpoints. In addition, CoLa-Seq provides a technique to continue branchpoint identification in other cell lines and under additional conditions. A reasonable protocol for branchpoint identification is important as, even with this study, many branchpoints remain unmapped. Branchpoint selection is important for recognition of the 3’ splice site and understanding of alternative splicing.

In addition to describing a new technique and documenting an extensive number of branchpoints, Zeng, et al., analyzed their results to make several novel biological insights. They analyze the timing of splicing by measuring the number of nucleotides past the 3’ splice site in NLI reads. One thing that surprised me from their timing data is the variability at the same intron. In addition, they found that splicing can happen in-order, out-of-order and concurrently. These three different splicing modes occur in different ratios in most transcripts. As spliced intermediates are rare and collected from a large population of cells, I wonder how the state of the cell and the level of transcript affect splicing timing and the order of splicing. Interestingly, splicing did not seem to depend on transcription of the downstream exon, even for long introns, as would be expected for the exon-definition model of splicing. Zeng, et al., also used extensive modeling to try and understand what elements control splicing timing and order. As an RNA structure lab, we were most intrigued by the role of GC content in splicing timing! However, as GC content captures both structural and sequence motifs, it is still too early to say what role RNA structure has in regulating splicing. We look forward to seeing the final version of this manuscript in press.

Lackey Lab Journal Club: AGO1 as a transcription factor

Written by Abigail Hatfield

10-22-21: For this week’s journal club, I elected to look at an AGO1 paper that was recently published in J Cell Biol by Acuña, et al. and relevant to many of our lab’s interests in ongoing experiments within our lab. This paper checked a lot of the boxes: timely, discussing argonaute, proposing functional roles for argonaute, utilizing immunocytochemistry, and incorporating large-scale experimental  and historical ChIP-Seq. 

The experiments performed by Acuña, et al. were interesting first and foremost because argonaute has been widely studied for its role in small RNA-mediated post transcriptional processing and in transcriptional gene silencing. However, they propose an alternate function for argonaute-1: estradiol triggered transcription in human cells. They used ChIP and ChIP-Seq on transfected MCF-7 cells to locate and identify ERα binding sites within the cells relative to argonaute-1 (AGO1), which they found in significance. ERα binding motifs were present throughout the cells expressing AGO1, indicating that they are indeed prominently linked in some capacity.

To further interrogate this connection, they treated these AGO1 transfected MCF-7 cells with estradiol. Upon enrichment, signals of AGO1 expression increased greater than MCF-7 cells that were untreated. Additionally, they detected no change in subcellular localization of AGO1 during E2 treatment. To complete their follow-up, they proceeded to knockdown the function of AGO1 to react to E2, showing a marked decrease in ERα activity within the knockdown cells. They were also able to rescue the functionality of these proteins and restore levels of production to near those of endogenous levels. More interestingly, when they conducted their experiments showing that estrogen increased in response to E2’s presence in relation to AGO1, they also found that when they silenced the AGO1’s ability to incorporate RNA, that the relation continued to exist. This implies that AGO1 and ERα are binding together on the chromatin and that it is activating co-transcriptionally. They were also able to demonstrate that AGO1 is preferentially enriched at active transcriptional enhancers. All of this points to the possibility that AGO1 does indeed act in an estrogen-dependent manner as a transcription coactivator, at least as the authors suggest. The most marked criticism of this paper is simply a wish this lab would like to have granted: investigating this same system in the rest of the family of argonaute proteins. 

Equally important as this conclusion, however, is the implication that proteins may harbor multiple and various functions. In the case of this AGO1, miRNA’s were shown to not even be necessary for its binding on chromatin with ERα. Previous studies indicating the importance of AGO1’s interactions with small RNA’s would have and did entirely miss this potential function. This gives rise to a compelling question: how to identify multiple functions of a protein and, more importantly, what determines the significance and impact of any given functions it may have out of multiple? Is AGO1 more important to transcriptional gene silencing or coactivating with ERα? In what ways would you determine the significance and relevance of AGO1’s influence in both of those mechanisms relative to each other? These are questions we may hope to answer soon, and wrestle with our principles of microbiology and genetics in doing so. 

Lackey Lab Journal Club: Fishing for RNA

Written by Edward Mabry

Week 21 (7/2/21) Dystrophin is an incredibly long gene of critical importance to crippling diseases such as Duchenne muscular dystrophy, but due to the limitations of traditional in situ hybridization (ISH) methods, the mRNA dynamics remained difficult to study. We recently discussed a research article from the Comparative Neuromuscular Diseases Laboratory validating RNAscope as a method to reveal the complex transcriptional dynamics of dystrophin mRNA. Hildyard et al. utilizes this sensitive ISH method for quantifying the dp247 isoform in mouse skeletal muscle localized both in the nucleus and sarcoplasm, possible in part to RNAscope’s reduction to off-targeting and its use of amplifiers for depositing high concentrations of fluorophore at the probe site. Probes were developed for both the 5’ and 3’ ends of the dystrophin mRNA such that detection and differentiation of the nascent and mature dp247 was possible at a subcellular level for individual transcripts. Quantification of the transcripts by RNAscope, in addition with support by qPCR analysis, implied that even in healthy mice the mature dp247 transcripts had a half-life of about 2 to 3 hours, even though the transcription of one such mRNA took about 16 hours. The researchers believe that this is due to the supply of transcripts for muscular dystrophin being higher than demand, but in dystrophic mice the transcription initiation events are less frequent, leading to fewer 5’ nascent signals and then prompt degradation following nuclear export due to premature termination codon and the NMD pathway. Overall, this ISH method shows promise towards quantifying long transcripts with low mRNA expression, such that complex transcriptional pathways could be more readily perceived and understood.

Lackey Lab Journal Club – Human genomic variation

Written by Kaila Honaker (Undergraduate Researcher)

Week 19 (6/20/21) – We discussed the 2020 Nature paper “The Mutational Constraint Spectrum Quantified from Variation in 141,456 Humans” by Karczewski, et al and including authors from the GnomAD consortium, Neale, Daly and MacArthur laboratories . In this manuscript, the authors wanted to form a catalogue/library of rare loss of function variants in humans that can be used to influence medical protocols/knowledge. They defined these variants as those that cause a premature stop codon, frameshift mutation, or change the necessary splice-site nucleotides of exons in the protein transcript. The sample pool that they drew their data from was very filtered so as not to allow for false positive pLoF variants. The data was filtered using the LOFTEE (loss-of-function transcript effect estimator) package which they created. The data mainly came from studies completed on common adult-onset diseases. One issue with this sample pool was that they needed more ancestral diversity, so in the future it might be beneficial to increase their range. It was discovered that known haploinsufficient genes seem to be very strongly depleted of pLoF variation, which is something that one would think to be the opposite. Constraint on genes and the prevalence of those genes in disease were found to be positively correlated. The data from this paper will be very beneficial in the future as a basis for further research on human diseases. The next step for this specific area of the research could be to perform mouse knockout experiments to determine the specific functions of each of these rare variants in a gene.

Lackey Lab Journal Club – Splicing in different cellular localizations

Written by Austin Herbert

Week 18 (6/11/21) – mRNAs can be localized to different cellular compartments. Several mechanisms affecting mRNA localization include intron based mRNA retention and polyadenylation. The extent to which post-transcriptional modifications alter the localization of RNA in different cell types is unknown. The Black laboratory recently published research on how splicing affects RNA localization. In this work, Yeom, et al. devised a method based on sample fractionation to assess mRNA localization patterns in cytoplasmic, nucleoplasmic, and chromatin fractions. Sequencing different fractions of mouse embryonic stem cells, neuronal progenitor cells, and postmitotic neurons, allowed the identification of differently enriched RNAs between cell types. Yeom, et al. further validated their method by describing differences in retained introns between cell types and studied the neuronal gamma-aminobutyric acid B receptor, 1 (Gabbr1) as an example of differentially retained introns. Gabbr1 RNA was found to be incompletely spliced and sequestered to the chromatin fraction of mouse embryonic stem cells whereas after neuronal differentiation, Gabbr1 RNA became fully processed and exported for translation. This example describes how intron retention and chromatin anchoring simultaneously act as post-transcriptional regulators preventing full gene expression. Finally, this method provides a large data set for analyzing the differences in localization of RNAs and offers a pipeline for people to repeat this method on different cell types.

Lackey Lab Journal Club – RNA World

Written by Edward Mabry (Master’s Student)

Week 17 (5/28/21) – We recently discussed a research article from the Mast + Braun Lab describing primary sequence structure selection in a template ligation replication system. An important argument that Kudella et al. presented in this paper is that Eigen error catastrophe would have made base-by-base replication difficult for the selection of structural and functional oligomers from random ones in the beginning of life on earth. They proposed that templated ligation does self-select for structure from randomness, such that the studied system within the paper did contain all the elements necessary for Darwinian evolution. However, the study implied that this system was comparable to an ‘RNA world scenario’, in which the processes are all prebiotic. Unfortunately, there were too many discrepancies between their experiment and the prebiotic world.

The RNA was replaced with DNA and they used an enzyme, TAQ DNA ligase, to ligate these oligonucleotides to study a period before enzymes, which the authors argued both were since their paper was focused on inherent properties of base pairing and not the chemical mechanisms of ligation. The application of the temperature cycling system for many hundreds of cycles along with an extra-efficient ligation system with a more stable nucleotide molecule than that which would have been used in the prebiotic environment does not inspire much confidence that this system gave an accurate depiction to how templated ligation may have occurred. To me, the biggest problem was the reduction of nucleobases in their study. The largest focus of this study was the reduction of sequence space to drive structured primary structure of nucleotide strands. This structure would be driven by the avoidance of secondary structure hairpins, which prevent templated ligation. By reducing the number of nucleobases from 4 to 2, not only was the sequence space magnitudes smaller, 4.096 x 103 versus ~1.678 x 107 sequence space, but the likelihood of a hairpin which was the primary selection mechanism focused on in the paper was reduced significantly as well. The probability of the formation of a 9 bp hairpin, which was a significant structure in their study, had a ~1.953 x 10-3 probability in the A & T nucleobase system at random versus ~3.815 x 10-6, the probability of a 9 bp hairpin’s formation in a G,C,A, & T nucleobase system at random. In fact, the study stated in the results that they had attempted to do all four bases under the same experimental conditions but found no ligation occurred. Overall, the study did show strong statistical analysis, but the issue itself would be with the system used, at least for trying to apply this experimental data to an the aforementioned, prebiotic ‘RNA world’ scenario.

Lackey Lab Journal Club – Splicing/RNA structure/Breast Cancer

Written by Luke Hatfield (research technician)

We discussed a recent publication from the Goodarzi lab describing a structural motif bound by SNRPA1 that regulates a network of transcripts important for cancer metastasis. Fish et al., showed that cancer cell metastasis and invasion can be propagated through alternative splicing in mRNA. This propagation depends on specific structural elements within RNAs (PLEC and ERFFI1) that cause aberrant splicing, are targeted by SNRPA1, and upregulated in highly metastasized breast cancer cells. They performed numerous assays to investigate existing data sets and verify experimental data. The paper was rigorous in its experimental procedures being well designed to address the specific question being posed at every point. This was specifically true in experiments like their RNA co-precipitation and transwell assay experiment: a simple and effective test to determine the invasion potential of highly metastasized cells. This experiment may seem simple, but the data it provided allowed them to move forward with their knockdown experiment in mice, where they got the crux of the disease-relevance aspect of the paper. They did show relative confidence in confirming that SNRPA1 has a non-canonical role involved in regulating alternative splicing of RNA to regulate the pathways that develop metastasis in breast and lung cancer development. The only metric by which the analysis falls short is in experimental RNA stucture research. More complex experiments with DMS-seq and other mutational profiling could go a long way in developing new theories about what isoforms are contributing to the increased activity of SNRPA1.

Lackey Lab Journal Club – circular RNAs and cancer

Week 15 (5/21/21) – We discussed research from the Pedersen laboratory on the biological role of circular RNAs (circRNAs) and their role in cancer. In this publication, Okholm et. al investigated the circRNAs profiles and circRNA-RBP interactions in several different lines of cancer. They were able to identify ~160 highly expressed circRNAs in the HepG2 and K562 cells lines and characterize their similarities and genomic properties. These circRNAs all contained high coverage of RBP binding sites with themselves, and their flanking introns. Moreover, each circRNAs was more highly expressed than its full-length transcript counterpart. Additionally, Okholm et. al also characterized the effects of a circRNA, circCDYL, and RBP knockdowns on bladder cancer cells lines. A knockdown of circCDYL resulted in the increase of expression in key cancer genes and that by acting as a sponge for the RNA binding protein GRWD1, circCDYL depletion counteracts the effects of GRWD1 depletion. In conclusion, the work by Okholm et al highlights that no special sequencing methods are needed for circRNA analyses and that circRNAs may play a bigger role in cancer than once thought.

Lackey Lab Journal Club – Nascent RNA structure

Week 15 (5/7/21) – We discussed a recent publication from the Bentley lab on nascent RNA structure. Saldi and colleagues took advantage of the tNET-seq protocol, which uses RNA polymerase immunoprecipitation to select for newly synthesized RNA, to ask questions about the structure of nascent RNAs. They developed tNET-structure-seq by combining nascent RNA selection with enzymatic and chemical structure probing to develop structure data on precursor RNAs. Interestingly they found that a subset of well-spliced junctions have steep differences in the amount of structure between the intronic and exonic regions. Structure could be an important element of defining splice junctions. Much of the highly structured nature of introns is due to Alu elements, which are often present as inverted repeats that can form double-stranded RNA. Structure is an inherent aspect of RNA. One question in the RNA field is what the impact of transcription speed is on RNA folding, especially in vivo. Saldi, et al., use a RNA Pol II mutant that is about 3 times slower than normal to understand what altering the transcription rate does to nascent RNA across the transcriptome. This slower transcription results in higher amounts of structure overall. One surprising take-away is that many genes are actually spliced more efficiently at slower speeds. The ability of RNA pol II speed to influence transcript structure also provides a mechanism for linking upstream effect on transcription to downstream post-transcriptional effects.