Computational identification of signals predictive for nuclear RNA exosome degradation pathway targeting

Mengjun Wu, Manfred Schmid, Torben Heick Jensen, Albin Sandelin*

*Corresponding author af dette arbejde

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningpeer review

21 Downloads (Pure)

Abstract

The RNA exosome degrades transcripts in the nucleoplasm of mammalian cells. Its substrate specificity is mediated by two adaptors: the 'nuclear exosome targeting (NEXT)' complex and the 'poly(A) exosome targeting (PAXT)' connection. Previous studies have revealed some DNA/RNA elements that differ between the two pathways, but how informative these features are for distinguishing pathway targeting, or whether additional genomic features that are informative for such classifications exist, is unknown. Here, we leverage the wealth of available genomic data and develop machine learning models that predict exosome targets and subsequently rank the features the models use by their predictive power. As expected, features around transcript end sites were most predictive; specifically, the lack of canonical 3 ' end processing was highly predictive of NEXT targets. Other associated features, such as promoter-proximal G/C content and 5 ' splice sites, were informative, but only for distinguishing NEXT and not PAXT targets. Finally, we discovered predictive features not previously associated with exosome targeting, in particular RNA helicase DDX3X binding sites. Overall, our results demonstrate that nucleoplasmic exosome targeting is to a large degree predictable, and our approach can assess the predictive power of previously known and new features in an unbiased way.

OriginalsprogEngelsk
Artikelnummerlqac071
TidsskriftNAR Genomics and Bioinformatics
Vol/bind4
Udgave nummer3
Antal sider18
DOI
StatusUdgivet - 2022

Citationsformater