in

Utilizing big data and deep learning in RNA biology. #RNAinformatics

Big data and deep learning for RNA biology

Summarise this content to 300 words

  • Ching, T. et al. Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface 15, 20170387 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sun, C., Shrivastava, A., Singh, S. & Gupta, A. Revisiting unreasonable effectiveness of data in deep learning era. In Proc. IEEE International Conference on Computer Vision. 843–852 (IEEE, 2017).

  • Stephens, Z. D. et al. Big data: astronomical or genomical? PLoS Biol. 13, e1002195 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Stark, R., Grzelak, M. & Hadfield, J. RNA sequencing: the teenage years. Nat. Rev. Genet. 20, 631–656 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 770–778 (IEEE, 2016).

  • Brown, T. et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020).


    Google Scholar
     

  • Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ule, J. et al. CLIP identifies Nova-regulated RNA networks in the brain. Science 302, 1212–1215 (2003).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hafner, M. et al. CLIP and complementary methods. Nat. Rev. Methods Prim. 1, 20 (2021).

    Article 
    CAS 

    Google Scholar
     

  • Dominissini, D. et al. Topology of the human and mouse m6A RNA methylomes revealed by m6A-seq. Nature 485, 201–206 (2012).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Rouskin, S., Zubradt, M., Washietl, S., Kellis, M. & Weissman, J. S. Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo. Nature 505, 701–705 (2014).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hwang, B., Lee, J. H. & Bang, D. Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 50, 1–14 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Roscher, R., Bohn, B., Duarte, M. F. & Garcke, J. Explainable machine learning for scientific insights and discoveries. IEEE Access. 8, 42200–42216 (2020).

    Article 

    Google Scholar
     

  • Consortium, E. P. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).

    Article 

    Google Scholar
     

  • Leipzig, J., Nüst, D., Hoyt, C. T., Ram, K. & Greenberg, J. The role of metadata in reproducible computational research. Patterns 2, 100322 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Barrett, T. et al. NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res. 41, D991–D995 (2012).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Consortium, R. E. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–329 (2015).

    Article 

    Google Scholar
     

  • Leinonen, R., Sugawara, H. & Shumway, M. The sequence read archive. Nucleic Acids Res. 39, D19–D21 (2011).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Gonçalves, R. S. & Musen, M. A. The variable quality of metadata about biological samples used in biomedical experiments. Sci. Data 6, 190021 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Giles, C. B. et al. ALE: automated label extraction from GEO metadata. BMC Bioinformatics 18, 509 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Serna Garcia, G., Leone, M., Bernasconi, A. & Carman, M. J. GeMI: interactive interface for transformer-based Genomic Metadata Integration. Database 2022, baac036 (2022).

  • Consortium, E. P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57 (2012).

    Article 

    Google Scholar
     

  • Moore et al. Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 583, 699–710 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Rozowsky, J. et al. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell 186, 1493–1511.e40 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hong, E. L. et al. Principles of metadata organization at the ENCODE data coordination center. Database 2016, baw001 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Parkinson, H. et al. ArrayExpress–a public database of microarray experiments and gene expression profiles. Nucleic Acids Res. 35, D747–D750 (2007).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Burgin, J. et al. The European Nucleotide Archive in 2022. Nucleic Acids Res. 51, D121–D125 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Abugessaisa, I. et al. FANTOM enters 20th year: expansion of transcriptomic atlases and functional annotation of non-coding RNAs. Nucleic Acids Res. 49, D892–D898 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hon, C.-C. et al. An atlas of human long non-coding RNAs with accurate 5′ ends. Nature 543, 199–204 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Shiraki, T. et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ramilowski, J. A. et al. Functional annotation of human long noncoding RNAs via molecular phenotyping. Genome Res. 30, 1060–1072 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • GTEx Consortium et al.The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Article 

    Google Scholar
     

  • Weinstein, J. N. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Calabrese, C. et al. Genomic basis for RNA alterations in cancer. Nature 578, 129–136 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).

  • Chapelle, O., Schölkopf, B. & Zien, A. Semi-Supervised Learning (Adaptive Computation and Machine Learning) (MIT Press, 2006).

  • Krishnan, R., Rajpurkar, P. & Topol, E. J. Self-supervised learning in medicine and healthcare. Nat. Biomed. Eng. 6, 1346–1352 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Young, J. D., Cai, C. & Lu, X. Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma. BMC Bioinformatics 18, 381 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Li, X. et al. Deep learning enables accurate clustering with batch effect removal in single-cell RNA-seq analysis. Nat. Commun. 11, 2338 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Collobert, R. & Weston, J. A unified architecture for natural language processing: deep neural networks with multitask learning. In Proc. 25th International Conference on Machine Learning 160–167 (ACM, 2008).

  • Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers) 4171–4186 (ACL, 2019).

  • Yang, F. et al. scBERT as a large-scale pretrained deep language model for cell type annotation of single-cell RNA-seq data. Nat. Mach. Intell. 4, 852–866 (2022).

    Article 

    Google Scholar
     

  • Theodoris, C. V. et al. Transfer learning enables predictions in network biology. Nature 618, 616–624 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zhou, Z. et al. Joint masking and self-supervised strategies for inferring small molecule-miRNA associations. Mol. Ther. Nucleic Acids. 35, 102103 (2024).

  • Jin, W. et al. HydRA: deep-learning models for predicting RNA-binding capacity from protein interaction association context and protein sequence. Mol. Cell 83, 2595–611. e11 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Peng, X. et al. RBP-TSTL is a two-stage transfer learning framework for genome-scale prediction of RNA-binding proteins. Brief Bioinform. 23, bbac215 (2022).

  • Xu, C. & Jackson, S. A. Machine learning and complex biological data. Genome Biol. 20, 76 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tzeng, E., Hoffman, J., Zhang, N., Saenko, K. & Darrell, T. Deep domain confusion: maximizing for domain invariance. Preprint at https://arxiv.org/abs/1412.3474 (2014).

  • Ganin, Y. et al. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17, 1–35 (2016).


    Google Scholar
     

  • Chen, J. et al. Deep transfer learning of cancer drug responses by integrating bulk and single-cell RNA-seq data. Nat. Commun. 13, 6494 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Shaw, D., Chen, H. & Jiang, T. DeepIsoFun: a deep domain adaptation approach to predict isoform functions. Bioinformatics 35, 2535–2544 (2018).

    Article 
    PubMed Central 

    Google Scholar
     

  • Kelley, D. R. Cross-species regulatory sequence activity prediction. PLoS Comput. Biol. 16, e1008050 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kimmel, J. C. & Kelley, D. R. Semisupervised adversarial neural networks for single-cell classification. Genome Res. 31, 1781–1793 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Finn, C., Abbeel, P. & Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proc. 34th International Conference on Machine Learning 1126–1135 (PMLR, 2017).

  • Snell, J., Swersky, K. & Zemel, R. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems 30 (NIPS, 2017).

  • Brbić, M. et al. MARS: discovering novel cell types across heterogeneous single-cell experiments. Nat. Methods 17, 1200–1206 (2020).

    Article 
    PubMed 

    Google Scholar
     

  • Qiu, Y. L., Zheng, H., Devos, A., Selby, H. & Gevaert, O. A meta-learning approach for genomic survival analysis. Nat. Commun. 11, 6350 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Li, Z. et al. CoraL: interpretable contrastive meta-learning for the prediction of cancer-associated ncRNA-encoded small peptides. Brief. Bioinform. 24, bbad352 (2023).

    Article 
    PubMed 

    Google Scholar
     

  • Cai, J., Wang, T., Deng, X., Tang, L. & Liu, L. GM-lncLoc: lncRNAs subcellular localization prediction based on graph neural network with meta-learning. BMC Genomics 24, 52 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Shorten, C. & Khoshgoftaar, T. M. A survey on image data augmentation for deep learning. J. Big Data 6, 1–48 (2019).

    Article 

    Google Scholar
     

  • Hill, S. T. et al. A deep recurrent neural network discovers complex biological rules to decipher RNA protein-coding potential. Nucleic Acids Res. 46, 8105–8113 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cao, Y., Geddes, T. A., Yang, J. Y. H. & Yang, P. Ensemble deep learning in bioinformatics. Nat. Mach. Intell. 2, 500–508 (2020).

    Article 

    Google Scholar
     

  • Sagi, O. & Rokach, L. Ensemble learning: a survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 8, e1249 (2018).

    Article 

    Google Scholar
     

  • Pan, X. & Shen, H.-B. Predicting RNA–protein binding sites and motifs through combining local and global deep convolutional neural networks. Bioinformatics 34, 3427–3436 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Camargo, A. P., Sourkov, V., Pereira, G. A. G. & Carazzolle, M. F. RNAsamba: neural network-based assessment of the protein-coding potential of RNA sequences. NAR Genomics Bioinform. 2, lqz024 (2020).

    Article 

    Google Scholar
     

  • Cheng, J. et al. MMSplice: modular modeling improves the predictions of genetic variant effects on splicing. Genome Biol. 20, 48 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Nguyen, T. A. et al. Direct identification of A-to-I editing sites with nanopore native RNA sequencing. Nat. Methods 19, 833–844 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kalkatawi, M., Magana-Mora, A., Jankovic, B. & Bajic, V. B. DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions. Bioinformatics 35, 1125–1132 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zhang, T., Tang, Q., Nie, F., Zhao, Q. & Chen, W. DeepLncPro: an interpretable convolutional neural network model for identifying long non-coding RNA promoters. Brief. Bioinform. 23, bbac447 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Aoki, G. & Sakakibara, Y. Convolutional neural networks for classification of alignments of non-coding RNA sequences. Bioinformatics 34, i237–i244 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. & Dean, J. Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems 26 (NIPS, 2013).

  • Chaabane, M., Williams, R. M., Stephens, A. T. & Park, J. W. circDeep: deep learning approach for circular RNA classification from other long non-coding RNA. Bioinformatics 36, 73–80 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Farhadi, F., Allahbakhsh, M., Maghsoudi, A., Armin, N. & Amintoosi, H. DiMo: discovery of microRNA motifs using deep learning and motif embedding. Brief. Bioinform. 24, bbad182 (2023).

  • Song, Z. et al. Attention-based multi-label neural networks for integrated prediction and interpretation of twelve widely occurring RNA modifications. Nat. Commun. 12, 4011 (2021).

  • Le, Q. & Mikolov, T. Distributed representations of sentences and documents. In Proc. 31st International Conference on Machine Learning 1188–1196 (PMLR, 2014).

  • Xie, W., Luo, J., Pan, C. & Liu, Y. SG-LSTM-FRAME: a computational frame using sequence and geometrical information via LSTM to predict miRNA-gene associations. Brief. Bioinform. 22, 2032–2042 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hendra, C. et al. Detection of m6A from direct RNA sequencing using a multiple instance learning framework. Nat. Methods 19, 1590–1598 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Leung, M. K. K., Xiong, H. Y., Lee, L. J. & Frey, B. J. Deep learning of the tissue-regulated splicing code. Bioinformatics 30, i121–i129 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lusk, R. et al. Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence. Nat. Commun. 12, 1652 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zhang, Z. et al. Deep-learning augmented RNA-seq analysis of transcript splicing. Nat. Methods 16, 307–310 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Chen, Y., Li, Y., Narayan, R., Subramanian, A. & Xie, X. Gene expression inference with deep learning. Bioinformatics 32, 1832–1839 (2016).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yu, G., Zhou, G., Zhang, X., Domeniconi, C. & Guo, M. DMIL-IsoFun: predicting isoform function using deep multi-instance learning. Bioinformatics 37, 4818–4825 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zhang, K., Wang, C., Sun, L. & Zheng, J. Prediction of gene co-expression from chromatin contacts with graph attention network. Bioinformatics 38, 4457–4465 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Han, S. et al. LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property. Brief. Bioinform. 20, 2009–2027 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Wen, M., Cong, P., Zhang, Z., Lu, H. & Li, T. DeepMirTar: a deep-learning approach for predicting human miRNA targets. Bioinformatics 34, 3781–3787 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Budach, S. & Marsico, A. pysster: classification of biological sequences by learning sequence and structure motifs with convolutional neural networks. Bioinformatics 34, 3035–3037 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ben-Bassat, I., Chor, B. & Orenstein, Y. A deep neural network approach for learning intrinsic protein-RNA binding preferences. Bioinformatics 34, i638–i646 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Townshend, R. J. L. et al. Geometric deep learning of RNA structure. Science 373, 1047–1051 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yan, Z., Hamilton, W. L. & Blanchette, M. Graph neural representational learning of RNA secondary structures for predicting RNA-protein interactions. Bioinformatics 36, i276–i284 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xiong, H. Y. et al. The human splicing code reveals new insights into the genetic determinants of disease. Science 347, 1254806 (2015).

    Article 
    PubMed 

    Google Scholar
     

  • Zhang, L. et al. A deep learning model to identify gene expression level using cobinding transcription factor signals. Brief. Bioinform. 23, bbab501 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Jha, A., Gazzara, M. R. & Barash, Y. Integrative deep models for alternative splicing. Bioinformatics 33, i274–i282 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • McGeary, S. E. et al. The biochemical basis of microRNA targeting efficacy. Science 366, eaav1741 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hornik, K., Stinchcombe, M. & White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366 (1989).

    Article 

    Google Scholar
     

  • LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).

    Article 

    Google Scholar
     

  • Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

    Article 

    Google Scholar
     

  • Yang, C. et al. LncADeep: an ab initio lncRNA identification and functional annotation tool based on deep learning. Bioinformatics 34, 3825–3834 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Mateos, P. A., Zhou, Y., Zarnack, K. & Eyras, E. Concepts and methods for transcriptome-wide prediction of chemical messenger RNA modifications with machine learning. Brief. Bioinform. 24, bbad163 (2023).

    Article 

    Google Scholar
     

  • Hinton, G. E., Osindero, S. & Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006).

    Article 
    PubMed 

    Google Scholar
     

  • Zhang, S. et al. A deep learning framework for modeling structural features of RNA-binding protein targets. Nucleic Acids Res. 44, e32 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Krizhevsky A., Sutskever I., Hinton G. E. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25 (NIPS, 2012).

  • Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol. 33, 831–838 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kelley, D. R. et al. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 28, 739–750 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sahu, B. et al. Sequence determinants of human gene regulatory elements. Nat. Genet. 54, 283–294 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zrimec, J. et al. Deep learning suggests that gene expression is encoded in all parts of a co-evolving interacting gene regulatory structure. Nat. Commun. 11, 6141 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Avsec, Ž., Barekatain, M., Cheng, J. & Gagneur, J. Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks. Bioinformatics 34, 1261–1269 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Cuperus, J. T. et al. Deep learning of the regulatory grammar of yeast 5’ untranslated regions from 500,000 random sequences. Genome Res. 27, 2015–2024 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xia, Z. et al. DeeReCT-PolyA: a robust and generic deep learning method for PAS identification. Bioinformatics 35, 2371–2379 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zheng, X., Fu, X., Wang, K. & Wang, M. Deep neural networks for human microRNA precursor detection. BMC Bioinform. 21, 17 (2020).

    Article 
    CAS 

    Google Scholar
     

  • Leung, M. K. K., Delong, A. & Frey, B. J. Inference of the human polyadenylation code. Bioinformatics 34, 2889–2898 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jaganathan, K. et al. Predicting splicing from primary sequence with deep learning. Cell 176, 535–548.e24 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zeng, T. & Li, Y. I. Predicting RNA splicing from DNA sequence using Pangolin. Genome Biol. 23, 103 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Luo, Z., Zhang, J., Fei, J. & Ke, S. Deep learning modeling m6A deposition reveals the importance of downstream cis-element sequences. Nat. Commun. 13, 2720 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yu, F. & Koltun, V. Multi-scale context aggregation by dilated convolutions. In 4th International Conference on Learning Representations https://doi.org/10.48550/arXiv.1511.07122 (2016).

  • Szegedy, C. et al. Going deeper with convolutions. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 1–9 (IEEE, 2015).

  • Zhao, Y. et al. CUP-AI-Dx: A tool for inferring cancer tissue of origin and molecular subtype using RNA gene-expression data and artificial intelligence. eBioMedicine 61, 103030 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lipton, Z. C., Berkowitz, J. & Elkan, C. A critical review of recurrent neural networks for sequence learning. Preprint at https://arxiv.org/abs/1506.00019 (2015).

  • Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 404, 132306 (2020).

    Article 

    Google Scholar
     

  • Hochreiter, S. & Schmidhuber, J. Long short-term memory. Neural Comput. 9, 1735–1780 (1997).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Cho, K. et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) 1724–1734 (ACL, 2014).

  • Sekhon, A., Singh, R. & Qi, Y. DeepDiff: DEEP-learning for predicting DIFFerential gene expression from histone modifications. Bioinformatics 34, i891–i900 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Graves, A., Mohamed, A.-R., Hinton, G. Speech recognition with deep recurrent neural networks. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 6645–6649 (IEEE, 2013).

  • Bretschneider, H., Gandhi, S., Deshwar, A. G., Zuberi, K. & Frey, B. J. COSSMO: predicting competitive alternative splice site selection using deep learning. Bioinformatics 34, i429–i437 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Grønning, A. G. B. et al. DeepCLIP: predicting the effect of mutations on protein–RNA binding with deep learning. Nucleic Acids Res. 48, 7099–7118 (2020).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Arefeen, A., Xiao, X. & Jiang, T. DeepPASTA: deep neural network based polyadenylation site analysis. Bioinformatics 35, 4577–4585 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Trabelsi, A., Chaabane, M. & Ben-Hur, A. Comprehensive evaluation of deep learning architectures for prediction of DNA/RNA sequence binding specificities. Bioinformatics 35, i269–i277 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Vaswani, A. et al. Attention is all you need. In Advances in Neural Information Processing Systems 30 (NIPS, 2017).

  • Dosovitskiy, A. et al. An image is worth 16×16 words: transformers for image recognition at scale. In 9th International Conference on Learning Representations https://openreview.net/forum?id=YicbFdNTTy (2021).

  • Yu, H. & Dai, Z. SANPolyA: a deep learning method for identifying Poly(A) signals. Bioinformatics 36, 2393–2400 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Avsec, Ž. et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat. Methods 18, 1196–1203 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zhou, J.-R., Wang, X.-F., Wen, J.-y, Shang, X.-Q. & Niu, R. Predicting circRNA-miRNA interactions utilizing transformer-based RNA sequential learning and high-order proximity preserved embedding. iScience 27, 108592 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gilmer, J. et al. Neural message passing for quantum chemistry. In Proc. 34th International Conference on Machine Learning 1263–1272 (PMLR, 2017).

  • Kipf, T. N. & Welling, M. Semi-supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations https://openreview.net/forum?id=SJU4ayYgl (2017).

  • Veličković, P. et al. Graph attention networks. In 6th International Conference on Learning Representations https://openreview.net/forum?id=rJXMpikCZ (2018).

  • Wu, Z. et al. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 4–24 (2020).

    Article 

    Google Scholar
     

  • Forster, D. T. et al. BIONIC: biological network integration using convolutions. Nat. Methods 19, 1250–1261 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Statello, L., Guo, C.-J., Chen, L.-L. & Huarte, M. Gene regulation by long non-coding RNAs and its biological functions. Nat. Rev. Mol. Cell Biol. 22, 96–118 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Bartel, D. P. MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233 (2009).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Peng, Y. & Croce, C. M. The role of microRNAs in human cancer. Signal Transduct. Target. Ther. 1, 15004 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Slack, F. J. & Chinnaiyan, A. M. The role of non-coding RNAs in oncology. Cell 179, 1033–1055 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ha, M. & Kim, V. N. Regulation of microRNA biogenesis. Nat. Rev. Mol. Cell Biol. 15, 509–524 (2014).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Agarwal, V., Bell, G. W., Nam, J. W. & Bartel, D. P. Predicting effective microRNA target sites in mammalian mRNAs. eLife 4, e05005 (2015).

  • Cao, H., Wahlestedt, C. & Kapranov, P. Strategies to annotate and characterize long noncoding RNAs: advantages and pitfalls. Trends Genet. 34, 704–721 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Yuan, J. et al. NPInter v2.0: an updated database of ncRNA interactions. Nucleic Acids Res. 42, D104–D108 (2013).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Fabregat, A. et al. The Reactome Pathway Knowledgebase. Nucleic Acids Res. 46, D649–D655 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kristensen, L. S. et al. The biogenesis, biology and characterization of circular RNAs. Nat. Rev. Genet. 20, 675–691 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kristensen, L. S., Jakobsen, T., Hager, H. & Kjems, J. The emerging roles of circRNAs in cancer and oncology. Nat. Rev. Clin. Oncol. 19, 188–206 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Chen, X. et al. circRNADb: a comprehensive database for human circular RNAs with protein-coding annotations. Sci. Rep. 6, 34985 (2016).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Harrow, J. et al. GENCODE: the reference human genome annotation for the ENCODE project. Genome Res. 22, 1760–1774 (2012).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Glažar, P., Papavasileiou, P. & Rajewsky, N. circBase: a database for circular RNAs. RNA 20, 1666–1670 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Meyer, K. D. & Jaffrey, S. R. The dynamic epitranscriptome: N6-methyladenosine and gene expression control. Nat. Rev. Mol. Cell Biol. 15, 313–326 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wiener, D. & Schwartz, S. The epitranscriptome beyond m6A. Nat. Rev. Genet. 22, 119–131 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Delaunay, S., Helm, M. & Frye, M. RNA modifications in physiology and disease: towards clinical applications. Nat. Rev. Genet. 25, 104–122 (2024).

  • Barbieri, I. & Kouzarides, T. Role of RNA modifications in cancer. Nat. Rev. Cancer 20, 303–322 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Linder, B. et al. Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome. Nat. Methods 12, 767–772 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Schaefer, M., Pollex, T., Hanna, K. & Lyko, F. RNA cytosine methylation analysis by bisulfite sequencing. Nucleic Acids Res. 37, e12 (2009).

    Article 
    PubMed 

    Google Scholar
     

  • Zhong, Z.-D. et al. Systematic comparison of tools used for m6A mapping from nanopore direct RNA sequencing. Nat. Commun. 14, 1906 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Helm, M. & Motorin, Y. Detecting RNA modifications in the epitranscriptome: predict and validate. Nat. Rev. Genet. 18, 275–291 (2017).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hasan, M. M. et al. Deepm5C: a deep-learning-based hybrid framework for identifying human RNA N5-methylcytosine sites using a stacking strategy. Mol. Ther. 30, 2856–2867 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tahir, M., Tayara, H. & Chong, K. T. iPseU-CNN: identifying RNA pseudouridine sites using convolutional neural networks. Mol. Ther. Nucleic Acids 16, 463–470 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Mostavi, M., Salekin, S. & Huang, Y. Deep-2’-O-Me: predicting 2’-o-methylation sites by convolutional neural networks. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2394–2397 (IEEE, 2018).

  • Garalde, D. R. et al. Highly parallel direct RNA sequencing on an array of nanopores. Nat. Methods 15, 201–206 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Liu, H. et al. Accurate detection of m6A RNA modifications in native RNA sequences. Nat. Commun. 10, 4079 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hentze, M. W., Castello, A., Schwarzl, T. & Preiss, T. A brave new world of RNA-binding proteins. Nat. Rev. Mol. Cell Biol. 19, 327–341 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Ray, D. et al. A compendium of RNA-binding motifs for decoding gene regulation. Nature 499, 172–177 (2013).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Licatalosi, D. D. et al. HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469 (2008).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hafner, M. et al. Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141, 129–141 (2010).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Van Nostrand, E. L. et al. Robust transcriptome-wide discovery of RNA-binding protein binding sites with enhanced CLIP (eCLIP). Nat. Methods 13, 508–514 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Taliaferro, J. M. et al. RNA sequence context effects measured in vitro predict in vivo protein binding and regulation. Mol. Cell. 64, 294–306 (2016).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sanchez de Groot, N. et al. RNA structure drives interaction with proteins. Nat. Commun. 10, 3246 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Pan, X., Rijnbeek, P., Yan, J. & Shen, H.-B. Prediction of RNA-protein sequence and structure binding preferences using deep convolutional and recurrent neural networks. BMC Genomics 19, 511 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Duvenaud, D. K. et al. Convolutional networks on graphs for learning molecular fingerprints. In Advances in Neural Information Processing Systems 28 (NIPS, 2015).

  • Sun, L. et al. Predicting dynamic cellular protein–RNA interactions by deep learning using in vivo RNA structures. Cell Res. 31, 495–516 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hu, J., Shen, L. & Sun, G. Squeeze-and-excitation networks. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 7132–7141 (IEEE, 2018).

  • Zhou, J. et al. Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. Nat. Genet. 51, 973–980 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Baek, M. et al. Accurate prediction of protein–nucleic acid complexes using RoseTTAFoldNA. Nat. Methods 21, 117–121 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tian, B. & Manley, J. L. Alternative polyadenylation of mRNA precursors. Nat. Rev. Mol. Cell Biol. 18, 18–30 (2017).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Matlin, A. J., Clark, F. & Smith, C. W. Understanding alternative splicing: towards a cellular code. Nat. Rev. Mol. Cell Biol. 6, 386–398 (2005).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • De Sandre-Giovannoli, A. et al. Lamin a truncation in Hutchinson-Gilford progeria. Science 300, 2055 (2003).

    Article 
    PubMed 

    Google Scholar
     

  • Baralle, F. E. & Giudice, J. Alternative splicing as a regulator of development and tissue identity. Nat. Rev. Mol. Cell Biol. 18, 437–451 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zuallaert, J. et al. SpliceRover: interpretable convolutional neural networks for improved splice site prediction. Bioinformatics 34, 4180–4188 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Derti, A. et al. A quantitative atlas of polyadenylation in five mammals. Genome Res. 22, 1173–1183 (2012).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jan, C. H., Friedman, R. C., Ruby, J. G. & Bartel, D. P. Formation, regulation and evolution of Caenorhabditis elegans 3′ UTRs. Nature 469, 97–101 (2011).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Gao, X., Zhang, J., Wei, Z. & Hakonarson, H. DeepPolyA: a convolutional neural network approach for polyadenylation site prediction. IEEE Access. 6, 24340–24349 (2018).

    Article 

    Google Scholar
     

  • Bogard, N., Linder, J., Rosenberg, A. B. & Seelig, G. A deep neural network for predicting and engineering alternative polyadenylation. Cell 178, 91–106.e23 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Beer, M. A. & Tavazoie, S. Predicting gene expression from sequence. Cell 117, 185–198 (2004).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zhou, J. et al. Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat. Genet. 50, 1171–1179 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Agarwal, V. & Shendure, J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 31, 107663 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Roohani, Y., Huang, K. & Leskovec, J. Predicting transcriptional outcomes of novel multigene perturbations with GEARS. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01905-6 (2023).

  • Singh, R., Lanchantin, J., Robins, G. & Qi, Y. DeepChrome: deep-learning for predicting gene expression from histone modifications. Bioinformatics 32, i639–i648 (2016).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Tasaki, S., Gaiteri, C., Mostafavi, S. & Wang, Y. Deep learning decodes the principles of differential gene expression. Nat. Mach. Intell. 2, 376–386 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sample, P. J. et al. Human 5′ UTR design and variant effect prediction from a massively parallel translation assay. Nat. Biotechnol. 37, 803–809 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xiang, Y. et al. Pervasive downstream RNA hairpins dynamically dictate start-codon selection. Nature 621, 423–430 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Pardi, N., Hogan, M. J., Porter, F. W. & Weissman, D. mRNA vaccines—a new era in vaccinology. Nat. Rev. Drug Discov. 17, 261–279 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Childs-Disney, J. L. et al. Targeting RNA structures with small molecules. Nat. Rev. Drug Discov. 21, 736–762 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Warner, K. D., Hajdin, C. E. & Weeks, K. M. Principles for targeting RNA with drug-like small molecules. Nat. Rev. Drug Discov. 17, 547–558 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Winkle, M., El-Daly, S. M., Fabbri, M. & Calin, G. A. Noncoding RNA therapeutics—challenges and potential solutions. Nat. Rev. Drug Discov. 20, 629–651 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Setten, R. L., Rossi, J. J. & Han, S.-P. The current state and future directions of RNAi-based therapeutics. Nat. Rev. Drug Discov. 18, 421–446 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Byron, S. A., Van Keuren-Jensen, K. R., Engelthaler, D. M., Carpten, J. D. & Craig, D. W. Translating RNA sequencing into clinical diagnostics: opportunities and challenges. Nat. Rev. Genet. 17, 257–271 (2016).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cummings, B. B. et al. Improving genetic diagnosis in Mendelian disease with transcriptome sequencing. Sci. Transl. Med. 9, eaal5209 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Mayhew, M. B. et al. A generalizable 29-mRNA neural-network classifier for acute bacterial and viral infections. Nat. Commun. 11, 1177 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Comitani, F. et al. Diagnostic classification of childhood cancer using multiscale transcriptomics. Nat. Med. 29, 656–666 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, T. et al. MOGONET integrates multi-omics data using graph convolutional networks allowing patient classification and biomarker identification. Nat. Commun. 12, 3445 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Schmauch, B. et al. A deep learning model to predict RNA-Seq expression of tumours from whole slide images. Nat. Commun. 11, 3877 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Chaudhary, N., Weissman, D. & Whitehead, K. A. mRNA vaccines for infectious diseases: principles, delivery and clinical translation. Nat. Rev. Drug Discov. 20, 817–838 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Qin, S. et al. mRNA-based therapeutics: powerful and versatile tools to combat diseases. Signal Transduct. Target. Ther. 7, 166 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jiménez-Luna, J., Grisoni, F. & Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2, 573–584 (2020).

    Article 

    Google Scholar
     

  • Vamathevan, J. et al. Applications of machine learning in drug discovery and development. Nat. Rev. Drug Discov. 18, 463–477 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wayment-Steele, H. K. et al. Deep learning models for predicting RNA degradation via dual crowdsourcing. Nat. Mach. Intell. 4, 1174–1184 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Chen, T. & Guestrin, C. XGboost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 785–794 (ACM, 2016).

  • Wong, C. UK first to approve CRISPR treatment for diseases: what you need to know. Nature 623, 676–677 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Chuai, G. et al. DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol. 19, 80 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xiang, X. et al. Enhancing CRISPR-Cas9 gRNA efficiency prediction by data integration and deep learning. Nat. Commun. 12, 3238 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Marquart, K. F. et al. Predicting base editing outcomes with an attention-based deep learning algorithm trained on high-throughput target library screens. Nat. Commun. 12, 5114 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Mathis, N. et al. Predicting prime editing efficiency and product purity by deep learning. Nat. Biotechnol. 41, 1151–1159 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Wessels, H.-H. et al. Prediction of on-target and off-target activity of CRISPR–Cas13d guide RNAs using deep learning. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01830-8 (2023).

  • Park, S. et al. shRNAI: a deep neural network for the design of highly potent shRNAs. Preprint at bioRxiv https://doi.org/10.1101/2024.01.09.574789 (2024).

  • Gao, D. et al. A deep learning approach to identify gene targets of a therapeutic for human splicing disorders. Nat. Commun. 12, 3332 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Abascal, F. et al. Perspectives on ENCODE. Nature 583, 693–698 (2020).

    Article 

    Google Scholar
     

  • Regev, A. et al. The Human Cell Atlas. eLife 6, e27041 (2017).

  • Lindeboom, R. G. H., Regev, A. & Teichmann, S. A. Towards a Human Cell Atlas: taking notes from the past. Trends Genet. 37, 625–630 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Deng, J. et al. Imagenet: a large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).

  • Lin, T.-Y. et al. Microsoft coco: common objects in context. In Proc. Computer Vision–ECCV 2014: 13th European Conference 740–755 (Springer, 2014).

  • Rajpurkar, P., Zhang, J., Lopyrev, K. & Liang, P. Squad: 100,000+ questions for machine comprehension of text. In Proc. 2016 Conference on Empirical Methods in Natural Language Processing 2383–2392 (ACL, 2016).

  • Wang, A. et al. GLUE: a multi-task benchmark and analysis platform for natural language understanding. In Proc. 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP 353–355 (ACL, 2018).

  • Pagès-Gallego, M. & de Ridder, J. Comprehensive benchmark and architectural analysis of deep learning models for nanopore sequencing basecalling. Genome Biol. 24, 71 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Horlacher, M. et al. A systematic benchmark of machine learning methods for protein–RNA interaction prediction. Brief. Bioinform. 24, bbad307 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Huang, Z. et al. Benchmark of computational methods for predicting microRNA-disease associations. Genome Biol. 20, 202 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sasse, A. et al. Benchmarking of deep neural networks for predicting personal gene expression from DNA sequence highlights shortcomings. Nat. Genet. 55, 2060–2064 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Huang, C. et al. Personal transcriptome variation is poorly explained by current genomic deep learning models. Nat. Genet. 55, 2056–2059 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Khan, S. A. et al. Reusability report: Learning the transcriptional grammar in single-cell RNA-sequencing data using transformers. Nat. Mach. Intell. 5, 1437–1446 (2023).

    Article 

    Google Scholar
     

  • Tan, M. & Le, Q. Efficientnet: rethinking model scaling for convolutional neural networks. In Proc. 36th International Conference on Machine Learning 6105–6114 (PMLR, 2019).

  • Thompson, N. C., Greenewald, K., Lee, K. & Manso, G. F. The computational limits of deep learning. In Ninth Computing within Limits 2023. https://doi.org/10.21428/bf6fb269.1f033948 (LIMITS, 2023).

  • Vermeulen, C. et al. Ultra-fast deep-learned CNS tumour classification during surgery. Nature 622, 842–849 (2023).

  • Bauer, W. et al. A novel 29-messenger RNA host-response assay from whole blood accurately identifies bacterial and viral infections in patients presenting to the emergency department with suspected infections: a prospective observational study. Crit. Care Med. 49, 1664–1673 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Menghani, G. Efficient deep learning: a survey on making deep learning models smaller, faster, and better. ACM Comput. Surv. 55, 1–37 (2023).

    Article 

    Google Scholar
     

  • Micikevicius, P. et al. Mixed precision training. In 6th International Conference on Learning Representations https://openreview.net/forum?id=r1gs9JgRZ (2018).

  • Jacob, B. et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 2704–2713 (IEEE, 2018).

  • He, Y., Zhang, X. & Sun, J. Channel pruning for accelerating very deep neural networks. In Proc. IEEE International Conference on Computer Vision 1389–1397 (IEEE, 2017).

  • Howard, A. G. et al. Mobilenets: efficient convolutional neural networks for mobile vision applications. Preprint at https://arxiv.org/abs/1704.04861 (2017).

  • Iandola, F. N. et al SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size. Preprint at https://arxiv.org/abs/1602.07360 (2016).

  • Zhang, X., Zhou, X., Lin, M. & Sun, J. Shufflenet: an extremely efficient convolutional neural network for mobile devices. In Proc. IEEE Conference on Computer Vision and Pattern Recognition 6848–6856 (IEEE, 2018).

  • Tan, M. & Le, Q. EfficientNetV2: Smaller models and faster training. In Proc. 38th International Conference on Machine Learning 10096–10106 (PMLR, 2021).

  • Penzar, D. et al. LegNet: a best-in-class deep learning model for short DNA regulatory regions. Bioinformatics 39, btad457 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Baltrušaitis, T., Ahuja, C. & Morency, L.-P. Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41, 423–443 (2018).

    Article 
    PubMed 

    Google Scholar
     

  • Ashuach, T. et al. MultiVI: deep generative model for the integration of multimodal data. Nat. Methods 20, 1222–1231 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Long, D. et al. Potent effect of target structure on microRNA function. Nat. Struct. Mol. Biol. 14, 287–294 (2007).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Wang, X.-W., Liu, C.-X., Chen, L.-L. & Zhang, Q. C. RNA structure probing uncovers RNA structure-dependent biological functions. Nat. Chem. Biol. 17, 755–766 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Mortimer, S. A., Kidwell, M. A. & Doudna, J. A. Insights into RNA structure and function from genome-wide studies. Nat. Rev. Genet. 15, 469–479 (2014).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Sato, K., Akiyama, M. & Sakakibara, Y. RNA secondary structure prediction using deep learning with thermodynamic integration. Nat. Commun. 12, 941 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Singh, J., Hanson, J., Paliwal, K. & Zhou, Y. RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat. Commun. 10, 5407 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, W. et al. trRosettaRNA: automated prediction of RNA 3D structure with transformer network. Nat. Commun. 14, 7266 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Guidotti, R. et al. A survey of methods for explaining black box models. ACM Comput. Surv. (CSUR). 51, 1–42 (2018).

    Article 

    Google Scholar
     

  • Lundberg, S. M. & Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems Vol. 30. https://papers.nips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf (Curran Associates, Inc., 2017).

  • Janssens, J. et al. Decoding gene regulation in the fly brain. Nature 601, 630–636 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • de Almeida, B. P., Reiter, F., Pagani, M. & Stark, A. DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nat. Genet. 54, 613–624 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Simonyan, K., Vedaldi, A. & Zisserman, A. Deep inside convolutional networks: visualising image classification models and saliency maps. In Proceedings of the International Conference on Learning Representations. https://doi.org/10.48550/arXiv.1312.6034 (2014)

  • Shrikumar, A., Greenside, P., Shcherbina, A. & Kundaje, A. Not just a black box: learning important features through propagating activation differences. Preprint at https://arxiv.org/abs/1605.01713 (2016).

  • Sundararajan, M., Taly, A. & Yan, Q. Axiomatic attribution for deep networks. In Proc. 34th International Conference on Machine Learning 3319–3328 (PMLR, 2017).

  • Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, e0130140 (2015).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences In International Conference on Machine Learning 3145–3153 (PMLR, 2017).

  • Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bommasani R., et al. On the opportunities and risks of foundation models. Preprint at https://arxiv.org/abs/2108.07258 (2021).

  • Bubeck, S. et al. Sparks of artificial general intelligence: early experiments with gpt-4. Preprint at https://arxiv.org/abs/2303.12712 (2023).

  • OpenAI, et al. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).

  • Radford, A., Narasimhan, K., Salimans, T. & Sutskever, I. Improving language understanding by generative pre-training. Preprint at https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (2018).

  • Ji, Y., Zhou, Z., Liu, H. & Davuluri, R. V. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37, 2112–2120 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zhang, D. et al. DNAGPT: a generalized pretrained tool for multiple DNA sequence analysis tasks. Preprint at https://www.biorxiv.org/content/10.1101/2023.07.11.548628v1 (2023).

  • Celaj, A. et al. An RNA foundation model enables discovery of disease mechanisms and candidate therapeutics. Preprint at https://www.biorxiv.org/content/10.1101/2023.09.20.558508v1 (2023).

  • Chen, J. et al. Interpretable RNA foundation model from unannotated data for highly accurate RNA structure and function predictions. Preprint at https://www.biorxiv.org/content/10.1101/2022.08.06.503062v1.full (2022).

  • Badia-i-Mompel, P. et al. Gene regulatory network inference in the era of single-cell multi-omics. Nat. Rev. Genet. 24, 739–754 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Cha, J. & Lee, I. Single-cell network biology for resolving cellular heterogeneity in human diseases. Exp. Mol. Med. 52, 1798–1808 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Sirugo, G., Williams, S. M. & Tishkoff, S. A. The missing diversity in human genetic studies. Cell 177, 26–31 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Consortium GT. The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369, 1318–1330 (2020).

    Article 

    Google Scholar
     

  • Spratt, D. E. et al. Racial/ethnic disparities in genomic sequencing. JAMA Oncol. 2, 1070–1074 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xu, J. et al. Algorithmic fairness in computational medicine. EBioMedicine 84, 104250 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sharma, S. et al. Data augmentation for discrimination prevention and bias disambiguation. In Proc. AAAI/ACM Conference on AI, Ethics, and Society; 358–364 (ACM, 2020).

  • Investigators AoURP. The “All of Us” research program. New Engl. J. Med. 381, 668–676 (2019).

    Article 

    Google Scholar
     

  • Gürsoy, G. et al. Functional genomics data: privacy risk assessment and technological mitigation. Nat. Rev. Genet. 23, 245–258 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Lunshof, J. E., Chadwick, R., Vorhaus, D. B. & Church, G. M. From genetic privacy to open consent. Nat. Rev. Genet. 9, 406–411 (2008).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Shokri, R. & Shmatikov, V. Privacy-preserving deep learning. In Proc. 22nd ACM SIGSAC Conference on Computer and Communications Security, 1310–1321 (ACM, 2015).

  • Wan, Z. et al. Sociotechnical safeguards for genomic data privacy. Nat. Rev. Genet. 23, 429–445 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gymrek, M., McGuire, A. L., Golan, D., Halperin, E. & Erlich, Y. Identifying personal genomes by surname inference. Science 339, 321–324 (2013).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Acar, A., Aksu, H., Uluagac, A. S. & Conti, M. A survey on homomorphic encryption schemes: theory and implementation. ACM Comput. Surv. 51, 1–35 (2018).

    Article 

    Google Scholar
     

  • Gilad-Bachrach, R. et al. Cryptonets: applying neural networks to encrypted data with high throughput and accuracy. In International Conference on Machine Learning, PMLR, 201–210 (PMLR, 2016).

  • Konečný, J. et al. Federated learning: strategies for improving communication efficiency. Preprint at https://arxiv.org/abs/1610.05492 (2016).

  • Rieke, N. et al. The future of digital health with federated learning. NPJ Digit. Med. 3, 119 (2020).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Dayan, I. et al. Federated learning for predicting clinical outcomes in patients with COVID-19. Nat. Med. 27, 1735–1743 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Warnat-Herresthal, S. et al. Swarm learning for decentralized and confidential clinical machine learning. Nature 594, 265–270 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, H. et al. Scientific discovery in the age of artificial intelligence. Nature 620, 47–60 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Katz, K. et al. The Sequence Read Archive: a decade more of explosive growth. Nucleic Acids Res. 50, D387–D390 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hudson, T. J. et al. International network of cancer genome projects. Nature 464, 993–998 (2010).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Ko, G. et al. KoNA: Korean Nucleotide Archive as a new data repository for nucleotide sequence data. Genomics Proteomics Bioinformatics, qzae017 https://doi.org/10.1093/gpbjnl/qzae017 (2024).

  • Lee, B. et al. Introduction of the Korea BioData Station (K-BDS) for sharing biological data. Genomics Inform. 21, e12 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zou, Q., Xing, P., Wei, L. & Liu, B. Gene2vec: gene subsequence embedding for prediction of mammalian N(6)-methyladenosine sites from mRNA. Rna 25, 205–218 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Source link

    Source link: https://www.nature.com/articles/s12276-024-01243-w

    What do you think?

    Leave a Reply

    GIPHY App Key not set. Please check settings

    Make films using Stability AI & Free AI Video Generator Morph Studio ( 2024)

    Create films with Stability AI & Free AI Video Generator #Filmmaking

    «The most advanced» Stability AI model scared the network with images of mutant humans — but fingers are in place

    Advanced AI model causes fear with mutant human images #Stability