Modern multi-omics datasets require substantial programming expertise and practical knowledge to analyse. Analysis is skill-gated, making it challenging for many researchers to apply modern statistical best practices. MultiScholaR is a package in R that addresses this challenge by providing a novel pipeline aiming to enable users to perform comprehensive multi-omics analyses, including single-omic analyses (e.g. transcriptomics, proteomics, phosphoproteomics and metabolomics datasets) and integrative multi-omics analysis. Through well-documented workflow templates, researchers can systematically apply best-practice to all steps of integrative multi-omics analysis.
MultiScholaR implements stringent quality control measures for multi-omics analysis by incorporating criteria such as false discovery rate thresholds, filtering criterias, and missing value limitations across samples. It integrates several sophisticated analytical tools: the IQ tool for peptide-to-protein quantitative data summarization¹, RUVIII-C for removing unwanted variation², and edgeR3 and/or limma4 for sample normalization and linear modelling. Pathway analysis can be performed either using user-supplied annotations via clusterProfiler5 or through automated analysis with gProfiler26. Multivariates and integrative multi-omics analyses were implemented using MOFA+7 and MixOmics8.
Structured on modular, object-oriented components, MultiScholaR's architecture facilitates easy integration of new tools as they emerge. The inclusion of comprehensive, documented workbooks that guide users through each analytical step, facilitates reproducibility and enabling public sharing of analyses. We demonstrate the pipeline's capabilities through the analysis of published data examining differences motor neurons (iMNs) differentiated from induced pluripotent stem cells (iPSCs) derived from subjects with Spinal Muscular Atrophy type 1, amyotrophic lateral sclerosis, from non-diseased control9.
By streamlining complex multi-omics analyses, MultiScholaR makes advanced analytical techniques accessible to researchers across all levels of programming expertise. The complete library and step-by-step tutorial will be available as an R package via https://github.com/APAF-bioinformatics/MultiScholaR
1) PMID: 31909781
2) PMID: 32732981
3) PMID: 25605792
4) PMID: 39844453
5) PMID: 34557778
6) PMID: 33564394
7) PMID: 32393329
8) PMID: 29099853
9) PMID: 36631473