Contents

1 Introduction

OmnipathR is an R package built to provide easy access to the data stored in the OmniPath webservice (Türei, Korcsmáros, and Saez-Rodriguez 2016):

http://omnipathdb.org/

The webservice implements a very simple REST style API. This package make requests by the HTTP protocol to retreive the data. Hence, fast Internet access is required for a proper use of OmnipathR.

1.1 Query types

OmnipathR can retrieve five different types of data:

  • Interactions: protein-protein interactions organized in different datasets:
    • omnipath: the OmniPath data as defined in the original publication (Türei, Korcsmáros, and Saez-Rodriguez 2016) and collected from different databases.
    • pathwayextra: activity flow interactions without literature reference.
    • kinaseextra: enzyme-substrate interactions without literature reference.
    • ligrecextra: ligand-receptor interactions without literature reference.
    • tfregulons: transcription factor (TF)-target interactions from DoRothEA (Garcia-Alonso et al. 2019).
    • tf-miRNA: transcription factor-miRNA interactions
    • miRNA-target: miRNA-mRNA interactions.
    • lncRNA-mRNA: lncRNA-mRNA interactions.
  • Post-translational modifications (PTMs): It provides enzyme-substrate reactions in a very similar way to the aforementioned interactions. Some of the biological databases related to PTMs integrated in OmniPath are Phospho.ELM (Dinkel et al. 2010) and PhosphoSitePlus [Hornbeck et al. (2014)}.

  • Complexes: it provides access to a comprehensive database of more than 22000 protein complexes. This data comes from different resources such as: CORUM (Giurgiu et al. 2018) or Hu.map (Drew et al. 2017).

  • Annotations: it provides a large variety of data regarding different annotations about proteins and complexes. These data come from dozens of databases covering different topics such as: The Topology Data Bank of Transmembrane Proteins (TOPDB) (Dobson et al. 2014) or ExoCarta (Keerthikumar et al. 2016), a database collecting the proteins that were identified in exosomes in multiple organisms.

  • Intercell: it provides information on the roles in inter-cellular signaling. For instance. if a protein is a ligand, a receptor, an extracellular matrix (ECM) component, etc. The data does not come from original sources but combined from several databases by us. The source databases, such as CellPhoneDB (Vento-Tormo et al. 2018) or Receptome (Ben-Shlomo et al. 2003), are also referred for each reacord.

Figure 1 shows an overview of the resources featured in OmniPath. For more detailed information about the original data sources integrated in Omnipath, please visit: