Authors: Arsenij Ustjanzew [aut, cre, cph] (https://orcid.org/0000-0002-1014-4521), Federico Marini [aut] (https://orcid.org/0000-0003-3252-7758)
Version: 1.12.0
Compiled date: 2024-04-30
License: AGPL-3 + file LICENSE

Introduction and scope

Intuitive visualization and interactive exploration of multidimensional cancer genomics data sets is essential to the field of cancer genomics. The cBioPortal for Cancer Genomics is an open-access, open-source tool that can integrate different types of alterations with clinical data. “The goal of cBioPortal is to significantly lower the barriers between complex genomic data and cancer researchers by providing rapid, intuitive, and high-quality access to molecular profiles and clinical attributes from large-scale cancer genomics projects, and therefore to empower researchers to translate these rich data sets into biologic insights and clinical applications.” (read more about cBioPortal for Cancer Genomics here.) cBioPortal enables the installation of an own instance for the analysis of your own data. The data for uploading to the own instance must have certain file formats. Although these specifications are documented in detail here, the creation of such specific files is not easy for medical professionals or technically inexperienced persons and is often very time-consuming.

The R package cbpManager provides an R Shiny application that facilitates the generation of files suitable for the import in cBioPortal for Cancer Genomics. It enables the user to manage and edit clinical data maintain new patient data over time.

This tutorial gives an overview of the functionality of the Shiny application, explains how to create cancer studies and edit its metadata, upload mutation data, and create and edit clinical patient data, sample data, and timeline data.

Installation

cbpManager is a stand-alone R package, so a user can start and use the application locally without implementing it in a larger system context.

A local installation of the latest version of R and RStudio is required.

Alternatively, cbpManager can be installed using Docker. This has the advantage that a permanent Shiny server instance of the application runs as a container and can thus be integrated into a global system context.

Installation of the R package

The package can be installed with the remotes library:

remotes::install_github("arsenij-ust/cbpManager")

cbpManager will use the validateData.py script from cBioPortal for Cancer Genomics inside of the application, which allows the user to validate the created files. Therefore, a conda environment will be installed. To prevent long loading times during the application usage, we can setup the conda environment with the function cbpManager::setupConda_cbpManager() before launch.

After successful installation, the application is started with the following command (a browser window/ RStudio viewer with the application should open):

cbpManager::cbpManager()

The installation was successful if the application starts working.

A study to be loaded in cBioPortal can basically consist of a directory where all the data files are located (see here). It is common to store the single study directories in one directory called e.g. “study”. If you already have a cBioPortal instance installed and such a folder containing study subfolders, you should provide the path when starting the application:

cbpManager::cbpManager(
  studyDir="path/to/study", 
  logDir="path/to/logingDirectory"
)

Now you can select your already existing studies in the dropdown menu.

Optionally, you can provide further parameters to cbpManager::cbpManager() function that are used by shiny::runApp, e.g. host or port.

Attention: If you do not pass a path to the application before starting it, the “study” directory in the installed R package will be used by default. If the package is reinstalled at a later time, all studies and files in the folder will be lost. Thus it is advisable to always pass a path (e.g. of the working directory).

Docker deployment

The advantage of running cbpManager as a Docker container is the system integration in a larger context and allows the access for multiple users. In addition, you can integrate authentication with ShinyProxy. More information about installing cbpManager using Docker can be found on this page LINK.

Functionality

File naming convention

If cbpManager should recognize files of a study, the files should be named as following:

  • data_clinical_patient.txt (Clinical Data)

  • data_clinical_sample.txt (Clinical Data)

  • data_mutations_extended.txt (Mutation Data)

  • meta_study.txt (Cancer Study)

  • meta_clinical_patient.txt (Clinical Data)

  • meta_clinical_sample.txt (Clinical Data)

  • meta_mutations_extended.txt (Mutation Data)

Optional files:

  • data_timeline_surgery.txt / meta_timeline_surgery.txt
  • data_timeline_status.txt / meta_timeline_status.txt
  • data_timeline_treatment.txt / meta_timeline_treatment.txt

Further custom timeline tracks should be named:

e.g. data_timeline_<custom>.txt / meta_timeline_<custom>.txt

For further details see File Formats and the ‘testpatient’ study in this package under ‘inst/study/’.

Editing studies:

On the “Study Metadata” page, new studies can be created and existing studies can be loaded.

Creating a new study:

To create a new study, one must first create metadata of the study. This is done in the right box “Add new study”.

After clicking the Add study button, the study will be created in a new folder (under the defined study folder the file meta_study.txt will be created). This folder is named after the input value of the Add ID of cancer study field. Metadata of a study can be changed by specifying the ID of an existing study.

The “cancer type” can be entered either in the drop-down menu Select the cancer type or alternatively in the expanded table below the metadata fields:

Loading an existing study:

In order to further process data from a study, this study must be loaded. For this purpose, the respective study must be selected in the drop-down menu Select ID of cancer study on the left side of the page “Study Metadata”. After pressing the Load study button, loading is confirmed by the appearance of the metadata table:

Editing patient data:

The “Patient” page is used to edit clinical patient data. In the upper area, it has a “Description” box containing important information on filling in the patient data as well as instructions for handling, and a “Sample from cBioPortal” box with an exemplary representation of the patient data in cBioPortal (both boxes are collapsed in the following image for better clarity). The “Patient manager” area contains several function buttons and a central table with the patient information. The first three light blue lines must contain a short, long name and the data type of a column. Each additional line defines a patient.

Add new attributes (columns):

By clicking the Add column(s) button, new attributes can be added to the table. You can choose from predefined attributes or create a user-defined attribute: