psygenet2r 1.10.0
The psygenet2r
package contains functions to query PsyGeNET [1], a resource on psychiatric diseases and their genes. This can also be done using the web tool. However, the psygenet2r
package goes a step further including analysis and visualization functions to study psychiatric diseases, their genes and disease comorbidities, as well as analyzing the tissues/anatomical structures in which the genes are expressed. A special focus is made on visualization of the results (not web available), providing a variety of representation formats such as networks, heatmaps and barplots (Table 3).
During the last years there has been a growing interest in the genetics of psychiatric disorders, leading to a concomitant increase in the number of publications that report these studies [2]. However, there is still limited understanding on the celular and molecular mechanisms leading to psychiatric diseases, which has limited the application of this wealth of data in the clinical practice. This situation also applies to psychiatric comorbidities. Some of the factors that explain the current situation is the heterogeneity of the information about psychiatric disorders and its fragmentation into knowledge silos, and the lack of resources that collect these wealth of data, integrate them, and supply the information in an intuitive, open access manner to the community. PsyGeNET has been developed to fill this gap. psygenet2r
has been developed to facilitate statistical analysis of PsyGeNET data, allowing its integration with other packages available in R to develop data analysis workflows.
PsyGeNET is a resource for the exploratory analysis of psychiatric diseases and their associated genes. The second release of PsyGeNET (version 2.0) contains updated information on depression, bipolar disorder, alcohol use disorders and cocaine use disorders, and has been expanded to cover other psychiatric diseases of interest: bipolar disorder, schizophrenia, substance-induced depressive disorder and psychoses and cannabis use disorder (Table 1). PsyGeNET allows the exploration of the molecular basis of psychiatric disorders by providing a comprehensive set of genes associated to each disease. Moreover, it allows the analysis of the molecular mechanisms underlying psychiatric disease comorbidities.
Long Name | Short Name | Acronym |
---|---|---|
Alcohol use disorders |
Alcohol UD | AUD |
Bipolar disorders and related disorders |
Bipolar disorder | BD |
Depressive disorders |
Depression | DEP |
Schizophrenia spectrum and other psychotic disorders |
Schizophrenia | SCHZ |
Cocaine use disorders |
Cocaine UD | CUD |
Substance induced depressive disorder |
SI-Depression | SI-DEP |
Cannabis use disorders |
Cannabis UD | CanUD |
Substance induced psychosis |
SI-Psychosis | SI-PSY |
PsyGeNET database is the result of the data extracted from the literature by text mining using BeFree [3], followed by manual curation by domain experts. A team of 22 experts participates as curators of the database. The current version of PsyGeNET (version 2.0) contains 3,771 associations between 1,549 genes and 117 psychiatric disease concepts.
With psygenet2r
package the user is able to submit queries to PsyGeNET from R, perform a variety of analysis on the data, and visualize the results through different types of graphical representations.
The tasks that can be performed with psygenet2r
package are the following:
In the following sections the specific functions that can be used to address each of these tasks are presented.
The package psygenet2r
is provided through Bioconductor. To install psygenet2r
the user must type the two following commands in an R session:
source( "http://bioconductor.org/biocLite.R" )
biocLite( "psygenet2r" )
library( psygenet2r )
DataGeNET.Psy
DataGeNET.Psy
object is obtained when psygenetGene
and psygenetDisease
functions are applied. This object is used as input for the rest of psyGeNET2r
functions, like the plot
function.
DataGeNET.Psy
object contains all the information about the different diseases/genes associated with the gene/disease of interested retrieved from PsyGeNET. This object contains a summary of the search, such as the search input (gene or disease), the selected database, the gene or disease identifier, the number of associations found (N. Results) and the number of unique results obtained (U. Results).
t1
## Object of class 'DataGeNET.Psy'
## . Type: gene
## . Database: ALL
## . Term: 4852
## . Number of Results: 13
## . Number of unique Diseases: 13
## . Number of unique Genes: 1
class( t1 )
## [1] "DataGeNET.Psy"
## attr(,"package")
## [1] "psygenet2r"
This object comes with a series of functions to allow users to interact with the information retrieved from PsyGeNET. These functions are ngene
, ndisease
, extract
and plot
. The first function ngene
returns the number of retrieved genes for a given query. ndisease
is the homologous function but for the diseases. The function extract
returns a formatted data.frame
with the complete set of information downloaded from PsyGeNET. Finally, the plot
function allows the visualization of the results in a variety of ways such as gene-disease association networks or heatmaps.
3 PsyGeNET and psygenet2r
The PsyGeNET web interface can be explored by searching a specific gene or a specific disease, and psygenet2r
package has the same options. Therefore, the starting point for psygenet2r
are psygenetGene
and psygenetDisease
functions.
PsyGeNET data is classified according to the database used as a source of information (“source database”). Therefore, any query run on PsyGeNET requires to specify the source database using the argument called database
. Table (tab:psygenet-databases) shows the source databases in PsyGeNET and their description. By default, the database "ALL"
is used in psygenet2r
. For illustrating purposes along the vignette, database `ALL} will be used in most of code snippets.
Name | Description |
---|---|
psycur15 |
Genes associated to DEP, BD, AUD and CUD between 1980 and 2013 (PsyGeNET release v1.0) |
psycur16 |
Genes associated to DEP, BD, AUD, CUD, SCHZ, S-DEP, CanUD and D-PSY between 1980 and 2015 |
ALL |
All previous Databases |
psygenet2r
psygenet2r
package allows exploring PsyGeNET information using a specifc gene or a list of genes. It retrieves the information that is available in PsyGeNET (associated diseases, source database, PsyGeNET Evidence Index, number of publications, attributes of genes, etc) and allows to visualize the results in different ways.
In order to look for a single gene into PsyGeNET, we can use the psygenetGene
function. This function retrieves PsyGeNET’s information using both, the NCBI gene identifier and the official Gene Symbol from HUGO. It contains also other arguments like the database to query, the PsyGeNET evidence index (evidenceIndex argument).
As an example, the gene NPY, whose entrez id is 4852 is queried using psygenetGene
function, and using alternatively the official HUGO Gene Symbol. In this example database "ALL"
.
t1 <- psygenetGene( gene = 4852,
database = "ALL")
t1
## Object of class 'DataGeNET.Psy'
## . Type: gene
## . Database: ALL
## . Term: 4852
## . Number of Results: 13
## . Number of unique Diseases: 13
## . Number of unique Genes: 1
t2 <- psygenetGene( gene = "NPY",
database = "ALL" )
t2
## Object of class 'DataGeNET.Psy'
## . Type: gene
## . Database: ALL
## . Term: NPY
## . Number of Results: 13
## . Number of unique Diseases: 13
## . Number of unique Genes: 1
Both cases result in an DataGeNET.Psy
object:
class( t1 )
## [1] "DataGeNET.Psy"
## attr(,"package")
## [1] "psygenet2r"
class( t2 )
## [1] "DataGeNET.Psy"
## attr(,"package")
## [1] "psygenet2r"
In the particular example used, by inspecting the DataGeNET.Psy
object, we can see that the gene NPY is associated to 13 different diseases in PsyGeNET (with no restriction on the PsyGeNET evidence index).
psygenet2r
offers several options to visualize the results from PysGeNET in networks by changing the type
argument when applying the plot
function. A network showing the diseases (type = "GDA network"
) or the psyc