To install this package, start R and enter:

## try http:// if https:// URLs are not supported

In most cases, you don't need to download the package archive at all.



This package is for version 3.2 of Bioconductor; for the stable, up-to-date release version, see gskb.

Gene Set data for pathway analysis in mouse

Bioconductor version: 3.2

Gene Set Knowledgebase (GSKB) is a comprehensive knowledgebase for pathway analysis in mouse. Interpretation of high-throughput genomics data based on biological pathways constitutes a constant challenge, partly because of the lack of supporting pathway database. We created a functional genomics knowledgebase in mouse, which includes 33,261 pathways and gene sets compiled from 40 sources such as Gene Ontology, KEGG, GeneSetDB, PANTHER, microRNA and transcription factor target genes, etc. In addition, we also manually collected and curated 8,747 lists of differentially expressed genes from 2,526 published gene expression studies to enable the detection of similarity to previously reported gene expression signatures. These two types of data constitute a comprehensive Gene Set Knowledgebase (GSKB), which can be readily used by various pathway analysis software such as gene set enrichment analysis (GSEA). As a first step, we gathered annotation information from 40 existing databases for mouse-related gene sets. These gene sets are divided into 7 categories, namely, Gene Ontology, Curated pathways, Metabolic Pathways, Transcription Factor (TF) and microRNA target genes, location (cytogenetics band), and others. We used information in GeneSetDB for some of the databases. Detailed information on these 40 sources and the citations is available . The gene lists from literature were retrieved manually from individual gene expression studies through a process similar to the one used to create AraPath, a similar resource for Arabidopsis[12]. As most expression studies upload raw data to repositories like GEO and ArrayExpress, we used the meta-data in these databases to search for publications. We scanned all datasets we can found and retrieved 4,313 potentially useful papers reporting gene expression studies in mouse. These papers were individually read by curators to identify lists of differentially expressed genes in various conditions. We compiled a total of 8,747 lists of differently expressed genes from 2,518 of papers. Each gene list was annotated with a unique name, brief description, and publication information, similar to the protocol used in MSigDB and Arapath. These gene lists constitute a large collection of published gene expression signatures that form a foundation for interpret new gene lists and expression profiles. More information about this data is available here There is also a paper describing these data are currently in revision by Database: The Journal of Biological Databases and Curation.

Author: Valerie Bares, Xijin Ge

Maintainer: Valerie Bares <valerie.bares at>

Citation (from within R, enter citation("gskb")):


To install this package, start R and enter:

## try http:// if https:// URLs are not supported


To view documentation for the version of this package installed in your system, start R and enter:



PDF gskb: mouse data
PDF   Reference Manual
Text   NEWS


biocViews ExperimentData, Mus_musculus
Version 1.2.0
License Artistic-2.0
Depends R (>= 3.2.0)
Depends On Me
Imports Me
Suggests Me
Build Report  

Package Archives

Follow Installation instructions to use this package in your R session.

Package Source gskb_1.2.0.tar.gz
Windows Binary
Mac OS X 10.6 (Snow Leopard)
Mac OS X 10.9 (Mavericks)
Subversion source (username/password: readonly)
Package Short Url
Package Downloads Report Download Stats

Documentation »


R / CRAN packages and documentation

Support »

Please read the posting guide. Post questions about Bioconductor to one of the following locations:

Fred Hutchinson Cancer Research Center