filteranot {PADOG} | R Documentation |
This function helps to deal with multiple probesets/probes per gene prior to geneset analysis.
filteranot(esetm=NULL,group=NULL,paired=FALSE,block=NULL,annotation=NULL,include.details=FALSE)
esetm |
A matrix containing log transfomed and normalized gene expression data. Rows correspond to genes and columns to samples. Rownames of esetm need to be valid probeset or probe names. |
group |
A character vector with the class labels of the samples. It can only contain "c" for control samples or "d" for disease samples. |
paired |
A logical value to indicate if the samples in the two groups are paired. |
block |
A character vector indicating the block ids of the samples classified by the group variable, if |
annotation |
A valid chip annotation package name (e.g. "hgu133plus2.db") |
include.details |
If set to true, will include all columns from limma's topTable for this dataset. |
See cited documents for more details.
A data frame containing the probeset IDs (and corresponding ENTREZ IDs) of the best probesets for each gene ;
Adi Laurentiu Tarca <atarca@med.wayne.edu>
Adi L. Tarca, Sorin Draghici, Gaurav Bhatti, Roberto Romero, Down-weighting overlapping genes improves gene set analysis, BMC Bioinformatics, 2012, submitted.
#run padog on a colorectal cancer dataset of the 24 datasets benchmark GSE9348 set="GSE9348" data(list=set,package="KEGGdzPathwaysGEO") x=get(set) #Extract from the dataset the required info exp=experimentData(x); dataset= exp@name dat.m=exprs(x) ano=pData(x) design= notes(exp)$design annotation= paste(x@annotation,".db",sep="") dim(dat.m) #get rid of duplicates in the same way as is done for PADOG and assign probesets to ENTREZ IDS #get rid of duplicates by choosing the probe(set) with lowest p-value; get ENTREZIDs for probes aT1=filteranot(esetm=dat.m,group=ano$Group,paired=(design=="Paired"),block=ano$Block,annotation) #filtered expression matrix filtexpr=dat.m[rownames(dat.m)%in%aT1$ID,] dim(filtexpr)