When features are shared.
In order to make use of TDbasedUFE for the drug repositioning, we previously
proposed(Taguchi 2017a) the integrated analysis of two gene expression profiles,
each of which is composed of gene expression of drug treated one and disease
one. At first, we try to prepare two omics profiles, expDrug and expDisease,
that represent gene expression profiles of cell lines treated by various drugs
and a cell line of diseases by
Cancer_cell_lines <- list(ACC.rnaseq, BLCA.rnaseq, BRCA.rnaseq, CESC.rnaseq)
Drug_and_Disease <- prepareexpDrugandDisease(Cancer_cell_lines)
expDrug <- Drug_and_Disease$expDrug
expDisease <- Drug_and_Disease$expDisease
rm(Cancer_cell_lines)
expDrug is taken from RTCGA package and those associated with Drugs based upon
(Ding, Zu, and Gu 2016). Those files are listed in drug_response.txt included in Clinical
drug responses at http://lifeome.net/supp/drug_response/.
expDisease is composed of files in BRCA.rnaseq, but not included in expDrug
(For more details, see source code of prepareexpDrugandDisease).
Then prepare a tensor as
Z <- prepareTensorfromMatrix(
exprs(expDrug[seq_len(200), seq_len(100)]),
exprs(expDisease[seq_len(200), seq_len(100)])
)
sample <- outer(
colnames(expDrug)[seq_len(100)],
colnames(expDisease)[seq_len(100)], function(x, y) {
paste(x, y)
}
)
Z <- PrepareSummarizedExperimentTensor(
sample = sample, feature = rownames(expDrug)[seq_len(200)], value = Z
)
In the above, sample are pairs of file IDs taken from expDrug and expDisease.
Since full data cannot be treated because of memory restriction, we restricted
the first two hundred features and the first one hundred samples, respectively
(In the below, we will introduce how to deal with the full data sets).
Then HOSVD is applied to a tensor as
HOSVD <- computeHosvd(Z)
#>
|
| | 0%
|
|======================= | 33%
|
|=============================================== | 67%
|
|======================================================================| 100%
Here we tries to find if Cisplatin causes distinct expression (0: cell lines
treated with drugs other than Cisplatin, 1: cell lines treated with Cisplatin)
and those between two classes (1 vs 2) of BRCA (in this case, there are no
meaning of two classes) within top one hundred samples.
Cond <- prepareCondDrugandDisease(expDrug)
cond <- list(NULL, Cond[, colnames = "Cisplatin"][seq_len(100)], rep(1:2, each = 50))
Then try to select singular value vectors attributed to objects.
When you try this vignettes, although you can do it in the interactive
mode (see below), here we assume that you have already finished the selection.
input_all <- selectSingularValueVectorLarge(HOSVD,cond,input_all=c(2,9)) #Batch mode
In the case you prefer to select by yourself you can execute interactive mode.
input_all <- selectSingularValueVectorLarge(HOSVD,cond)
When you can see Next'',
Prev’‘, and ``Select’’ radio buttons by which you
can performs selection as well as histogram and standard deviation optimization
by which you can verify the success of selection interactively.
Next we select which genes’ expression is altered by Cisplatin.
index <- selectFeature(HOSVD,input_all,de=0.05)
You might need to specify suitable value for de which is initial value of
standard deviation.
Then we get the following plot.
Finally, list the genes selected as those associated with distinct expression.
head(tableFeatures(Z,index))
#> Feature p value adjusted p value
#> 4 ACADVL.37 2.233863e-24 4.467726e-22
#> 6 ACLY.47 1.448854e-19 1.448854e-17
#> 1 A2M.2 6.101507e-16 4.067671e-14
#> 3 ABHD2.11057 3.934360e-10 1.967180e-08
#> 2 AARS.16 1.449491e-06 5.797964e-05
#> 5 ACIN1.22985 6.510593e-06 2.170198e-04
rm(Z)
rm(HOSVD)
detach("package:RTCGA.rnaseq")
rm(SVD)
#> Warning in rm(SVD): object 'SVD' not found
The described methods were frequently used
in the studies(Taguchi 2017b) (Taguchi 2018) (Taguchi and Turki 2020) by maintainers.
Reduction of required memory using partial summation.
In the case that there are large number of features, it is impossible to apply
HOSVD to a full tensor (Then we have reduced the size of tensor).
In this case, we apply SVD instead of HOSVD to matrix
generated from a tensor as follows.
In contrast to the above where only top two hundred features and top one hundred
samples are included, the following one includes all features and all samples since
it can save required memory because partial summation of features.
SVD <- computeSVD(exprs(expDrug), exprs(expDisease))
Z <- t(exprs(expDrug)) %*% exprs(expDisease)
sample <- outer(
colnames(expDrug), colnames(expDisease),
function(x, y) {
paste(x, y)
}
)
Z <- PrepareSummarizedExperimentTensor(
sample = sample,
feature = rownames(expDrug), value = Z
)
Nest select singular value vectors attributed to drugs and cell lines then
identify features associated with altered expression by treatment of
Cisplatin as well as distinction between two classes. Again, it included
all samples for expDrug and expDisease.
cond <- list(NULL,Cond[,colnames="Cisplatin"],rep(1:2,each=dim(SVD$SVD$v)[1]/2))
Next we select singular value vectors and optimize standard deviation
as batch mode
index_all <- selectFeatureRect(SVD,cond,de=c(0.01,0.01),
input_all=3) #batch mode