parseInfoProfile {PhyloProfile} | R Documentation |
Creating main dataframe for the input phylogenetic profiles based on selected input taxonomy level (e.g. strain, species) and reference taxon. The output contains the number of paralogs, percentage of species presence in each supertaxon, and the max/min/mean/median of VAR1 and VAR2.
parseInfoProfile(inputDf, sortedInputTaxa, var1AggregateBy = "max", var2AggregateBy = "max")
inputDf |
input profiles in long format |
sortedInputTaxa |
sorted taxonomy data for the input taxa (check sortInputTaxa()) |
var1AggregateBy |
aggregate method for VAR1 (max, min, mean or median), applied for calculating var1 of supertaxa. Default = "max". |
var2AggregateBy |
aggregate method for VAR2 (max, min, mean or median), applied for calculating var2 of supertaxa. Default = "max". |
A dataframe contains all info for the input phylogenetic profiles. This full processed profile that is required for several profiling analyses e.g. estimation of gene age (?estimateGeneAge) or identification of core gene (?getCoreGene).
Vinh Tran tran@bio.uni-frankfurt.de
createLongMatrix
, sortInputTaxa
,
calcPresSpec
, mainLongRaw
data("mainLongRaw", package="PhyloProfile") taxonIDs <- getInputTaxaID(mainLongRaw) sortedInputTaxa <- sortInputTaxa( taxonIDs, "class", "Mammalia", NULL ) var1AggregateBy <- "max" var2AggregateBy <- "mean" parseInfoProfile( mainLongRaw, sortedInputTaxa, var1AggregateBy, var2AggregateBy )