parseInfoProfile {PhyloProfile}R Documentation

Parsing info for phylogenetic profiles

Description

Creating main dataframe for the input phylogenetic profiles based on selected input taxonomy level (e.g. strain, species) and reference taxon. The output contains the number of paralogs, percentage of species presence in each supertaxon, and the max/min/mean/median of VAR1 and VAR2.

Usage

parseInfoProfile(inputDf, sortedInputTaxa, var1AggregateBy = "max",
    var2AggregateBy = "max")

Arguments

inputDf

input profiles in long format

sortedInputTaxa

sorted taxonomy data for the input taxa (check sortInputTaxa())

var1AggregateBy

aggregate method for VAR1 (max, min, mean or median), applied for calculating var1 of supertaxa. Default = "max".

var2AggregateBy

aggregate method for VAR2 (max, min, mean or median), applied for calculating var2 of supertaxa. Default = "max".

Value

A dataframe contains all info for the input phylogenetic profiles. This full processed profile that is required for several profiling analyses e.g. estimation of gene age (?estimateGeneAge) or identification of core gene (?getCoreGene).

Author(s)

Vinh Tran tran@bio.uni-frankfurt.de

See Also

createLongMatrix, sortInputTaxa, calcPresSpec, mainLongRaw

Examples

data("mainLongRaw", package="PhyloProfile")
taxonIDs <- getInputTaxaID(mainLongRaw)
sortedInputTaxa <- sortInputTaxa(
    taxonIDs, "class", "Mammalia", NULL
)
var1AggregateBy <- "max"
var2AggregateBy <- "mean"
parseInfoProfile(
    mainLongRaw, sortedInputTaxa, var1AggregateBy, var2AggregateBy
)

[Package PhyloProfile version 1.0.7 Index]