filterProfileData {PhyloProfile}R Documentation

Filter phylogentic profiles

Description

Create a filtered data needed for plotting or clustering phylogenetic profiles. NOTE: this function require some intermediate steps using the results from other functions. If you would like to get a full processed data from the raw input, please use the function fromInputToProfile() instead!

Usage

filterProfileData(DF, refTaxon = NULL,
    percentCO = c(0, 1), coorthoCOMax = 9999,
    var1CO  = c(0, 1), var2CO = c(0, 1), var1Rel = "protein",
    var2Rel = "protein", groupByCat = FALSE, catDt = NULL)

Arguments

DF

a reduced dataframe contains info for all phylogenetic profiles in the selected taxonomy rank.

refTaxon

selected reference taxon. NOTE: This taxon will not be affected by the filtering. If you want to filter all, set refTaxon <- NULL. Default = NULL.

percentCO

min and max cutoffs for percentage of species present in a supertaxon. Default = c(0, 1).

coorthoCOMax

maximum number of co-orthologs allowed. Default = 9999.

var1CO

min and max cutoffs for var1. Default = c(0, 1).

var2CO

min anc max cutoffs for var2. Default = c(0, 1).

var1Rel

relation of var1 ("protein" for protein-protein or "species" for protein-species). Default = "protein".

var2Rel

relation of var2 ("protein" for protein-protein or "species" for protein-species). Default = "protein".

groupByCat

group genes by their categories (TRUE or FALSE). Default = FALSE.

catDt

dataframe contains gene categories (optional, NULL if groupByCat = FALSE or no info provided). Default = NULL.

Value

A filtered dataframe for generating profile plot including seed gene IDs (or orthologous group IDs), their ortholog IDs and the corresponding (super)taxa, (super)taxon IDs, number of co-orthologs in each (super)taxon, values for two additional variables var1, var2, supertaxon, and the categories of seed genes (or ortholog groups).

Author(s)

Vinh Tran tran@bio.uni-frankfurt.de

See Also

parseInfoProfile and reduceProfile for generating input dataframe, fullProcessedProfile for a demo full processed profile dataframe, fromInputToProfile for generating fully processed data from raw input.

Examples

# NOTE: this function require some intermediate steps using the results from
# other functions. If you would like to get a full processed data from the
# raw input, please use the function fromInputToProfile() instead!
data("fullProcessedProfile", package="PhyloProfile")
superTaxonDf <- reduceProfile(fullProcessedProfile)
refTaxon <- "Mammalia"
percentCutoff <- c(0.0, 1.0)
coorthologCutoffMax <- 10
var1Cutoff <- c(0.75, 1.0)
var2Cutoff <- c(0.5, 1.0)
var1Relation <- "protein"
var2Relation <- "species"
groupByCat <- FALSE
catDt <- NULL
filterProfileData(
    superTaxonDf,
    refTaxon,
    percentCutoff,
    coorthologCutoffMax,
    var1Cutoff,
    var2Cutoff,
    var1Relation,
    var2Relation,
    groupByCat,
    catDt
)

[Package PhyloProfile version 1.2.8 Index]