fgga {fgga}R Documentation

Factor Graph GO Annotation model

Description

A hierarchical graph-based machine learning model for the consistent GO annotation of protein coding genes.

Usage

fgga(graphGO, tableGOs, dxCharacterized, dxTestCharacterized,
    kFold, kernelSVM, tmax, epsilon)

Arguments

graphGO

A graphNEL graph with ‘m’ GO node labels.

tableGOs

A binary matrix with ‘n’ proteins (rows) by ‘m’ GO node labels (columns).

dxCharacterized

A data frame with ‘n’ proteins (rows) by ‘f’ features (columns).

dxTestCharacterized

A data frame with ‘k’ proteins (rows) by ‘f’ features (columns).

kFold

An integer for the number of folds.

kernelSVM

The kernel used to calculate the variance (default: radial).

tmax

An integer indicating the maximum number of iterations (default: 200).

epsilon

An integer that represents the convergence criteria (default: 0.001).

Details

The FGGA model is built in two main steps. In the first step, a core Factor Graph (FG) modeling hidden GO-term predictions and relationships is created. In the second step, the FG is enriched with nodes modeling observable GO-term predictions issued by binary SVM classifiers. In addition, probabilistic constraints modeling learning gaps between hidden and observable GO-term predictions are introduced. These gaps are assumed to be independent among GO-terms, locally additive with respect to observed predictions, and zero-mean Gaussian. FGGA predictions are issued by the native iterative message passing algorithm of factor graphs.

Value

A named matrix with ‘k’ protein coding genes (rows) by ‘m’ GO node labels (columns) where each element indicates a probabilistic prediction value.

Author(s)

Flavio E. Spetale and Elizabeth Tapia <spetale@cifasis-conicet.gov.ar>

References

Spetale F.E., Tapia E., Krsticevic F., Roda F. and Bulacio P. “A Factor Graph Approach to Automated GO Annotation”. PLoS ONE 11(1): e0146986, 2016.

Spetale Flavio E., Arce D., Krsticevic F., Bulacio P. and Tapia E. “Consistent prediction of GO protein localization”. Scientific Report 7787(8), 2018

See Also

fgga2bipartite, sumProduct, svmGO

Examples

data(CfData)

mygraphGO <- as(CfData[["graphCfGO"]], "graphNEL")

dxCfTestCharacterized <- CfData[["dxCf"]][CfData[["indexGO"]]$indexTest[1:2], ]

myTableGO <- CfData[["tableCfGO"]][
                    CfData[["indexGO"]]$indexTrain[1:300], ]

dataTrain <- CfData[["dxCf"]][
                    CfData[["indexGO"]]$indexTrain[1:300], ]

fggaResults <- fgga(graphGO = mygraphGO,
                tableGOs = myTableGO, dxCharacterized = dataTrain,
                dxTestCharacterized = dxCfTestCharacterized, kFold = 2,
                tmax = 50, epsilon = 0.05)


[Package fgga version 1.1.0 Index]