get_graph_structure_node_feature {DeepPINCS}R Documentation

Graph structure and node features from SMILES strings

Description

In molecular graph representations, nodes represent atoms and edges represent bonds. For molecular features, the Chemistry Development Kit (CDK) is used as a cheminformatics tool. The degree of an atom in the graph representation and the atomic symbol and implicit hydrogen count for an atom are used as molecular features.

Usage

get_graph_structure_node_feature(smiles, max_atoms,
    element_list = c(
        "C", "N", "O", "S", "F", "Si", "P", "Cl",
        "Br", "Mg", "Na", "Ca", "Fe",  "Al", "I",
        "B", "K", "Se", "Zn", "H", "Cu", "Mn"))

Arguments

smiles

SMILES strings

max_atoms

maximum number of atoms

element_list

list of atom symbols

Value

A_pad

a padded or turncated adjacency matrix for each SMILES string

X_pad

a padded or turncated node features for each SMILES string

feature_dim

dimension of node features

element_list

list of atom symbols

Author(s)

Dongmin Jung

References

Balakin, K. V. (2009). Pharmaceutical data mining: approaches and applications for drug discovery. Wiley.

See Also

matlab::padarray, purrr::chuck, rcdk::get.adjacency.matrix, rcdk::get.atoms, rcdk::get.hydrogen.count, rcdk::get.symbol rcdk::parse.smiles

Examples

get_graph_structure_node_feature(example_cpi[1, 1], 10)

[Package DeepPINCS version 1.1.6 Index]