get_seq_encode_pad {DeepPINCS} | R Documentation |
A vectorization of characters of strings is necessary. Vectorized characters are padded or truncated.
get_seq_encode_pad(sequences, length_seq, ngram_max = 1, ngram_min = 1, lenc = NULL)
sequences |
SMILE strings or amino acid sequences |
length_seq |
length of input sequences |
ngram_max |
maximum size of an n-gram (default: 1) |
ngram_min |
minimum size of an n-gram (default: 1) |
lenc |
encoded labels for characters, LableEncoder object fitted by "CatEncoders::LabelEncoder.fit" (default: NULL) |
sequences_encode_pad |
for each SMILES string, an encoded sequence which is padded or truncated |
lenc |
encoded labels for characters |
num_token |
total number of characters |
Dongmin Jung
CatEncoders::LabelEncoder.fit, CatEncoders::transform, keras::pad_sequences, stringdist::qgrams, tokenizers::tokenize_ngrams
if (keras::is_keras_available() & reticulate::py_available()) { get_seq_encode_pad(example_cpi[1, 2], 10) }