mapToGenome-methods {Pbase} | R Documentation |
Map range coordinates between peptide features along proteins and genome (reference) space.
## S4 method for signature 'Proteins,GRangesList' mapToGenome(x, genome, pcol, drop.empty.ranges = TRUE, ...) ## S4 method for signature 'Proteins,GRangesList' pmapToGenome(x, genome, pcol, drop.empty.ranges = TRUE, ...) ## S4 method for signature 'Proteins,EnsDb' mapToGenome(x, genome, pcol, id = "name", idType = "protein_id", drop.empty.ranges = TRUE, ...)
x |
|
genome |
A |
pcol |
character(1) specifying the name of the column in
|
drop.empty.ranges |
|
id |
character(1) indicating which metadata columns in |
idType |
character(1) specifying the type of the IDs found in
|
... |
Additional parameters passed to inner functions. Currently ignored. |
mapToGenome
maps the pranges(x)
to the ranges of
genome
. Unless x
and genome
are of length 1,
both must be named and items of x
are matched to items of
genome
using their respective names. Names that do not
co-occur in x
and genome
are ignored. If we have
seqnames(x)
: "A"
, "B"
and "C"
and
names(genome)
: "C"
, "A"
, "a"
,
"z"
, "A"
and "A"
.
the names of the output will be
"A"
, "A"
, "A"
and "C"
.
The output is ordered by (1) seqnames(x)
and (2) the order of
the elements in genome
.
In case less than length(x)
are mapped, as for p["B"]
above, a message informs the user.
mapToGenome,Proteins,EnsDb
maps each of the
pranges(x)
ranges within the protein sequence to the
corresponding genomic coordinates using annotations provided by the
EnsDb
object. To enable the mapping the
Proteins
object has to provide IDs that can be used to
identify the encoding transcript. Such IDs can be the Ensembl
protein ID, the Uniprot ID or the Ensembl transcript ID. If a
protein is annotated to multiple transcripts, the function selects
the transcript which CDS length best matches the protein sequence
length.
The mapToGenome,Proteins,EnsDb
method maps pranges
of
all proteins in the Proteins
object to the genome. See
examples below for more details.
pmapToGenome
is the element-wise (aka 'parallel')
version of mapToGenome
. The i-th pranges(x)
is mapped
to the i-th range in genome
. x
and genome
must
have the same length and do not need to be named (names are
ignored).
A named GRangesList
object, with names matching
names(genome)
. For pmapToGenome
, the return value will
have the same length as the inputs.
Laurent Gatto, Johannes Rainer
See ?mapToAlignments
in the
GenomicAlignments package for mapping coordinates between
reads (local) and genome (reference) space using a CIGAR
alignment.
See ?mapToTranscripts
in the
GenomicRanges package for mapping coordinates between features
in the transcriptome and genome space.
The proteinCoding
function to remove non-protein
coding ranges before mapping peptides to their genomic coordinates.
The mapping
vignette for examples and visualisations.
See plotAsAnnotationTrack
and
plotAsAnnotationTrack
for more details about the two
plotting functions.
data(p) grl <- etrid2grl(acols(p)$ENST) pcgrl <- proteinCoding(grl) plotAsGeneRegionTrack(grl[[1]], pcgrl[[1]]) mp <- mapToGenome(p[4], pcgrl[4]) plotAsAnnotationTrack(mp[[1]], pcgrl[[4]]) pmapToGenome(p, pcgrl) ####### ## mapToGenome,Proteins,EnsDb ## load an EnsDb object providing the required annotations library(EnsDb.Hsapiens.v86) edb <- EnsDb.Hsapiens.v86 ## Map the pranges of all proteins in p to the genome providing the proteins' ## Uniprot IDs (being the 'names' of the Proteins object) for the mapping. mp <- mapToGenome(p, edb, id = "name", idType = "uniprot_id")