Contents

Package: Pbase
Authors: Laurent Gatto and Sebastian Gibb
Last compiled: Mon Oct 17 19:31:09 2016
Last modified: 2016-10-17 16:10:57

0.1 Introduction

This vignette briefly introduces the central data object of the Pbase package, namely Proteins instances, as depicted below. They contain a set of protein sequences (10 in the figure below), composed of the protein sequences (grey boxes) and annotation data (table on the left). Each protein links to a set of experimentally observed peptides (also in grey) that are also decorated with their own annotation data. The figure also show the accessors for the different data slots, that are detailed in ?Proteins.

Proteins objects are populated by protein sequences stemming from a fasta file and the peptides originate from an LC-MSMS experiment.

The original data used below is a 10 fmol Peptide Retention Time Calibration Mixture spiked into 50 ng HeLa background acquired on a Thermo Orbitrap Q Exactive instrument. A restricted set of high scoring human proteins from the UniProt release 2015_02 were searched using the MSGF+ search engine.

0.2 The fasta database

library("Biostrings")
## Loading required package: XVector
fafile <- system.file("extdata/HUMAN_2015_02_selected.fasta",
                      package = "Pbase")
fa <- readAAStringSet(fafile)
fa
##   A AAStringSet instance of length 9
##     width seq                                          names               
## [1]  2602 MPVTEKDLAEDAPWKKIQQNT...LAVKWGEEHIPGSPFHVTVP sp|O75369|FLNB_HU...
## [2]  3374 MSPESGHSRIFEATAGPNKPE...TLSKDSLSNGVPSGRQAEFS sp|A4UGR9|XIRP2_H...
## [3]  2624 MFRRARLSVKPNVRPGVGARG...ATTVSEYFFNDIFIEVDETE sp|A6H8Y1|BDP1_HU...
## [4]   911 MVDYHAANQSYQYGPSSAGNG...VPGALDYKSFSTALYGESDL sp|O43707|ACTN4_H...
## [5]   417 MSLSNKLTLDKLDVKGKRVVM...ASLELLEGKVLPGVDALSNI sp|P00558|PGK1_HU...
## [6]   375 MDDDIAALVVDNGSGMCKAGF...WISKQEYDESGPSIVHRKCF sp|P60709|ACTB_HU...
## [7]   664 METPSQRRATRSGAQASSTPL...SYLLGNSSPRTQSPQNCSIM sp|P02545|LMNA_HU...
## [8]   364 MPYQYPALTPEQKKELSDIAH...PSGQAGAAASESLFVSNHAY sp|P04075|ALDOA_H...
## [9]   418 MARRKPEGSSFNMTHLSMAMA...PSGQAGAAASESLFVSNHAY sp|P04075-2|ALDOA...

0.3 The PSM data

library("mzID")
idfile <- system.file("extdata/Thermo_Hela_PRTC_selected.mzid",
                      package = "Pbase")
id <- flatten(mzID(idfile))
## reading Thermo_Hela_PRTC_selected.mzid... DONE!
dim(id)
## [1] 137  29
head(id)
##     spectrumid scan number(s)
## 1    index=173          12256
## 1.1  index=173          12256
## 2    index=163          11860
## 2.1  index=163          11860
## 3    index=200          13408
## 3.1  index=200          13408
##                                                                                spectrum title
## 1   msLevel 2; retentionTime 2094.56706; scanNum 12256; precMz 1137.06665029649; precCharge 2
## 1.1 msLevel 2; retentionTime 2094.56706; scanNum 12256; precMz 1137.06665029649; precCharge 2
## 2   msLevel 2; retentionTime 2039.84424; scanNum 11860; precMz 1136.57450195803; precCharge 2
## 2.1 msLevel 2; retentionTime 2039.84424; scanNum 11860; precMz 1136.57450195803; precCharge 2
## 3   msLevel 2; retentionTime 2258.27868; scanNum 13408; precMz 703.038108542133; precCharge 3
## 3.1 msLevel 2; retentionTime 2258.27868; scanNum 13408; precMz 703.038108542133; precCharge 3
##     acquisitionnum passthreshold rank calculatedmasstocharge
## 1              173          TRUE    1               1136.574
## 1.1            173          TRUE    1               1136.574
## 2              163          TRUE    1               1136.574
## 2.1            163          TRUE    1               1136.574
## 3              200          TRUE    1                703.037
## 3.1            200          TRUE    1                703.037
##     experimentalmasstocharge chargestate ms-gf:denovoscore ms-gf:evalue
## 1                  1137.0667           2               132 2.597097e-18
## 1.1                1137.0667           2               132 2.597097e-18
## 2                  1136.5745           2               230 4.942664e-17
## 2.1                1136.5745           2               230 4.942664e-17
## 3                   703.0381           3               145 4.080429e-10
## 3.1                 703.0381           3               145 4.080429e-10
##     ms-gf:rawscore ms-gf:specevalue assumeddissociationmethod isotopeerror
## 1              118     2.276758e-22                       CID            1
## 1.1            118     2.276758e-22                       CID            1
## 2              186     4.333009e-21                       CID            0
## 2.1            186     4.333009e-21                       CID            0
## 3               98     3.578068e-14                       CID            0
## 3.1             98     3.578068e-14                       CID            0
##     isdecoy post pre end start               accession length
## 1     FALSE    C   K 134   112   sp|P04075|ALDOA_HUMAN    364
## 1.1   FALSE    C   K 188   166 sp|P04075-2|ALDOA_HUMAN    418
## 2     FALSE    C   K 134   112   sp|P04075|ALDOA_HUMAN    364
## 2.1   FALSE    C   K 188   166 sp|P04075-2|ALDOA_HUMAN    418
## 3     FALSE    Y   K 173   154   sp|P04075|ALDOA_HUMAN    364
## 3.1   FALSE    Y   K 227   208 sp|P04075-2|ALDOA_HUMAN    418
##                                                                description
## 1      Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA PE=1 SV=2
## 1.1 Isoform 2 of Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA
## 2      Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA PE=1 SV=2
## 2.1 Isoform 2 of Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA
## 3      Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA PE=1 SV=2
## 3.1 Isoform 2 of Fructose-bisphosphate aldolase A OS=Homo sapiens GN=ALDOA
##                      pepseq modified modification
## 1   GVVPLAGTNGETTTQGLDGLSER    FALSE         <NA>
## 1.1 GVVPLAGTNGETTTQGLDGLSER    FALSE         <NA>
## 2   GVVPLAGTNGETTTQGLDGLSER    FALSE         <NA>
## 2.1 GVVPLAGTNGETTTQGLDGLSER    FALSE         <NA>
## 3      IGEHTPSALAIMENANVLAR    FALSE         <NA>
## 3.1    IGEHTPSALAIMENANVLAR    FALSE         <NA>
##                             idFile                  spectrumFile
## 1   Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 1.1 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 2   Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 2.1 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 3   Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
## 3.1 Thermo_Hela_PRTC_selected.mzid Thermo_Hela_PRTC_selected.mgf
##                     databaseFile
## 1   HUMAN_2015_02_selected.fasta
## 1.1 HUMAN_2015_02_selected.fasta
## 2   HUMAN_2015_02_selected.fasta
## 2.1 HUMAN_2015_02_selected.fasta
## 3   HUMAN_2015_02_selected.fasta
## 3.1 HUMAN_2015_02_selected.fasta

0.4 The Proteins object

library("Pbase")
p <- Proteins(fafile)
p <- addIdentificationData(p, idfile)
## Reading 1 identification files:
##   1. /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## done.
p
## S4 class type     : Proteins
## Class version     : 0.1
## Created           : Mon Oct 17 19:31:22 2016
## Number of Proteins: 9
## Sequences:
##   [1] A4UGR9 [2] A6H8Y1 ... [8] P04075-2 [9] P60709
## Sequence features:
##   [1] DB [2] AccessionNumber ... [11] Filename [12] npeps
## Peptide features:
##   [1] DB [2] AccessionNumber ... [27] acquisitionNum [28] filenames

A Proteins object is composed of a set of protein sequences accessible with the aa accessor as well as an optional set of peptides features that are mapped as coordinates along the proteins, available with pranges. The actual peptide sequences can be extraced with pfeatures.

aa(p)
##   A AAStringSet instance of length 9
##     width seq                                          names               
## [1]  3374 MSPESGHSRIFEATAGPNKPE...TLSKDSLSNGVPSGRQAEFS A4UGR9
## [2]  2624 MFRRARLSVKPNVRPGVGARG...ATTVSEYFFNDIFIEVDETE A6H8Y1
## [3]   911 MVDYHAANQSYQYGPSSAGNG...VPGALDYKSFSTALYGESDL O43707
## [4]  2602 MPVTEKDLAEDAPWKKIQQNT...LAVKWGEEHIPGSPFHVTVP O75369
## [5]   417 MSLSNKLTLDKLDVKGKRVVM...ASLELLEGKVLPGVDALSNI P00558
## [6]   664 METPSQRRATRSGAQASSTPL...SYLLGNSSPRTQSPQNCSIM P02545
## [7]   364 MPYQYPALTPEQKKELSDIAH...PSGQAGAAASESLFVSNHAY P04075
## [8]   418 MARRKPEGSSFNMTHLSMAMA...PSGQAGAAASESLFVSNHAY P04075-2
## [9]   375 MDDDIAALVVDNGSGMCKAGF...WISKQEYDESGPSIVHRKCF P60709
pranges(p)
## IRangesList of length 9
## $A4UGR9
## IRanges object with 36 ranges and 28 metadata columns:
##              start       end     width |    DB AccessionNumber   EntryName
##          <integer> <integer> <integer> | <Rle>     <character> <character>
##   A4UGR9      2743      2760        18 |    sp          A4UGR9 XIRP2_HUMAN
##   A4UGR9       307       318        12 |    sp          A4UGR9 XIRP2_HUMAN
##   A4UGR9      1858      1870        13 |    sp          A4UGR9 XIRP2_HUMAN
##   A4UGR9      1699      1708        10 |    sp          A4UGR9 XIRP2_HUMAN
##   A4UGR9      2622      2637        16 |    sp          A4UGR9 XIRP2_HUMAN
##      ...       ...       ...       ... .   ...             ...         ...
##   A4UGR9        20        31        12 |    sp          A4UGR9 XIRP2_HUMAN
##   A4UGR9      1712      1729        18 |    sp          A4UGR9 XIRP2_HUMAN
##   A4UGR9        48        61        14 |    sp          A4UGR9 XIRP2_HUMAN
##   A4UGR9      2082      2094        13 |    sp          A4UGR9 XIRP2_HUMAN
##   A4UGR9      2743      2756        14 |    sp          A4UGR9 XIRP2_HUMAN
##          IsoformName
##                <Rle>
##   A4UGR9        <NA>
##   A4UGR9        <NA>
##   A4UGR9        <NA>
##   A4UGR9        <NA>
##   A4UGR9        <NA>
##      ...         ...
##   A4UGR9        <NA>
##   A4UGR9        <NA>
##   A4UGR9        <NA>
##   A4UGR9        <NA>
##   A4UGR9        <NA>
##                                                                  ProteinName
##                                                                  <character>
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##      ...                                                                 ...
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##   A4UGR9 sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##          OrganismName GeneName          ProteinExistence SequenceVersion
##                 <Rle>    <Rle>                     <Rle>           <Rle>
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##      ...          ...      ...                       ...             ...
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##   A4UGR9 Homo sapiens    XIRP2 Evidence at protein level               2
##          Comment spectrumID chargeState      rank passThreshold
##            <Rle>   <factor>   <integer> <integer>     <logical>
##   A4UGR9    <NA>  index=124           3         1          TRUE
##   A4UGR9    <NA>   index=28           2         1          TRUE
##   A4UGR9    <NA>   index=20           2         1          TRUE
##   A4UGR9    <NA>  index=187           2         1          TRUE
##   A4UGR9    <NA>  index=211           3         1          TRUE
##      ...     ...        ...         ...       ...           ...
##   A4UGR9    <NA>   index=99           2         1          TRUE
##   A4UGR9    <NA>    index=9           2         1          TRUE
##   A4UGR9    <NA>  index=122           2         1          TRUE
##   A4UGR9    <NA>   index=87           2         1          TRUE
##   A4UGR9    <NA>   index=77           2         1          TRUE
##          experimentalMassToCharge calculatedMassToCharge
##                         <numeric>              <numeric>
##   A4UGR9                 715.0305               715.0308
##   A4UGR9                 715.9177               715.4117
##   A4UGR9                 786.9066               786.9081
##   A4UGR9                 629.8380               629.3386
##   A4UGR9                 645.3429               645.3511
##      ...                      ...                    ...
##   A4UGR9                 619.2888               618.7782
##   A4UGR9                1014.0198              1013.5117
##   A4UGR9                 821.4005               820.8909
##   A4UGR9                 720.3445               720.3527
##   A4UGR9                 821.9231               821.9254
##                    sequence    modNum   isDecoy     post      pre
##                    <factor> <integer> <logical> <factor> <factor>
##   A4UGR9 QEITQNKSFFSSVKESQR         0     FALSE        D        K
##   A4UGR9       LPVPKDVYSKQR         0     FALSE        N        R
##   A4UGR9      EQNNDALEKSLRR         0     FALSE        L        R
##   A4UGR9         SLKESSHRWK         0     FALSE        E        K
##   A4UGR9   LKMVPRKQREFSGSDR         0     FALSE        G        K
##      ...                ...       ...       ...      ...      ...
##   A4UGR9       PESGFAEDSAAR         0     FALSE        G        K
##   A4UGR9 QPDAIPGDIEKAIECLEK         1     FALSE        A        K
##   A4UGR9     MARYQAAVSRGDCR         1     FALSE        S        R
##   A4UGR9      TNTSTGLKMAMER         0     FALSE        S        K
##   A4UGR9     QEITQNKSFFSSVK         0     FALSE        E        K
##              start       end        DatabaseAccess DBseqLength DatabaseSeq
##          <integer> <integer>              <factor>   <integer>    <factor>
##   A4UGR9      2743      2760 sp|A4UGR9|XIRP2_HUMAN        3374            
##   A4UGR9       307       318 sp|A4UGR9|XIRP2_HUMAN        3374            
##   A4UGR9      1858      1870 sp|A4UGR9|XIRP2_HUMAN        3374            
##   A4UGR9      1699      1708 sp|A4UGR9|XIRP2_HUMAN        3374            
##   A4UGR9      2622      2637 sp|A4UGR9|XIRP2_HUMAN        3374            
##      ...       ...       ...                   ...         ...         ...
##   A4UGR9        20        31 sp|A4UGR9|XIRP2_HUMAN        3374            
##   A4UGR9      1712      1729 sp|A4UGR9|XIRP2_HUMAN        3374            
##   A4UGR9        48        61 sp|A4UGR9|XIRP2_HUMAN        3374            
##   A4UGR9      2082      2094 sp|A4UGR9|XIRP2_HUMAN        3374            
##   A4UGR9      2743      2756 sp|A4UGR9|XIRP2_HUMAN        3374            
##          acquisitionNum
##               <numeric>
##   A4UGR9            124
##   A4UGR9             28
##   A4UGR9             20
##   A4UGR9            187
##   A4UGR9            211
##      ...            ...
##   A4UGR9             99
##   A4UGR9              9
##   A4UGR9            122
##   A4UGR9             87
##   A4UGR9             77
##                                                                               filenames
##                                                                                   <Rle>
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##      ...                                                                            ...
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
##   A4UGR9 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 
## ...
## <8 more elements>
pfeatures(p)
## AAStringSetList of length 9
## [["A4UGR9"]] A4UGR9=QEITQNKSFFSSVKESQR ... A4UGR9=QEITQNKSFFSSVK
## [["A6H8Y1"]] A6H8Y1=EDAEQVALEVDLNQKKRR ...
## [["O43707"]] O43707=QQRKTFTAWCNSHLR ... O43707=VGWEQLLTTIAR
## [["O75369"]] O75369=DLDIIDNYDYSHTVK ... O75369=VQAQGPGLKEAFTNK
## [["P00558"]] P00558=ELNYFAKALESPER P00558=DLMSKAEK ... P00558=GTKALMDEVVK
## [["P02545"]] P02545=METPSQRRATR ... P02545=RATRSGAQASSTPLSPTR
## [["P04075"]] P04075=GVVPLAGTNGETTTQGLDGLSER ...
## [["P04075-2"]] P04075-2=GVVPLAGTNGETTTQGLDGLSER ...
## [["P60709"]] P60709=DLTDYLMKILTER

A Proteins instance is further described by general metadata. Protein sequence and peptide features annotations can be accessed with ametadata and pmetadata (or acols and pcols) respectively.

metadata(p)
## $created
## [1] "Mon Oct 17 19:31:22 2016"
head(acols(p))
## DataFrame with 6 rows and 12 columns
##      DB AccessionNumber   EntryName IsoformName
##   <Rle>     <character> <character>       <Rle>
## 1    sp          A4UGR9 XIRP2_HUMAN          NA
## 2    sp          A6H8Y1  BDP1_HUMAN          NA
## 3    sp          O43707 ACTN4_HUMAN          NA
## 4    sp          O75369  FLNB_HUMAN          NA
## 5    sp          P00558  PGK1_HUMAN          NA
## 6    sp          P02545  LMNA_HUMAN          NA
##                                         ProteinName OrganismName GeneName
##                                         <character>        <Rle>    <Rle>
## 1     Xin actin-binding repeat-containing protein 2 Homo sapiens    XIRP2
## 2 Transcription factor TFIIIB component B'' homolog Homo sapiens     BDP1
## 3                                   Alpha-actinin-4 Homo sapiens    ACTN4
## 4                                         Filamin-B Homo sapiens     FLNB
## 5                         Phosphoglycerate kinase 1 Homo sapiens     PGK1
## 6                                      Prelamin-A/C Homo sapiens     LMNA
##            ProteinExistence SequenceVersion Comment
##                       <Rle>           <Rle>   <Rle>
## 1 Evidence at protein level               2      NA
## 2 Evidence at protein level               3      NA
## 3 Evidence at protein level               2      NA
## 4 Evidence at protein level               2      NA
## 5 Evidence at protein level               3      NA
## 6 Evidence at protein level               1      NA
##                                                                       Filename
##                                                                          <Rle>
## 1 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 2 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 3 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 4 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 5 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/HUMAN_2015_02_selected.fasta
## 6 /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/HUMAN_2015_02_selected.fasta
##       npeps
##   <integer>
## 1        36
## 2        23
## 3         6
## 4        13
## 5         5
## 6        12
head(pcols(p))
## SplitDataFrameList of length 6
## $A4UGR9
## DataFrame with 36 rows and 28 columns
##        DB AccessionNumber   EntryName IsoformName
##     <Rle>     <character> <character>       <Rle>
## 1      sp          A4UGR9 XIRP2_HUMAN          NA
## 2      sp          A4UGR9 XIRP2_HUMAN          NA
## 3      sp          A4UGR9 XIRP2_HUMAN          NA
## 4      sp          A4UGR9 XIRP2_HUMAN          NA
## 5      sp          A4UGR9 XIRP2_HUMAN          NA
## ...   ...             ...         ...         ...
## 32     sp          A4UGR9 XIRP2_HUMAN          NA
## 33     sp          A4UGR9 XIRP2_HUMAN          NA
## 34     sp          A4UGR9 XIRP2_HUMAN          NA
## 35     sp          A4UGR9 XIRP2_HUMAN          NA
## 36     sp          A4UGR9 XIRP2_HUMAN          NA
##                                                             ProteinName
##                                                             <character>
## 1   sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 2   sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 3   sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 4   sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 5   sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## ...                                                                 ...
## 32  sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 33  sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 34  sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 35  sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
## 36  sp|A4UGR9|XIRP2_HUMAN Xin actin-binding repeat-containing protein 2
##     OrganismName GeneName          ProteinExistence SequenceVersion
##            <Rle>    <Rle>                     <Rle>           <Rle>
## 1   Homo sapiens    XIRP2 Evidence at protein level               2
## 2   Homo sapiens    XIRP2 Evidence at protein level               2
## 3   Homo sapiens    XIRP2 Evidence at protein level               2
## 4   Homo sapiens    XIRP2 Evidence at protein level               2
## 5   Homo sapiens    XIRP2 Evidence at protein level               2
## ...          ...      ...                       ...             ...
## 32  Homo sapiens    XIRP2 Evidence at protein level               2
## 33  Homo sapiens    XIRP2 Evidence at protein level               2
## 34  Homo sapiens    XIRP2 Evidence at protein level               2
## 35  Homo sapiens    XIRP2 Evidence at protein level               2
## 36  Homo sapiens    XIRP2 Evidence at protein level               2
##     Comment spectrumID chargeState      rank passThreshold
##       <Rle>   <factor>   <integer> <integer>     <logical>
## 1        NA  index=124           3         1          TRUE
## 2        NA   index=28           2         1          TRUE
## 3        NA   index=20           2         1          TRUE
## 4        NA  index=187           2         1          TRUE
## 5        NA  index=211           3         1          TRUE
## ...     ...        ...         ...       ...           ...
## 32       NA   index=99           2         1          TRUE
## 33       NA    index=9           2         1          TRUE
## 34       NA  index=122           2         1          TRUE
## 35       NA   index=87           2         1          TRUE
## 36       NA   index=77           2         1          TRUE
##     experimentalMassToCharge calculatedMassToCharge           sequence
##                    <numeric>              <numeric>           <factor>
## 1                   715.0305               715.0308 QEITQNKSFFSSVKESQR
## 2                   715.9177               715.4117       LPVPKDVYSKQR
## 3                   786.9066               786.9081      EQNNDALEKSLRR
## 4                   629.8380               629.3386         SLKESSHRWK
## 5                   645.3429               645.3511   LKMVPRKQREFSGSDR
## ...                      ...                    ...                ...
## 32                  619.2888               618.7782       PESGFAEDSAAR
## 33                 1014.0198              1013.5117 QPDAIPGDIEKAIECLEK
## 34                  821.4005               820.8909     MARYQAAVSRGDCR
## 35                  720.3445               720.3527      TNTSTGLKMAMER
## 36                  821.9231               821.9254     QEITQNKSFFSSVK
##        modNum   isDecoy     post      pre     start       end
##     <integer> <logical> <factor> <factor> <integer> <integer>
## 1           0     FALSE        D        K      2743      2760
## 2           0     FALSE        N        R       307       318
## 3           0     FALSE        L        R      1858      1870
## 4           0     FALSE        E        K      1699      1708
## 5           0     FALSE        G        K      2622      2637
## ...       ...       ...      ...      ...       ...       ...
## 32          0     FALSE        G        K        20        31
## 33          1     FALSE        A        K      1712      1729
## 34          1     FALSE        S        R        48        61
## 35          0     FALSE        S        K      2082      2094
## 36          0     FALSE        E        K      2743      2756
##            DatabaseAccess DBseqLength DatabaseSeq acquisitionNum
##                  <factor>   <integer>    <factor>      <numeric>
## 1   sp|A4UGR9|XIRP2_HUMAN        3374                        124
## 2   sp|A4UGR9|XIRP2_HUMAN        3374                         28
## 3   sp|A4UGR9|XIRP2_HUMAN        3374                         20
## 4   sp|A4UGR9|XIRP2_HUMAN        3374                        187
## 5   sp|A4UGR9|XIRP2_HUMAN        3374                        211
## ...                   ...         ...         ...            ...
## 32  sp|A4UGR9|XIRP2_HUMAN        3374                         99
## 33  sp|A4UGR9|XIRP2_HUMAN        3374                          9
## 34  sp|A4UGR9|XIRP2_HUMAN        3374                        122
## 35  sp|A4UGR9|XIRP2_HUMAN        3374                         87
## 36  sp|A4UGR9|XIRP2_HUMAN        3374                         77
##                                                                          filenames
##                                                                              <Rle>
## 1   /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 2   /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 3   /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 4   /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 5   /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## ...                                                                            ...
## 32  /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 33  /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 34  /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 35  /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 36  /tmp/Rtmp1eCrf9/Rinst3e7835c84a5a/Pbase/extdata/Thermo_Hela_PRTC_selected.mzid
## 
## ...
## <5 more elements>

Specific proteins can be extracted by index of name using [ and proteins and their peptide features can be plotted with the default plot method.

seqnames(p)
## [1] "A4UGR9"   "A6H8Y1"   "O43707"   "O75369"   "P00558"   "P02545"  
## [7] "P04075"   "P04075-2" "P60709"
plot(p[c(1,9)])

More details can be found in ?Proteins. The object generated above is also directly available as data(p).

0.5 Session information

sessionInfo()
## R version 3.3.1 (2016-06-21)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.1 LTS
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
##  [1] grid      stats4    parallel  stats     graphics  grDevices utils    
##  [8] datasets  methods   base     
## 
## other attached packages:
##  [1] mzID_1.12.0          Biostrings_2.42.0    XVector_0.14.0      
##  [4] Pbase_0.14.0         Gviz_1.18.0          GenomicRanges_1.26.0
##  [7] GenomeInfoDb_1.10.0  IRanges_2.8.0        S4Vectors_0.12.0    
## [10] Rcpp_0.12.7          BiocGenerics_0.20.0  BiocStyle_2.2.0     
## 
## loaded via a namespace (and not attached):
##  [1] Biobase_2.34.0                httr_1.2.1                   
##  [3] vsn_3.42.0                    AnnotationHub_2.6.0          
##  [5] splines_3.3.1                 foreach_1.4.3                
##  [7] Formula_1.2-1                 shiny_0.14.1                 
##  [9] assertthat_0.1                interactiveDisplayBase_1.12.0
## [11] affy_1.52.0                   latticeExtra_0.6-28          
## [13] BSgenome_1.42.0               Rsamtools_1.26.0             
## [15] impute_1.48.0                 yaml_2.1.13                  
## [17] RSQLite_1.0.0                 lattice_0.20-34              
## [19] biovizBase_1.22.0             limma_3.30.0                 
## [21] chron_2.3-47                  digest_0.6.10                
## [23] RColorBrewer_1.1-2            colorspace_1.2-7             
## [25] preprocessCore_1.36.0         htmltools_0.3.5              
## [27] httpuv_1.3.3                  Matrix_1.2-7.1               
## [29] plyr_1.8.4                    MALDIquant_1.15              
## [31] XML_3.98-1.4                  biomaRt_2.30.0               
## [33] zlibbioc_1.20.0               xtable_1.8-2                 
## [35] scales_0.4.0                  affyio_1.44.0                
## [37] cleaver_1.12.0                BiocParallel_1.8.0           
## [39] tibble_1.2                    ggplot2_2.1.0                
## [41] SummarizedExperiment_1.4.0    GenomicFeatures_1.26.0       
## [43] nnet_7.3-12                   survival_2.39-5              
## [45] magrittr_1.5                  mime_0.5                     
## [47] evaluate_0.10                 doParallel_1.0.10            
## [49] foreign_0.8-67                mzR_2.8.0                    
## [51] Pviz_1.8.0                    BiocInstaller_1.24.0         
## [53] tools_3.3.1                   data.table_1.9.6             
## [55] formatR_1.4                   matrixStats_0.51.0           
## [57] stringr_1.1.0                 MSnbase_2.0.0                
## [59] munsell_0.4.3                 cluster_2.0.5                
## [61] AnnotationDbi_1.36.0          ensembldb_1.6.0              
## [63] pcaMethods_1.66.0             RCurl_1.95-4.8               
## [65] iterators_1.0.8               dichromat_2.0-0              
## [67] VariantAnnotation_1.20.0      bitops_1.0-6                 
## [69] rmarkdown_1.1                 gtable_0.2.0                 
## [71] codetools_0.2-15              DBI_0.5-1                    
## [73] reshape2_1.4.1                R6_2.2.0                     
## [75] GenomicAlignments_1.10.0      gridExtra_2.2.1              
## [77] knitr_1.14                    rtracklayer_1.34.0           
## [79] Hmisc_3.17-4                  ProtGenerics_1.6.0           
## [81] stringi_1.1.2                 rpart_4.1-10                 
## [83] acepack_1.3-3.3