Contents

1 Introduction

The BiocIO package is primarily to be used by developers for interfacing with the abstract classes and generics in this package to devlop their own related classes and methods.

2 Installation

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("BiocIO")
library("BiocIO")

2.1 Import and Export

The functions import and export load and save objects from and to particular file formats. This package contains the following generics for the import and export methods used throughout the Bioconductor package suite.

getGeneric("import")
## standardGeneric for "import" defined from package "BiocIO"
## 
## function (con, format, text, ...) 
## standardGeneric("import")
## <bytecode: 0x56190dd11d78>
## <environment: 0x56190df5a5b0>
## Methods may be defined for arguments: con, format, text
## Use  showMethods("import")  for currently available ones.
getGeneric("export")
## standardGeneric for "export" defined from package "BiocIO"
## 
## function (object, con, format, ...) 
## standardGeneric("export")
## <bytecode: 0x56190e031ca0>
## <environment: 0x56190e0d7e58>
## Methods may be defined for arguments: object, con, format
## Use  showMethods("export")  for currently available ones.

2.2 The BiocFile Class

BiocFile is a base class for high-level file abstractions, where subclasses are associated with a particular file format/type. It wraps a low-level representation of a file, currently either a path/URL or connection.

2.3 CompressedFile

CompressedFile is a base class that extends the BiocFile class that offers high-level file abstractions for compressed file formats. As with the BiocFile class, it takes either a path/URL of connection as an arguement. This package also includes other File classes that extend CompressedFile including: BZ2File, XZFile, GZFile, and BGZFile which extends the GZfile class

3 For developers

3.1 Converting existing “File” Classes

As of the current release, the rtracklayer package’s RTLFile, RTLList, and CompressedFile classes are throwing the following error when a class that extends them is initialized. The error can currently be seen with the LoomFile class from LoomExperiment.

file <- tempfile(fileext = ".loom")
LoomFile(file)

### LoomFile object
### resource: file.loom
### Warning messages:
### 1: This class is extending the deprecated RTLFile class from
###     rtracklayer. Use BiocFile from BiocIO in place of RTLFile.
### 2: Use BiocIO::resource()

The first warning indicates the the RTLFile class from rtracklayer is being depricated in future releases. The second waning indicates that the resource method from rtracklayer has also been moved to BiocIO.

To resolve this issue, simply replace the contains="RTLFile" argument in setClass with contains="BiocFile".

## Old
setClass('LoomFile', contains='RTLFile')

## New
setClass('LoomFile', contains='BiocFile')

3.2 Creating classes and methods that extend BiocFile’s class and methods

The primary purpose of this package is to provide high-level classes and generics to facilitate file IO within the Biocondcutor package suite. The remainder of this vignette will detail how to create File classes that extend the BiocFile class and create methods for these classes. This section will also detail using the filter and select methods from the tidyverse dplyr package to facilitate lazy operations on files.

The CSVFile class defined in this package will be used as an example. The purpose of the CSVFile class is to represent CSVFile so that IO operations can be performed on the file. The following code defines the CSVFile class that extends the BiocFile class using the contains argument. The CSVFile function is used as a constructor function requiring only the argument resource (either a character or a connection).

.CSVFile <- setClass("CSVFile", contains = "BiocFile")

CSVFile <-
    function(resource)
{
    .CSVFile(resource = resource)
}

Next, the import and export functions are defined. These functions are meant to import the data into R in a usable format (a data.frame or another user-friendly R class), then export that R object into a file. For the CSVFile example, the base read.csv() and write.csv() functions are used as the body for our methods.

setMethod("import", "CSVFile",
    function(con, format, text, ...)
{
    read.csv(resource(con), ...)
})

setMethod("export", c("data.frame", "CSVFile"),
    function(object, con, format, ...)
{
    write.csv(object, resource(con), ...)
})

And finally a demonstration of the CSVFile class and import/export methods in action.

temp <- tempfile(fileext = ".csv")
csv <- CSVFile(temp)

export(mtcars, csv)
df <- import(csv)

Session info

## R version 4.0.3 (2020-10-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.5 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.12-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.12-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] BiocIO_1.0.1     BiocStyle_2.18.0
## 
## loaded via a namespace (and not attached):
##  [1] bookdown_0.21          IRanges_2.24.0         digest_0.6.27         
##  [4] bitops_1.0-6           GenomeInfoDb_1.26.0    stats4_4.0.3          
##  [7] magrittr_1.5           evaluate_0.14          zlibbioc_1.36.0       
## [10] rlang_0.4.8            stringi_1.5.3          XVector_0.30.0        
## [13] S4Vectors_0.28.0       rmarkdown_2.5          tools_4.0.3           
## [16] stringr_1.4.0          RCurl_1.98-1.2         parallel_4.0.3        
## [19] xfun_0.19              yaml_2.2.1             compiler_4.0.3        
## [22] BiocGenerics_0.36.0    BiocManager_1.30.10    GenomicRanges_1.42.0  
## [25] htmltools_0.5.0        knitr_1.30             GenomeInfoDbData_1.2.4