CHANGES IN VERSION 0.16.0 ------------------------- NEW FEATURES o Added 'as.sparse' argument to read_block() (see ?read_block) and to AutoRealizationSink() (see ?AutoRealizationSink). o SparseArraySeed objects now can hold dimnames. As a consequence read_block() now also propagates the dimnames to sparse blocks, not just to dense blocks. o Matrix multiplication is now sparse-aware via sparseMatrices. o Added is_sparse<- generic (with methods for HDF5Array/HDF5ArraySeed objects only, see ?HDF5Array in the HDF5Array package). o Added viewportApply() and viewportReduce() to the blockApply() family. o Added set_grid_context() for testing/debugging callback functions passed to blockApply() and family. SIGNIFICANT USER-VISIBLE CHANGES o Renamed first write_block() argument 'x' -> 'sink' o Renamed: RealizationSink() -> AutoRealizationSink() get/setRealizationBackend() -> get/setAutoRealizationBackend() blockGrid() -> defaultAutoGrid() row/colGrid() -> row/colAutoGrid() o Improved support of sparse data: - Slightly more efficient coercion from SparseArraySeed to dgCMatrix/lgCMatrix (small speedup and memory footprint reduction). This provides a minor speedup to the sparse aware block-processed row/col summarization methods for DelayedMatrix objects when the object is sparse. (These methods are: row/colSums(), row/colMeans(), row/colMins(), row/colMaxs(), and row/colRanges(). The methods defined in DelayedMatrixStats are not sparse aware yet so are not affected.) - Made the following block-processed operations on DelayedArray objects sparse aware: anyNA(), which(), max(), min(), range(), sum(), prod(), any(), all(), and mean(). With a typical 50%-60% speedup when the DelayedArray object is sparse. - Implemented a bunch of methods to operate natively on SparseArraySeed objects. Their main purpose is to support the above i.e. to support block processed methods for DelayedArray objects like sum(), mean(), which(), etc... when the object is sparse. Note that more are needed to also support the sparse aware block-processed row/col summarization methods for DelayedMatrix objects so we can finally ditch the costly coercion from SparseArraySeed to dgCMatrix/lgCMatrix that they currently rely on. o The utility functions for retrieving grid context for the current block/viewport should now be called with no argument (previously one needed to pass the current block to them). These functions are effectiveGrid(), currentBlockId(), and currentViewport(). o DelayedArray now depends on the MatrixGenerics package. BUG FIXES o Various fixes and improvements to block processing of sparse logical DelayedMatrix objects (e.g. DelayedMatrix object with a lgCMatrix seed from thr Matrix package). o Fix extract_sparse_array() inefficiency on dgCMatrix and lgCMatrix objects. o Switch matrix multiplication to bplapply2() from bpiterate() to fix error handling. CHANGES IN VERSION 0.14.0 ------------------------- NEW FEATURES o Support 'type(x) <- new_type' to change the type of a DelayedArray object. o 1D-style single bracket subsetting of DelayedArray objects now supports subsetting by a numeric matrix with one column per dimension. SIGNIFICANT USER-VISIBLE CHANGES o No more parallel evaluation by default, that is, getAutoBPPARAM() now returns NULL on a fresh session instead of one of the parallelization backends defined in BiocParallel. It is now the responsibility of the user to set the parallelization backend (with setAutoBPPARAM()) if they wish things like matrix multiplication, rowsum() or rowSums() use parallel evaluation again. Also BiocParallel has been moved from Depends to Suggests. o Replace arrayInd2() and linearInd() with Lindex2Mindex() and Mindex2Lindex(). The new functions are implemented in C for better performances and they properly handle L-index values greater than INT_MAX (2^31 - 1) in the input and output. o 2x speedup to coercion from DelayedArray to SparseArraySeed or dgCMatrix. DEPRECATED AND DEFUNCT o arrayInd2() and linearInd() are now deprecated in favor of Lindex2Mindex() and Mindex2Lindex(). BUG FIXES o Fix handling of linear indices >= 2^31 in 1D-style single bracket subsetting of DelayedArray objects. o rowsum() & colsum() methods for DelayedArray objects now respect factor level ordering (issue #59). o Coercion from DelayedMatrix to dgCMatrix now propagates the dimnames. o No more quotes around the NA values of a DelayedArray of type "character". o Better error message when Ops methods for DelayedArray objects reject their operands. CHANGES IN VERSION 0.12.0 ------------------------- NEW FEATURES o Add isPristine() o Delayed subassignment now accepts a right value with dimensions that are not strictly the same as the dimensions of the selection as long as the "effective dimensions" are the same o Small improvement to delayed dimnames setter: atomic vectors or factors in the supplied 'dimnames' list are now accepted and passed thru as.character() SIGNIFICANT USER-VISIBLE CHANGES o Improve show() method for DelayedArray objects (see commit 54540856) BUG FIXES o Setting and getting the dimnames of a DelayedArray object or derivative now preserves the names on the dimnames o Some fixes related to DelayedArray objects with list array seeds (see commit 6c94eac7) CHANGES IN VERSION 0.10.0 ------------------------- NEW FEATURES o Many improvements to matrix multiplication (%*%) of DelayedMatrix objects by Aaron Lun. Also add limited support for (t)crossprod methods. o Add rowsum() and colsum() methods for DelayedMatrix objects. These methods are block-processed operations. o Many improvements to the RleArray() contructor (see messages for commits 582234a7 and 0a36ee01 for more info). o Add seedApply() o Add multGrids() utility (still a work-in-progress, not documented yet) CHANGES IN VERSION 0.8.0 ------------------------ NEW FEATURES o Add get/setAutoBlockSize(), getAutoBlockLength(), get/setAutoBlockShape() and get/setAutoGridMaker(). o Add rowGrid() and colGrid(), in addition to blockGrid(). o Add get/setAutoBPPARAM() to control the automatic 'BPPARAM' used by blockApply(). o Reduce memory usage when realizing a sparse DelayedArray to disk On-disk realization of a DelayedArray object that is reported to be sparse (by is_sparse()) to a "sparsity-optimized" backend (i.e. to a backend with a memory efficient write_sparse_block() like the TENxMatrix backend imple- mented in the HDF5Array package) now preserves sparse representation of the data all the way. More precisely, each block of data is now kept in a sparse form during the 3 steps that it goes thru: read from seed, realize in memory, and write to disk. o showtree() now displays whether a tree node or leaf is considered sparse or not. o Enhance "aperm" method and dim() setter for DelayedArray objects. In addition to allowing dropping "ineffective dimensions" (i.e. dimensions equal to 1) from a DelayedArray object, aperm() and the dim() setter now allow adding "ineffective dimensions" to it. o Enhance subassignment to a DelayedArray object. So far subassignment to a DelayedArray object only supported the **linear form** (i.e. x[i] <- value) with strong restrictions (the subscript 'i' must be a logical DelayedArray of the same dimensions as 'x', and 'value' must be an ordinary vector of length 1). In addition to this linear form, subassignment to a DelayedArray object now supports the **multi-dimensional form** (e.g. x[3:1, , 6] <- 0). In this form, one subscript per dimension is supplied, and each subscript can be missing or be anything that multi-dimensional subassignment to an ordinary array supports. The replacement value (a.k.a. the right value) can be an array-like object (e.g. ordinary array, dgCMatrix object, DelayedArray object, etc...) or an ordinary vector of length 1. Like the linear form, the multi-dimensional form is also implemented as a delayed operation. o Re-implement internal helper simple_abind() in C and support long arrays. simple_abind() is the workhorse behind realization of arbind() and acbind() operations on DelayedArray objects. o Add "table" and (restricted) "unique" methods for DelayedArray objects, both block-processed. o range() (block-processed) now supports the 'finite' argument on a DelayedArray object. o %*% (block-processed) now works between a DelayedMatrix object and an ordinary vector. o Improve support for DelayedArray of type "list". o Add TENxMatrix to list of supported realization backends. o Add backend-agnostic RealizationSink() constructor. o Add linearInd() utility for turning array indices into linear indices. Note that linearInd() performs the reverse transformation of base::arrayInd(). o Add low-level utilities mapToGrid() and mapToRef() for mapping reference array positions to grid positions and vice-versa. o Add downsample() for reducing the "resolution" of an ArrayGrid object. o Add maxlength() generic and methods for ArrayGrid objects. SIGNIFICANT USER-VISIBLE CHANGES o Multi-dimensional subsetting is no more delayed when drop=TRUE and the result has only one dimension. In this case the result now is returned as an **ordinary** vector (atomic or list). This is the only case of multi-dimensional single bracket subsetting that is not delayed. o Rename defaultGrid() -> blockGrid(). The 'max.block.length' argument is replaced with the 'block.length' argument. 2 new arguments are added: 'chunk.grid' and 'block.shape'. o Major improvements to the block processing mechanism. All block-processed operations (except realization by block) now support blocks of **arbitrary** geometry instead of column-oriented blocks only. 'blockGrid(x)', which is called by the block-processed operations to get the grid of blocks to use on 'x', has the following new features: 1) It's "chunk aware". This means that, when the chunk grid is known (i.e. when 'chunkGrid(x)' is not NULL), 'blockGrid(x)' defines blocks that are "compatible" with the chunks i.e. that any chunk is fully contained in a block. In other words, blocks are chosen so that chunks don't cross their boundaries. 2) When the chunk grid is unknown (i.e. when 'chunkGrid(x)' is NULL), blocks are "isotropic", that is, they're as close as possible to an hypercube instead of being "column-oriented" (column-oriented blocks, also known as "linear blocks", are elongated along the 1st dimension, then along the 2nd dimension, etc...) 3) The returned grid has the lowest "resolution" compatible with 'getAutoBlockSize()', that is, the blocks are made as big as possible as long as their size in memory doesn't exceed 'getAutoBlockSize()'. Note that this is not a new feature. What is new though is that an exception now is made when the chunk grid is known and some chunks are >= 'getAutoBlockSize()', in which case 'blockGrid(x)' returns a grid that is the same as the chunk grid. These new features are supposed to make the returned grid "optimal" for block processing. (Some benchmarks still need to be done to confirm/quantify this.) o The automatic block size now is set to 100 Mb (instead of 4.5 Mb previously) at package startup. Use setAutoBlockSize() to change the automatic block size. o No more 'BPREDO' argument to blockApply(). o Replace block_APPLY_and_COMBINE() with blockReduce(). BUG FIXES o No-op operations on a DelayedArray derivative really act like no-ops. Operating on a DelayedArray derivative (e.g. RleArray, HDF5Array or GDSArray) will now return an objet of the original class if the result is "pristine" (i.e. if it doesn't carry delayed operations) instead of degrading the object to a DelayedArray instance. This applies for example to 't(t(x))' or 'dimnames(x) <- dimnames(x)' etc...