CHANGES IN VERSION 1.48.0 ------------------------- NEW FEATURES o Add subtract() for subtracting a set of genomic ranges from a GRanges object. This is similar to bedtools subtract. o Add 'na.rm' argument to makeGRangesFromDataFrame(). DEPRECATED AND DEFUNCT o Remove the GenomicRangesList() constructor. This constructor got deprecated in BioC 3.10 and defunct in BioC 3.13. BUG FIXES o Make sure promoters() works on GPos objects. CHANGES IN VERSION 1.46.0 ------------------------- - No changes in this version. CHANGES IN VERSION 1.44.0 ------------------------- SIGNIFICANT USER-VISIBLE CHANGES o Replace KEGG.db usage with KEGGREST in vignettes and examples. DEPRECATED AND DEFUNCT o The GenomicRangesList() constructor is now defunct (got deprecated in BioC 3.10). CHANGES IN VERSION 1.42.0 ------------------------- NEW FEATURES o Add nearestKNeighbors() method for GenomicRanges derivatives. o coverage() now supports 'method="naive"'. This is in addition to the already supported methods "sort" and "hash". This new method is a slower version of the "hash" method that has the advantage of avoiding floating point artefacts in the no-coverage regions of the numeric-Rle object returned by coverage() when the weights are supplied as a numeric vector of type 'double'. See "FLOATING POINT ARITHMETIC CAN BRING A SURPRISE" example in '?coverage' in the IRanges package. CHANGES IN VERSION 1.40.0 ------------------------- NEW FEATURES o Add trim() method for GRangesList objects. CHANGES IN VERSION 1.38.0 ------------------------- NEW FEATURES o GPos objects now exist in 2 flavors: UnstitchedGPos and StitchedGPos GPos is now a virtual class with 2 concrete subclasses: UnstitchedGPos and StitchedGPos. In an UnstitchedGPos instance the positions are stored as an integer vector. In a StitchedGPos instance, like with old GPos instances, the positions are stored as an IRanges object where each range represents a run of consecutive positions. This is analog to the IPos/UnstitchedIPos/StitchedIPos situation. See ?GPos for more information. Old serialized GPos instances can be converted to StitchedGPos instances with updateObject(). o GPos objects now can hold names o Coercion to GPos now propagates the names o Add GRangesFactor class (Factor derivative). See ?GRangesFactor SIGNIFICANT USER-VISIBLE CHANGES o Export from_GPos_to_GRanges() o Some reorgnization of the GenomicRangesList hierarchy (see commit f988a5a9). o Swap order of arguments 'seqlengths' and 'seqinfo' of the GRanges() constructor so now the latter comes before the former. DEPRECATED AND DEFUNCT o Remove findOverlaps, seqnames, and seqinfo<- methods for RangedData objects. These methods were deprecated in BioC 3.8 and defunct in BioC 3.9. BUG FIXES o Coercion from RangesList to GRanges is more robust to seqlevel differences o Fix bug in isSmallGenome() (introduced by change in sum() in R >= 3.5) CHANGES IN VERSION 1.36.0 ------------------------- NEW FEATURES o findOverlaps() now supports type="equal" on GRangesList objects DEPRECATED AND DEFUNCT o After being deprecated in BioC 3.8, the seqinfo() setter, seqnames(), and findOverlaps() are now defunct on RangedData objects. CHANGES IN VERSION 1.34.0 ------------------------- NEW FEATURES o Add coercions from GenomicRanges to IRangesList and from GenomicRanges to CompressedIRangesList. These 2 new coercions are equivalent to coercion from GenomicRanges to IntegerRangesList, that is, if 'gr' is a GenomicRanges object, the 3 following coercions are equivalent and return the same CompressedIRangesList object: as(gr, "IntegerRangesList") as(gr, "IRangesList") as(gr, "CompressedIRangesList") DEPRECATED AND DEFUNCT o Deprecate several RangedData methods: seqinfo, seqinfo<-, seqnames, and findOverlaps#RangedData#GenomicRanges RangedData objects will be deprecated in BioC 3.9 (their use has been discouraged since BioC 2.12, that is, since 2014). Package developers that are still using RangedData objects need to migrate their code to use GRanges or GRangesList objects instead. BUG FIXES o Make [[, as.list(), lapply(), and unlist() fail more graciously on a GenomicRanges object. o Make "show" methods for GenomicRanges and GPos objects robust to special metadata column names like "stringsAsFactors". o Export the "update" method for GRanges objects. This addresses https://github.com/Bioconductor/GenomicRanges/issues/7 CHANGES IN VERSION 1.32.0 ------------------------- NEW FEATURES o 2 improvements to the "promoters" method for GenomicRanges objects: - The 'upstream' and 'downstream' arguments now can be integer vectors parallel to 'x', - The 'use.names' argument now is supported. This is for consistency with the other intra range transformations. SIGNIFICANT USER-VISIBLE CHANGES o GenomicRanges now is a List subclass. This means that GRanges objects and their derivatives are now considered list-like objects (even though [[ don't work on them yet, this will be implemented in Bioconductor 3.8). o Add the CompressedGRangesList class as a replacement for the GRangesList class. The long term goal is that GRangesList becomes a virtual class with CompressedGRangesList as a concrete subclass. Note that the GRangesList() constructor now returns a CompressedGRangesList instance instead of a GRangesList instance. o GenomicRangesList is now a virtual class (like IntegerRangesList is). o GRanges derivatives no longer support the 'x[i, j] <- value' form of subassignment. This feature was of very limited usefulness and no Bioconductor package was using it. o Improve performance of nearest(), precede(), and follow() on a GRanges object. o Improve performance of coverage() on a GPos object. o Improve performance of sort() on a GRangesList object. Also now it supports 'ignore.strand'. See https://github.com/Bioconductor/GenomicRanges/issues/1 (and note how unnicely these changes were requested). o Improve performance and error handling of coercion from RleList to GRanges. This is a 50x speedup or more when the RleList object to coerce has thousands of list elements or more. BUG FIXES o Fix coercion from RleList to GRanges when some list elements in the object to coerce have length 0 (see https://support.bioconductor.org/p/105926/ for original report by Xiaotong Yao). o Fix bug in nearest() when an unstranded range in 'query' precedes or follows more than one range in 'subject'. CHANGES IN VERSION 1.30.0 ------------------------- NEW FEATURES o Support GPos-based GRangesList objects. o Add 'na.rm' argument to binnedAverage(). SIGNIFICANT USER-VISIBLE CHANGES o Change 'maxgap' and 'minoverlap' defaults for findOverlaps() and family (i.e. countOverlaps(), overlapsAny(), and subsetByOverlaps()). This change addresses 2 long-standing issues: (1) by default zero-width ranges are not excluded anymore, and (2) control of zero-width ranges and adjacent ranges is finally decoupled (only partially though). New default for 'minoverlap' is 0 instead of 1. New default for 'maxgap' is -1 instead of 0. See ?findOverlaps for more information about 'maxgap' and the meaning of -1. For example, if 'type' is "any", you need to set 'maxgap' to 0 if you want adjacent ranges to be considered as overlapping. o GPos now extends GRanges but with a ranges slot that must be an IPos object. Update "old" GPos objects with updateObject(). o Move pos() generic to IRanges package. o Move rglist() generic to IRanges package. o Rename GenomicRangesORmissing and GenomicRangesORGRangesList classes -> GenomicRanges_OR_missing and GenomicRanges_OR_GRangesList, respectively. o Remove "seqinfo" method for RangesList objects. o Remove "stack" method for GenomicRangesList objects. DEPRECATED AND DEFUNCT o Remove 'force' argument from seqinfo() and seqlevels() setters (the argument got deprecated in BioC 3.5 in favor of new and more flexible 'pruning.mode' argument). BUG FIXES o nearest() and distanceToNearest() now call findOverlaps() internally with maxgap=0 and minoverlap=0. This fixes incorrect results obtained in some situations e.g. in the situation reported here: https://support.bioconductor.org/p/99369/ (zero-width ranges) but also in this situation: nearest(GRanges("chr1", IRanges(5, 10)), GRanges("chr1", IRanges(1, 4:5)), select="all") where the 2 ranges in the subject are *both* nearest to the 5-10 range. o '$' completion on GenomicRanges works in RStudio. o Minor tweaks to conversion from character to GRanges and reverse conversion. CHANGES IN VERSION 1.28.0 ------------------------- NEW FEATURES o Add coercion from ordinary list to GRangesList. Also the GRangesList() constructor function now accepts a list of GRanges as input (and just calls new coercion from list to GRangesList on it internally). o seqlevels() setter now supports "fine" and "tidy" pruning modes on GRangesList objects (in addition to "coarse" mode, which is the default). o "range" methods now have a 'with.revmap' argument (like "reduce" and "disjoin" methods). o Add a bunch of range-oriented methods for GenomicRangesList objects. SIGNIFICANT USER-VISIBLE CHANGES o Some changes/improvements to "precede" and "follow" methods for GenomicRanges objects motivated by discussion on support site: https://support.bioconductor.org/p/90664/ o Some changes/improvements to "rank" method for GenomicRanges objects: - now supports the same ties methods as base::rank() (was only supporting ties methods "first" and "min" until now) - default ties method now is "average", like base::rank() - now supports additional argument 'ignore.strand'. DEPRECATED AND DEFUNCT o Argument 'force' of seqinfo() and seqlevels() setters is deprecated in favor of new and more flexible 'pruning.mode' argument. BUG FIXES o Fix severe performance regression introduced in Bioconductor 3.3 in "intersect" and "setdiff" methods for GRangesList objects. Thanks to Jens Reeder for catching and reporting this. CHANGES IN VERSION 1.26.0 ------------------------- NEW FEATURES o Add 'with.revmap' argument to "reduce" method for GRangesList objects. o Add 'with.revmap' argument to various "disjoin" methods. o makeGRangesFromDataFrame() now tries to turn the "start" and "end" columns of the input data frame into numeric vectors if they are not already. o Add makeGRangesListFromDataFrame() function. o Add "summary" method for GenomicRanges objects. o Add 'use.names' argument to the granges(), grglist(), and rglist() generics and methods, as well as to a bunch of "ranges" methods (for GRanges, GPos, GNCList, GRangesList, and DelegatingGenomicRanges). Default is TRUE to preserve existing behavior. o Add 'use.mcols' arguments to the "ranges" methods for GPos objects. SIGNIFICANT USER-VISIBLE CHANGES DEPRECATED AND DEFUNCT BUG FIXES o Fix bug in distanceToNearest() related to ranges starting at zero. o Fix GRanges(Seqinfo()). CHANGES IN VERSION 1.24.0 ------------------------- NEW FEATURES o Add the GPos class, a container for storing a set of "genomic positions" (i.e. genomic ranges of width 1). Even though a GRanges object can be used for that, using a GPos object can be much more memory-efficient, especially when the object contains long runs of adjacent positions. o Add a bunch of "invertStrand" methods to support strand inversion of any "stranded" object (i.e. any object with a strand() getter and setter). E.g. invertStrand() works on GRanges, GRangesList, GAlignments, GAlignmentPairs, GAlignmentsList, and RangedSummarizedExperiment objects. o Add "is.unsorted" method for GenomicRanges objects (contributed by Pete Hickey). o base::rank() gained a new 'ties.method="last"' option and base::order() a new argument ('method') in R 3.3. Thus so do the "rank" and "order" methods for GenomicRanges objects. o Add "selfmatch" method for GenomicRanges objects. o Add "union" method for GRangesList objects. SIGNIFICANT USER-VISIBLE CHANGES o Remove old SummarizedExperiment class from the GenomicRanges package (this class is now defined in the SummarizedExperiment package). o Move the following generic functions from the GenomicRanges package to the SummarizedExperiment package: - SummarizedExperiment - exptData, "exptData<-" - rowRanges, "rowRanges<-" - colData, "colData<-" - assayNames, "assayNames<-" - assays, "assays<-" - assay, "assay<-" o Rename "pintersect" and "psetdiff" methods for GRangesList objects -> "intersect" and "setdiff" without changing their behavior (they still do mendoapply(intersect, x, y) and mendoapply(setdiff, x, y), respectively). The old names were misnomers (see svn commit message for commit 113793 for more information). o Remove the ellipsis (...) from all the setops methods, except from: - "punion" method for signature GRanges#GRangesList; - "pintersect" and "psetdiff" methods for signature GRangesList#GRangesList; - "pgap" method for GRanges objects. o Use DESeq2 instead of DESeq in the vignettes (better late than never). DEPRECATED AND DEFUNCT o Remove GIntervalTree class and methods (were defunct in BioC 3.2). o Remove mapCoords() and pmapCoords() (were defunct in BioC 3.2). CHANGES IN VERSION 1.22.0 ------------------------- NEW FEATURES o Support coercions back and forth between a GRanges object and a character vector (or factor) with elements in the format 'chr1:2501-2800' or 'chr1:2501-2800:+'. o Add facilities for manipulating "genomic variables": bindAsGRanges(), mcolAsRleList(), and binnedAverage(). See ?genomicvars for more information. o Add "narrow" method for GRangesList objects. o Enhancement to the GRanges() constructor. If the 'ranges' argument is not supplied then the constructor proceeds in 2 steps: 1. An initial GRanges object is created with 'as(seqnames, "GRanges")'. 2. Then this GRanges object is updated according to whatever other arguments were supplied to the call to GRanges(). Because of this enhancement, GRanges(x) is now equivalent to 'as(x, "GRanges")' e.g. GRanges() can be called directly on a character vector representing ranges, or on a data.frame, or on any object for which coercion to GRanges is supported. o Add 'ignore.strand' argument to "range" and "reduce" methods for GRangesList objects. o Add coercion from SummarizedExperiment to RangedSummarizedExperiment (also available via updateObject()). See 1st item in DEPRECATED AND DEFUNCT section below for more information about this. o GNCList objects are now subsettable. o "coverage" methods now accept 'shift' and 'weight' supplied as an Rle. SIGNIFICANT USER-VISIBLE CHANGES o Modify behavior of "*" strand in precede() / follow() to mimic 'ignore.strand=TRUE'. o Revisit "pintersect" methods for GRanges#GRanges, GRangesList#GRanges, and GRanges#GRangesList: - Sanitize their semantic. - Add 'drop.nohit.ranges' argument (FALSE by default). - If 'drop.nohit.ranges' is FALSE, the returned object now has a "hit" metadata column added to it to indicate the elements in 'x' that intersect with the corresponding element in 'y'. o binnedAverage() now treats 'numvar' as if it was set to zero on genomic positions where it's not set (typically happens when 'numvar' doesn't span the entire chromosomes because it's missing the trailing zeros). o GRanges() constructor no more mangles the names of the supplied metadata columns (e.g. if the column is "_tx_id"). o makeGRangesFromDataFrame() now accepts "." in strand column (treated as "*"). o GNCList() constructor now propagates the metadata columns. o Remove "seqnames" method for RangesList objects. DEPRECATED AND DEFUNCT o The SummarizedExperiment class defined in GenomicRanges is deprecated and replaced by 2 new classes defined in the new SummarizedExperiment package: SummarizedExperiment0 and RangedSummarizedExperiment. In BioC 2.3, the SummarizedExperiment class will be removed from the GenomicRanges package and the SummarizedExperiment0 class will be renamed SummarizedExperiment. To facilitate this transition, a coercion method was added to coerce from old SummarizedExperiment to new RangedSummarizedExperiment (this coercion is performed when calling updateObject() on an old SummarizedExperiment object). o makeSummarizedExperimentFromExpressionSet() and related stuff was moved to the new SummarizedExperiment package. o After being deprecated in BioC 3.1, the rowData accessor is now defunct (replaced with the rowRanges accessor). o After being deprecated in BioC 3.1, GIntervalTree objects and the "intervaltree" algorithm in findOverlaps() are now defunct. o After being deprecated in BioC 3.1, mapCoords() and pmapCoords() are now defunct. BUG FIXES o 2 tweaks to subsetting *by* an GenomicRanges: - Improve speed when the object to subset is a SimpleList (e.g. SimpleRleList). - Fix issue when the GenomicRanges subscript is empty. CHANGES IN VERSION 1.20.0 ------------------------- NEW FEATURES o Add coercion methods to go back and forth between ExpressionSet and SummarizedExperiment. o Add 'assayNames', 'assayNames<-' for SummarizedExperiment o assays() supports arrays of up to 4 dimensions. o Add GNCList() for preprocessing a GenomicRanges object into a GNCList object that can be used for fast overlap seach with findOverlaps(). GNCList() is a replacement for GIntervalTree() that uses Nested Containment Lists instead of interval trees. Unlike GIntervalTree(), GNCList() supports preprocessing of a GenomicRanges object with ranges located on a circular sequence. For a one time use, it's not advised to explicitely preprocess the input. This is because findOverlaps() or countOverlaps() will take care of it and do a better job at it (that is, they preprocess only what's needed when it's needed and release memory as they go). o All "findOverlaps" methods now support 'select' equal "last" or "arbitrary" (in addition to "all" and "first"). o Add absoluteRanges() and relativeRanges() to transform back and forth between absolute and relative genomic ranges. SIGNIFICANT USER-VISIBLE CHANGES o Renamed 'rowData' and 'rowData<-' -> 'rowRanges', 'rowRanges<-'. Old names still work but are deprecated. o Some improvements to makeGRangesFromDataFrame(): - Improve internal logic used for finding the GRanges columns in the input. - If 'seqinfo' is not supplied, the seqlevels are now ordered according to the output of GenomeInfoDb::rankSeqlevels(). - Now an attempt is made to turn 'df' into a data frame (with 'as.data.frame(df)') if it's not a data frame or a DataFrame object. o The GRanges() constructor now propagates the metadata cols that are on 'ranges' if no metadata cols are explicitly passed to the constructor. DEPRECATED AND DEFUNCT o Deprecated 'rowData' and 'rowData<-' in favor of 'rowRanges' and 'rowRanges<-'. o Deprecated mapCoords() and pmapCoords(). They're replaced by mapToTranscripts() and pmapToTranscripts() from the GenomicFeatures package and mapToAlignments() and pmapToAlignments() from the GenomicAlignments package. o Deprecated GIntervalTree objects. o Removed "map" and "splitAsListReturnedClass" methods (were defunct in GenomicRanges 1.18.0). o Removed makeSeqnameIds() (was defunct in GenomicRanges 1.18.0). o Removed 'with.mapping' argunment from "reduce" methods (was defunct in GenomicRanges 1.18.0). BUG FIXES o Fix 'findOverlaps(..., type="start")' on GRangesList objects which has been broken for years. o Fix self overlap search on a GRanges object when 'ignore.strand=TRUE' (i.e. 'findOverlaps(gr, ignore.strand=TRUE)'). CHANGES IN VERSION 1.18.0 ------------------------- NEW FEATURES o Add 'use.mcols' arg to "ranges" method for GRangesList objects. o "assays<-" methods may be invoked with 'withDimnames' arg. o Add mapCoords() generic and methods (replacing map()). o Add granges,GenomicRanges method. o Add strand<-,GRangesList,character method for global replacement (i.e., all strands become 'value'). o Add resize,GRangesList-method. o Add DelegatingGenomicRanges class and vignette on how to extend GenomicRanges. o Document subsetting a named list-like object by a GRanges subscript. SIGNIFICANT USER-VISIBLE CHANGES o Modify "show" methods for GRanges and GRangesList objects so they print a 1-line summary of the seqinfo component. o Remove as.data.frame,GRangesList-method; use as.data.frame,List. o "trim" method for GenomicRanges only trims out-of-bound ranges on non-circular sequences whose length is not NA. This behavior is consistent with the GenomicRanges validity method. o Changes to flank(), resize() and start/end/width setters: - no longer trim the result ranges when called on a GRanges - warning is issued by GenomicRanges validity method when out-of-bound ranges are on non-circular sequences whose length is not NA Note this behavior is now consistent with that of shift(). o Speed up validation of GenomicRanges objects by 1.2x. o Speed up trim() on GenomicRanges objects by 1.2x. o Improve warning when GenomicRanges object contains out-of-bound ranges. o Work on vignette HOWTOs: - split 'How to read BAM files into R' into 3 HOWTOs - split 'How to prepare a table of read counts for RNA-Seq differential gene expression' into 3 HOWTOs - split 'How to extract DNA sequences of gene regions' into 2 HOWTOs - make individual HOWTOs subsections of single HOWTO section o Follow renaming of TranscriptDb class to TxDb. o Replace references to plantsmart21 with plantsmart22. DEPRECATED AND DEFUNCT o Defunct map() (skip deprecation). Replace with mapCoords(). BUG FIXES o [cr]bind,SummarizedExperiment methods respect derived classes. o assays(se, withDimnames=TRUE) <- value no longer tries to access a slot 'withDimnames'. o cbind and rbind,SummarizedExperiment-methods respect derived classes o "ranges" method for GRangesList objects should not propagate inner metadata columns by default. o GRanges() constructor now preserves the seqlevels in the order supplied by the user. o Ensure tileGenome() breakpoints do not extend past end of genome. o Fix "show" method for GenomicRanges objects when 'showHeadLines' global option is set to Inf. o [rc]bind,SummarizeExperiment-methods now compare all elements. o Remove "==" and "<=" methods for GenomicRanges objects (not needed). CHANGES IN VERSION 1.16.0 ------------------------- NEW FEATURES o Add "subset" method for SummarizedExperiment objects. o Allow DataFrame in SummarizedExperiment assays. o Add 'use.mcols' arg (FALSE by default) to the granges(), grglist(), and rglist() generics (a.k.a. the range-squeezer generics). o Add coercion method from GRangesList to RangesList. o Add score() setter for GRangesList objects. o findOverlaps(..., type="within") now works on circular chromosomes. o Add 'ignore.strand' arg to "sort" method for GRanges objects. o Support subsetting of a named list-like object *by* a GenomicRanges subscript. o Support sort(granges, by = ~ score), i.e., a formula-based interface for sorting by the mcols. SIGNIFICANT USER-VISIBLE CHANGES o Move many functionalities to the new GenomicAlignments package: - The GAlignments, GAlignmentPairs, and GAlignmentsList classes. - The qnarrow() generic and methods. - The "narrow" and "pintersect" methods for GAlignments and GAlignmentsList objects. - The low-level CIGAR utilities. - The "findOverlaps" methods for GAlignment* objects. - The summarizeOverlaps() generic and methods, and the "Counting reads with summarizeOverlaps" vignette. - findCompatibleOverlaps() and countCompatibleOverlaps(). - The findSpliceOverlaps() generic and methods. - The "overlap encodings" stuff i.e. the "encodeOverlaps" method for GRangesList objects, flipQuery(), selectEncodingWithCompatibleStrand(), isCompatibleWithSplicing(), isCompatibleWithSkippedExons(), extractSteppedExonRanks(), extractSpannedExonRanks(), extractSkippedExonRanks(), and extractQueryStartInTranscript(), and the "OverlapEncodings" vignette. o Rename 'with.mapping' arg -> 'with.revmap' in "reduce" methods. The old arg name is still working but deprecated. o Move makeSeqnameIds() function to the new GenomeInfoDb package and rename it rankSeqlevels(). The old name is still working but deprecated. o The "strand" methods now perform stricter checking and are guaranteed to always return a factor (or factor-Rle) with the "standard strand levels" and no NAs. Or to fail. BUG FIXES o Tweaks and fixes to various "strand" methods: - Methods for character vectors and factors do not accept NAs anymore (they raise an error). - Methods for integer and logical vectors map NAs to * (instead of NA). - Method for Rle objects now also works on character-, factor-, and integer-Rle objects (in addition to logical-Rle objects). CHANGES IN VERSION 1.14.0 ------------------------- NEW FEATURES o Add coercion from GenomicRangesList to RangedDataList. o Add "c" method for GAlignmentPairs objects. o Add coercion from GAlignmentPairs to GAlignmentsList. o Add 'inter.feature' and 'fragment' arguments to summarizeOverlaps(). o Add seqselect,GAlignments-method. o Add CIGAR utilities: explodeCigarOps(), explodeCigarOpLengths() cigarRangesAlongReferenceSpace(), cigarRangesAlongQuerySpace() cigarRangesAlongPairwiseSpace(), extractAlignmentRangesOnReference() cigarWidthAlongReferenceSpace(), cigarWidthAlongQuerySpace() cigarWidthAlongPairwiseSpace(). o Add seqlevels0() and restoreSeqlevels(). o Add seqlevelsInUse() getter for GRanges, GRangesList, GAlignments GAlignmentPairs, GAlignmentsList and SummarizedExperiment objects. o Add update,GAlignments method. o Add GIntervalTree class. o Add coercion from GAlignmentPairs to GAlignments. o Add sortSeqlevels(). o Add tileGenome(). o Add makeGRangesFromDataFrame() and coercion from data.frame or DataFrame to GRanges. SIGNIFICANT USER-VISIBLE CHANGES o Renaming (with aliases from old to new names): - classes GappedAlignments -> GAlignments GappedAlignmentPairs -> GAlignmentPairs - functions GappedAlignments() -> GAlignments() GappedAlignmentPairs() -> GAlignmentPairs() readGappedAlignments() -> readGAlignments() readGappedAlignmentPairs() -> readGAlignmentPairs() o Remove 'asProperPairs' argument to readGAlignmentsList(). o Modify "show" method for Seqinfo object to honor showHeadLines and showTailLines global options. o 50x speedup or more when merging 2 Seqinfo objects, 1 very small and 1 very big. o Add dependency on new XVector package. o Enhanced examples for renaming seqlevels in seqlevels-utils.Rd. o More efficient reference class constructor for 'assays' slot of SummarizedExperiment objects. o 'colData' slot of SummarizedExperiment produced from call to summarizedOverlaps() now holds the class type and length of 'reads'. o 4x speedup to cigarToRleList(). o Relax SummarizedExperiment class validity. o Renaming (with aliases from old to new names): cigarToWidth() -> cigarWidthOnReferenceSpace(), and cigarToQWidth() -> cigarWidthOnQuerySpace(). o Improvements to summarizeOverlaps(): - mode 'Union': 1.5x to 2x speedup - mode 'IntersectionNotEmpty': 2x to 8x speedup + memory footprint reduced by ~ half o Change default 'use.names' to FALSE for readGAlignmentsList(). o Implement 'type="equal"' for findOverlaps,SummarizedExperiment methods. o Modify summarizeOverlaps() examples to use 'asMates=TRUE' instead of 'obeyQname=TRUE'. o Remove unneeded "window" method for GenomicRanges objects. o Speed up seqinfo() getter and setter on SummarizedExperiment objects and derivatives (e.g. VCF) by using direct access to 'rowData' slot. o coverage,GenomicRanges method now uses .Ranges.coverage() when using the defaults for 'shift' and 'width'. o Remove restriction that metadata column names must be different on a GRangesList and the unlisted GRanges. o GenomicRangesUseCases vignette has been redone and renamed to GenomicRangesHOWTOs. DEPRECATED AND DEFUNCT o Defunct all "match" and "%in%" methods in the package except for those with the GenomicRanges,GenomicRanges signature. o Deprecate GappedAlignment*: - GappedAlignments and GappedAlignmentPairs classes - GappedAlignments() and GappedAlignmentPairs() constructors - readGappedAlignments() and readGappedAlignmentPairs() functions o Deprecate cigar util functions: cigarToWidth(), cigarToQWidth(), cigarToIRanges() splitCigar(), cigarToIRanges(), cigarToIRangesListByAlignment() cigarToIRangesListByRName(), cigarToWidth(), cigarToQWidth() cigarToCigarTable(), summarizeCigarTable() o Deprecate seqselect(). BUG FIXES o Fix bug in c,GAlignments for case when objects were unnamed. o Fix bug in flank,GenomicRanges (when 'ignore.strand=TRUE' 'start' was being set to TRUE). o Fix bug in behavior of summarizeOverlaps() count mode 'IntersectionNotEmpty' when 'inter.features=FALSE'. Shared regions are now removed before counting. o Fix bug in cigarToIRangesListByAlignment() when 'flag' is supplied and indicates some reads are unmapped. o Fix bug in summarizeOverlaps(..., mode='IntersectionNotEmpty') when 'features' has '-' and '+' elements and 'ignore.strand=TRUE'. o match,GenomicRanges,GenomicRanges method now handles properly objects with seqlevels not in the same order. CHANGES IN VERSION 1.12.0 ------------------------- NEW FEATURES o Implement "seqnameStyle" replacement method for Seqinfo object. 'seqnameStyle(x) <- style' works on any object with a "seqinfo" replacement method. o Add trim,GenomicRanges-method to trim out of bound ranges. o Add promoters,GenomicRanges and promoters,GRangesList methods. o Add "overlapsAny" methods as a replacement for the deprecated "%in%" methods. o Add 'ignore.strand' argument to match,GenomicRanges-method. o Add 'with.mapping' argument to "reduce" method for GenomicRanges objects. o Add "unname" method to remove dimnames from SummarizedExperiment. o Add "cbind" and "rbind" methods for SummarizedExperiment. o Add "seqselect", "seqselect<-" and "split" methods for SummarizedExperiment. o Add GAlignmentsList class. o Add readGAlignmentsList generic and methods. SIGNIFICANT USER-VISIBLE CHANGES o resize,GenomicRanges method no longer checks that 'fix' is length-compatible with 'x' when 'x' is length zero. This allows for resize(x, w, fix = "end") without worrying about 'x' being zero-length. o Change the behavior of "distance". Previously adjacent ranges had a distance of 1 and overlapping had a distance of 0. Now both adjacent AND overlapping have a distance of 0. o shift,GenomicRanges-method no longer trims out of bound ranges. o "distanceToNearest" no longer drops ranges that have no hit but returns 'NA' for 'subjectHits' and 'distance'. o "genome" is no longer an invalid metadata colname for GenomicRanges objects. o 4x-8x speedup for doing coverage() on a GRanges or GRangesList with many seqlevels. o Remove ">=", "<", and ">" methods for GenomicRanges objects. o Speedup "seqinfo" setters for GenomicRanges and GappedAlignments by avoiding validation when not necessary. o readGappedAlignments can now pass a BamFile to readBamGappedAlignments. o Remove unneeded "unique" and "sort" methods for GenomicRanges objects. o Change behavior of "match" and "%in%" on GenomicRanges objects to use equality instead of overlap for comparing elements between GenomicRanges objects 'x' and 'table'. o match,GenomicRanges-method gets the same 'method' argumnet as the "duplicated" method for these objects. o Remove unneeded "countOverlaps" methods. o "classNameForDisplay" shortens the name of data type when displayed. o Add global options 'showHeadLines' and 'showTailLines' to control the number of head/tails lines displayed in show,GRanges and show,GappedAlignments methods. o "distanceToNearest" now returns a Hits object instead of DataFrame. DEPRECATED AND DEFUNCT o Remove defunct countGenomicOverlaps(), grg(), and globalToQuery() o Defunct previously deprecated '.ignoreElementMetadata' argument of c,GenomicRanges-method. o Deprecate all "match" and "%in%" methods in the package except for those with the GenomicRanges,GenomicRanges signature. o Deprecate "resolveHits" methods. BUG FIXES o Several bug fixes to "nearest". o Output of "findSpliceOverlaps" now displays 'NA' for ranges with no hits. CHANGES IN VERSION 1.10.0 ------------------------- NEW FEATURES o SummarizedExperiment gains direct GRanges / GRangesList interface to rowData. o Add "distanceToNearest" method for GenomicRanges objects. o SummarizedExperiment class can now be subset by row when there are no 'columns', and by column when there are no 'rows'. o Add 'drop.D.ranges' argument to coverage,GappedAlignments and coverage,GappedAlignmentPairs methods. o findOverlaps() now supports 'select="last"' and 'select="arbitrary"' (in addition to 'select="all"' and 'select="first"') on GenomicRanges objects. o summarizeOverlaps(..., mode="IntersectionStrict") now handles circular chromosomes. A warning is issued and circular chromosomes in 'reads' are omitted from counting. o Add disjoin,GRangesList method. o Add findSpliceOverlaps() for identifyng ranges (reads) that are compatible with a specific transcript isoform (the non-compatible ranges are analyzed for the presence of novel splice events). o Add ngap,GappedAlignmentPairs method. o Add introns() generic with methods for GappedAlignments and GappedAlignmentPairs objects. o No more arbitrary max of 3 gaps per read in isCompatibleWithSplicing() and isCompatibleWithSkippedExons(). o Add findCompatibleOverlaps() and countCompatibleOverlaps(). o Passing '...' down through as.data.frame(GRanges, ...) so user can tweak stringsAsFactors default for metadata columns. o Add extractSteppedExonRanks(), extractSpannedExonRanks() and extractQueryStartInTranscript() utilities (work with single- and paired-end reads). o Add 'flip.query.if.wrong.strand' arg (FALSE by default) to "encodeOverlaps" method for GRangesList objects. o Add makeSeqnameIds() low-level utility. SIGNIFICANT USER-VISIBLE CHANGES o SummarizedExperiment rowData and assays operations have significant performance improvements. o mcols() is now the preferred way (over elementMetadata() or values()) to access the metadata columns of a GenomicRanges, GRangesList, GappedAlignments, GappedAlignmentPairs, SummarizedExperiment object, or any Vector object. elementMetadata() and values() might go away at some point in the (not so close) future. o Add "$" and "$<-" methods for GenomicRanges *only*. Provided as a convenience and as the result of strong popular demand. Note that those methods are not consistent with the other "$" and "$<-" methods in the IRanges/GenomicRanges infrastructure, and might confuse some users by making them believe that a GenomicRanges object can be manipulated as a data.frame-like object. It is therefore recommended to use them only interactively, and their use in scripts or packages is discouraged. For the latter, use 'mcols(x)$name' instead of 'x$name'. o No more warning when doing as(x, "GRanges") on a RangedData object with no "strand" column. o Refactor "[" method for GenomicRanges objects. The new implementation always preserves the names of the selected elements instead of trying to return a GenomicRanges object with unique names. This new behavior is consistent with subsetting of ordinary vectors and other Vector objects defined in IRanges/GenomicRanges. Also modify "seqselect" method for GenomicRanges objects so it also preserves the names of the selected elements (and thus remains consistent with new behavior of "[" method for GenomicRanges objects). o No more names on the integer vector returned by "ngap" method for GappedAlignments objects. o Many improvements to the "Overlap encodings" vignette. o Remove 'param' argument from summarizeOverlaps() generic. DEPRECATED AND DEFUNCT o Defunct previously deprecated grg() function. o Defunct previously deprecated countGenomicOverlaps() generic and methods. BUG FIXES o Fix several issues with "precede", "follow", "nearest", and "distance" methods for GenomicRanges objects. o Fix bug in summarizeOverlaps(..., ignore.strand=TRUE). o 6x speedup (and a 6x memory footprint reduction) or more when using encodeOverlaps() on big GRangesList objects. o Fix bug in renameSeqlevels() wrt order of rename vector. o Fix bug in selectEncodingWithCompatibleStrand(). CHANGES IN VERSION 1.8.0 ------------------------ NEW FEATURES o Add GappedAlignmentPairs class (with accessors first(), last(), left(), right(), seqnames(), strand(), isProperPair()), and readGappedAlignmentPairs() for dealing with paired-end reads. Most of the GappedAlignments functionalities (e.g. coercion to GRangesList, "findOverlaps" and related methods, "coverage", etc...) work on a GappedAlignmentPairs object. o Add encodeOverlaps,GRangesList,GRangesList,missing and related utilities flipQuery(), selectEncodingWithCompatibleStrand(), isCompatibleWithSplicing(), isCompatibleWithSkippedExons() and extractSkippedExonRanks(). o Add 'order.as.in.query' arg to grglist() and rglist(). o SummarizedExperiment gains direct access to colData columns with $, $<-, [[, and [[<- methods o Add map,GenomicRanges,GRangesList and map,GenomicRanges,GappedAlignments methods. These allow mapping from genome space to transcript space, and genome space to read space, respectively. o Add seqinfo methods (and friends) for RangedData, RangesList, and other IRanges data structures. These use metadata(x)$seqinfo. o Add disjointBins,GenomicRanges. o Add score,GRangesList and score,GenomicRanges (gets the score column like for RangedData). o Add RangedDataList -> GenomicRangesList coercion. o Add RleViewsList -> GRanges coercion. o Add pintersect,GRangesList,GRangesList o Add stack,GenomicRangesList o ignore.strand argument now more uniformly supported on set operations. o Add Ops,GenomicRanges (from rtracklayer). o Add strand,Rle (only logical-Rle is supported). o Add compare,GenomicRanges o Add 'drop.empty.ranges' arg (FALSE by default) to low-level cigar utilities cigarToIRanges(), cigarToIRangesListByAlignment(), and cigarToIRangesListByRName(). o Add 'reduce.ranges' arg to cigarToIRangesListByAlignment(). SIGNIFICANT USER-VISIBLE CHANGES o grglist,GappedAlignments now carries over metadata columns. o Names are no longer forced to be unique when unlisting a GRangesList with use.names=TRUE. o seqnames() is now preferred over rname() on a GappedAlignments object. o cigarToIRangesListByAlignment() now returns a CompressedIRangesList instead of CompressedNormalIRangesList. o Low-level CIGAR utilities now ignore CIGAR operation P (instead of trowing an error). o The 'weight' arg in "coverage" method for GenomicRanges objects now can also be a single string naming a column in elementMetadata(x). o Ranges outside the sequences bounds of the underlying sequences are now accepted (with a warning) in GenomicRanges/GRangesList/GappedAlignments objects. o When called with 'ignore.strand=TRUE', the "range" and "disjoin" methods for GenomicRanges objects now behave like if they set the strand of the input to "*" before they do any computation. o When called with 'ignore.strand=TRUE', "reduce" method for GenomicRanges objects, and "union", "intersect" and "setdiff" methods for GRanges objects now set the strand of their arguments to "*" prior to any computation. o No more mangling of the names when combining GRanges objects ("c" method for GRanges objects was trying to return unique names). o Remove isCircularWithKnownLength() generic and methods (nobody knows, uses, or needs this). BUG FIXES o flank,GRangesList no longer forces 'use.names' to TRUE and 'both' to FALSE. o range,GenomicRanges was broken when object had no ranges o Fix integer overflow issue that can occur in cigarQNarrow() or cigarQNarrow() when the cigar vector is very long. CHANGES IN VERSION 1.6.0 ------------------------ NEW FEATURES o seqlevels() and seqinfo() setters have a new arg ('force', default is FALSE) to force dropping sequence levels currently in use. o Seqinfo objects now have a genome column that can be accessed with genome() getter/setter. o "pgap" method for c(x="GRanges", y="GRanges"). o Add comparison (==, <=, duplicated, unique, etc...) and ordering (order, sort, rank) methods for GenomicRanges objects. o Add "flank" method for GRangesList objects. o Add "isDisjoint" and "restrict" methods for GRanges and GRangesList objects. o Add GRangesList constructor makeGRangesListFromFeatureFragments(). o Add "names" and "names<-" methods for GappedAlignments objects. o Add 'ignore.strand' arg to a number of methods: - findOverlaps,GRangesList,RangesList - findOverlaps,GappedAlignments,ANY - findOverlaps,ANY,GappedAlignments o 'shift' and 'weight' arguments of "coverage" method for GenomicRanges objects now can be numeric vectors in addition to lists. o Add "c" method for GappedAlignments objects. SIGNIFICANT USER-VISIBLE CHANGES o readGappedAlignments() supports 2 new arguments: (1) 'use.names' (default is FALSE) for using the query template names (QNAME field in a SAM/BAM file) to set the names of the returned object, and (2) 'param' (default is NULL, otherwise a ScanBamParam object) for controlling what fields and which records are imported. readGappedAlignments() doesn't support the 'which' arg anymore. o The names of a GRanges/GRangesList/GappedAlignments object are not required to be unique anymore. o By default, the rownames are not set anymore on the DataFrame returned by elementMetadata() on a GRanges/GRangesList/GappedAlignments object. o 'width' arg of "coverage" method for GenomicRanges objects now must be NULL or numeric vector (instead of NULL or list). DEPRECATED AND DEFUNCT o Deprecate countGenomicOverlaps() in favor of summarizeOverlaps(). o Deprecate grg() in favor of granges(). BUG FIXES o Fix bug in "pintersect" methods operating on GappedAlignments objects.