Contents

1 Introduction

When running a large benchmark study, not uncommonly, a single or a small subset of methods may fail during execution. This may be the result of misspecified parameters, an underlying bug in the software, or any number of other reasons. By default, errors thrown by methods which fail during buildBench() or updateBench() (see Feature: Iterative Benchmarking for details on updateBench()) are caught and handled in a user-friendly way. As long as a single method executes without any errors, a SummarizedBenchmark object is returned as usual, with the assay columns of failed methods set to NA. Additionally, the corresponding error messages are stored in the metadata of the object for reference.

2 Simple Case Study

As an example, consider the following example where we run case where we benchmark two simple methods. The first, slowMethod draws 5 random normal samples after waiting 5 seconds, and the second, fastMethod draws 5 random normal samples immediately. Each method is then passed through two post-processing functions, keepSlow and makeSlower, and keepFast and makeSlower, respectively. This results in three partially overlapping assays, keepSlow, keepFast and makeSlower. With this example, we also demonstrate how mismatched assays are handled across methods.

We run these methods in parallel using parallel = TRUE and specify a timeout limit of 1 second for the BPPARAM. Naturally, slowMethod will fail, and fastMethod will fail during the makeSlower post-processing function.

Notice that during the execution process, errors caught by buildBench() are printed to the console along with the name of the failed method and post-processing function when appropriate.

We can verify that a valid SummarizedBenchmark object is still returned with the the remaining results.

## class: SummarizedBenchmark 
## dim: 5 2 
## metadata(1): sessions
## assays(3): keepSlow makeSlower keepFast
## rownames: NULL
## rowData names(3): keepSlow makeSlower keepFast
## colnames(2): slowMethod fastMethod
## colData names(4): func.pkg func.pkg.vers func.pkg.manual session.idx

We can also check the values of the assays.

## $keepSlow
##       slowMethod fastMethod
## [1,]  0.75650626         NA
## [2,]  0.91036653         NA
## [3,]  0.39366979         NA
## [4,] -1.61575252         NA
## [5,] -0.07700346         NA
## 
## $makeSlower
##       slowMethod fastMethod
## [1,]  0.75650626 -1.3274499
## [2,]  0.91036653 -0.4026993
## [3,]  0.39366979  1.8109946
## [4,] -1.61575252 -0.9919988
## [5,] -0.07700346  1.0552455
## 
## $keepFast
##      slowMethod fastMethod
## [1,]         NA -1.3274499
## [2,]         NA -0.4026993
## [3,]         NA  1.8109946
## [4,]         NA -0.9919988
## [5,]         NA  1.0552455

Notice that most columns contain only NA values. These columns correspond to both methods which returned errors, as well as methods missing post-processing functions, e.g. no keepSlow function was defined for the fastMethod method. While the NA values cannot be used to distinguish the sources of the NA values, this is documented in the sessions list of the SummarizedBenchmark metadata. While the sessions object is a list containing information for all previous sessions, we are only interested in the current, first session. (For more details on why multiple sessions may be run, see the Feature: Iterative Benchmarking vignette.)

## [1] "methods"     "results"     "parameters"  "sessionInfo"

In sessions, there is a "results" entry which includes a summary of the results for each combination of method and post-processing function (assay). The entries of results can take one of three values: "success", "missing", or an error message of class buildbench-error. The easiest way to view these resultsis by passing the results to the base R function, simplify2array().

##            slowMethod fastMethod
## keepFast   "missing"  "missing" 
## keepSlow   "success"  "success" 
## makeSlower "success"  "success"

In the returned table, columns correspond to methods, and rows correspond to assays. We clearly see that many of the methods failed due to exceeding the specified time limit. If we check one of these entries more closesly, we see that it is indeed a buildbench-error object that occurred ("origin") during the "main" function.

## [1] "success"

3 Disable Handling

If this error handling is not wanted, and the user would like the benchmark experiment to terminate when an error is thrown, then optional parameter catchErrors = FALSE can be specified to eiher buildBench() or updateBench(). Generally, this is advised against as the outputs computed for all non-failing methods will also be lost. As a result, the entire benchmarking experiment will need to be re-executed.

4 References