A Future for R

Download this episode

Download Video


A future is an abstraction for a value that may be available at some point in the future and which state is either unresolved or resolved. When a future is resolved the value is readily available. How and when futures are resolved is given by their evaluation strategies, e.g. synchronously in the current R session or asynchronously on a compute cluster or in background processes. Multiple asynchronous futures can be created without blocking the main process providing a simple yet powerful construct for parallel processing. It is only when the value of an unresolved future is needed it blocks. We present the future package which defines a unified API for using futures in R, either via explicit constructs f <- future({ expr }) and v <- value(f) or via implicit assignments (promises) v %<-% { expr }. From these it is straightforward to construct classical *apply() mechanism. The package implements synchronous eager and lazy futures as well as multiprocess (single-machine multicore and multisession) and cluster (multi-machine) futures. Additional future types can be implemented by extending the future package, e.g. BatchJobs and BiocParallel futures. We show that, because of the unified API and because global variables are automatically identified and exported, an R script that runs sequentially on the local machine can with a single change of settings run in, for instance, a distributed fashion on a remote cluster with values still being collected on the local machine. The future package is cross-platform and available on CRAN with source code on GitHub (https://github.com/HenrikBengtsson/future).





Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to send us feedback you canĀ Contact Us.