On-Premises Hadoop and Revolution R: Architectures and Solutions
For machine learning, Hadoop offers new performance capabilities, but not the intrusion of Hadoop's accompanying tradeoffs--performance, resource consumption, and data management. Machine learning users should consider Hadoop as a portion of a solution, but not the end-all. Alternatives such as dedicated servers, in-database deployment, and memory-based alternatives like Apache Spark can be combined with Hadoop to address a far broader array of opportunities. Fortunately for Revolution R users, Revolution R Enterprise (RRE) enables analytical scripts and models built in RRE to port between platforms with relative ease. In this session, we'll review the considerations for R developers, including performance, resource management, and data handling for deployment on Hadoop, individual servers, clusters and grids, in-database, and in-memory, including Apache Spark. We'll also dive briefly into the internals RRE on Hadoop to deepen awareness of some of the tradeoffs.