Taking R to new heights for scalability and performance
Big Data is all the rage, but how can enterprises extract value from such large accumulations of data as found in the growing corporate "data lakes" or "data reservoirs." The ability to extract value from big data demands high performance and scalable tools – both in hardware and software. Increasingly, enterprises take on massive predictive modeling projects, where the goal is to build models on multi-billion row tables or build thousands or millions of models. Data scientists need to address use cases that range from modeling individual customer behavior to understand aggregate behavior or tailoring predictions at the individual customer level, to monitoring sensors from the Internet of Things for anomalous behavior. While R is cited as the most used statistical language, limitations of scalability and performance often restrict its use for big data. In this talk, we present scenarios both on Hadoop and database platforms using R. We illustrate how Oracle Advanced Analytics' R Enterprise interface and Oracle R Advanced Analytics for Hadoop enable taking R to new heights for scalability and performance.