Building A Scalable Data Science Platform with R and Hadoop

Play Building A Scalable Data Science Platform with R and Hadoop
Sign in to queue


Hadoop is famously scalable. Cloud Computing is famously scalable. R – the thriving and extensible open source Data Science software – not so much. But what if we seamlessly combined Hadoop, Cloud Computing, and R to create a scalable Data Science platform? Imagine exploring, transforming, modeling, and scoring data at any scale from the comfort of your favorite R environment. Now, imagine calling a simple R function to operationalize your predictive model as a scalable, cloud-based Web Service. Learn how to leverage the magic of Hadoop on-premises or in the cloud to run your R code, thousands of open source R extension packages, and distributed implementations of the most popular machine learning algorithms at scale.



Download this episode

The Discussion

Add Your 2 Cents