Building A Scalable Data Science Platform with R and Hadoop

Download this episode

Download Video


Hadoop is famously scalable. Cloud Computing is famously scalable. R – the thriving and extensible open source Data Science software – not so much. But what if we seamlessly combined Hadoop, Cloud Computing, and R to create a scalable Data Science platform? Imagine exploring, transforming, modeling, and scoring data at any scale from the comfort of your favorite R environment. Now, imagine calling a simple R function to operationalize your predictive model as a scalable, cloud-based Web Service. Learn how to leverage the magic of Hadoop on-premises or in the cloud to run your R code, thousands of open source R extension packages, and distributed implementations of the most popular machine learning algorithms at scale.



Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to send us feedback you can Contact Us.