Building a Kickass Data Science Pipeline with Azure Batch, Ubuntu, and Microsoft R Server
Play Building a Kickass Data Science Pipeline with Azure Batch, Ubuntu, and Microsoft R Server
A typical data science pipeline involves feature engineering, model building (often with cross validation and hyper-parameter optimization) and then production deployment of those models. There are many ways to solve this problem but the cloud plays a key role in almost all of them. In this session we'll use a range of open source tools on Microsoft Azure to build out our pipeline. This includes Microsoft R Server, Azure Batch, Azure Functions, Ubuntu and a sprinkling of Python. This example is based on a real-world production system but it has been ported across to a sample dataset; you'll be able to take all of the code and scripts away to use yourself.