Building a Kickass Data Science Pipeline with Azure Batch, Ubuntu, and Microsoft R Server

Sign in to queue

Description

A typical data science pipeline involves feature engineering, model building (often with cross validation and hyper-parameter optimization) and then production deployment of those models. There are many ways to solve this problem but the cloud plays a key role in almost all of them. In this session we'll use a range of open source tools on Microsoft Azure to build out our pipeline. This includes Microsoft R Server, Azure Batch, Azure Functions, Ubuntu and a sprinkling of Python. This example is based on a real-world production system but it has been ported across to a sample dataset; you'll be able to take all of the code and scripts away to use yourself.

Day:

3

Level:

Level 300

Session Type:

Breakout

Code:

M364

Room:

Marlborough Room (SKYCITY)

Embed

Download

Download this episode

The Discussion

Add Your 2 Cents