Part 2: Predicting airline delay using and end-to-end Big Data solution using HDInsight anf Hive AND Predicting delay using HDInsight and Azure Machine learning Part two
This is the second part of a 2 part session. In the first part we saw how to collect and model data using HDInsight and Hive.
Machine learning as a service isn't new, but when Microsoft offers it as part of Azure you have to take notice. It is also worth noting that Microsoft already has a cloud based service for big data based on Hadoop - Azure HDInsight. One of the Microsoft Azure HDinsight key components is Mahout, a scalable machine learning library that provides a number of algorithms relying on the Hadoop platform but now there is a new kid in Machine Learning town: Azure Machine Learning that also can use HDInsight as a datasource. Let's find out if Azure ML is targeted at users not willing or able to learn the technicalities of implementing a Hadoop solution. In this session we will use the CRISP data mining methodology to build an ML solution from end to end using HDInsight as a data source. We will look at ML studio, the algorithms, the types of learning, Hadoop map/reduce of our datasets, ML support for R, the ML API Service, how to interpret the results and finally how to deploy it to a large scale production-like environment using SSIS and C# to build customer recommendations. We will use the data generated in the Part I of the presentation to predict if your upcoming flight is going to be delayed or not. If time allows we will also check the predictions against the actual facts to see the accuracy of the implemented algorithms.
Attend this session to learn to use Azure Machine Learning, the concepts, ML studio and to use it in a production like environment