02 Juv Chan -Interactive IIS Log Analysis with Azure HDInsight Spark (Linux)

Sign in to queue

Description

This video shows how to perform interactive IIS log analysis and visualization with Python (PySpark), Jupyter notebook and custom Python library on Azure HDInsight Linux Spark cluster.

Video Table Content:

[00:41] Goal

[01:22] Agenda

[02:00] Pre-requisites

[02:44] What is Spark

[03:43] Azure HDInsight Spark (Linux)

[04:53] Management Dashboard Snapshot

[05:49] What is Jupyter Notebook

[07:26] What is RDD

[08:13] RDD Operations

[08:50] Python Spark API (PySpark) and Libraries

[10:35] Code Walk-through Overview 

[12:06] Demo

[23:15] Sample Code GitHub Repository

[23:40] Useful Resources and References

Day:

1

Embed

Download

Download this episode

The Discussion

Add Your 2 Cents