Spark Performance Tuning - Part 3

Play Spark Performance Tuning - Part 3
Sign in to queue


This week's Data Exposed show welcomes back Maxim Lukiyanov to talk more about Spark performance tuning with Spark 2.x. Maxim is a Senior PM on the big data HDInsight team and is in the studio today to present Part 3 of his 4-part series.

Topics in today's video:

[00:45] - Recap and overview of the first two videos

[03:40] - Join Types (SortMerge and Broadcast)

[09:30] - Cost-based Optimizer

[21:35] - Outliers and Data Skew

Spark 2.2 rc4 on Azure HDInsight: Script action

Be sure to follow the Data Exposed show on Twitter at @DataExposed!



Download this episode

The Discussion

Add Your 2 Cents