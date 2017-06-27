Sign in to queue
Spark Performance Tuning - Part 2

This week's Data Exposed show welcomes back Maxim Lukiyanov to talk more about Spark performance tuning with Spark 2.x. Maxim is a Senior PM on the big data HDInsight team and is in the studio today to present Part 2 of his 4-part series.

Topics in today's video:

[01:40] - DataSets vs. DataFrames vs. RDDs

[10:45] - Garbage Collection Overhead and Executor Size

[18:20] - Data Formats  

[22:35] - Data Partitioning

[26:25] - Caching

Data, Performance, spark, Big Data

