Data Preparation with the U-SQL language
In this episode of Data Exposed Scott welcomes to the show Michael Rys, Principal Program Managers in the Big Data group. Today Michael is in the studio to talk about an exciting new language called U-SQL, the new big data query language of the Azure Data Lake Analytics Service. There have been a lot of buzz around U-SQL since its announcement, and Michael wastes no time explaining what is U-SQL, but more importantly, why the need for U-SQL. Michael builds the case for the U-SQL big data query language by first explaining some of the characteristics of big data analytics and then discussing some use cases that support the need for U-SQL.
Michael gives us some insight into the current landscape for querying big data, discussing some of the pros and cons of some of the existing big data query technologies including SQL (Hive) and other programing languages, and then discusses how U-SQL solves many of the challenges for querying big data and how U-SQL makes it easy by unifying declarative and extensible benefits into a single language for querying big data.
At the 8:25 mark Michael wraps the show up by showing us a simple U-SQL code sample that shows a simple data extraction and count of tweets and outputs those results into a new CSV file. The demo shows simple SQL operating over unstructured data. Michael extends the demo by showing how to implement C# expressions into the code to enhance the code.
Great introduction into the U-SQL language. We will definitely be having Michael back for further discussions on this awesome language.