Earlier in February 2017 we announced availability of SQL Data Warehouse (SQLDW) PolyBase support for Azure Data Lake Store (ADLS). Customers are able to rapidly load cooked data from ADLS into SQLDW using PolyBase, thereby speeding up time to start performing interactive analytics.
However, the above support requires users having to load the data from SQLDW itself. Customers want to run ingestion, preparation and data loading pipelines on a scheduled basis. They want cooked data to be automatically loaded into SQLDW at the end of the pipeline run, and have it ready for analysis and dashboarding. Azure Data Factory (ADF) has been used to perform the pipeline activities and the scheduled copying of data also. But, the speed of copying is bottlenecked because the copies happen one row at a time into SQLDW instead of loading the data in a parallelized fashion.
To address the above gap, Azure Data Factory has now enabled support to use PolyBase to load data from ADLS into SQLDW in the copy activity. Data can now be copied many, many times faster than in previous copy activities which did not use PolyBase support. In addition, Azure Data Factory is now supporting Service Principal authentication for ADLS. Your copy activities can now run uninterrupted without requiring to be reauthorized every couple of months. PolyBase support in ADF requires activities to use Service Principal authentication with ADLS so you can have data being loaded with high throughput without interruptions.
The video below shows you how to use this new features. Learn how to operate your big data pipeline in a convenient, automated way and use the time saved to focus on getting even more insights from your data. Give it a try today.