Streaming Big Data on Azure with HDInsight Kafka, Storm and Spark
Implementing big data streaming pipelines for robust, enterprise use cases is hard. Doing so with open source technologies is even harder. To help with this, HDInsight recently added Kafka as a managed service to complete a scalable, big data streaming scenario on Azure. This service processes millions+ of events/sec, pedabytes of data/day to power scenarios like Toyota’s connected car, Office 365’s clickstream analytics, fraud detection for large banks, etc. We will discuss the streaming landscape, challenges in building production ready streaming services, and build an enterprise grade realtime pipeline. We will then discuss the learnings and future investments on managed open source streaming through Azure HDInsight.