Thomas Alex joins Lara Rubbelke to discuss how Microsoft uses Apache Kafka for HDInsight to power Siphon, a data ingestion service for internal use. Apache Kafka for HDInsight is an enterprise-grade, open-source, streaming ingestion service. Microsoft created Siphon as a highly available and reliable service to ingest massive amounts of data for processing in near real time. Siphon handles ingestion of over a trillion events per day across multiple business-critical scenarios at Microsoft. In this episode, learn how Siphon uses Apache Kafka for HDInsight as its scalable pub/sub message queue.
For more information:
- Quickstart: Create a Kafka on HDInsight cluster
- Tutorial: Use Spark Structured Streaming with Kafka on HDInsight
- Apache Kafka for HDInsight overview
- Azure HDInsight pricing
- Siphon: Streaming data ingestion with Apache Kafka
- Create a free account (Azure)