IOT Analytics Architecture Whiteboard with David Crook

Download this episode

Download Video

Download captions

Download Captions

Description

When working with IOT one of the more common questions we get is this:  "What is the typical architecture in IOT scenarios."  In this video, David Crook uses a white board to diagram and discuss a very common architecture when dealing with IOT devices and then addresses some questions the audience had at the end of the talk.  

Embed

Format

Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    • User profile image
      Lars

      Great explanation. What is you sugested datastore? Would azure tablestore be OK, for how should I store the files to be able to use them in hadoop (or spark)

    • User profile image
      DrCrook

      @Lars: wasb:// Blob storage is HDFS compliant, however I would suggest for new products to use Azure Data Lake as it is HDFS compliant, has infinite data capabilities and will support U-SQL and Azure Data Lake Analytics packages. 

    • User profile image
      larsoleruben

      OK, thanks a lot for you feedback. I really like this architecture. Do you have any customers doing data validation in the stream analytics part before saving data to the storage? I mean building a model of the devices from which the data comes and then by machine learning mark data to be questionable? Would that be possible to do for very large amounts of data?

    • User profile image
      DrCrook

      I don't see why not.  I'm interested to hear the use case.  One thing to note is because I can do something, doesn't mean I necessarily will.  For example, to generate an ML model on the fly in a stream means you have access to basically a windowed snapshot of the data, which is likely not very much data, you could theoretically bring in the historical stores as well, but then in my opinion you are defeating the purpose of Stream Analytics. 

      I would generate an ML model from my historical stores first, then dynamically pull up that model from stream and compare incoming objects to that.  I also do normalization of windowed objects (if necessary) in the stream.  You have to architect you ML algorithm fairly intelligently to use in Stream Analytics as to update the query itself, you need to recycle the stream job.  You could theoretically stand up a second job, then shut down the first.  I haven't tried it, but it should work.

      As for quantity of data that this will handle, you get up to16 channels per hub, and here is the page for input: https://azure.microsoft.com/en-us/documentation/articles/event-hubs-availability-and-support-faq/

      You can then have different stream jobs listening to 1 or many channels and if necessary nest them by having the output feed into another input.

      Sounds like a great session topic :) 

       

       

    • User profile image
      meeran

      ok

    • User profile image
      Giuseppe Mascarella

      Great job in make is so simple and easy to remember.

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to send us feedback you can Contact Us.