Artur Laksberg

Artur Laksberg Artur Laksberg

Niner since 2009


  • Device to device ​communicati​on using Azure IoT Hub

    Divya's blog post with more detail and links to the source is here.

  • GoingNative 31: Easy ​Paralleliza​tion with Parallel STL

    @Norbert: You can find the implementation here.

    To get reliable results, you need to perform multiple runs, preferably on different types of hardware. The variance is usually more significant than the precision of the timer, so measuring small differences in performance comes down to probability and statistics. We've found Student's t-test to be useful for measuring the difference between the sets of runs.

  • Casablanca: C++ on Azure

    @maxbeard12: Correct! Azure is not required for consuming REST services.

    BTW, We're not monitoring this page actively, so for the future, I suggest Casablanca forums for questions and feedback:

    Artur Laksberg,
    Casablanca Team

  • Parallel Programming for C++ Developers: Tasks and ​Continuatio​ns, Part 1 of 2


    It's a great question but any meaningful answer must be qualified with "compared with what?"

    Tasks definitely have overhead over plain OS threads, and if the amount of work you're doing in a task is small, the overhead becomes more pronounced. Because tasks are built on top of other PPL/ConcRT constructs, they have overhead over these constructs. So one way to look at the efficiency is to try to solve the same problem using different constructs and compare the results.

    To take one specific example, I can calculate a Fibonacci number (using the naïve recursive algorithm) in a parallel_for loop from 0 to 100, then do the same by spawning 100 tasks that do the same, and compare the elapsed time of both solutions. The data I get on my laptop shows that the tasks-based solution is about 5% slower than the parallel_for-based solution. I'm not too worried about this kind of overhead because a) it's small and more importantly b) you would never do this in a real-world application - parallel_for is a better solution for this problem.

    Some of our other performance tests show that the overhead can be quite significant, so more optimization is in order.

    Now to my main point. For problems where PPL tasks offer a more productive programming model, their performance should be "good enough" so that you don't have to fall back to a less productive programming model, such as OS threads. If we have accomplished that, we have succeeded. If you want to use PPL tasks but are forced to use some lower-level constructs to get the performance you need, we have failed.