Coffeehouse Thread

17 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Asynchrony, Progress Reporting and Cancellation

Back to Forum: Coffeehouse
  • User profile image
    exoteric

    The new async/await pattern in C# is a great step forward but when dealing with asynchronous computations or more generally processes, three things inevitably show up:

    • progress reporting
    • cancellation
    • multi-valued returns

    There's an article about how to incorporate both cancellation and progress reporting into asynchronous operations but at that point algorithms start to become less elegant in my oppinion.

    If we add multi-valued returns into the mix the only real alternative in town appears to be Rx.

    Should we use Rx, cancellation is more elegant than with async and TaskCancellationToken; we can just call Dispose() on the subscription.

    But how to deal with progress reporting?

    We can simply model progress as an event too. In a language that supports discriminated unions like F# this is trivial but it's of course also possible to model in C# etc.

    An Rx event stream can be used for single-value return functions as well (and a zero-returning function can be treated as an error case) but async doesn't appear so well suited for representing multi-valued returns because the returns come back in one chunk, unless you use something like IAsyncEnumerable.

    But where does async really excel over Rx (complementary as they may be)? Language support! Async/await is a language feature whereas Rx is not. That is, until someone came up with the idea of using F# workflow builders. This allows one to add custom semantics to existing syntactic constructs like for, while, use etc.

    I do wonder, though, how a language where the designer had all these things in mind from the outset, could look because I still feel like asynchrony has not been fully embraced yet.

    Here's an alternative way to deal with asynchrony and progress reporting which plugs nicely into any GUI application (in this case a side-effecting (IO) operation where we are not really interested in values but progress reporting):

                process
                |> Rx.map
                    (fun signal ->
                        match signal with
                        | ValueSignal x ->
                            ()
                        | ErrorSignal e  ->
                            ()
                        | StatusSignal s ->
                            processStatusLabel.Text <- s
                        | ProgressSignal p ->
                            processProgressBar.Value <- p
                        signal)
                    |> Rx.subscribe
                    |> ignore
    

    In this model a "process" is a signal of signals, where we can pattern match over the aspects we are interested in.

    The Rx approach incidentally is also great for hooking up logging and other listeners as a separate aspect.

    What do you think about this whole problematic, where .NET and WinRT is heading and the state of language support today?

  • User profile image
    vesuvius

    You could always use the background worker component, it is not a language level component, but has yet to be equalled when it comes to ease of use and reporting progress.

    I realise that a background worker is cumbersome for something like WinRT, the reason the async and await are so useful, is that anything between 50%-90% of the time, you are making web service calls, or starting/suspending applications where progress reporting is not required. If you do have a granular procedure that requires progress reporting, then .NET already has APM and EAP, where TAP and your exposition is merely the new kid on the block.

  • User profile image
    exoteric

    There are many nice acronyms but few nice technologies.

    APM is pretty horrible to work with in my oppinion. Rx has methods to encapsulate the pattern so that it becomes practical and usable. EAP is probably better but it doesn't look all that attractive or compositional either.

    There are several ugly ways to skin a cat - I'm just trying to find the cleanest one - to use a slightly disguisting metaphor.

    I find Rx quite superior but it could benefit from some kind of language support. F# provides a means to customize semantics of existing syntactic constructs and override the semantics. It doesn't quite feel like the optimal solution yet though (dealing with side-effects, and how to best use multiple interpretations of a for loop inside a workflow body, for example).

  • User profile image
    wkempf

    , vesuvius wrote

    You could always use the background worker component, it is not a language level component, but has yet to be equalled when it comes to ease of use and reporting progress.

    I realise that a background worker is cumbersome for something like WinRT, the reason the async and await are so useful, is that anything between 50%-90% of the time, you are making web service calls, or starting/suspending applications where progress reporting is not required. If you do have a granular procedure that requires progress reporting, then .NET already has APM and EAP, where TAP and your exposition is merely the new kid on the block.

    BackgroundWorker uses a background thread. Not all asynchronous operations require a background thread, and so BackgroundWorker is a pretty heavy weight solution. Moreover, the only thing BackgroundWorker provides over IProgress<T>/CancellationToken is designer support... which honestly I question the utility of. IOW, I'll have to disagree with your assertion that it "has yet to be equalled when it comes to ease of use and reporting progress." In any event, exoteric has already dismissed this with "but at that point algorithms start to become less elegant in my oppinion." He didn't expand on that, so I'm not sure why he has this opinion, but BackgroundWorker isn't going to cut it for him.

    Me, I find Rx to become horribly complicated rather quickly. It's great for small things, especially in handling UI events, but it doesn't scale well. Part of that stems just from not having a lot of solid functional experience, but unless you're developing on your own that's something to keep in mind, because I have a lot more experience here than your average .NET developer. Also, I think part of it is from design problems with Rx. Given a choice between Rx and async/await with IProgress/CancellationToken I'll take the latter 9 times out of 10. I believe the resulting code is easier to understand and maintain, especially for your average .NET developers.

    I think async/await code composes fairly nicely, though there are some tricks one has to learn in order to ensure good results. Asynchronous coding is hard, plain and simple, and I don't think anyone's found a silver bullet for it yet, nor am I overly optimistic that anyone will. I like the mixture of solutions available to us in .NET, because the various solutions are each the best choice for specific scenarios. I'm also looking forward to the writeable/readable/immutable/isolated changes being researched, which is another key component in this whole discussion.

  • User profile image
    Richard.Hein

    , exoteric wrote

    There are many nice acronyms but few nice technologies.

    APM is pretty horrible to work with in my oppinion. Rx has methods to encapsulate the pattern so that it becomes practical and usable. EAP is probably better but it doesn't look all that attractive or compositional either.

    There are several ugly ways to skin a cat - I'm just trying to find the cleanest one - to use a slightly disguisting metaphor.

    I find Rx quite superior but it could benefit from some kind of language support. F# provides a means to customize semantics of existing syntactic constructs and override the semantics. It doesn't quite feel like the optimal solution yet though (dealing with side-effects, and how to best use multiple interpretations of a for loop inside a workflow body, for example).

    Rx is more general than async/await and therefore has more applications.  Rx doesn't force you to decide if something is async or sync in advance.  The composition is the same regardless.  Rx applies to single or multi-values, and therefore has more power than async/await as it does not deal with mult-values. 

    Rx shares the same pattern across many languages.  Async/await does not.  You can use Rx from any .NET language.  You can use Rx in JavaScript and the semantics and everything you learn about Rx from .NET applies to it except the specialized schedulers which vary depending on the implementation.   Learning Rx means you will be able to leverage it in any language (eventually) and also deal with asynchrony in any language - even C/C++ are getting Rx.  F# and Rx work very well together and show how simple Rx could be with language support. 

    The complexity of Rx scales very well, because you can guarantee composability and you cannot do that with APM/TPM - you simply cannot because it's not obeying the fundamental rules of composition. Rx is built on mathematical rules of composition which can be expressed in many languages and there's no escaping having to understand these laws of composition for a programmer today.  You must understand these at some point to deal with complexity. 

    The limitations of C# is the problem with Rx, not Rx, in terms of complexity, as managing continuations seems to be a problem for large Rx compositions, as the number of arguments grows.  The code is inside out, but this is a simple matter of experience to grok, and everyone is capable.  There is also so many ways to write a complex composition in Rx that one can write horribly unreadable code in Rx which once refactored becomes a beautiful pearl.  There are code samples that show how Rx on the server can communicate with the client in just an absolutely simple pattern of LINQ queries, everywhere.

    But what can I say, without code to back it all up?  Later, after work, I'll try to post a few examples of crappy Rx code versus beautiful Rx code and then perhaps we can really see where the complexity lies, and why Rx has this real/perceived level of difficultly that is a barrier to entry.

  • User profile image
    vesuvius

    @wkempf: To be honest, a background worker is as useful and limiting as you have mentioned, but depends on what problems you have to fix.

    I sometimes neglect to recollect the diverse nature of programmers, what some people use 80% of the time other use 0.08% when they run through code samples and leave disgusted and confused as to why one would ever suggest X over Y.

  • User profile image
    vesuvius

    , Richard.Hein wrote

    *snip*

     The limitations of C# is the problem with Rx, not Rx, in terms of complexity, as managing continuations seems to be a problem for large Rx compositions, as the number of arguments grows.  The code is inside out, but this is a simple matter of experience to grok, and everyone is capable.  

    I wouldn't call it a limitation as much as "prior art". C# has for the best part of a decade been imperative. Resistance to Rx is solely based on its IoC, which is what a lot of people find hard about dealing with the IAsyncResult, Begin and End style programming when dealing with asynchrony. Once you start using ManualResetEvent and WaitHandle peoples heads start hurting, then I would interject that Rx semantically has similar issues with complexity.

    Rx has this real/perceived level of difficultly that is a barrier to entry because a lot of people plain don't like programming this way. I have seen some brilliant implementations of Rx in programs, and would use it after choosing wisely, the problem with a lot of developers is that they are more concerned with just using it for a problem (because its new and cool), rather than really identifying whether it is the best tool for the job.

  • User profile image
    exoteric

    , Richard.Hein wrote

    *snip*

    Rx is more general than async/await and therefore has more applications.  Rx doesn't force you to decide if something is async or sync in advance.  The composition is the same regardless.  Rx applies to single or multi-values, and therefore has more power than async/await as it does not deal with mult-values. 

    Rx shares the same pattern across many languages.  Async/await does not.  You can use Rx from any .NET language.  You can use Rx in JavaScript and the semantics and everything you learn about Rx from .NET applies to it except the specialized schedulers which vary depending on the implementation.   Learning Rx means you will be able to leverage it in any language (eventually) and also deal with asynchrony in any language - even C/C++ are getting Rx.  F# and Rx work very well together and show how simple Rx could be with language support. 

    The complexity of Rx scales very well, because you can guarantee composability and you cannot do that with APM/TPM - you simply cannot because it's not obeying the fundamental rules of composition. Rx is built on mathematical rules of composition which can be expressed in many languages and there's no escaping having to understand these laws of composition for a programmer today.  You must understand these at some point to deal with complexity. 

    That's not 100% true though is it? Side-effects can cross boundaries and wreak havoc. In queries involving well-tamed primitives it shouldn't be an issue though. IO is actually a very useful are to apply Rx - as you also imply below.

    The limitations of C# is the problem with Rx, not Rx, in terms of complexity, as managing continuations seems to be a problem for large Rx compositions, as the number of arguments grows.  The code is inside out, but this is a simple matter of experience to grok, and everyone is capable.  There is also so many ways to write a complex composition in Rx that one can write horribly unreadable code in Rx which once refactored becomes a beautiful pearl.  There are code samples that show how Rx on the server can communicate with the client in just an absolutely simple pattern of LINQ queries, everywhere.

    But what can I say, without code to back it all up?  Later, after work, I'll try to post a few examples of crappy Rx code versus beautiful Rx code and then perhaps we can really see where the complexity lies, and why Rx has this real/perceived level of difficultly that is a barrier to entry.

    That pretty much states my point (perhaps you are not replying to me). I don't find Rx complicated to use - it eliminates a lot of complexity - but still do find the syntactic expression of Rx semantics not as nice as it could be.

    As a purist (albeit a pragmatic one) I care about simplicity and cosmetics at every level but work with the popular tools as they are currently defined (and they have indeed improved a lot in recent times).

    F# provides a means improve syntactic expression of Rx semantics but it doesn't feel like we've arrived at the "final destination" in terms of elegance yet - it's "just" ahead of the curve.

    The pattern I show is just one more layer of unification; where additional semantics is added to model progress and status updates, where single and multi-valued asynchronous methods with cancellation were already modelled with Rx semantics.

    It would also be useful if one could constrain an IObservable to a single-valued collection, either by type or (perhaps) by Code Contracts.

  • User profile image
    evildictait​or

    The limitations of C# is the problem with Rx...

    I wish Microsoft would stop changing C#. .NET was always designed to be a family of languages that can interoperate between module boundaries. Frankly a lot of these new features are things that can already be trivially implemented via libraries in C# (like async, await could just be implemented as two functions on an IAsyncAwait interface).

    If Rx requires you to change from thinking imperatively to thinking functionally and there are complaints that the language gets in the way of, rather than enables the feature, why don't they just put it in F#?

  • User profile image
    Charles

    These are the kinds of threads in the Coffeehouse that make me smile Smiley

    Thank you. (Would love to see the ugly vs beautiful Rx code samples, Richard!).

    C

  • User profile image
    JoshRoss

    It seems to me that many of the problems that people have with Rx could be alleviated with FxCop like static code analysis and or visualizers like the concurrency visualizer.

    Then again, the last time I tried to use Rx, to solve a seemingly simple problem, I ran through a gauntlet of emotions ranging from inspired, stupid, angry and finally to sad, within a two day span.

    -Josh

  • User profile image
    exoteric

    , JoshRoss wrote

    It seems to me that many of the problems that people have with Rx could be alleviated with FxCop like static code analysis and or visualizers like the concurrency visualizer.

    Then again, the last time I tried to use Rx, to solve a seemingly simple problem, I ran through a gauntlet of emotions ranging from inspired, stupid, angry and finally to sad, within a two day span.

    -Josh

    Hidden markov models is perhaps not the ideal first topic for your very first Rx program Wink

    Brian's code looked very clean though but I didn't have the energy to experiment with it at the time (don't we all have extra hobby projects on the side). One wonders whether it was originally prototyped in Mathematica.

  • User profile image
    spivonious

    , wkempf wrote

    *snip*

    BackgroundWorker uses a background thread. Not all asynchronous operations require a background thread, and so BackgroundWorker is a pretty heavy weight solution. 

    Wait, why is this bad? I would have thought background threads would be lighter than foreground threads.

  • User profile image
    evildictait​or

    , spivonious wrote

    *snip*

    Wait, why is this bad? I would have thought background threads would be lighter than foreground threads.

    Depends what you mean by lighter. Background threads have shorter timeslices and different semantics when the application closes, but they are allocated the same system memory and handle resources as foreground threads.

    In any case, the premise is false. Background workers use a ThreadPool, so constructing large numbers of background workers to run for short periods of time isn't a bad thing since the cost of constructing the Thread (which is huge) is amortized over all of the BackgroundWorkers.

    The number of actual threads assigned to the thread pool is equal to the maximum number of background workers you had running at the same time - this means if you have N background worker threads, all of them are scheduled to run.

    This is in contrast to Task<T>s, which often lie dormant until space becomes available on one of the task schedulers and run synchronously if they are awaited before being scheduled on a background core. This means that Task<T> is really useful for speeding up short partial computations, whereas BackgroundWorker is really good for offloading medium-length work to a background thread.

    Thread is useful for activities that are very long lasting, where you don't really care that it might take seconds to start up the thread.

    So for example, if you were designing a web-server, you might want your main TcpListener to live in a Thread, each web-request to be offloaded to a BackgroundWorker, and you might want internal parts of your LINQ queries on large datasets as part of that page to be handled by Task<T> and Future<T> objects.

  • User profile image
    wkempf

    What's really missing in this discussion are asynchronous operations from I/O. These may occur within the "UI thread" event queue (keyboard and mouse input) or via I/O completion ports. These types of asynchronous operations are what I was mostly referring to. It's a lot "lighter" to use one of the new Task based asynchronous I/O operations to, for instance, download data from a web service than it is to use a BackgroundTask and synchronous I/O operations. It's unfortunate that WebClient.DownloadStringTaskAsync and friends don't have overloads for handling progress and cancellation (can't imagine why this was left out, actually), but it is possible to roll your own extension methods for this. In fact, there's a Code Project article that does this with the CTP. The result uses I/O completion ports rather than background threads to achieve the asynchronous I/O operation and includes progress (and you could easily add cancellation) support. http://www.codeproject.com/Articles/129447/Progress-Reporting-in-C-5-Async

    Oh, and I'm fully aware that BackgroundWorker operates using the thread pool. This still means it's using a background thread, it's just a pooled thread. This removes the overhead of having to create the thread every time, but it doesn't remove the overhead involved with context switching and scheduling of the thread. So the "premise" isn't "false". The remarks on the number of threads used by the thread pool are also way off the mark as the scheduling is far more complicated than just "the number of background workers you have running at the same time." Further, there's very little difference in how Task<T> and BackgroundWorker schedule work on the thread pool (at least in the typical scenario, since scheduling is something configurable for Task<T>). BackgroundWorker is no more or less suited for small or medium size work than Task<T> is.

    The recommendation for when to use Thread is fairly accurate, but the example with TcpListener isn't, really. Rather than use a dedicated thread there, you'd be better off with I/O completion ports again, using one of the async operations. There's little use for a BackgroundWorker in a server scenario, so for everything else you could rely on either more I/O completion ports and/or Tasks. Also, just FYI, there is no Future<T>. Future<T> became Task<T> in beta 1 of .NET 4.0.

  • User profile image
    Richard.Hein

    I have a bunch of Rx tests and samples, but they are on a bitlockered drive and I can't find the key since I put in a new SSD and upgraded to Win8 ... Sad.  

    I have some Kinect samples I will post later when I get the chance as well, but I don't have my Kinect here, so I can't test it.  In the meantime I will start with one example where I feel like there is boilerplate C# for no reason, and it's completely missing out on the FromEventPattern operator:

    class FileWatcher
    {
        public class FileChangedEvent{}
    
        public FileWatcher(string path, string filter, TimeSpan throttle)
        {
            Path = path;
            Filter = filter;
            Throttle = throttle;
        }
    
        public string Path { get; private set; }
        public string Filter { get; private set; }
        public TimeSpan Throttle { get; private set; }
    
        public IObservable<FileChangedEvent> GetObservable()
        {
            return Observable.Create<FileChangedEvent>(observer => {
    
                    FileSystemWatcher fileSystemWatcher = new FileSystemWatcher(Path, Filter) { EnableRaisingEvents = true };
                    FileSystemEventHandler created = (_, __) => observer.OnNext(new FileChangedEvent());
                    FileSystemEventHandler changed = (_, __) => observer.OnNext(new FileChangedEvent());
                    RenamedEventHandler renamed = (_, __) => observer.OnNext(new FileChangedEvent());
                    FileSystemEventHandler deleted = (_, __) => observer.OnNext(new FileChangedEvent());
                    ErrorEventHandler error = (_, errorArg) => observer.OnError(errorArg.GetException());
                    
                    fileSystemWatcher.Created += created;
                    fileSystemWatcher.Changed += changed;
                    fileSystemWatcher.Renamed += renamed;
                    fileSystemWatcher.Deleted += deleted;
    
                    fileSystemWatcher.Error += error;
    
                    return () => {
                            fileSystemWatcher.Created -= created;
                            fileSystemWatcher.Changed -= changed;
                            fileSystemWatcher.Renamed -= renamed;
                            fileSystemWatcher.Deleted -= deleted;
                            fileSystemWatcher.Error -= error;
                            fileSystemWatcher.Dispose();
                        };
                }).Throttle(Throttle);
        }
    }

    This could be much simpler, something like this:

    var observable =
            Observable.Return(new FileSystemWatcher(@"D:\Documents"))
                .Do(watcher => watcher.EnableRaisingEvents = true)
                .Do(watcher => watcher.IncludeSubdirectories = true)
            .SelectMany(watcher => 
                Observable.FromEventPattern<FileSystemEventHandler, FileSystemEventArgs>(
                    h => watcher.Created += h, h => watcher.Created -= h)
                .Select(e => new { e.EventArgs.ChangeType, e.EventArgs.FullPath, e.EventArgs.Name })
                .Merge(
                Observable.FromEventPattern<FileSystemEventHandler, FileSystemEventArgs>(
                    h => watcher.Deleted += h, h => watcher.Deleted -= h)
                .Select(e => new { e.EventArgs.ChangeType, e.EventArgs.FullPath, e.EventArgs.Name }))
                .Merge(
                Observable.FromEventPattern<FileSystemEventHandler, FileSystemEventArgs>(
                    h => watcher.Changed += h, h => watcher.Changed -= h)
                .Select(e => new { e.EventArgs.ChangeType, e.EventArgs.FullPath, e.EventArgs.Name }))
                .Merge(
                Observable.FromEventPattern<RenamedEventHandler, RenamedEventArgs>(
                    h => watcher.Renamed += h, h => watcher.Renamed -= h)
                .Select(e => new { e.EventArgs.ChangeType, e.EventArgs.FullPath, 
                                Name = e.EventArgs.OldName + " renamed to " + e.EventArgs.Name }))
            );
     

    One of the techniques I use to really get Rx is to try to avoid semi-colons as much as possible.  It might not be the best way to write code in the end, but during development it makes you think and ensures that there's no unnecessary variables and potential side-effects.  So, in my samples, you'll see a lot of Do's.  Do's from within the monad make it obvious there is some side-effect.

    There isn't a need for Observable.Create in the first snippet, which adds unnecessary complexity,  but I can see why the user might want a IObservable<FileChangedEvent>.  Even if I were to return an IObservable<FileChangedEvent>, I would be sure to use Observable.FromEventPattern. 

    So, which is easier to understand at a glance?  This may not be the best example, because I would normally want to eliminate duplicate calls to Observable.FromEventPattern, but that's not easy to do because you can't pass a watcher.Deleted event around.  However, the code, in my opinion is easier to understand because you can start at the top, read down to the bottom and there is no surprises as to what it will do, even though things like the order of events is completely irrelevant.

    More/better examples with the Kinect later....

  • User profile image
    exoteric

    @Richard.Hein: There are pros and cons to both styles.

    The examples are a little unbalanced however.

    The first example is unnecessarily verbose:

    • uses explicit variable typing instead of type-inference
    • uses pointless class instead of fully parameterized extension method

    The second example is too simple:

    • no use of using
    • no use of exception handling
    • no use of throttling

    I realize this was just to show a point about a fluent style of writing Rx of course.

    I prefer your style as it has a certain simplicity about it and it appears completely lazy.

    Since we do in fact have some level of syntactic support for Rx, namely LINQ, we could sugar up your example a little bit:

    var observable =
        from _ in Observable.Return(0)
        let watcher =
            new FileSystemWatcher(@"D:\Documents")
            {
                EnableRaisingEvents = true,
                IncludeSubdirectories = true,
            }
        let created =
            from e in Observable.FromEventPattern<FileSystemEventHandler, FileSystemEventArgs>
                            (h => watcher.Created += h, h => watcher.Created -= h)
            select new { e.EventArgs.ChangeType, e.EventArgs.FullPath, e.EventArgs.Name }
        let deleted =
            from e in Observable.FromEventPattern<FileSystemEventHandler, FileSystemEventArgs>
                            (h => watcher.Deleted += h, h => watcher.Deleted -= h)
            select new { e.EventArgs.ChangeType, e.EventArgs.FullPath, e.EventArgs.Name }
        let changed =
            from e in Observable.FromEventPattern<FileSystemEventHandler, FileSystemEventArgs>
                            (h => watcher.Changed += h, h => watcher.Changed -= h)
            select new { e.EventArgs.ChangeType, e.EventArgs.FullPath, e.EventArgs.Name }
        let renamed =
            from e in Observable.FromEventPattern<RenamedEventHandler, RenamedEventArgs>
                            (h => watcher.Renamed += h, h => watcher.Renamed -= h)
            select new
            {
                e.EventArgs.ChangeType,
                e.EventArgs.FullPath,
                Name = e.EventArgs.OldName + " renamed to " + e.EventArgs.Name
            }
        select Observable.Merge
        (
            created, deleted, changed, renamed
        );
    

    This is more verbose and unfortunately less homogeneous but I suspect more readable.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.