ARCast.net - ARCast Rapid Response with Michele Leroux Bustamante

Announcer: It's Monday, July 9, 2007, and you're listening to ARCast.
Ron Jacobs: Hey, welcome back to ARCast, friends. This is your host, Ron Jacobs. Today we are going to do something a little different, something I started kind of recently, as I was browsing around on the MSDN Architecture Forums. I noticed that a lot of times, people are asking questions and they're not getting very good answers; or maybe they are getting good answers, but it's just interesting stuff. And I thought, wouldn't it be fun to get some of our architect MVPs or other MVPs on an audio recording, just talking about the question that someone is asking.


So we're calling this an ARCast Rapid Response. And originally, I said, let's try to keep it three to five minutes, but then sometimes we just go overboard. I had recently recorded a couple of these with Michele Leroux Bustamante, who's just a very, very smart WCF person, who works with the IDesign Consulting firm. So we have a couple of those for you today.


[siren]
Announcer: Your post on the MSDN Architecture Forum has been selected for an ARCast Rapid Response.


[siren]
Ron: Welcome to an ARCast Rapid Response. This is Ron Jacobs, and today, I'm joined by Michele Leroux Bustamante. Welcome, Michele.
Michele Leroux Bustamante: Hi there, Ron, nice to hear you.
Ron: Yeah. And so Michele, I'm looking up this message on the MSDN Architecture General Forum, and this guy's asking about Authorization Manager and WFC. His question basically boils down to this, he has a Smart Client Application, he's calling a web service so he can retrieve the security permissions that person has, and he's going to cash them on the client's side, and decide what they can do. Maybe he wants to show a menu item or not show it, or gray it out because they don't have permissions to do it.


So they're defining the security permissions on the server, you call the web service, you cash them and you apply them. It sounds like an interesting issue, kind of a thorny one too. I'm curious what you think about this?
Michele: Well I think it's actually a pretty common problem nowadays because a lot of folks are using services in terms of a distributed application, client applications that can work offline and then reconnect. Usually the authentication is not on the client machine. So essentially, you need to login to the service. And if you want to do role-based limitation of UI on the client that implies that you need to ask the service, "What am I allowed to do over there?" and that can change.


So if they're adjusting whether or not you're in the administrators group or some other group, while you're in the middle of using the application, in theory, there should be also notifications to the client application if you're online to let you know that your restrictions have taken place to your account.


So I think the first problem to address really is just what is the model for making a request to a service to authenticate and grab authorization information. And there's probably nothing really baked, today, for this, in terms of XML web service standards, with the exception that we could certainly leverage security tokens that exist today, like SAML tokens that carry claims. Those claims would then have information including possibly your roles and other things you are allowed to do.
Ron: And also, a SAML token like that would have expiration to it as well. So it would give you claims for a certain period of time. The thing I worry about is what if we hire the guy, he's doing some work and then we realize he's a thief, so we fire him and revoke his claims. But he's offline and still able to mess around with stuff and the system hasn't updated his claims yet. You have to think about that, right?
Michele: Right. Well keep in mind though that if they are offline, they might be able to do some work to the offline data store and see certain UI for a period of time, but when synchronization takes place of whatever they did offline, the new roles will kick in. And so that particular user attempting to synchronize is going to be restricted by their new account.
Ron: OK.
Michele: So I think that would be one way to address that; you weren't able to do that anymore. And the chances of that happening are probably slim, but we know that it will be addressed; I think that's the point. So it might not be the most elegant solution for somebody to do a bunch of things and then be told, "No, you can't do that anymore". So whatever you entered, you're not allowed to update that record anymore. But if that happened, it's probably for a good reason, so I suppose that's what we deal with, right? Sorry, did you same something?
Ron: Yeah. But you would have to take into account that maybe you give him a token and it expires tomorrow at noon. So tomorrow at noon, the guy's running the application, the application would have to sense, "Hey that token's about to expire, I better call in and refresh the token, get a new one so that we can keep working, right?"
Michele: Right. So if the token expires, then that would definitely imply they have to get online at that point. But if something were to change between the time the token was issued and expired, that would be what we're talking about. In theory, if it's a whole day before the expiry, then if their account was revoked in the meantime because of some issue like, they were let go of the company, then of course, what you want is the end result, which is they can't even update with that live token anymore. Even though the token's still valid, they're no longer valid at the server because what will happen is that will be checked on as well because there would have been some sort of invalidation of the token.


So when you issue a security token from an official security token service, which would mean a WS-Trust standpoint, there are functions to issue the token, functions to validate the token and there are functions to cancel the token. So on the server side, if you update a user's account, you're going to cancel any tokens that are currently issued to that particular account, so that would invalidate any future use.
Ron: Here's another thorny issue with this. Let's imagine that I'm working on the application, and my token is valid, but then, it expires. Now I try to do some more work after my token has expired but because I'm offline, the application doesn't know my token's expired or is not invalid anymore. Maybe they let me keep working because it's just going to give me the benefit of the doubt.


Then I've got a bunch of work, but the next time that I can connect, some of it happened when my token was valid, and some of it happened after it expired or maybe it wasn't renewed. I guess you'd have this problem that now none of that work will be accepted but it might be something you have to think about; deciding what to do, if some of the work was valid, some of it was not. I don't even know if you want to try and mess with that.
Michele: I would think not. I think at some point you have to say, "Good enough is good enough," right? I...you have to you know build the application in a way that it's manageable and maintainable and you know insure that from a development perspective you're not you know, throwing in so much custom code that it's error prone.


So you have to really decide is it really that important to me? I mean if something happened to the validity of this user during the time they had a token and it either expired or was invalidated because of a change to their account. At the end of the day, whatever you did just doesn't count anymore.


And that doesn't mean you have to lose the data, by the way, it's possible you could build in to the back end, the synchronization process still receives what they did and maybe decides what to do with that offline in some sort of administration process in order to resolve any issues. And it depends on what the application does, whether that really matters?


For example, if it's a CRM application and it was a salesman on the road, you want to synchronize their computer, you want to grab whatever they did while they were on the road; but they no longer work for the company. So maybe you don't allow updates to happen, in case there was something malicious done or something like that. But you grab the data and you take a look at it more closely; and again, how often would you do this? Probably not too often.


I think the bigger issue is actually in the model for the token issuance itself. Because, although token issuance is something that's built in to some things that we're used to today, like for example in a federated environment, I can have a WCF service that forces client applications to go get a token from a security token service.


And that token is sent back to the client application, right and we can make it so that token contains ROL information so it's there at the client and it's available. But that token is also the same thing that's used to make future calls in a session-like environment to the service, right or a set of services let's say.


I think that the harder part is, how do I pull that token out of the proxy and work with it and handle the, you know regeneration of a token when it expires. Because right now, that's not really elegant today; so WCF will hold the token in the proxy but when it expires, you get a session expired exception, right? So...
Ron: OK.
Michele: ...it's not built in but you're just going to go in and all of the sudden, "I get a new token, wallah!" I actually need to reauthenticate my security token service to get my updated token.
Ron: So, when you get that session expired exception, you could try to reauthenticate, but you might be offline at the moment and you can't authenticate.
Michele: In which case, your token has exp...and that's OK actually because your token has expired but the worst part is that the proxy can't be used, right? So let's go through a use case. I open the app for the first time, I log in as whoever I am, I try to go to the service or do my authentication call to the service, if that's the case; that redirects me to a security token service, that issues me a token that has ROLs in it, which is great.


Now we don't really see that token, it's opaque when it arrives to the client. So that token is actually, you know, some...in terms of getting at the contents of the token, that can be tricky if it's been encrypted by the security token service for the client...for the service that we're calling.


That's one issue to deal with, is can we decrypt that token or not. And that depends on an exchange of keys and so forth. And it depends on whether the security service encrypts the token and by default that usually is the case although you can come up with your own custom models. So where I'm going with that is, the first question is, can I get at the token in the proxy, right?


Now assuming I can, then I don't care if the proxy you know, suddenly can't call the service anymore because the session expired; as much as what I care about that I've got the token cached somewhere else and I'm able to look at the contents and I'm able to do my ROL-based stuff at the client machine. So I'd have to write my own custom code to care about the token expiring, because at this point, what I've done, is I've cached. OK, I've pulled out the ROLs, great I know your user, I know your ROLs.


So there's nothing built in that will say, "Hey, these ROLs might not be valid anymore, " right? The token has expired, it only has session expiry only has validity when you call to the service that is relying on the token.


So there's really...this is a different model for using the token that's extremely useful but just doesn't have any code built in, if you will.
Ron: So would you use...on the back end, when you create the security token service, would you create your ROLs and assignments and all that with authorization manager and the security token service would issue the tokens based on that information?
Michele: Yeah, so a security token service can be backed by a ADE or an ASMAN or it can be a custom database, like SEQL server that has ROLs built in. And by the way, we don't have to look at ROLs, we can look at claims or permissions at a more granular level too, which is actually more useful.


I wrote a couple of articles about that, that is, are sort of, you know, describe the whole mental switch to claims-based security. And I think that that's one way you might want to look at this if you're going to do it.
Ron: OK, and now you wrote a book on WCF. Anything in your book that relates to this?
Michele: Actually, I do go through...I have a lab in the security chapter on claims-based security. So security is probably the longest chapter of the whole book, there's so much to it. And I tell you, I could have probably writt...there were so many details, but yeah, I do have a section that talks about claims-based security and talks about security token issuance and then I've added onto that so that would be a foundation to read and then my articles are sort of going even deeper into it.


So yeah. It's definitely one of my favorite subjects, there's a lot of really interesting things that you can do, it's just that some of the things are built in and some of the things you have to work a little harder for and the unfortunate side of that is that means you have to understand a little bit about, you know SAML and/or the keys that are used to sign in and encrypt tokens and how that works. So that's a bit nasty for most people today.
Ron: And the name of the book?
Michele: Oh, "Learning WCF."
Ron: all right. Well thanks so much, Michele, for being with me today.
Michele: Sure, thanks Ron.


[siren]
Announcer: This has been an ARCast Rapid Response brought to you buy ARCast TV and Architects MVP Worldwide.


[siren]
Ron: Hey that was pretty fun. So I decided why not today, since it was so long, let's put a second one together. We'll make just make an ARCast radio episode out of these two. So here's another ARCast Rapid Response.


[siren]
Announcer: Your post on the MSDN Architecture Forum has been selected for an ARCast Rapid Response.


[siren]
Ron: This is Ron Jacobs, and it's time for an ARCast Rapid Response. And once again, I'm joined by Michele Leroux Bustamante. Welcome, Michele.
Michele: Hello, Ron.
Ron: So I'm looking at this post on the MSDN Architecture General Forum that says "Monitoring.NET Application." And he's asking about the best practices for monitoring.NET applications. There is a "patterns and practices" white paper or book on operating.NET Framework applications, but it's a little bit old; it's.NET 1.0 and Windows 2000. So he's wondering if there's new or better information.


So just in general, I'm going to build an application and let's say it's a web application with some web services. I want to know if the thing is healthy, and I want to tell my operations people, here's how we know that the thing is healthy. What kind of advice do you have for people on that?
Michele: This is a several-tiered approach, I think that's why this is such a complicated subject for people because it crosses from the developer to IT. From a development perspective, what we really care about is whether or not there is a configuration setting, for example, for ASP.NET, WCF for your web services, or other services that can turn on the built-in performance counters that are written by the application.


So by default, for example, ASP.NET has a number of page requests, errors, and exceptions, and rejected requests, and other related information about the health of the applications that will help you do things like monitor how long it takes to execute a page request from the moment that the request is received by the web server, to the moment the last byte goes out. Also, the size of request, so that you can monitor average size per page load, how many requests per second. That's a really important statistic that you can pick up based on service level agreements and what you care about in terms of through-puts.


How many requests per second can I handle in this particular worker process or even machine? That helps me know how to scale out and how many other servers I might need to do some capacity planning. With WCF, for example, we can turn on WMI and performance counters, which will then track those performance counters related to services, and that can be service level, as a whole service, looking at that view and what are the requests like, and the frequency of requests, and size of requests, and so forth, and health.


If requests are being rejected, we want to know about that. We want a trigger that will hook when requests are being rejected because that means that all of the worker threads are being used and we are not processing requests fast enough to get new requests in. And so they time out after a minute or what not. So these are things that we want to make sure we capture. And those are built in; we just have to turn it on with a configuration setting in the diagnostic sections.
Ron: I was going to say that I did a very interesting ARCast episode, it hasn't aired yet, but it will be out soon, where we were talking about this. It was Nigel Watson who was on with me, and we were talking about vital signs. If I'm not feeling well and I go to a doctor they take my blood pressure, they check my pulse. Those are like performance counters, right? They say something about the state of my health at any given moment.


Now it's one thing to know my pulse is this, but it's another thing to know if that's good or bad. So a doctor has the knowledge to know how to interpret these numbers. So I think that's a really important issue. You can tell your operations people, "Hey watch for these numbers." But they say, "Well I see a number, but I don't know if it's good or bad, healthy or unhealthy, or if I should do something or not do something." So you have to somehow also give them the knowledge of what they should be looking for, right?
Michele: Exactly, so that's where we limit. There are so many different performance counters you can look at. You usually want to limit it to a core few that you're collecting metrics on, because again, collecting the metrics carries overhead. I guess where I was starting with is that the first step is turning them on.


But once the developer has arranged for its part in that, then we really move to the IT person in setting up MOM, for example, or Systems Operations Center, to listen to certain counters. And they should know which ones matter and that sometimes means getting together with development to say, "Which ones do you think I should care about? Let's evaluate this. What do I really need to know?"


I really find what helps is to think about it from a service level agreement perspective because you're going to report that to your customers. What do you need to tell your customers about your health? You need to know how I can process this many requests per second, and that all requests will process under two seconds, at the most, and maybe on rare occasions, it might be up to 10 seconds. But we will guarantee we will never go over 10 seconds for a request. You have to come up with what those metrics are, based on what your clients need. And then you look at the counters and say these are the ones we care about because that will tell us if we're OK.
Ron: Yeah. I like the idea. So when you're working out your SLA, that's a good place to look at what counters define the limits you're looking at. Good answer. Thanks so much, Michele, for being with me on this ARCast Rapid Response.


[siren]
Announcer: This has been an ARCast Rapid Response brought to you by ARCast TV and Architect MVPs Worldwide.


[siren]


[music]
Ron: Hey, is it just me or has the traffic on the MSDN Architecture Forum actually picked up since I started doing this? I don't know. I don't want to take credit. It may not be me. But I think it's great when we as a community have a dialogue together, helping each other, because there are so many things to know. Nobody can know it all. I don't know it all, certainly I don't.


If you listen to ARCast, though, you probably know more than your average architect out there, which is what's great about this medium, and it's a way where we can have a really rich dialogue about some interesting questions and things that people are facing.


And as we've started a new fiscal year for ARCast and ARCast TV, I'm looking forward to many, many more episodes. I'd love to hear from you. If you've never written me a note, write an email today to arcast@microsoft.com. Tell me what you think, what you like, what you don't like. This show is about you. Let's make it better.