Announcer:

It's Monday, April 23, 2007, and you're watching ARCast.TV.


[intro music in background]

Ron Jacobs:

Hey, this is Ron Jacobs, and I'm here on the outskirts of Sydney at a place called La Perouse.' It was named after the first Frenchman to come to Australia, Jean-Michel La Perouse, in 1788. Now, he wasn't the first European to discover Australia. Captain Cook and his men landed just across the harbor here at Botany Bay some years before.


But La Perouse was on a mission of discovery throughout the Pacific, as they were trying to chart potential sites where France could extend its power. Unfortunately, just one year later, the French Revolution would mean that France's dream of empire in the South Pacific would be cut short. That might be fortunate from some people's perspective, by the way--but we're here for a different purpose, and that's to learn about architecture.


And we've been thinking about user experience here in Sydney, and how there's lots of different aspects of user experience. And, of course, one of them is performance. And for this purpose, we contacted Bill Karagounis, who runs a performance consulting company here in Sydney. And Bill is going to have some great insight for us about what it takes to build a great user experience in terms of performance.


[end intro]

Ron:

Well, I am Ron Jacobs, here in Sydney with Bill Karagounis, and, Bill, we've been talking about, here in Sydney, user experience, and you've been thinking about user experience from your performance work. Tell us who you are and what you do.

Bill Karagounis:

Sure. I'm Bill Karagounis, I have a company, which essentially does consulting around Web performance. It's called Webperf. And we basically go and tune and optimize large-scale systems. Mostly tending to work in the financial services industries--a lot of my clients are banks. But we also tend to work in the government sector as well.


So, yeah. It's all about, "How do I run something and make it really responsive when there's four or five thousand people using it concurrently?". On those sort of lines.

Ron:

Now, when we start talking about user experience, often people think about very clever designs, and nice colors, and beautiful pages, but really, I think, one of the elements of poor user experience is when you have a site which has bad performance. Have you seen this kind of thing?

Bill:

Oh, yeah, sure, absolutely. The scary thing for us is, sometimes we're brought in when somebody's actually pushed the panic button, and said, "It's not working, we're about to release this system, and it's falling down. It's clogging up. For some reason, there's something wrong with it. It looks great when there are two people using it, but as soon as we throw some load at it, we have a lot of problems."


So, then, we go into the mode of, "How do we fix this up from the back end?" We have to work all the way up, and figure out, "Where are all the little problems?" Ideally, it's the kind of thing, from an architectural standpoint, you look at before you've even started building the crux of this system. As you're doing your prototyping, as you're building your user interface, you also want to be thinking about, "Here are all these pieces of software, and here's how this architecture's going to be put together--what are the bits that are likely to break?"


It's kind of like when you're building a building--you're going to look at the ground you're going to build it on, and it's like, "Is there bedrock there, or am I building on quicksand?" And, ideally, that's when you want to start having that discussion. That's when I prefer to be involved in a project.

Ron:

Yeah, and in fact, we're sitting here on these rocks here at La Perouse...

Bill:

Sure! This is a great foundation!

Ron:

[laughing] Yeah, that's right! This feels very solid. More solid than I'd like right now, but-- [laughs] What I'm thinking about, and this is actually a great place to think about the beginnings of a project, because right across the water here is the place where Captain Cook first landed, discovering Australia--from, at least, his point of view.


So, at the very beginning, there are some things that architects need to think about. What do you think those are?

Bill:

Well, it's a case of, first of all, "Have I used the right kinds of components for the amount of people that are going to be using this system?" I think, from a fundamental standpoint, it's "How many people do I expect? How many users do I expect will register on my site or register in my system?"


And then, once you understand that, "What level of concurrency am I going to have? Am I going to have three or four people using this thing concurrently, or am I going to have a few hundred, or am I going to have a few thousand?"


If the answer to that question is "I'm going to have a few thousand, " you really need to start thinking, "OK, I need to essentially build the wind tunnel model of my app." When you think about it, car manufacturers and building designers, they actually build prototypes, and they throw them in wind tunnels, and they figure out, "How's this thing going to work?"


From my standpoint, especially in these large financial systems, we will tend to go away and we'll build a skeleton to make sure that all of the decisions we've made from a software architecture standpoint are going to be able to handle what the system is going to need them to do in production.


If it's not possible to absolutely do that, we will come up with a model, which we can then figure out we feel confident about from a scalability standpoint. So things like statelessness, and how much data I'm pushing across every time I do a call into the system, stuff like that starts to become important.


So, that's the approach we tend to take, and we work from there, and we optimize as we go on.

Ron:

I like the picture of a wind tunnel. In fact, the wind's blowing quite nicely out here today. [laughs]

Bill:

It's a great little wind tunnel we've got going on here.

Ron:

You know, it's great, because when you think about that, people will build their model of their skyscraper or something, they'll put it in the wind tunnel, they'll even run a little smoke through the wind tunnel to see how the smoke flows around the building. They might eventually increase the speed of the wind to the point where they can learn when the building falls over--how far it would have to go. Because they're making a very serious estimate of how strong the building needs to be.

Bill:

Right.

Ron:

Do you think that a lot of projects, though, never bother to go through that work?

Bill:

Yes. I think that's one of the biggest issues. For me and my company, people ring us up, and they're really close to the end, and they realize--they've had this "Oh, my God" moment, where it's not right, there's something wrong--and, unfortunately, the fixing of that is not a trivial exercise. Quite often the application designers have to take a lot of pieces out of their system and rework a whole bunch of stuff.


And you know what? It's not the kind of thing you want to go back to the board and say, "Yeah, you know what? We need another three million bucks," and it's going to be another six months before this thing's done.


So, you really want to be thinking about that kind of stuff up front. If the answer to your questions around capacity and performance are going to be, "Well, there's only two or three users concurrently, and I only have a transaction rate of two or three a second at peak," maybe I'm not going to be as concerned about it. But you definitely have to ask yourself those questions.


I think the other thing, too, from a user interface design to think about is, quite often these large systems are reliant on other systems where they will send a message, and at some point that other system will come back. It might not come back straightaway.


Now, from a user experience standpoint, this is generally going to leave the user hanging around waiting for that. That's actually really bad for you, as well, from a scalability perspective, because you've got to tie up all these resources for this really long period of time. Twenty seconds on a system can be a really long period of time when you've got thousands and thousands of people wanting to get serviced.


So, from a user design standpoint, maybe the appropriate model then is, "Hey, look! We've taken your piece of information, we're working on this thing, come back in a little while, we'll let you know when it's done." Maybe I'll send you back a message through some other channel to let you know, "Hey, that piece of work is done. You've accomplished that transaction."


So, it's stuff like that you really want to start thinking through.

Ron:

OK. So, Bill, you were talking about kind of an asynchronous model of sort of working with the back-end systems.

Bill:

Mm-hmm.

Ron:

Which I think is really a great idea, but a lot of architects shy away from it because it is just so much easier to do things synchronously. But it brings up a technology there is a lot of buzz about right now, which is sort of the AJAX stuff.

Bill:

Yeah.

Ron:

What's your thought on that?

Bill:

Well, it's interesting. If you suddenly take on a little bit more complexity when you sort of go down the AJAX model, you need a few more skills in your dev shop to do it. But if you think about it from a standpoint of the back-end systems, AJAX systems that are built well can really take the pressure off what is happening on the back-end. So, I don't necessarily need 40 servers or 20 servers or something along those lines when I'm giving more data to the client and I'm using the processing power that is living there on the client.


Interestingly, because the AJAX model can be very asynchronous, the other thing that is happening as well is that from a user's experience standpoint, you know what, they are actually getting quite a good responsive site. You know, they are seeing things happening. And even if you have higher latencies than what you expected, sometimes the experience is just as good.

Ron:

Well, in fact, a lot of times isn't it really a matter of setting the right expectation with your user?

Bill:

Yeah.

Ron:

So that, you know, if you give somebody a progress meter or some kind of a thing that says, "This is going to take awhile, we'll get back to you," or whatever. With some of these systems that are built synchronously it's very hard to know how long something is taking, but if I can do something useful while I am waiting then I am not so worried about it.

Bill:

That's true. And also from an AJAX standpoint, if you look at some of the newer-age sort of UI controls and objects; UI, there is a lot of rich components out there that can actually, you know, alert users quite well when things are going on in the background. Users don't mind.

Ron:

I'm here next to a tower that was built in the early 1800s as a means of defense for this harbor, which is not Sydney Harbor by the way, but it is Botany Bay, right?

Bill:

Yes.

Ron:

And it is also a point where they could watch for smugglers who were trying to avoid taxes of all kinds [laughs]. But it makes me think about security because most applications have need for something like this tower where they are watching for intruders and people trying to avoid the proper authentication and whatnot, but yet this has an impact on performance. So I'm wondering, you know, how do you think about the trade-off between the security and performance?

Bill:

Like anything, I agree, it is a trade-off, and you do need to tread carefully. Quite often, from a performance standpoint, there are some things you can do. You can apply acceleration type of technologies if you need to. You can optimize the way security protocols are working, you can... There are opportunities to fix those kinds of things. But I agree, I mean, you know, I've lost many arguments with security architects around, you know, do we absolutely need this layer of firewalls? Well, actually we do.


[laughter]


And that's fine, you know. Like anything in architecture you have got to trade off the requirements of the organization. And for some organizations the operational risk of some kind of security breach is significant. So, you just have to work with that and you have to work around that. And you know what, quite often security technology, when you think about it hard enough you find another way around and are able to sort of deliver what you need to from a performance standpoint as well, and from a reliability standpoint too.

Ron:

OK. So, we are down here at the beach and I'm staring at the remains of many former sea creatures that didn't quite live up to their performance goals.


[laughter]


It reminds me of software systems that haven't done it, and they have been left to the scrap heap of history.

Bill:

Sure, sure.

Ron:

How do you find out if your system is going to survive or it is going to end up like one of these?

Bill:

OK. Well, coming back to that wind-tunnel analogy. Ideally, early on in the day you take your software architecture and the components that you are planning to build this thing on top of, and you also have to spend some time thinking about your infrastructure architecture as well. So, these two worlds, you want to bring them together.


From that standpoint, you want to essentially build out the skeletons of the major subsystems and the componentry, so that you can take those skeletal pieces, come up with the loads you expect, and throw the loads against them. And that is typically an environment you want to set up early on in your software architecture before you have actually built the rest of the system, because you might need to trim that or change that in some way. From that standpoint, when you have got that test environment set up, what you do from then is you essentially push that as hard as you can, you throw enough load to the system under test to essentially figure out, you know, take it to 100%.


First question is can you take it to 100% CPU? If not, that automatically tells you that you have got a scalability issue in that software that you need to resolve. But then you take it to 100%, you look at it and you tweak it, you tune it a little bit more, you look at its throughput, and you figure out, "Well, is this going to do the job I need it to do?" If the thing is coming back and doing 600-700 requests a second, chances are you are going to be OK, depending on what your loads are.


So, it is really a case of essentially building that out, isolating that application, building an environment that is going to test the skeleton well enough. And then you essentially sign off the pieces "OK, that architecture is going to work for what I need to do." You then build your system, and then along the way you checkpoint it again in the same sort of mode. Ideally, from a test standpoint you always have an environment around, in a sort of stubs, so that you can take the system that you have built, even after major releases, to make sure that it is performing.

Ron:

Now, is this the kind of thing that you want to do continuously going out in the future so that every time you have got to roll a release you do it again?

Bill:

Yeah, sure, absolutely; for the major release. The interesting thing about performance analysis and performance work is it takes one really dumb chicken from, you know, a developer, to completely break the system. It can--sorry, it can--take something that is just inadvertent that can really affect your performance. You might put something in there which basically gives you something that is a lock that everybody coming into the system gets blocked behind this one piece of code, and that basically tanks the whole system. So, for every major release it is a good idea to go away, revalidate what you've built to understand how it is performing.


The other thing as well that you want to think about is, what are going to build into your system or what pieces of infrastructure you are going to have so that the system itself can report how it is performing at runtime. An ideal thing, one of the things, for example one of the clients I am working on right now is, we are essentially building instrumentation into that system so that at every key piece of the transaction flow it is actually taking a time-stamp and it is reporting, it is aggregating and reporting that stuff back out. So, we can tell immediately if there is some sort of performance issue, where it is.

Ron:

Well, thanks Bill, for joining me today.

Bill:

Thank you.

Ron:

Wow! Talk about a wind tunnel. [laughs] I'm telling you, it was a brisk winter day at La Peruse that day, where the wind was just coming in off the ocean, and you could kind of hear it there. There is actually some clips we just couldn't use because there was so much noise.


But I love that picture of the wind tunnel, you know. Who would build a plane or even an automobile or a building without testing the basics of the design shape under stress in the wind tunnel? And, you know, if you are building an application where performance is critical, where you think you going to have a lot of traffic, you need to build the wind tunnel. I love that picture.


I hope you are learning a lot from this, I know I am; it's just incredible. And we have so much more to come on ARCast.TV, so, see you next time.

Announcer:

ARCast.TV is a production of the...