Thanks for all the great replies -- hope everybody had fun! We read all of them, and think that this posting covers everything (which is why it's so long); if not, we're sure you'll let us know -- the thread is now unlocked!
We liked this one so much that we used it not once but twice on our internal mailing list -- one developer had a great description:
Wow! It's so screwed up it doesn't even make it to wrong.
A first question is whether this routine even needed to be written; of the various random-number generators available, weren't any suitable? Random number generators are hard (see the classic discussion in Knuth's Seminumerical Algorithms -- and then see Knuth's comments at http://www-cs-faculty.stanford.edu/~knuth/news02.html about the issue that was discovered in 2002), and there are plenty of existing random number generators out there; if one of those is appropriate, why not just use it? There are known issues in the randomness of rand, but it's substantially better than that provided by this code fragment; or, depending on your needs random or (on Windows) CryptGenRandom may be more appropriate … but if the code hadn't been written, then we couldn't spot the bugs, so I guess it's for the best.
Assuming that the code did need to be written, it probably could have been written better. On to the bugs.
To discuss this example, I'll use the terminology convention that [m, n) denotes a number >= m and < n. I'll also simplify the discussion by assuming that LONG, ULONG, and DWORD are all 32 bit quantities; the code appears to make this assumption in a couple of places.
1. The general format of a LCRNG (linear-congruential random number generator) is seed = seed * C1 + C2, and the programmer apparently thought that's what he or she was doing. However, that's not what the code says; the code says seed = seed * (C1 + C2). Since C1 and C2 are both odd, this has the effect of seed = seed * even -- so seed will never come out as an odd number. This means that if an even lower and upper bound are passed in (more about that below), the result will always be even ... not good. And in fact, several people pointed out that the number of zero-bits at the end of seed increases by one for every time this thing gets called seed becomes zero; at that point, there's another call to GetTickCount and things start again.
The statement should have been written
seed = (seed * 0x41C64E6D) + 0x3039;
Several people hypothesized something along the lines of
Oh, I get it -- someone "corrected" this perfectly correct code:
seed = seed * 0x41C64E6D + 0x3039;
to the far more concise, incorrect *= form. Nice.
It's certainly possible. I don't know about the "Nice", though.
2. The modulus operation is wrong in a couple of ways. As written, some value is computed, and then taken mod UpperBound; but this does not consider LowerBound at all, so (LowerBound > 0) could return a result < LowerBound. At first it looks like just a parenthezation problem; should it have been
rand = LowerBound + ((seed & 0x7FFFFFFF) % UpperBound); // STILL WRONG!
Nope; now you're getting a number in the range [LowerBound, LowerBound + UpperBound) ... so
rand = LowerBound + ((seed & 0x7FFFFFFF) % (UpperBound - LowerBound)); // Better, but see #3 and #4
3. What about the signedness? Several people questioned why seed is a LONG rather than a ULONG (since so much care is to make everything else a ULONG); that's a good question. It's worse than just a clarity problem; the masking of the seed with 0x7FFFFFFF essentially chops out half the range. Consider the case where LowerBound = 0 and UpperBound > 0x80000000 (which is possible, since it is a ULONG). Now seed & 0x7FFFFFFF will always be less than 0x8000000, so there's an entire chunk of possible values that won't actually be generated.
It could be objected that for the particular application that this function was used in, this situation would never come up -- it would always be called with values that were originally signed longs, and hence < 0x8000000. Well, fine; but if that's really the intent, it's much better to make the function prototype LONG GenerateRandom(LONG, LONG) ... but of course that brings up the question as to whether this could work on negative values ... hmm, this is also problematic. It seems to me that it's better to leave it as ULONGs, change the declaration of seed to be a ULONG as well, and remove the unnecessary bitwise-and.
rand = LowerBound + (seed % (UpperBound - LowerBound)); // Better still, but see #4 and #9
There's also a signed/unsigned mismatch with DWORD and LONG. This seems harmless (yes, it could lead to an overflow on conversion, but this routine is intentionally introducing overflows all the time), but it deserves a comment at the very least -- and perhaps an explicit cast to make it obvious that this is what is intended.
4. If the program is called with UpperBound == 0, you'll get a divide-by-zero exception. I hate it when that happens.
The fix in #3 (where you'll get a divide-by-zero if UpperBound == LowerBound) makes it clear that the underlying problem here is that the "specification" (comment describing the behavior) is somewhat ambiguous: it doesn't specify any preconditions, and it doesn't say what happens if UpperBound <= LowerBound.
So, what should be done here? There are various possibilities. The easiest in some ways is simply to update the comment to make it clear that this is illegal (in which case it's on the caller's head if she does it); this gets the implementer off the hook, but if you want the function to be used in more than one place, puts a lot of burden on the caller. Another possibility is to check to ensure the correctness of the parameters and do something if they're not; in this case, you could conceivably return 0xFFFFFFFF, which isn't a possible return value for any possible input values (or you could raise or throw an exception); now the caller is responsible for checking the return value, and it seems like this could be a very easy thing to overlook. Still another possibility is to change the function prototype to return a status of some kind -- a BOOL, HRESULT or NTSTATUS or whatever the preferred status type is for the API that it's a part of. Once again the burden is on the caller to check, but it's somewhat clearer now just from the function's prototype that checking needs to be done. The Win32 API CryptGenRandom, for example, returns a BOOL. Finally, it might be possible to extend the specification so that the function doesn't fail for any value of inputs, something along the lines of "returns a random integer in the range [min(LB, UB), max(LB, UB)] unless LB==UB in which case it returns LB"; but you really have to ask yourself whether this would lead to anything more usable.
An advantage of returning an explicit status is that you can then use some kind of static analysis tool to verify whether the return value is checked by the caller. The disadvantage is that this doesn't tell you whether it's handled correctly -- a lot of defects occur in cleanup code.
I'm going to wimp out and avoid taking a position on this; the right answer depends too much on context.
5. There's a race condition here: multiple threads could use the same static variable, and there are no synchronization primitives (critical sections, mutexes, etc.). It could be pointed out that the person writing this somehow new that it could only get called from a single threaded program, in which case it wouldn't be a problem in the short term … just a lurking time bomb in case the environment changes or somebody tries to reuse this. At the very least, a comment (probably both here, at the start of the source file, and in the header file) indicating the restriction to single-threaded enviornments is needed. [Incidentally, the VC implementations of the varoius rand/random functions are careful to store their seed in thread-local storage to ensure that there aren't any race conditions. ]
6. A couple of people pointed out that the constants in LCRNG's are traditionally prime, and 0x41C64E6D isn't (AT was even kind enough to factor it). There are actually important number-theoretic reasons for having these be prime, so this is a bug, not just a "failure to observe tradition". A couple of people asked where the magic constants came from (beats me), which points out another problem: a comment giving some explanation (or even better, source) for these would certainly be helpful for anybody reviewing or modifying the code. Even if you're not doing code reuse by calling an existing routine, you often find other ways to leverage existing work.
7. Is the resolution of GetTickCount an issue? The documentation says that that it's the number of milliseconds since system restart time; well, the fact that it's milliseconds rather than microseconds or seconds is okay (although if you're in an environment where systems are rebooted regularly, it does lead to some bias) … but the description in http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sysinfo/base/gettickcount.asp makes this intriguing point about GetTickCount
It is limited to the resolution of the system timer.
If the system timer on some system is only accurate to (say) 100 milliseconds, then the two least significant bits of the seed will always be 0 after the first call to GetTickCount, which could be an issue.
9. Even after making all these fixes, the function isn't particularly random. There are some well-defined mathematical tests for randomness, but here are a couple of specific issues:
a. It's not fair. For example, if LowerBound is zero and UpperBound is 1,500,000,000, values between 0 and about 500,000,000 will be returned twice as often as other values.
b. It relies primarily on the least significant bits of seed, which are the least random parts. The worst case example, if LowerBound is 0 and UpperBound is 1, the results will strictly alternate between 0 and 1: 0, 1, 0, 1, 0, 1 … completely nonrandom. If the bounds are 0, 10 -- it generates even, odd, even, odd, even, odd.
10. Several people pointed out that the code relied on ULONG and LONG both being 32-bit quantities. Should that be commented? [But then do you comment every place you have a dependency on ULONG and LONG being 32-bit?] Or is it better to rewrite it to remove the dependency?
As always, people spotted some "bugs" that weren't there. A couple of examples in particular are worth highlighting:
A. Several repsonders suggested that the static variable would get reinitialized each time the function is called. No; whether this is C or C++ code, static and global variables are initialized at program startup time. Is the explicit initialization needed? Microsoft compilers (and I believe gcc as well) do initialze statics and globals to zero, but if you're writing code that needs to work on arbitrary C compilers, a little web searching will reveal that this is something you can't necessarily rely on; and in any case, I would say that putting it in helps communication by pointing out that it actually does matter that the seed starts at zero.
B. There are integer overflows here! Yeah, so what? Sometimes you do this intentionally for efficiency purposes -- and random number generators are one of those places.
Even with all of these problems, this is a story with a happy ending -- the actual bug was found (and fixed) in a system-wide code review before Win2K. Still, looking at old bugs can also be instructive. The person who originally sugested this as a good "Spot the Bug" fragment noticed
The original code had a comment saying they didn't understand why the lower nibbles were not very random. Very funny.
Ha ha.
jon