Regards the 32-bit/64-bit thing - lots of sysinternals tools do in fact require running the 64-bit version - for example procexp and procmon come to mind, as well as other utilities that require loading temporary drivers into Windows.
The reason you probably haven't noticed is that the 32-bit version of these detects that it's running in WOW64, and so silently drops the 64-bit procexp64 or procmon64 to disk and launches that as 64-bit.
This means that although many sysinternals tools aren't really 64/32-bit agnostic, they pretend to be in a way that is pretty clever.
Secondly, unlike AWE, PAE affects only things that look at physical addreses, i.e. some kernel mode drivers and the memory-management in the kernel itself - it doesn't affect normal user-mode programs. Consequently, to use physical memory above 4GB on a 32-bit system, only your kernel and some of your drivers need to be PAE aware - i.e. the ones that allocate physical addresses but foolishly assume that the physical addresses are 32-bit (physical addresses are actually 56 bit, just to be confusing). This truncation is the reason why 32-bit Windows chooses by default to ignore physical memory above 4GB and why it is disabled for Client SKUs.
You also need to be running the PAE version of Windows Server. There's a flag in boot.ini (or bcdedit) called /PAE, but this has been enabled by default ever since XP SP2. On a side-note, PAE is a requirement of hardware-DEP, so if you have DEP enabled, PAE is enabled. It's also a requirement of Windows7, so if you're running Windows7, you've got PAE.
Anyway, all that said, unless you've got a really good reason not to, you should be moving to a 64-bit OS. That way all of these complicated things just work and you don't have to think too hard about it.
The Software Engineer in Test career is only boring if you want it to be - and lots of really senior developers at Microsoft have at one point or another been an SET.
It's a bit like a battle of wills between the SETs and the normal devs; the dev writes some new code, and the job of the SET is to prove himself by finding the bugs in the code. The job of the dev is to not write the bugs in the first place. As the developers get better at not writing bugs, the SET must get better and more ingenious at finding the bugs in the first place.
If you think about it, SETs have arguably a much more fun job than devs. They don't need to spend their life thinking about customers or features or compatability and upgrade paths and bashing out more code to comply with EU law. All they need to do is look at the code that someone else wrote, beat it with a stick and send back the mangled remains to the original developer with a note saying "I bashed it with a stick and it broke. Please fix".
- Is it possible to write C programs that do not depend on the CRT? What are the limitations?
Yes, but you have to choose not to. In Visual Studio you can go properties -> Linker -> Ignore All Default Libraries -> True. If you do this you'll also need to turn off buffer security checks and runtime checks in the compiler options and if writing a DLL in compiler -> Advanced set the entry point to "DllMain" or if writing an exe change it to "main".
Things that have to live underneath the CRT (e.g. the Windows libraries, drivers and the kernel) need to do this if you want to build them with visual studio, but it's expected that most programs will want the CRT as it provides all sorts of nice low-performance overhead abstractions on top of the Windows functions exported by the Windows API.
- What is exactly the relationship between the Windows API and the CRT? It looks like the CRT is built on top of Win32 (e.g. malloc calls HeapAlloc) but I came across a few Win32 functions that actually rely on the CRT to do their work.
Microsoft's CRT is built on top of the Windows API (Win32). I'm not sure any core Win32 functions rely on the CRT, but do tell me if I'm wrong.
- Is the Windows NT CRT dll (system32\msvcrt.dll) different from the one that ships with Visual Studio? Can we link our applications with this version of the CRT (and no longer ship the CRT dll with our code)?
Yes. The Windows CRT is the one used by the Windows API dlls, e.g. kernel32.dll uses msvcrt.dll, but your dll will use the CRT shipped as part of your version of Visual Studio. The msvcrt.dll in system32 tends to change only with Windows Service Pack.
- I would love to hear more about the way SEH is implemented (what really happens at the cpu and memory level when an exception is thrown, the search for exception handlers, stack unwinding and so forth). The operating system and the compiler are definitely involved but is the CRT also a part of that?
In I386 processors (the x86 and x64 family if you want to call them that) use PAE to reference memory after Windows XP SP1. PAE allows pages of granularity 4096 bytes to be marked as "paged in" or "not available" and can put additional restrictions on them like "not executable", "not writable" or "uncacheable". When your CPU tries an invalid access to a page in memory, the CPU faults to the page fault handler in the kernel.
A lot of the time page faults are "false positives" - you're touching some memory that is paged out (i.e. put in the pagefile to free up physical ram for other processes), part of a memory mapped file that hasn't been read yet and so on, and these page faults are serviced transparently by the kernel so that user-mode won't ever see them, but occasionally the page is marked as "not available" because you're trying to do something invalid like read an unmapped page or execute on a DEP-protected (NX) page.
At this point the kernel SEH handler kicks in and tries to see if the previous mode was kernel mode. If it was, the kernel services the exception (or bluescreens). If the previous mode was user-mode, it queues an APC request to ntdll's KiUserExceptionDispatcher which is responsible for the user-mode part of SEH. ntdll checks if there are any vectored exception handlers that need to be run, then checks for SEH on the stack via the SEH exception chain. It calls each method in turn and if none of the registered exception handlers want to service the exception, ntdll tells the kernel to tear down the process and invoke the DrWatson crash dialog.
The CRT is not typically involved in the SEH chain, but there are exceptions - the CRT does put SEH checks around several of its functions (i.e. the CRT uses SEH) and the compiler emits vectored exception handlers for many "SEH" try catch blocks (because vectored exception handlers are more safe from attack), but it's not really a CRT thing, it's more of an OS/ntdll thing.
- I believe that some runtime floating point support is built inside the CRT but the source code is not available. Can you share with us some implementation details?
The CRT does have code for "upgrading" the floating point performed on the CPU's FPU chip on processors that have low precision by essentially emulating the floating point behaviour in software. This is configurable in the properties of the compiler as either "fast" or "precise" floating point mode.
- Ntdll.dll also contains some C runtime library functions (mostly string related though), are there similar to the ones implemented in the CRT?
Yes and no. NTDLL is the ultimate base of the Windows user-mode - it's the first thing to be run in the process (long before your main function and long before the CRT) and it provides lots of functionality to the other Windows dlls such as kernel32, advapi, user32 and gdi32. The functionality that it provides is basically added on an ad-hoc basis almost like a somewhat manually invoked CRT for the Win32 libraries themselves, but it's not the same as the Visual Studio CRT even if they share code or if the CRT passes down to the NTDLL implementations.
- How is the CRT designed to be thread-safe and do we still need to call functions like _beginthreadex instead of CreateThread to avoid memory leaks?
The CRT is mostly threadsafe, but you'll have to check on a per function basis. In particular the core functions like heap/thread/file/security are all thread safe.
The _beginthreadex/CreateThread is an old issue, and what matters more than which to use is that you pair _beginthreadex's with _endthreadex and CreateThreads with ExitThread. _beginthreadex is better for the CRT (it gives you SEH wrappers for the entire thread for example) and mixing and matching will cause a small memory leak in the _beginthreadex/ExitThread direction and a possible crash in the CreateThread/_endthreadex direction, so choose one and stick to it.
@damiandixon: If you're doing complex 3D geometry you probably shouldn't be using Direct2D to do it. Direct3D will always always be faster at doing graphics than OpenGL on Windows because Direct3D is a thin layer to the graphics driver's HAL, and OpenGL on Windows thunks to Direct3D anyway.
I not sure if that is correct. Incomputability does not equal non-existence. Example the concept of pi. We now it exists though we can't compute it.
Incomputability doesn't mean non-existence, but it does mean the non-existence of a computational answer (i.e. it may exist but it cannot exist within mathematics, since mathematics has a deterministic axiom base and all theorems of mathematics are provable
within a finite (albeit large) amount of time). Your case of PI is slightly flawed. PI
is computable, and indeed every digit of PI is computable. If you give me a big enough computer and enough time, then for any digit
d of Pi I can give you the value of d. This is because although Pi is definited as an infinite sum, the fact that it converges means that if some error margin is allowed, the sum can be trunchated to a finite one. Thus is the
dth digit is required, we just set the error rate to 10-(d+1) which is a finite sum and thus computable.
On the otherhand it is impossible to compute all the digits of Pi within a finite amount of time (finding a digit
d inside a convergent sum of n non-zerovalues takes time
~O(d/p) where p is the convergent rate of the series, in general therefore, finding any digit that contains an infinity-term (such as the last digit of pi, the infinity-th digit) will take O(infinity) time and therefore is incomputable via
this method in finite time).
Similarly we can know that a particle exists somewhere and has a certain probability of being on some region without knowing for sure if it is there or not: that is the definition of probabilistic not deterministic theories. We can claim that quantum mechanics
is right because the resulting probablities match what experiment shows the distribution to be, without necessitating that we can but an exact velocity and position vector on each particle in the universe.
As a mathematician at heart I would decline to comment on whether quantum mechanics is "correct" - correctness is physics is dependent on the maths being correct
and the world-view transform being correct. If I took the problem of pushing a 1kg mass object with 1 Newton of force, but failed to take into account friction of the object on a surface, then the inaccuracy of my result is due to an inconsistency of
my world-view transform rather than of my maths.
Things such as statistical mechanics and the problem of quantum-observation phenomena is not that the result isn't there, but that it can't be measured due to our limited ability to measure it. The Heisenburg effect (which states that measuring a quantum particle
limits its degrees of freedom by 1, or more simply that you can't measure speed, spin and velocity of a quantum particle all at once) is dependent so far as I am aware on the limitation that measuring can only be done by projectile analysis. In future we may
improve upon this by using other methods, and the solution may become calculable.