Posted By: Dan Fernandez | Jul 16th, 2008 @ 12:22 PM | 89,756 Views | 16 Comments
Daniel goes through the four key reasons why BSODs happen, how Windows allocates memory and how developers need to be careful when setting kernel mode memory. Daniel then goes through a real-world example of a faulty device driver and how to debug and diagnose issues. Daniel also shows how to read and write data to an application process, like Notepad using WinDbg.
Rating:
4
0
karnokd:

It's simple really - boot device drivers have special requirements, which lead to the ability to save crash dumps.

A storage driver written any old way won't be able to do this.  Windows loads an extra copy of the boot driver, kernel crash dump writing code, plus a bitmap of the page file on the boot drive, then checksums the lot.  At blue screen time, the checksum is verified and if good, the crash dump is written directly to the sectors known to be used by the page file on the boot drive.
Can you save crash dumps in other cases?  Make sure your page file on the boot drive (the one with the Windows directory) is large enough to save the kind of dump you've selected and turn off "Automatically restart" so you can see the blue screen (including the bugcheck code and parameters) and the results of the attempt to save the crash dump.

Then download notmyfault from here: http://technet.microsoft.com/en-us/sysinternals/bb963901.aspx

Use notmyfault to crash your system and verify that crash dumps can be saved.

Daniel's video nicely complements a talk I've delivered on crash and hang analysis at various conferences. You can check out the on-demand web cast from TechEd a couple years ago here:

TechEd On-Demand Webcast: Windows Hang and Crash Dump Analysis

I answer some of the questions raised here in the comments, like how to debug a frozen system and why a dump file requires a paging file on the boot volum (the one with \Windows).

Unfortunately your only real option is to attach a kernel debugger to the system and wait for it to crash again. If it's only just started happening and is rather frequent then I would suspect some sort of hardware error but get a debugger attached and we'll get some answers.
Mark's video goes into more deal about the requirements for a successful dump but at a bare minimum you should create a 16 MB paging file on the same partition that Windows is installed and set the system to perform a small memory dump. That way once you triage the minidumps, you'll be able to determine if it's the same problem causing all of the bugchecks or if another action plan needs to be followed.

If you find it's your display driver causing the problem and it's not something the vendor has seen before, it's possible you'll need to provide them with a kernel memory dump. In that case you'll need to increase the size of the paging file on your boot partition and switch your options to kernel memory dump.
Microsoft Communities