Daniel Pearson: Debugging a Windows Blue Screen of Death

Download this episode

Download Video


Daniel goes through the four key reasons why BSODs happen, how Windows allocates memory and how developers need to be careful when setting kernel mode memory. Daniel then goes through a real-world example of a faulty device driver and how to debug and diagnose issues. Daniel also shows how to read and write data to an application process, like Notepad using WinDbg.



Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    • Chadk
      This is awesome. I would love to see much more like this. 

      It was interesting how to see how you could find information from the memory dump.
    • stevo_
      I was just wondering today, what would happen if the 'code' that handles the bsod.. crashed? like, what if my processor was on last legs and caused an execution fault?

      Let me guess, something really boring like everything going blank, or my computer restarting (or not)?
    • Charles
      I've had something like this happen to my machine twice in the last two days. There's some sort of kernel crash that causes the system to reboot - with no telemetry whatsoever. Certainly a critical system failure, but the system is unable to record any data before restarting.

    • PerfectPhase

      If this sort of thing is of intreast to you, this is a must see Sysinternals Video Library

    • karnokd
      Nice video. However, I never understood why there aren't any minidump created in case the swap file is moved to a differend partition/hard disk in WinXP? (In my case, I play 3D games and the videocard driver sometimes crashes my system and I cannot hope to send that crash dump to its manufacturer.)
    • tomkirbygre​en
      Hardcore! Smiley Nice to see some close to the metal stuff in with all the managed goodness. Helps keep us honest as developers Wink
    • stevo_

      That doesn't sound good charles Wink.. I was interested because.. if theres no one "there" ie, the kernel itself doing its bugcheck code has crashed out due to a hardware fault.. does my system just sit there... I assume I'd still have display because the gpu would just be outputting the last buffer it was given?

      But I've watched the video now, really cool, loved the idea of trapping a driver by putting it on a "known offenders" list, and luring into doing something it will get caught red handed for.. Big Smile

    • littleguru
      Probably you get a freeze. A complete halt with the most current video buffer re-drawn over and over again.
    • Charles
      Unhandled exceptions in kernel mode lead to reboot by policy (or if you're lucky a bluescreen with data capture for debugging purposes). Anytime something goes wrong in kernel world the system must commit temporary suicide (or start the reincarnation process, to be more positive in tone Smiley). There's too much weird and invalid state to deal with when this happens and typically and it's not worth it (the insuing instability and total strangeness that user mode gets to experience as a result)...

      This keeps happening on my machine and there's no way for me to debug given that no data on the fault is preserved (or even captured). Clearly, it's a device driver malfunction. I suspect it's a driver that's not Vista Ready... Smiley

      What's one to do in this case, Daniel?
    • ZXTT95

      It's simple really - boot device drivers have special requirements, which lead to the ability to save crash dumps.

      A storage driver written any old way won't be able to do this.  Windows loads an extra copy of the boot driver, kernel crash dump writing code, plus a bitmap of the page file on the boot drive, then checksums the lot.  At blue screen time, the checksum is verified and if good, the crash dump is written directly to the sectors known to be used by the page file on the boot drive.
    • ZXTT95
      Can you save crash dumps in other cases?  Make sure your page file on the boot drive (the one with the Windows directory) is large enough to save the kind of dump you've selected and turn off "Automatically restart" so you can see the blue screen (including the bugcheck code and parameters) and the results of the attempt to save the crash dump.

      Then download notmyfault from here: http://technet.microsoft.com/en-us/sysinternals/bb963901.aspx

      Use notmyfault to crash your system and verify that crash dumps can be saved.
    • Mark Russinovich

      Daniel's video nicely complements a talk I've delivered on crash and hang analysis at various conferences. You can check out the on-demand web cast from TechEd a couple years ago here:

      TechEd On-Demand Webcast: Windows Hang and Crash Dump Analysis

      I answer some of the questions raised here in the comments, like how to debug a frozen system and why a dump file requires a paging file on the boot volum (the one with \Windows).

    • Daniel Pearson
      Unfortunately your only real option is to attach a kernel debugger to the system and wait for it to crash again. If it's only just started happening and is rather frequent then I would suspect some sort of hardware error but get a debugger attached and we'll get some answers.
    • Daniel Pearson
      Mark's video goes into more deal about the requirements for a successful dump but at a bare minimum you should create a 16 MB paging file on the same partition that Windows is installed and set the system to perform a small memory dump. That way once you triage the minidumps, you'll be able to determine if it's the same problem causing all of the bugchecks or if another action plan needs to be followed.

      If you find it's your display driver causing the problem and it's not something the vendor has seen before, it's possible you'll need to provide them with a kernel memory dump. In that case you'll need to increase the size of the paging file on your boot partition and switch your options to kernel memory dump.
    • jasone

      If you like debugging, check out http://blogs.msdn.com/ntdebugging/.

    • jasone
      If you like debugging, see http://blogs.msdn.com/ntdebugging/

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.