Dan

Daniel Pearson: Debugging a Windows Blue Screen of Death

Sign in to queue

Description

Daniel goes through the four key reasons why BSODs happen, how Windows allocates memory and how developers need to be careful when setting kernel mode memory. Daniel then goes through a real-world example of a faulty device driver and how to debug and diagnose issues. Daniel also shows how to read and write data to an application process, like Notepad using WinDbg.

Embed

Download

Download this episode

The Discussion

  • User profile image
    Chadk
    This is awesome. I would love to see much more like this. 

    It was interesting how to see how you could find information from the memory dump.
  • User profile image
    stevo_
    I was just wondering today, what would happen if the 'code' that handles the bsod.. crashed? like, what if my processor was on last legs and caused an execution fault?

    Let me guess, something really boring like everything going blank, or my computer restarting (or not)?
  • User profile image
    Charles
    I've had something like this happen to my machine twice in the last two days. There's some sort of kernel crash that causes the system to reboot - with no telemetry whatsoever. Certainly a critical system failure, but the system is unable to record any data before restarting.

    C
  • User profile image
    PerfectPhase

    If this sort of thing is of intreast to you, this is a must see Sysinternals Video Library

  • User profile image
    karnokd
    Nice video. However, I never understood why there aren't any minidump created in case the swap file is moved to a differend partition/hard disk in WinXP? (In my case, I play 3D games and the videocard driver sometimes crashes my system and I cannot hope to send that crash dump to its manufacturer.)
  • User profile image
    tomkirbygre​en
    Hardcore! Smiley Nice to see some close to the metal stuff in with all the managed goodness. Helps keep us honest as developers Wink
  • User profile image
    stevo_

    That doesn't sound good charles Wink.. I was interested because.. if theres no one "there" ie, the kernel itself doing its bugcheck code has crashed out due to a hardware fault.. does my system just sit there... I assume I'd still have display because the gpu would just be outputting the last buffer it was given?

    But I've watched the video now, really cool, loved the idea of trapping a driver by putting it on a "known offenders" list, and luring into doing something it will get caught red handed for.. Big Smile

  • User profile image
    littleguru
    Probably you get a freeze. A complete halt with the most current video buffer re-drawn over and over again.
  • User profile image
    Charles
    Unhandled exceptions in kernel mode lead to reboot by policy (or if you're lucky a bluescreen with data capture for debugging purposes). Anytime something goes wrong in kernel world the system must commit temporary suicide (or start the reincarnation process, to be more positive in tone Smiley). There's too much weird and invalid state to deal with when this happens and typically and it's not worth it (the insuing instability and total strangeness that user mode gets to experience as a result)...

    This keeps happening on my machine and there's no way for me to debug given that no data on the fault is preserved (or even captured). Clearly, it's a device driver malfunction. I suspect it's a driver that's not Vista Ready... Smiley

    What's one to do in this case, Daniel?
    C
  • User profile image
    ZXTT95
    karnokd:

    It's simple really - boot device drivers have special requirements, which lead to the ability to save crash dumps.

    A storage driver written any old way won't be able to do this.  Windows loads an extra copy of the boot driver, kernel crash dump writing code, plus a bitmap of the page file on the boot drive, then checksums the lot.  At blue screen time, the checksum is verified and if good, the crash dump is written directly to the sectors known to be used by the page file on the boot drive.
  • User profile image
    ZXTT95
    Can you save crash dumps in other cases?  Make sure your page file on the boot drive (the one with the Windows directory) is large enough to save the kind of dump you've selected and turn off "Automatically restart" so you can see the blue screen (including the bugcheck code and parameters) and the results of the attempt to save the crash dump.

    Then download notmyfault from here: http://technet.microsoft.com/en-us/sysinternals/bb963901.aspx

    Use notmyfault to crash your system and verify that crash dumps can be saved.
  • User profile image
    Mark Russinovich

    Daniel's video nicely complements a talk I've delivered on crash and hang analysis at various conferences. You can check out the on-demand web cast from TechEd a couple years ago here:

    TechEd On-Demand Webcast: Windows Hang and Crash Dump Analysis

    I answer some of the questions raised here in the comments, like how to debug a frozen system and why a dump file requires a paging file on the boot volum (the one with \Windows).

  • User profile image
    Daniel Pearson
    Unfortunately your only real option is to attach a kernel debugger to the system and wait for it to crash again. If it's only just started happening and is rather frequent then I would suspect some sort of hardware error but get a debugger attached and we'll get some answers.
  • User profile image
    Daniel Pearson
    Mark's video goes into more deal about the requirements for a successful dump but at a bare minimum you should create a 16 MB paging file on the same partition that Windows is installed and set the system to perform a small memory dump. That way once you triage the minidumps, you'll be able to determine if it's the same problem causing all of the bugchecks or if another action plan needs to be followed.

    If you find it's your display driver causing the problem and it's not something the vendor has seen before, it's possible you'll need to provide them with a kernel memory dump. In that case you'll need to increase the size of the paging file on your boot partition and switch your options to kernel memory dump.
  • User profile image
    jasone

    If you like debugging, check out http://blogs.msdn.com/ntdebugging/.

  • User profile image
    jasone
    If you like debugging, see http://blogs.msdn.com/ntdebugging/

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to send us feedback you can Contact Us.