AOA Forums AOA Forums AOA Forums Folding For Team 45 AOA Files Home Front Page Become an AOA Subscriber! UserCP Calendar Memberlist FAQ Search Forum Home


Go Back   AOA Forums > General > AOA FAQ

AOA FAQ Need a general understanding of something or detailed plan of action? Our members offer you their FAQs!


 
 
LinkBack Thread Tools Rate Thread
  #1 (permalink)  
Old 4th December, 2002, 07:35 PM
Member
 
Join Date: November 2002
Posts: 24

Guide to troubleshooting blue screen (BSOD) errors

As requested by Daniel~, I have posted my BSOD troubleshooting guide as a FAQ.

The first thing you can do in troubleshooting blue screens is enable a complete memory dump and turn off automatic reboot. This is done under the system properties in Startup and Recovery. It's best to do this right after you install Windows, before you start getting any BSODs. Sometimes the blue screen will show the name of the driver that is causing the problem. If nothing else, you can look up the STOP code and see what the problem is.

The most important info to record from the blue screen is the technical info, it will look something like this:

*** STOP: 0x000000EB (0x00000032, 0x00002345, 0xABCDEF00, 0x00000000)

The first number after the STOP is the bug check code. In this case it translates to DIRTY_MAPPED_PAGES_CONGESTION. (I just picked that one at random, I've never actually seen that on a live system). The 4 numbers after that are the bug check parameters, which are explained in the debugger help file. For example, in this case the first number would mean the total number of dirty pages.

If the info on the blue screen isn't enough to pinpoint the problem, you will need to install the Microsoft debugging tools. This can be downloaded from:

http://www.microsoft.com/ddk/debugging/default.asp

Don't download the beta, get verion 6.0.17.0, or whatever the current non-beta version is.

Even if you never actually use the debugger itself, the help file with its section on blue screens is very useful. It lists the explanations for all 200+ of them, as well as possible troubleshooting steps.

If you can get in to the system in safe mode with networking support, install the tools on your PC, if not, take the HD from your PC and put it in another system. But don't boot from your HD, boot from the working PC's HD. The key is that you want to be able to read the memory dump from the hard drive with the debugger.

Then, set the symbol file path to use the MS symbol server. Do this from the File menu, and set the path to:

srv*c:\websymbols*http://msdl.microsoft.com/download/symbols

Next, set the image file path to root of whatever drive contains the memory dump. If it is your own PC, this is probably c:\. If you install the HD in another PC, it will probably be something like e:\.

Then, open the crash dump file (again from the file menu). This will usually be in your WINDOWS or WINNT folder with the name MEMORY.DMP. Usually only the most recent dump file is stored here. It could also be a mini dump, in which case it will be located in the Minidump folder. You can have more than one mini dump, they are named according to the date and time of the crash. In any case, open it up. Be sure that you choose to open a crash dump and not a source file.

After some time, you will get a brief analysis of the crash with the name of the driver that likely caused it. Look at the screen to be sure you don't have any messages about wrong symbols or invalid image paths. If you get those, the analysis may not be accurate. Also, close any windows that come up with assembly code in them. The info you want is in the main window and is in reasonably plain English.

If you want more info, give this command on the debugger command line:

!analyze -v

That will give you a more detailed analysis of the crash. If you can't make sense of it, cut and paste it here and I will try to figure it out.

Finally, if you can't find a memory dump, then your machine is probably crashing before the hard drive controller drivers are loaded. This prevents a memory dump from being written.

Whew, did you get all of that? If not, let me know what parts you need clarification on and I will try to help.

You can use WinDbg to troubleshoot just about any BSOD, it is really quite handy. It's always easier if you have a complete memory dump, so you probably want to set that in your system properties. The help file also includes info on all of the different BSODs, including some you have probably never heard of before like EMPTY_THREAD_REAPER_LIST and SYSTEM_PTE_MISUSE. Just look in help under the category of bug checks (blue screen).

In case anyone is curious, here is a sample output from the debugger when analyzing a crash dump:
Code:
Microsoft (R) Windows Debugger  Version 6.0.0017.0
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:\WINDOWS\MEMORY.DMP]
Kernel Dump File: Full address space is available

Symbol search path is: srv*c:\websymbols*http://msdl.microsoft.com/download/symbols
Executable search path is: c:\
Windows XP Kernel Version 2600 (Service Pack 1) UP Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 2600.xpsp1.020828-1920
Kernel base = 0x804d4000 PsLoadedModuleList = 0x8054be30
Debug session time: Tue Dec 03 22:34:21 2002
System Uptime: 0 days 0:00:49.250
Loading Kernel Symbols
...........................................................
Loading unloaded module list
.........
Loading User Symbols
*********************
*                   *
* Bugcheck Analysis *
*                   *
*********************

Use !analyze -v to get detailed debugging information.

BugCheck E2, {0, 0, 0, 0}

Probably caused by : i8042prt.sys ( i8042prt!I8xProcessCrashDump+235 )

Followup: MachineOwner
---------
In this case, BugCheck E2 is MANUALLY_INITIATED_CRASH, which means that I held down the right CTRL key and presed Scroll Lock twice to force a blue screen. You have to set a registry setting for this to work, search Google groups for CrashOnCtrlScroll and you can find the exact value you need to set.

The next line is the most important one, that is where is tells you which driver caused the crash. In this case it is the keyboard driver, as it always is for an E2. The information after that is the name of the function that was being called. You can sometimes deduce what is going on from the name of the function that caused the crash.

The Followup line almost always says MachineOwner. That's a nice way of saying that it's something wrong with your system. I've also seen PoolCorruption on that line, but I don't know what that's really supposed to mean.

Then, if you do an !analyze -v, you get all the gory details:
Code:
kd> !analyze -v
*********************
*                   *
* Bugcheck Analysis *
*                   *
*********************

MANUALLY_INITIATED_CRASH (e2)
The user manually initiated this crash dump.
Arguments:
Arg1: 00000000
Arg2: 00000000
Arg3: 00000000
Arg4: 00000000

Debugging Details:
------------------


BUGCHECK_STR:  MANUALLY_INITIATED_CRASH

DEFAULT_BUCKET_ID:  DRIVER_FAULT

LAST_CONTROL_TRANSFER:  from f8668681 to 805266db

STACK_TEXT:  
80541e5c f8668681 000000e2 00000000 00000000 nt!KeBugCheckEx+0x19
80541e78 f8667efb 0025e0d8 01541ec6 00000000 i8042prt!I8xProcessCrashDump+0x235
80541ec0 804ebb04 820d6d98 8225e020 0001001a i8042prt!I8042KeyboardInterruptService+0x21c
80541ec0 804f1d67 820d6d98 8225e020 0001001a nt!KiInterruptDispatch+0x3d
ffdff980 ffdff980 f8952000 0000233b 00000000 nt!KiIdleLoop+0x12


FOLLOWUP_IP: 
i8042prt!I8xProcessCrashDump+235
f8668681 5d               pop     ebp

FOLLOWUP_NAME:  MachineOwner

SYMBOL_NAME:  i8042prt!I8xProcessCrashDump+235

MODULE_NAME:  i8042prt

IMAGE_NAME:  i8042prt.sys

DEBUG_FLR_IMAGE_TIMESTAMP:  3d6de41d

STACK_COMMAND:  kb

BUCKET_ID:   MANUALLY_INITIATED_CRASH_i8042prt!I8xProcessCrashD
ump+235

Followup: MachineOwner
---------
As you can see there is a lot of weird stuff in there, but if you look at the names of what is being called sometimes it makes pretty good sense. The STACK_TEXT section tells you what was being called right before the crash, with the topmost line being the last thing that happened before the crash, and then it goes backward in time as you read down. So if you see a line like this:

i8042prt!I8042KeyboardInterruptService+0x21c

That means that the function I8042KeyboardInterruptService was called in the i8042prt driver. Following the same logic, this:

nt!KeBugCheckEx+0x19

Means that the function KeBugCheckEx was called in the NT kernel. This will almost always be the last call before the system BSODs, because that is the actual function in the code that displays the blue screen and handles the memory dump.

Finally, if you get something that looks like this:
Code:
*********************
*                   *
* Bugcheck Analysis *
*                   *
*********************

Use !analyze -v to get detailed debugging information.

BugCheck E2, {0, 0, 0, 0}

***** Kernel symbols are WRONG. Please fix symbols to do analysis.

*** ERROR: Module load completed but symbols could not be loaded for nv4_mini.sys
Probably caused by : i8042prt.sys ( i8042prt+2681 )

Followup: MachineOwner
---------
Then that means that your symbol path or image path is wrong, and your analysis may be incomplete or incorrect. Notice how in this case you don't get any function names due to the symbols not being loaded. Some drivers may not have symbols available, so there is a chance you could get the second error even if all your paths are right. But if you get the error about kernel symbols being wrong, then you definitely need to fix your symbol path.

So that is today's lesson on troubleshooting BSODs. See, they aren't as mysterious as they look.
__________________

Last edited by Daniel ~; 4th December, 2002 at 09:24 PM.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
 



Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Guide to troubleshooting Windows and Office noob OS, Software, Firmware, and BIOS 29 28th July, 2005 07:58 PM
Guide to troubleshooting Windows and Office noob AOA FAQ 1 6th July, 2005 10:53 PM
Blue Screen! 5|*42 CRASHED! 4 6th September, 2004 01:38 AM
Blue screen sasquash72 CRASHED! 12 2nd July, 2004 11:26 AM
Blue Screen bigjohnson CRASHED! 12 26th July, 2002 03:31 AM


All times are GMT +1. The time now is 04:17 PM.


Copyright ©2001 - 2010, AOA Forums
Don't Click Here Don't Click Here Either

Search Engine Friendly URLs by vBSEO 3.3.0