Coffeehouse Thread

23 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

I hate computers...

Back to Forum: Coffeehouse
  • User profile image
    Sven Groot

    Sometimes, something happens on a computer, that just cannot be explained, no matter how much you know of computers. Take a look at this example.

    I've been having harddisk-space shortage for a while now, so today, I figured, lets buy an additional harddrive. My plan was to install the new drive, copy my data to it, repartition the original drive and reinstall Windows (I also wanted to increase the size of my system partition, this is the safest way to do it).

    So I got the drive, installed it, booted Windows, and WHAM, it crashes. Reboot, seems to last a little longer, and WHAM, it crashes. Reboot again, by now I've got the new drive partitioned, start copying files, and after a few minutes, WHAM it crashes. There's no pattern to it, even the STOP error given isn't the same one all the time (although PAGE_FAULT_IN_NON_PAGED_AREA and DRIVER_IRQL_NOT_LESS_OR_EQUAL were most common).

    So I unhook the new drive. Crashes remain. I switch all kinds of stuff around with my hardware, crashes remain. I revalidate connections, jumper settings, everything, try removing and reinstalling drivers, crashes remain. Another strange thing I notice is that sometimes the old drive doesn't initialize fast enough to be detected by the BIOS, which it's never done before.

    I reconnect the new drive, but this time as master, and attempt to install Windows XP on it. It crashes during hardware detection. Remove the old drive from the system, continue setup, still crashes during hardware detection. Start setup from the beginning again, no crash in hardware setup, but it crashes later during the setup.

    Unhook new drive, rehook old drive, try Windows again, still crashes. Unhook CD and DVD drives (I'm getting desparate now), still crashes. Boot Knoppix from CD, that too crashes. Unhook all harddrives from the system, boot Knoppix, it still crashes!

    By now I'm certain that I must have damaged some hardware part while installing the drive, but because of the random nature of the crashes, I can't figure out what.

    I keep trying Knoppix, it keeps crashing. I try again, but this time I try doing nothing. It doesn't crash! Since on previous times I used Mozilla, I decide to try and do something non-browsing related. I try out a little ISO C++ conformance bug I found in Visual C++ in g++ (namely friend name injection, and g++ also has the bug I found out), no crash.

    I start using the browser again, no crash. I use the system for more than 40 minutes, without it crashing, while previously, for hours and hours on end of trying, anything over 3 minutes was a miracle.

    I hook up the old drive again, boot Windows. I start browsing, watch the video on Nomad, type this lengthy post, etc. all in all I've been working for about 45 minutes now, and still no crash!

    So an inexplicable problem that appeared for no good reason, had no identifiable cause or no symptoms besides random inexplicable crashes, now seems to have just as inexplicably disappeared again! Or so it seems. It could just as well crash again a few seconds from now. But fact is, it's been running now for longer than it has been able to all afternoon.

    So now I'm in dubio. Do I retry connecting the new drive? It seem dubious it was the problem, but perhaps there's something wrong with the controller that doesn't want two devices on the primary IDE channel? But then why would the problem persist even with the original situation restored, and then vanish randomly? I can't explain it anymore! I payed €65 for a hard drive which I can't really tie to the problem anymore except that it started when I hooked it up (but didn't go away until hours after removing it again), so I simply don't know if trying to hook it up again is safe. If I hook it up again and it crashes, does that mean the crash is caused by the new drive, or does it mean this crash-free session was a fluke and the problem hasn't gone away?

    In other words, I hate computers!

  • User profile image

    Get the hard disk replaced. Check for bad sectors. If they exist replace it right now.

  • User profile image

    Did you even read what he said?

  • User profile image
    Sven Groot

    It just crashed again. Sad

    I had installed the Windows Command Shell Preview, and it crashed as soon as I clicked the icon in the start menu (exactly simulaneous with my click, so I doubt it has anything to do with that) and it was gone again. Never mind the fact I've set it not to restart on STOP errors, it restarted anyway.

    Now I have no idea if this was a symptom of the same problem, an aftershock, a random crash... I've been messing with the IDE drivers so much, and Windows had so many sudden crashes while the problem persisted, this may just be an effect of the damage that may have caused... I simply don't know.
    I'm not back to the near-instant crashes again, otherwise I would not've been able to finish this post, it wouldn't give me that much time.

    So now I'm even more confused than before.

  • User profile image

    PSU ?

  • User profile image
    Sven Groot

    I don't think so. That wouldn't cause an actual software crash. You see, even though Windows rebooted in spite me telling it not to, there's an actual Save Dump event in the event log.

    If the PSU decided to trigger a hardware reboot, that wouldn't have happened.

    In Knoppix, it would just hang completely, not rebooting or giving any kind of helpful message as to what had happened at all (and yes, I did try if I could still access another virtual console, but I couldn't).

  • User profile image
    Sven Groot

    Online Crash Analysis on that last crash told me it was because of a memory corruption. All the previous crashes were because of device driver.

    If I did damage a piece of hardware while installing the drive, the memory DIMM isn't even an unlikely candidate. I'll try running WinDiag later on.

    For now I've just tried reinstalling the SiS IDE drivers (which were left in a bit of a half-installed state due to my meddlings), see if that makes a difference.

  • User profile image

    If the PSU wasn't supplying power correctly to devices or intermittently could be causing the problems you suggest. Also could be memory as you said. Test memory first.

    Download Windows Memory Diagnostic

  • User profile image

    We'll I am convinced that computers are made by human beings and these creatures constantly add to some historical record written to a harddrive that may crash---but it is often erased and reformatted.

    So I depend on the "historians" of to tell me what a reliable computer system is before I try to build a new computer---or add new parts to my computer. I used to depend on PC Magazine but these guys are too commerical now which means they represent a virtual person we call a corporation (so they write articles for virtual people) while the folks of ArsTechnica (for the time being) are human beings who may or may not be incorporated.

    So my question (that I am sure you are not wont to answer this late in the game) to you is, "What brand of hard drive and motherboard are you using?"

  • User profile image
    Sven Groot

    WinDiag completed all tests without problems.

    The computer, which is now running in the same setup as it has been for almost a year without any trouble, has an Asus motherboard (SiS chipset) and both the new and the old HDD were Maxtor DiamondMax Plus 9 drives, both 80GB. I've been using Maxtor's in systems for as long as I can remember, and have never had any problems.

  • User profile image

    I hate to admit it, but I actually installed the connection into an IDE drive upside down once (the lug was missing on the cable) but the computer survived fine. Modern PC's are pretty tough.

    I used to have a similar problem with my old computer on hot days. The system would just reboot and I think it was just the CPU overheating (it can get hot down here... 108F last week). The case was sitting on the floor and the carpet was blocking the cool air getting in the bottom. Anyway, perhaps your CPU is overheating. You should be able to find a utility that tells you you CPU temp and see if there is any connection between the crash and CPU temp? Check your CPU fan.

    An oldie, but goodie, is to wiggle (or press down) on any loose chips on the board (like th memory, etc). It only takes once bad connection to one bit on one track of the memory and you could get a parity error.

    My other suggestion would be to replace the power supply. They have caused me trouble in the past as well.

    I'm not sure if any of these are the actually problem, but I'm just to tring to help, as I know how frustrating this type of thing can be. Sadly, the fact that everything worked fine in the past does not matter... everything works fine until is breaks Smiley

    Best of luck!!

  • User profile image
    Sven Groot

    MrJelly wrote:
    I hate to admit it, but I actually installed the connection into an IDE drive upside down once (the lug was missing on the cable) but the computer survived fine. Modern PC's are pretty tough.

    I actually once accidentally pulled the soundcard out of its PCI slot while the computer was on, it crashed but otherwise survived fine (that was not on this system though). Smiley

    Asus PC Probe says my current CPU temperature is 47 celcius, that's about normal. I actually checked it a few days ago too by coincidence, and it was even a bit higher then.

    CPU Fan is running at 3200 rpm, same as always.

    Motherboard temperature is 29 celcius, chassis fan is running at 2500rpm, nothing out of the ordinary there. PSU fan isn't monitored, but I can see it's running normally through the window in my computer case.

    Voltage levels are all normal.

    I'll keep PC Probe running in the background for a while. Should any anomalous readings occur before a crash (if it crashes again), it'll warn me about them. Provided they happen sufficiently far before a crash of course, but if it's the CPU overheating it should go gradually. I don't expect to catch PSU problems this way though.

  • User profile image
    Sven Groot

    Well, I'm gonna call it a night (it's 1:43 AM here now). No more crashes so far, lets pray everything will work tomorrow.

  • User profile image

    Is this either a PCChips or ECS motherboard with an SiS 735 chipset? I had the exact problem you describe.

    I bought a new hard drive. The computer began acting funny a week later. It went downhill. I plugged the HD into my Dad's computer and it had no problems (Identical Motherboard). When I put it back into my computer, it was worse yet. It couldn't find ANY boot devices.

    I ordered a new motherboard with an nForce2 chipset from NewEgg. Two days before it arrived, my old motherboard started working again.

    I put the new one in anyway, though, for reliability's sake. It's faster, too.

    BTW, AMD Athlon 1.0Ghz, 256MB DDR.

  • User profile image
    Sven Groot

    No, it is not. For the sake of completeness, here are my full system specs:

    Asus P4S8X-X mb, SiS 648 chipset.
    Pentium 4 2667MHz, 533MHz FSB (no hyperthreading)
    512MB PC2700 DDR-RAM in one bank
    Maxtor 6Y080P0 80GB UDMA-6 HDD (old drive)
    (Maxtor 6Y080L0 80GB UDMA-6 HDD (new drive, currently not connected))
    ATI Radeon 9200SE
    Creative Labs Sound Blaster Audigy2
    Leadtek TV2000XP Expert (TV-card)
    NEC DV-5800C DVD-drive
    NEC NR-9300A CD-R/W drive
    Microsoft Wireless Optical Desktop

    So far, so good today. 21 minutes uptime and counting.

  • User profile image
    Sven Groot

    And it crashed again, after 33 minutes. The bugcheck indicates it was 0x1000007f (UNEXPECTED_KERNEL_MODE_TRAP_M), which is the same as what happened yesterday, and according to MSDN the same as bugcheck 0x7f (UNEXPECTED_KERNEL_MODE_TRAP). The trap code 0x8 indicates a double fault:

    0x00000008, or Double Fault, is when an exception occurs while trying to call the handler for a prior exception. Normally, the two exceptions can be handled serially. However, there are several exceptions that cannot be handled serially, and in this situation the processor signals a double fault. There are two common causes of a double fault:

    1. A kernel stack overflow. This occurs when a guard page is hit, and then the kernel tries to push a trap frame. Since there is no stack left, a stack overflow results, causing the double fault. If you suspect this has occurred, use the !thread debugger extension to determine the stack limits, and then use the kb (Display Stack Backtrace) debugger command with a large parameter (for example, kb 100) to display the full stack.
    2. A hardware problem.

    I'm betting on number 2. The article also remarks that it is especially likely the memory is responsible for it. And the more I think about it, the more this seems like a memory problem. However, that does not explain why adding a disk drive caused the problem (unless I did something while hooking it up to damaged some other component) or why WinDiag can't find the problem.

    In any case, I have just removed and reseated the RAM chip. I don't expect it to make any difference, but you never know.

  • User profile image

    Sven Groot wrote:
    And it crashed again, after 33 minutes.

    Spooky given what happens in the first episode of Battlestar Galactica monday night (not a spoiler if you are still waiting in the US - it's the title of the episode).

  • User profile image
    Sven Groot

    2 hours 20 minutes, still no crash. Could this rediculously simple action (reseating the RAM) have finally solved this problem? I'm not getting too optimistic yet, but this seems a good sign anyway. If it succeeds in running without crashing for the remainder of the day, then I'll start thinking about reconnecting the new hard drive again.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.