Today I discovered yet another reason why I prefer running Linux servers instead of Windows servers…
Hardware Failures
I noticed over the course of the past week, the Decatur server seemed to have been rebooting for no reason at least once and sometimes twice a day. My first thought was the power supply.
I changed out the power supply with a new one – and the server rebooted again for no reason.
Next I checked the memory using the memtest86+ Linux utility. The server has 2 pieces of GSkill 2 GB DDR2-800 memory and the motherboard has two banks. I ran the test and it got to about 2% (took about five minutes) and the server rebooted again!
So I thought it was possibly the memory. I took out one of the memory modules, ran memtest86+ again, and the test fully completed an hour later.
The server was shut down again and the other memory was added back to bank 2. I re-ran the test – just thinking maybe it was a fluke of chance the server rebooted during the process, and again – about 2% it restarted again.
This time I took the memory out of bank 1 – the one that previously tested just fine. So only the previously untested memory was in bank 2. Turned the server on – and it wouldn't even post.
Oh boy I thought – I've got some bad memory. So I turned the server off and put the memory in bank 2 into bank 1 – this time figuring I would check the memory controller on the motherboard. Turned the server on – and again, wouldn't even post.
I put the original tested memory back into bank 1 and took the other memory out of bank 1 and put it off to the side. This previously worked – and again, the server wouldn't post yet again.
Luckily a few months back, I upgraded our home computer and had a full motherboard/processor/memory combo in storage. I took it out, got it all setup, hooked in the server's hard drive and CD-ROM and poof.. off and running. Before I booted onto the server hard drive, I booted off the CD and did a full image of the server hard drive – just in case the new motherboard/processor/memory combo caused problems.
Rebooted after making the image – and viola! The Linux server kicked right off – and needing no additional drivers or loading anything! I did have to go in and change the MAC address for the network card – otherwise it was showing up as eth1 with no configuration. Easily done – I just opened up the /etc/udev/rules.d/70-persistent-net.rules and removed the line for the old network card – then changed the "eth1" to "eth0" on the end of the new card.
Much easier than a Windows server where you would have possibly had re-activation, new drivers to download and configure, and who knows what other potential problems.
The Decatur server's motherboard is only about 1.5 years old – I am appalled that the memory controller has gone bad. I did notice there was one capacitor that was raised and looked like it was just beginning to leak – so I wonder if it is the culprit. The original board was an ECS AMD69GM-M2. I just ordered a new JetWay JM26GT4-LF motherboard today. I got it for a steal of $35 including shipping. Otherwise, I was contemplating leaving the system as-is with the AMD XP3200+ 64-bit processor and 1 GB of memory. The other processor/memory was an ADM 64 X2 2.2 GHz and 4 GB of memory.