Saturday, September 15, 2007

Overheating woes

Beast seemed to be getting a bit too slow for my liking, and I had this unused Athlon64 3800 CPU sitting around at school, so I thought I'd try it out. After swapping out the old A64 3000, I did some benchmarks compiling some code. The first time around, compile time dropped from 13 seconds to 10, but oddly, subsequent runs got slower and slower, it finally took 25 seconds to compile the same code before I got a kernel panic (Caps and Scroll lock blinking). Odd, I thought... why would a relatively small bump in CPU speed raise the temps so much? So I tried swapping back the old 3000 CPU, when I found the heatsink was so hot I couldn't hold on to it for more than a few seconds without letting go. Weird.

With the 3000 back in, I monitored temps and they were high but not dangerously so (55-60 C on load). I thought maybe the problem is improper heatsink grease application (my tube was almost empty), so I bought a tube of Arctic Silver 5 from NewEgg. While I was there, I blew some more cash and bought some extra RAM and, thinking the newer 90mm process would help, I bought an A64 X2 4200 (plus they are the last dual core CPUs my motherboard supports).

Turned out the X2 had the same problem, temps go all the way to 90 C (the thermal limit of the CPU) and I get random kernel panics. In desperation, I thought the MSI Neo4 motherboard wasn't sending the right Vcore to the CPU, searched forums, updated the BIOS... no good. Finally, when I was putting the 3000 CPU back in, I noticed I couldn't see the fins on the heatsink through the fan blades, so I unscrewed the fan from the heatsink. What I saw next almost made me sick:

Heatsink before cleaning

Three years of almost non-stop operation (save for the 1 month I went to India) and the fan had blown a huge layer of crud over the heatsink fins, completely covering them up, meaning zero airflow. Yikes!

I disassembled the heatsink (it's a fairly decent Kingwin all-copper heatsink. Nice and heavy!), used tweezers to remove the layers of dust bunnies, blew out as much dust as I could, then washed it in the sink. The result was a nice, shiny copper heatsink. This is what it looked like with everything but the fan back in place:

Heatsink after cleaning

After reassembling the whole thing, idle temps were around 33-35 (my room isn't air conditioned), on load it went up to maybe 48. Much better.

Makes me wonder... many people complain that when they buy a new computer, everything works well and things are fast. Then later, despite reinstalling operating systems and so on, things are unstable and prone to lock up. I wonder if accumulated dust may be the reason. Modern CPUs consume vast amounts of energy, all of which must be converted to heat. Without a good heatsink, CPUs are designed to throttle back their speed to protect themselves, and in extremes, shut off completely.

One way to detect such problems early on is to use hardware monitoring programs. Linux users can configure and set up lm_sensors (the sensors-config script makes this easy), then run monitoring progams like gkrellm or the GNOME sensors applet. Windows users can run SpeedFan or MBM5. Watch the CPU temps for signs of abnormal behavior, especially when on load.