November 2018

ucs c240 m3

I purchased a cisco ucs c240 m3 on ebay for $195. The unit came equipped with dual e5-2609's, 8gb of ram, heatsinks and four drive trays. Initially, the fans were too loud so I upgraded the firmware using the host upgrade utility.

This allowed me to bring the fans down to a very reasonable level. However, after running linpack on the machine, I noticed that it was rebooting after being under heavy load. I checked out the ram and everything but the same thing happened. None of the capacitors on the motherboard appear swolen or leaky.

I tried replacing the processors (since the unit wasn't enclosed in electrostatic wrap during shipping I figured that the processors may have experienced some static discharge rendering them unreliable.

I was getting errors in cimc like PVCCP_P1: Processor 1 voltage is lower critical before I changed the procs. I am still getting reboots after the server has been under heavy load. It's now not while the processors are stressed but actually after the process has been killed and the machine idles for a while.

Running geekbench is fine, running linpack is too much. I think I have no choice but to return this server, which is a bummer because I'm going to be out my shipping costs.


Upgrading the processor energy profile to high performance has prevented failures. The longest I've been up has been 8 hours. I'm quite confident it can stay running for longer now. I wanted to work up to something longer since a system failure results in the fans hitting 100% which can be quite annoying in the middle of the night.

December update

Unfortunately, the problems resurfaced after attempting to fill all the ram slots on processor 0. Stepping back down to only 4 dimms fixed the issue. I still suspect an issue with the powersupply. I have gotten $60 from the seller due to the situation. I hope to buy a new psu for this machine using this money.