I'm putting this here in case anyone else comes across something similar and needs any ideas. While I'm not yet 100% sure I've completely fixed it I am about 90% sure, so it should probably be documented.
Some time ago I got a new 40 inch 4K monitor, as previously mentioned. It is amazingly good, and indeed I ended up buying another one so I have a matched pair. Which is very nice indeed
To drive these things I needed a card with two displayport outputs and picked up a second hand AMD Radeon 7870 card, which has the required ports as miniDP, along with a dual-link DVI port and an HDMI one, so it can actually run five monitors. The motherboard I had, a Gigabyte one, also had two more ports, DVI and HDMI, so I could if necessary run 7 of the things, although only two 4K ones at full resolution and 60Hz.
The machine was nothing special, the aforementioned gigabyte motherboard, which I'd originally picked due to the fact it had three graphics card PCI-E slots which were required for the three dual-port cards I was running at the time, and a PCI slot for my 8-port RS232 card. In the machine was an intel i5 processor, 16GB of ram, a 1.5TB samsung HDD, an 840EVO 250GB SSD, and a 1kW psu. It was fast enough for my purposes, reliable, and fairly quiet. It was running windows 7 as well.
The problems arose sometime about two months after I got the first monitor. The machine suddenly began becoming reluctant to start. Normally I hibernate the thing when I stop using it, because I have so many programs in operation that even with the SSD restart times are annoying due to needing to reload everything. It's been working fine like that for a couple of years. Yet, with no obvious reason apparent, it wouldn't start up when you pressed the power button.
It would bring up all the fans, ponder life for a few seconds, beep, have another good think, then go dead right down to the power light going out. Then, after a couple of seconds, it would jump back into life and do the same thing. This could repeat anywhere from a dozen times to indefinitely.
Sometimes hitting the reset button was enough to make it sort its life out and get on with it, but more often it wouldn't do anything useful. Turning it off and back on, again, might sometimes jolt it into action, but again, not always. Turning it off at the wall and waiting for thirty seconds, though, pretty much always made it work.
The problem was made more annoying by the fact that it didn't always display this behaviour, only about eight out of ten times. Sometimes it would work fine. And it never did it unless it had been off for several hours, ie overnight, which made testing the various possibilities very time consuming.
I tried everything I could think of to no avail and ended up living with it for a couple of months. Then, one day, right in the middle of a major project of course, it died completely. No way to get it to come up at all.
The end result was I had to put together a new machine in a hurry. The motherboard was a socket 1155 one, which by now (August this year) was sufficiently obsolete that finding a replacement was difficult. I had to bite the bullet and buy not only a new motherboard (a socket 1150 Gigabyte Z97X-Gaming 5), but a new processor and ram to match. I took the opportunity to increase the ram to 32GB and stick in another SSD, a samsumg 850 evo, to avoid the issues that the 840 evo had, which is another story.
I also replaced the PSU, on the basis I didn't trust it, since the problems smelled of a power fault of some sort. It seems to test OK but even so it was worth a try. I retained the large CPU cooler, the USB3 extension adapter, the case, the graphics card, and the serial card, but everything else got replaced.
Once I'd cloned the old SSD onto the new one, booted in safe mode and uninstalled all the hardware drivers, and rebooted a few times with new drivers being added where necessary, I had the machine back in working order with all my data and programs in place.
Life was good.
The machine is faster, just as stable, even quieter, and everything is working nicely. I got the project finished, which ironically provided exactly the right amount of income to cover the machine needed to do it (
) and was able to continue working.
Right up to the point that the new machine developed the exact same symptoms...
Press power button, fans come on, beep, fans go off, pause, power cycle, repeat. Over and over again.
Turn it off at the wall, wait thirty seconds, turn it back on, it works.
A month later it died completely.
This time I took it completely to pieces and tested the hell out of everything. By the time I'd finished I could prove it did the same thing with a motherboard and processor only, nothing else at all connected. One of them was clearly bad. Replacing the motherboard was cheapest and also most likely to solve it, so I bought an identical one and put the processor and ram in it. It worked fine.
OK, bad motherboard, coincidental fault. The bad board did the same thing with three different power supplies, the power supply out of the machine worked fine on different hardware, so I put it all back together, leaving the serial card out just in case, and got back to work. Amazon replaced the faulty board, so I now had a spare.
A week later...
You guessed it, exact same initial symptoms.
Clearly something else is wrong.
Replacing the PSU with a known good one, the larger cooler with the stock one, and disconnecting absolutely everything not immediately required didn't help. I even tried taking the UPS I have connected to the machine out just in case there was some esoteric AC issue in play, with no luck.
By this point I'm wondering how the case is eating motherboards. Last night I spent hours poking around on google looking for possible similar problems, which I've done several times to no effect, and finally found something that might explain it. But it's not at all obvious.
It's basically all about displayport cables, I suspect. It turns out that there is a weird gotcha with the damn things, which is that a VESA-spec cable should NOT connect pin 20 from the sink (the monitor) to the source (the PC). Most cables don't. However, some cheap ones do, presumably because they didn't read the spec and just, fairly logically, assumed you connected all the pins through.
The problem is that the sink is supposed to provide 3.3V at 500mA on pin 20, as does the source. I assume this is probably meant to drive an active cable or something similar, but if it's connected straight through, what ends up happening is that the monitor is trying to power the PC's 3.3V rail, which is never going to end well.
I checked with a meter and sure enough, one of the two cables was indeed providing 3.3v on pin 20 at the PC end when it was unplugged from the PC itself.
In my case it seems to be slowly killing something in the power circuitry, quite possibly by reverse-polarising a capacitor or something along those lines. It starts off with this refusal to boot properly, for some reason I'm not entirely sure about, and eventually progresses to a dead motherboard. It's the first actually measurable and testable potential problem that might explain the difficulties I've been having and I'm very hopeful it's the root cause of the issue.
Another indicator that it's doing something is that I noticed when scrabbling around in the dark under the bench that one of the LEDs on the back of the graphics card stays dimly illuminated when that cable is plugged in and the monitor is on, which is proof that the power is in fact going somewhere it shouldn't. Unplugging the cable makes the LED go out.
I replaced the cable with the spare I had which was identical to the other one that doesn't exhibit this issue, double checked there was no voltage present, and turned everything off. This morning when I turned it on, it all works fine.
It will take some time to be reasonably certain I've fixed it, and there's always the possibility that either the graphics card or motherboard are now damaged, but early indications are promising.
Anyway, if you have weird problems with a PC and you've got displayport cables connected to it, it might be worth checking them.
pca