0

Could do with some second opinions about what might be behind several hard lock-ups/freezes I've been experiencing lately.

Unfortunately, I have no other symptoms to go on than that. The computer simply freezes in the middle of whatever it was doing and I have to hard-reboot. No BSODs, no error messages. Nothing.

So far I've tried running the full memory test suite (Windows10 built in, not Memtest86), which passed several times with no errors, so I'm pretty certain it's not memory related.

I've run sfc /scannow a bunch of times, and DISM.exe too, all with no corruptions or anything to be detected.

Currently, my only other hypothesis is that it maybe an overheating issue, though there is some evidence for and against this...

For:

  • Computer does run quite hot in general (GPU temps anyway, CPU seems pretty happy).

  • I've seen GPU temps in-game >80 degrees (I'm running Xfire'd XFX RX480s and they really pump out some heat).

  • I think I've seen hard-locks as a symptom of overheating in the past (though my memory may be wrong).

  • I recently had to move the PC to a more confined space and it seems to be occurring more frequently.

Against:

  • I would expect the GPUs to throttle before simply grinding to a halt, and this supposedly doesn't kick in until about 93 degrees.

  • It happens when sat idle at the desktop occaisionally (though much more frequently in game), where the load is minimal and the idle temps are about 53 on the master GPU and 38 on the second card.

  • I updated the Radeon drivers and it doesn't seem to have made a great deal of difference.

I think my next step will be to remove one of the GPUs in case its Xfire itself that's breaking something, or the combined heat from the 2 cards is the culprit, but if anyone has any better suggestions for things to try/ways to get some more diagnostic info, I'm all ears!

Full specs below in case it's useful:

AMD FX-9590 Octacore 4.7-5.0Ghz (stock OC) cooled w/ H100i
16GB DDR3 1600MHz CAS 8-8-8-24 Crucial Ballistix
Gigabyte 990FXA-UD5 rev 02 MOBO
2x XFX RX480 Double Dissipation Black Edition GPUs (16x/8x Crossfire)
Kingston HyperX 256 GB OS Boot drive
  • 1
    Look at this answer for strategy on isolating the cause of the problem – I say Reinstate Monica Feb 14 '18 at 22:39
  • I’m not really looking for a general strategy, I’m more than happy isolating the issues by trial and error. I’m more on the lookout for any more specific suggestions anyone might have with these particular symptoms. – Joe Healey Feb 14 '18 at 22:43
  • 1
    You don't know what's causing the issue. You need to systematically eliminate possibilities. That's exactly what the answer I linked helps you do. – I say Reinstate Monica Feb 14 '18 at 22:57
  • Yes, I realise that. I dont need help eliminating possibilities. I'm looking for specific input about whether the symptoms fit a GPU issue or whether I should look elsewhere. – Joe Healey Feb 14 '18 at 22:59
  • From the info you have gathered, it could be a GPU, but it could just as easily not be. You could try stress testing the GPU to see if that turns anything up. If not, grant me the observation that you only have a guess as to what's wrong. If it comes to a place where you feel that's indeed the case, I suggest the divide & conquer strategy previously suggested. Good luck. – I say Reinstate Monica Feb 14 '18 at 23:06
  • What we know from your description is the problem is not software related. Hard freezes like this are never software related. So you can eliminate any process of trying to update drivers or fix software. The easiest thing to do would be to run a memtest86 then gently tap on the various components inside your computer with the plastic end of a screwdriver. You may find that tapping on the memory or GPU or some other component causes memtest to freeze or generate errors. Reseating devices and power cords can often fix simple bad connections. You should also make sure there is zero over clocking. – Appleoddity Feb 14 '18 at 23:09
  • You didn’t make mention of this was a custom built computer or if the problem occurred since it was new, or if the problem started more recently. These can be indications of bad memory settings, or incompatible components. – Appleoddity Feb 14 '18 at 23:11
  • Its custom built by me yes. I started noticing the freezes maybe about 6 months ago, but they were infrequent so I didn't think much of it. Theyre now occurring multiple times in a single session of use. The fact that the machine recently moved to a more enclosed space does correlate with this too which is why i mentioned it in the context of the overheating. I'm just not sure a hard lock is consistent with overheating. The computer has been otherwise working flawlessly in its current configuration since i built it about 2 years ago – Joe Healey Feb 14 '18 at 23:19

2 Answers2

1

So after some swapping in and out of components, monitoring hardware statuses and various checks, I pretty much ruled out any hardware issues.

Searching around further on the web and speaking to some friends of mine, it seems many people have been experiencing hard lockups lately, and with increased frequency. There is some suggestion that it is due to the Windows Fall Creators update. (I'm on build 16299.248)

It seemed that this was therefore a software issue.

I made 2 changes in particular that seem to have (touch wood) helped - though much more prolonged testing is needed on my part:

Temp

I emptied the temp folder following the instructions of a thread I found elsewhere. I did this in concert with the step below, so I'm not sure if this explicitly had anything to do with any improvement - it may just be coincidental.

Simply type temp in an open Run box, and empty the contents. Try to quit as many applications as possible so that as few files in the folder are in active use as they can be.

Paging

I suspect this is where the key lies, and given the symptoms, is probably not unlikely. On inspecting the paging file size/usage, the file was approximately 2.5GB for my machine, and was entirely set to be determined automatically. This seemed on the low side, especially given that my PC has 16GB RAM.

I overruled the automatic settings, and set a custom minimum size of 1000MB (it complains if its lower than 800MB anyway), and set the maximum to 1.5x my RAM size for now (24000MB) for my C: drive. This is probably excessive, but I'll tweak it later.*

*NB: don't do this if your boot drive doesn't have sufficient space to allow it.

My paging filesize has now increased to a little over 4GB, so I suspect this could well have been the culprit all along.

0

A follow up to my previous posts, as I think I finally resolved the issue.

I'm posting this as a second answer as my first one did seem to help somewhat, so I don't think its 'wrong' per se.


There are reports online of Vishera series CPUs in particular, suffering from issues with their intelligent power management (though unfortunately, I can't find the specific link that tipped me off now).

I won't post specific instructions as it no doubt varies by board/BIOS, but since changing the following BIOS settings, I have experienced exactly 0 crashes in several months:

  • Disable C6 Power state
  • Disable Cool & Quiet
  • Disable HPC Mode

I think C6 Power State is the particular culprit, but I changed a few things and it's been stable ever since so I didn't really have the inclination to go back and do it one by one, but I'd start with C6 perhaps.

Hope this helps someone else!