• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

PC freezing - no solution in sight :(

Status
Not open for further replies.

JaseC

gave away the keys to the kingdom.
JaseC - I'm still pretty unclear on what I'm doing with the clock speeds of the CPU using your method. It's basically me setting the clock speeds myself rather than letting the mobo run it at the default settings? So I should set the CPU frequency to maybe 37, how would I work out the voltage to use then? I'm tempted to make this the first step, but I feel like it requires heaps of attempts before I can rule it out as the issue (i.e fiddling with voltages each time until I decide, fuck it, it's not the CPU voltage).

You should be able to adjust the voltage (vcore) without fiddling with the multiplier. Leave the latter at its default value (which I believe is 35 if you do need to set it manually) and increase the former to something that you'd typically associate with a generous overclock, such as 1.36v.
 

PaNaMa

Banned
1) Try connectin your PC to another power outlet in a diff part of your house (on a diff circuit)
2) A UPS helped me identify a problem with my system on a particular outlet. Under full gaming load my PC woujld just lock up. Finally connected to UPS and alt peak load the UPS would eventually kick in over to battery. Something was happening with th wall outlet. Something was shorting with the socket or I was drawing too much power from it...7970 crossfire plus CPU over clock) cause that circuit also on powered a 1500wattt baseboard and lights and whatever.

Another thing to try is New Power cable. I've seen cords fail or be flakey so do that too.
Anyways ups and a different outlet on a diff circuit fixed my problem. No harm in trying man.
 
Don't mean to be rude - why it would be a thermal paste problem if the temperatures don't indicate an overheating CPU?

The mobo could be fucked, you applied the paste, you could have put too much when the CPU was in place, it spread too far and touch places it shouldn't..
 

Resilient

Member
Well I guess it's getting to that point where there are about 10-20 different combinations of things I need to try. Looks like this is gonna take at least a month to close out..this is fucked though. Tossing up whether I should take it somewhere to get checked out.

You should be able to adjust the voltage (vcore) without fiddling with the multiplier. Leave the latter at its default value (which I believe is 35 if you do need to set it manually) and increase the former to something that you'd typically associate with a generous overclock, such as 1.36v.

Ok. I'm going to try this last as I want to eliminate the other issues.

1) Try connectin your PC to another power outlet in a diff part of your house (on a diff circuit)
2) A UPS helped me identify a problem with my system on a particular outlet. Under full gaming load my PC woujld just lock up. Finally connected to UPS and alt peak load the UPS would eventually kick in over to battery. Something was happening with th wall outlet. Something was shorting with the socket or I was drawing too much power from it...7970 crossfire plus CPU over clock) cause that circuit also on powered a 1500wattt baseboard and lights and whatever.

Another thing to try is New Power cable. I've seen cords fail or be flakey so do that too.
Anyways ups and a different outlet on a diff circuit fixed my problem. No harm in trying man.

I'll give it a shot in another room then and see if it happens.
If there is no issue I'll look into the UPS.
I've already switched the cables, problem still appeared.

The mobo could be fucked, you applied the paste, you could have put too much when the CPU was in place, it spread too far and touch places it shouldn't..

I see - fair enough.
 

LilJoka

Member
As some people have said the next thing to try is downclocking.

Make sure the ram is at 2133mhz.

For the GPU use MSI AB to drop the GPU core and memory clocks by 100Mhz.
Test with heaven again.

If that fails drop the CPU multiplier by one and try again.

If that fails then bump the VRAM voltage to 1.25v @2133mhz.

If this still fails then I would start the RMA process on CPU, motherboard and GPU.
 

Grath

Member
I had a similar (not the exact same, my system would not totally crash, but stop everything for a few minutes, then continue normally again for 10-20 minutes) problem, and it turned out one of my HDDs had some internal errors which were not shown in any diagnostic software. Try to take out the HDDs, leave in only the SSD, and try booting that way. In the OP I didn't see this, maybe it will help you too.
 

Resilient

Member
I just ran chkdsk on my D: drive (HDD, 1TB secondary storage)

Errors detected in sparse file record segment ... (probably 200+ entries here)

Errors found. CHKDSK cannot continue in read-only mode.

is this a something?
 
Well I guess it's getting to that point where there are about 10-20 different combinations of things I need to try. Looks like this is gonna take at least a month to close out..this is fucked though. Tossing up whether I should take it somewhere to get checked out.

Start from the bottom! Jesus Christ why are you making it so hard on yourself.

Remove all devices which are not needed for PC operation and work your way by adding them one by one. You will be able to verify whether your MOBO/CPU/RAM/SSD work correctly with a single config.

You need only 4 configs to test everything: base, base + GPU, base + secondary drives, base + all RAM. What is more, you can do most activites and play games in all of those configs, so it is not wasting your time stress-testing all night long rendering pc useless.

You reduce some random settings in BIOS and continue testing all components at once.
 
When my PC was doing this for seemingly no reason, I removed my nvidia graphics card, cleaned it out super well with qtips, put it back and it hasn't happened since. Could've been the cleaning, could've been the reseat, but whatever it was it worked.
 

Sevenfold

Member
I just ran chkdsk on my D: drive (HDD, 1TB secondary storage)

Errors detected in sparse file record segment ... (probably 200+ entries here)

Errors found. CHKDSK cannot continue in read-only mode.

is this a something?

Was that on restart? Try chkdsk /r from an elevated command prompt (right click open as admin)
 

Engell

Member
Errors detected in sparse file record segment ... (probably 200+ entries here)?

Shouldn't make your computer crash
But it can be a sign of a defective controller on motherboard or CPU.. can also just be a result of your computer repeatedly crashing.
 
b-b-but PC master race!!!

For real though shit like this is why I stopped PC gaming, I have had such bad luck in the past with hardware that I couldn't be bothered anymore.

I hope you solve the problem OP as I understand how shitty it feels when your expensive PC isn't working properly.
 

Resilient

Member
Gonna do what Evolution of Metal said then report back

However after running chkdsk i took out the HDD and haven't had a crash for 2hours and 20minutes..

also one other thing i noticed. when i took the HDD, and logged in to windows. Usually, the lock screen says welcome, then it logs in and goes to desktop. USUALLY there is a brief flash of a full blue screen, like, less than 0.5 of a second. i always thought that was really weird. then it just loads the desktop. well. when i took the HDD that didn't come up. is that something? who knows. W10 is weird.

When my PC was doing this for seemingly no reason, I removed my nvidia graphics card, cleaned it out super well with qtips, put it back and it hasn't happened since. Could've been the cleaning, could've been the reseat, but whatever it was it worked.

when i start removing components i'll do this.

Was that on restart? Try chkdsk /r from an elevated command prompt (right click open as admin)

this was just on start, cmd, run as admin, chkdsk d:

Shouldn't make your computer crash
But it can be a sign of a defective controller on motherboard or CPU.. can also just be a result of your computer repeatedly crashing.

so far so good having taken it out...fingers crossed?

b-b-but PC master race!!!

For real though shit like this is why I stopped PC gaming, I have had such bad luck in the past with hardware that I couldn't be bothered anymore.

I hope you solve the problem OP as I understand how shitty it feels when your expensive PC isn't working properly.

yeah, it stinks, PC master race is most definitely not a thing.
 

Resilient

Member
Can someone make a summary so I can see what the exact problems are and try and help?

When gaming, the PC will crash. It might be 4 hours into a session or it could be 1. But the game just crashes, screen freezes on the game, and sometimes the audio goes weird (stuttering, crazy distortion sounds), the PC loses functionality (can't open Task Manager, can't alt+tab, can't ctrl alt del) - but the fans remain on - and I need to hold the power down for 5seconds and turn it back on again.

Specs are in the OP.

This happened with the same build using my old GPU (7850 OC, Gigabyte, 2GB) and my EVGA 1070 SC.

Steps I've taken:

1. First occurrence - update all drivers. This lead to a W10 boot failure/BSOD. Reformatted. Updated all drivers. Updated MOBO BIOS.

2. Second attempt to fix (when it reappeared) - test RAM using memtest86. Tested each stick individually until it achieved 8 "Pass" marks - each stick achieved with 0 errors.

3. Replaced the SSD (now using a Samsung 850 PRO 256GB). Fresh W10 install. Updated all drivers. Didn't update BIOS (BIOS version was too recent (Jan2017) so didn't want to risk it just yet). The error occurred again just now.

Turning off CPU OC - crashed when gaming.
Swapping SATA ports around with SSD/HDD - crashed when gaming - although one time here, the PC just reset in game. Like, bam, restart, login screen.

8 hour prime95 28.10 stress test; no errors found
1 hour Unigine Heaven stress test; crashed, took the PC with it
4 hour Unigine Heaven stress test; crashed, didn't crash the PC
Turning off Geforce Experience - crashed when gaming

Installed new PSU - crashed when gaming

Right now, I've just taken out the HDD (secondary storage) as I ran chkdsk and had a lot of errors (Errors detected in sparse file record segment ... (probably 200+ entries here) Errors found. CHKDSK cannot continue in read-only mode). I've had no crashes for 3 hours in game, which is a good sign, but i have no idea anymore.

The next crash, planning to do as Evolution of Metal outlined above to identify which part is causing the issues.
 

ChryZ

Member
It could be the mainboard, micro crack in or on the PCB, cold=contact, board warms up under use, crack expands, throws spanner in the works, everything freezes up.
 

Resilient

Member
Have you checked event viewer? Are you receiving BSOD when the PC is crashing or is it just hard crashing?

At this point in time it could be anywhere from a driver conflict with a piece of hardware or the hardware itself.

So these crashes are only happening when gaming? The PC didn't crash when running prime 95 solely?

no crash with prime95.

event viewer has never turned up anything other than , "the pc shut down unexpectedly" when the crash happens. there is one other error that floats about.

"The application-specific permission settings do not grant Local Activation permission for the COM Server application with CLSID
{8D8F4F83-3594-4F07-8369-FC3C3CAE4919}
and APPID
{F72671A9-012C-4725-9D2F-2A4D32D65169}
to the user NT AUTHORITY\SYSTEM SID (S-1-5-18) from address LocalHost (Using LRPC) running in the application container Unavailable SID (Unavailable). This security permission can be modified using the Component Services administrative tool."

right now with the 2nd HDD out, i've not had a problem with crashing, so im going to continue like this and see if that was the issue.

also, occasionally "breakpoint has been reached" as an error when i shutdown while firefox is still open...not sure if that's relevant to anything, who knows at this stage.
 
Don't mean to be rude - why it would be a thermal paste problem if the temperatures don't indicate an overheating CPU?

http://www.techspot.com/community/topics/cpu-overheating-possible-thermal-paste-problem.178768/

http://www.computing.net/answers/windows-7/cpu-usage-spikes-to-100-almost-randomly/13119.html

http://www.tomshardware.com/forum/352659-28-crazy-temp-spikes-dried-thermal-paste

I have that exact same cooler. My M/B never goes above 45C (currently at 37C) and my CPU is currently running at 51C.

The moment you run something intensive, you mentioned large spikes in temperature which means the heat isn't being dissipated properly.
 

Resilient

Member
http://www.techspot.com/community/topics/cpu-overheating-possible-thermal-paste-problem.178768/

http://www.computing.net/answers/windows-7/cpu-usage-spikes-to-100-almost-randomly/13119.html

http://www.tomshardware.com/forum/352659-28-crazy-temp-spikes-dried-thermal-paste

I have that exact same cooler. My M/B never goes above 45C (currently at 37C) and my CPU is currently running at 51C.

The moment you run something intensive, you mentioned large spikes in temperature which means the heat isn't being dissipated properly.

is the 51 you are getting under load? im usually getting 55-65 when in a game, and i thought that was safe temps from what i had read online for Intel.

so to get this straight: the system isnt shutting down due to the CPU overheating directly, however, due to the high spikes, you think there is potentially a thermal paste issue that may be causing the system to become unstable. that right?

i'll investigate. thank you.
 
is the 51 you are getting under load? im usually getting 55-65 when in a game, and i thought that was safe temps from what i had read online for Intel.

so to get this straight: the system isnt shutting down due to the CPU overheating directly, however, due to the high spikes, you think there is potentially a thermal paste issue that may be causing the system to become unstable. that right?

i'll investigate. thank you.

And keep this in mind this is with an OC 2600k which has been going for 6 years. The fans are not even going full speed, the m/b automatically shuts off past 45C.

Under load the CPU never goes past 61C. If I had the fans running full speed like when I was in Florda, it would never come close to 71C.

I have personally replaced the paste before in this computer after I bought some bad paste soon after I built the computer. Those minor spikes should never happen that high.
 

LilJoka

Member
is the 51 you are getting under load? im usually getting 55-65 when in a game, and i thought that was safe temps from what i had read online for Intel.

so to get this straight: the system isnt shutting down due to the CPU overheating directly, however, due to the high spikes, you think there is potentially a thermal paste issue that may be causing the system to become unstable. that right?

i'll investigate. thank you.


PC wouldn't shutdown even if it hit 100c. It would just downclock.
Unless all cores hit 100c for a sustained period.
 
Downloading a program like Core Temp will allow you to see the temperatures for each core.

Core Temp
http://www.alcpu.com/CoreTemp/


http://www.alcpu.com/forums/viewtopic.php?f=63&t=892
What is TjMax?
Tjunction Max (TjMax) is the maximum temperature the manufacturer has rated their processor at. This value represents the maximum temperature the hottest part of the processor core should not exceed.
This value should not be confused with the TCaseMax rating, which indicates the maximum temperature the top-center of the processor's heatspreader should not exceed.
If your CPU is rated for 100C TjMax, and it was nearing the 100C value in the temperature fields, that is a sign of overheating. The temperature should not exceed this value, or it may cause instability, shorten the life of the CPU and cause massive performance issues.
A rule of thumb dictates that the temperature should be kept around 20C or lower below the TjMax value while under full load.

What is considered to be a safe temperature for my processor?
For processors with the "TjMax" value being shown in Core Temp it is usually considered best to keep the temperature 15-20C below that value when the processor is under full load.
For chips which don't provide a TjMax value, such as the AMD K8 family of chips, it's best to keep the temps under 70C full load.

I am running a Sandy Bridge OC 2600k and the core under full load one single core doesn't go beyond 63 degrees C.
 

Resilient

Member
Well fuck, it happened again gang. This was with everything but the 2nd HDD as I thought it was causing things to go fucky.

Next step: GPU out and both sticks of RAM. If this still crashes I'm cracking open the CPU to check the paste. After that I'm gonna RMA the ... board and or CPU?
 

LilJoka

Member
Well fuck, it happened again gang. This was with everything but the 2nd HDD as I thought it was causing things to go fucky.

Next step: GPU out and both sticks of RAM. If this still crashes I'm cracking open the CPU to check the paste. After that I'm gonna RMA the ... board and or CPU?

Board and cpu.
Did you try downclocking gpu and cpu?
 

Arsenic

Member
hey, I didnt read the entire thread but I too had a problem with my PC I built in mid-2015 where it randomly freezes in any situation at random times, especially during gaming. It was incredilbly frustrating for about a year until I finally got lucky with a solution. I still cant explain it, but it work, and since summer 2016, I can count the amount of freezes I've hand on one hand.

I downloaded Driver Booster 3 by IObit, ran it once, it updated my drivers for me, and suddenly my problem disappeared.

I dont know which specific driver was the cause/solution, since I had i think as amany as 90 updated/replaced at once.

The other weird thing is I tried updating drivers myself for every component but the results werent the same.

IObit is a bit of shady company however, so proceed with caution on their other products and running this thing all the time. But I do credit them with saving my PC build.


EDIT:


I want to add that every windows update/patch "cured" my PC for a day or two, then the problem will start again. I checked events viewer for the common problems, and narrowed it to about three, but following instructions online for disabling certain services to get rid of the issues did not help. Manual Driver update did not work. Isolating components did not work.

I noticed the only component we have that are the same is the Crucial DDR4 memory sticks. I also assumed they were a problem, but I had no backup sticks to check. Moving them around slots didnt help.
 

Resilient

Member
Ok so bear with me.

Saturday I did the following:
Put the GPU in the 2nd PCIe slot.
The SATA cable behind the GPU originally seemed a bit loose...I reseated this properly.
GPU BIOS updated (sorry I know I should have done this sooner).
Plugged into a different power point (may be on the same line as the other, but what the heck).

Lastly. Googled my mobo (Z170 pro gaming ASUS) and the issue. Found a dude who had the exact same issue as me. He changed his RAM XMP profile to the motherboard setting. I had done this once before but I allowed it to OC the CPU automatically (LilJokka may remember this issue a few months back). Rebooted.

Managed to play about 8 hours Saturday without a crash and just logged another 6 today, no issues.

I'm not getting my hopes up, so we will see what happens.

I know I may catch some ire from those who have been helping as I've been diagnosing the issue in an unorthodox way. But I thought i would give these things a try before I started taking things apart. Thanks for your patience.

LilJokka - I haven't started to RMA anything just yet. Gonna wait a little longer in light of the recent developments.
Arsenic - believe it or not this was the first thing I did last June when it first happened. It ended up corrupting my W10 install and I had to reinstall it lol.
 

NimbusD

Member
The system just turns off? I had a power strip I'd been using for 10 years and when I was playing games with heavy power usage the power strip with heat up and turn itself off. I didnt realize what it was until it happened a few times.

This may not be your problem, just throwing it out there.
Yeah seems power related if temps aren't rising on CPU or GPU.

I had this happening last year, it was a shitty psu. Replaced it and it was peachy keen after.

I remembered later that the same thing happened to me a long time ago w the first computer I ever built.

Idk happens sometimes and it drives you insane because there's no internal PC diagnostic to figure it out.
 

Resilient

Member
Holy shit it did it again! 4 straight days of no crashing, I thought it was fixed. I logged a fair amount of time each day. What an absolute cunt of a problem. Going to take out the GPU and see how it goes now. fuck fuck fuck

Lol turned it on again to try once more out of curiosity and it crashed within 10min
 

inm8num2

Member
Ugh, I'm sorry. Was hoping the issue was fixed. :/

I don't have much else to add that hasn't already been said. Since the lockups aren't generating blue screens or dump files, that seems to make it less likely this is a driver or software issue (I'm not certain on this).

Maybe the problem does come from the motherboard. It seems like you've practically ruled out every other piece of hardware (PSU, RAM, GPU, SSD, HDD) in your testing. I was going to suggest buying a different Z170 board in the interim and doing a new OS install for that setup. Given how Win 10 licensing works (tied to motherboard/hardware) you'd likely need to call MS to have them move your license over to the new board (then potentially back to your Asus Z170 replacement if the newly bought board doesn't eliminate your problem). What you choose also depends on whether you have a different PC or laptop to work with in the interim, but it might be worthwhile to test a different brand/model motherboard while you wait for the RMA to process.

I wish you all the best in resolving this. I've been plagued by some BSODs on my PC over the past few months and haven't been able to nail it down. Not as serious or aggravating as your freezing issue, but I sympathize with the frustration of an ongoing problem that just won't go away despite your best and diligent efforts to diagnose and repair it.

edit - just more thoughts. Based on post #169, prime95 doesn't crash the system while the GPU benchmark does. And your freezing/lockups only happen during gaming? I was thinking it might be an issue with the PCIE slot (maybe the GPU isn't getting enough power under load due a faulty motherboard and/or some bizarre issue with the PCIE network adapter). Also, according to the post above mine you experienced another crash within 10 minutes. Is that within 10 minutes of gaming or just within 10 minutes of booting into the OS? Had you removed the GPU yet or is it still installed? At first it sounded like you removed the GPU then turned the PC back on and got a crash within 10 minutes while on the integrated GPU, but just want verify which case it is.

Plus, the way the system is freezing would ordinarily indicate that the CPU itself is locking up or ceasing operation (the system stops working entirely despite everything else still being powered). My other random thought was, when you did your most recent/current installation of Windows was your OC still on? I don't know if it makes a difference but I've seen in various forums/threads that it's best to install the OS on stock, OC and test until stable, record stable clocks, turn off OC, reinstall Windows w/ OC still off, then OC to stable levels. Otherwise there may be issues if the OS was installed while OC'ed to potentially unstable levels, even if turning off the OC later on. Don't know if there's any weight to this, but again just something I recall seeing discussed a couple times.

But yea, the motherboard (and CPU) seem to be at the center of the problem or at the very least need to be the next components isolated in testing (by replacing the motherboard). Again, best wishes and keep us updated.
 

Resilient

Member
Ugh, I'm sorry. Was hoping the issue was fixed. :/

I don't have much else to add that hasn't already been said. Since the lockups aren't generating blue screens or dump files, that seems to make it less likely this is a driver or software issue (I'm not certain on this).

Maybe the problem does come from the motherboard. It seems like you've practically ruled out every other piece of hardware (PSU, RAM, GPU, SSD, HDD) in your testing. I was going to suggest buying a different Z170 board in the interim and doing a new OS install for that setup. Given how Win 10 licensing works (tied to motherboard/hardware) you'd likely need to call MS to have them move your license over to the new board (then potentially back to your Asus Z170 replacement if the newly bought board doesn't eliminate your problem). What you choose also depends on whether you have a different PC or laptop to work with in the interim, but it might be worthwhile to test a different brand/model motherboard while you wait for the RMA to process.

I wish you all the best in resolving this. I've been plagued by some BSODs on my PC over the past few months and haven't been able to nail it down. Not as serious or aggravating as your freezing issue, but I sympathize with the frustration of an ongoing problem that just won't go away despite your best and diligent efforts to diagnose and repair it.

edit - just more thoughts. Based on post #169, prime95 doesn't crash the system while the GPU benchmark does. And your freezing/lockups only happen during gaming? I was thinking it might be an issue with the PCIE slot (maybe the GPU isn't getting enough power under load due a faulty motherboard and/or some bizarre issue with the PCIE network adapter). Also, according to the post above mine you experienced another crash within 10 minutes. Is that within 10 minutes of gaming or just within 10 minutes of booting into the OS? Had you removed the GPU yet or is it still installed? At first it sounded like you removed the GPU then turned the PC back on and got a crash within 10 minutes while on the integrated GPU, but just want verify which case it is.

Plus, the way the system is freezing would ordinarily indicate that the CPU itself is locking up or ceasing operation (the system stops working entirely despite everything else still being powered). My other random thought was, when you did your most recent/current installation of Windows was your OC still on? I don't know if it makes a difference but I've seen in various forums/threads that it's best to install the OS on stock, OC and test until stable, record stable clocks, turn off OC, reinstall Windows w/ OC still off, then OC to stable levels. Otherwise there may be issues if the OS was installed while OC'ed to potentially unstable levels, even if turning off the OC later on. Don't know if there's any weight to this, but again just something I recall seeing discussed a couple times.

But yea, the motherboard (and CPU) seem to be at the center of the problem or at the very least need to be the next components isolated in testing (by replacing the motherboard). Again, best wishes and keep us updated.

Thanks for your post.

To clarify, I turned the PC right back on (I got kicked out of an instance in FF14 and wanted to join it again) - that's when it froze again within probably 10minutes.

I'm convinced it's a motherboard issue but I wanted to narrow down a few more things I found online from people who had issues with the same mobo. I've already tested the RAM with no errors (8 hours each stick), but I've just swapped the channels now as another poster from a dif forum said it worked for them.

After this, GPU is out.

After that, I'm gonna take a squizz at the thermal paste.

After that, RMA on the board. I'll buy a new board in the meantime (I can afford the float, and I really want to fix the issue too..).

A few posters have said to clear the thermal paste/strip all parts first, but I'm trying to knock out what other people with the same mobo have done to clear the same kind of issue.

re: the W10 install, only the 3rd W10 install had the OC still on, the 1st and 2nd installs had the same crashing issue.

I dunno, it's weird man. The audio thing has to be some sort of clue, it goes absolute bonkers when it crashes. Funnily enough, when it crashed yesterday, I had 0 sound going into FF14 and couldn't get it to work, then it crashed shortly after. Weird coincidence? Or a sign that the audio drivers are truly fucked for this mobo?
 

Resilient

Member
Shameless bump.

Is it possible for the grilles on the case (at the back, the tiny ones that you screw in where there is no PCIe card, but if there were a PCIe card (GPU) you would remove them) to short the motherboard and make it act funky?

I ask because when i was taking the GPU out I noticed that one of them was a bit close to the motherboard, probably too close. now it has me shook that that may have been the root of the problem, and when the motherboard expanded, it touched that and triggered the crash. any thoughts?
 

inm8num2

Member
I dunno, it's weird man. The audio thing has to be some sort of clue, it goes absolute bonkers when it crashes. Funnily enough, when it crashed yesterday, I had 0 sound going into FF14 and couldn't get it to work, then it crashed shortly after. Weird coincidence? Or a sign that the audio drivers are truly fucked for this mobo?

If the audio drivers were an issue you'd probably be getting BSODs with your crashes. Still, maybe in rare cases different hardware and their drivers just have bizarre conflicts and incompatibilities that don't necessarily have a solution (other than the drivers being fixed by the hardware manufacturer). The repeated crashes could also potentially be corrupting your OS which could lead to other issues like audio not appearing when you relaunched the game after rebooting.

Also, when you swap out a different model motherboard you may want to reinstall Windows on a spare drive if you have one. If not you may be able to get away with uninstalling the chipset/LAN/etc. drivers for the Asus board then installing those drivers for the new one. Either way you'll likely need to call MS to re-authenticate your Windows license when the new board is installed.

Shameless bump.

Is it possible for the grilles on the case (at the back, the tiny ones that you screw in where there is no PCIe card, but if there were a PCIe card (GPU) you would remove them) to short the motherboard and make it act funky?

I ask because when i was taking the GPU out I noticed that one of them was a bit close to the motherboard, probably too close. now it has me shook that that may have been the root of the problem, and when the motherboard expanded, it touched that and triggered the crash. any thoughts?

How small was the gap between the plates and the motherboard? This could potentially be the issue - I don't know how much the board would expand as it heats up. You may try putting some black electrical tape on the thin metal plate where the gap is smallest and see if that prevents lockups.

Given the nature of the entire system freezing, as if the CPU/motherboard have stopped communicating with the rest of the components, it does seem like it's some kind of freak electrical problem related to the motherboard. Hopefully your continued testing narrows it down.
 

Resilient

Member
If the audio drivers were an issue you'd probably be getting BSODs with your crashes. Still, maybe in rare cases different hardware and their drivers just have bizarre conflicts and incompatibilities that don't necessarily have a solution (other than the drivers being fixed by the hardware manufacturer). The repeated crashes could also potentially be corrupting your OS which could lead to other issues like audio not appearing when you relaunched the game after rebooting.

Also, when you swap out a different model motherboard you may want to reinstall Windows on a spare drive if you have one. If not you may be able to get away with uninstalling the chipset/LAN/etc. drivers for the Asus board then installing those drivers for the new one. Either way you'll likely need to call MS to re-authenticate your Windows license when the new board is installed.

How small was the gap between the plates and the motherboard? This could potentially be the issue - I don't know how much the board would expand as it heats up. You may try putting some black electrical tape on the thin metal plate where the gap is smallest and see if that prevents lockups.

Given the nature of the entire system freezing, as if the CPU/motherboard have stopped communicating with the rest of the components, it does seem like it's some kind of freak electrical problem related to the motherboard. Hopefully your continued testing narrows it down.

I see. Just thought the audio was a strange red flag, so I guess that's another thing we can knock off the list of causes.

Yeah, I'll just do a clean wipe to be safe, and do a fresh install.

We're talking....like, less than 1mm. Not touching by a hair. I'm disappointed I didn't notice that when I built it, it's one of the grilles that I had never needed to move so I never touched it. I took it out and bent it inwards so that when it was screwed in, it couldn't touch. they are all clear of the motherboard now by about 3-4mm.

i played ff14 last night for about 2.5 hours, had no issues. the CPU never went past 54c the whole time, so I'm fairly confident there is no issue with the paste. however if it crashes again, that's what I will check next as I prepare to RMA the mobo. thanks again for your post.
 

Xisiqomelir

Member
Short term solution:

140812REDMackieWin7.jpg


Long term solution:

Freebsd-logo.png
 

Resilient

Member
EVGA have had issues with their 1070 heat pads, they're exceptionally thin and wear away easily. I think maybe you should RMA your 1070.

While I agree about the 1070, my PC was still crashing with the old GPU (same problem) so it's not the crux of the issue.

While I'm at it though. I bought it from Amazon US (I'm in Austraia) -would they do a straight refund of the card usually? Gonna open a ticket with support regardless if I do go that route.
 

Resilient

Member
So I've been running it the last few days with just RAM, CPU and SSD/HDD. No issues so far, I've had it in game for 3 hours and it doesn't get any hotter than 53c which is strange. That's running FF at full settings (20fps lol).

Thoughts?

GPU? Even though it happened with my old one? And I never once had an issue with that old GPU in my old PC?

Motherboard not handling it when the GPU is plugged in?

Thermal paste causing the GPU to not be handled well in this build?

Or the rogue piece of metal on the case that was almost nearly touching the MB, then the MB expanded and went funky?
 
@op, when you changed some settings in the mobo, you had a 6 day streak of zero accidents. I think that is no coincidence. But I'm not sure how that relates to removing the gpu tho, did you disable the integrated gpu when the 1070 was in?
There is definitely something to do with the mobo.

Edit: whats your ram voltage in the bios? Maybe change it to recommended volts by corsair?
 
Have you tried keeping your CPU at a single speed/voltage? Could be OC, could be stock, just keep at a constant speed/voltage that works.

Like I mentioned in my other post, I had an extremely similar situation for my build in that it wasn't the overclock that was causing issue, but the fact that the CPU downclock was causing the crash. Basically the CPU got undervolted on the power down and starved for power and crashed.
 

Resilient

Member
Update:

So, didn't bump into the issue when I was running off the CPU, and 2x8GB sticks of RAM.

As a precaution I have RMAd the RAM and will have a couple of new sticks by the end of the week - was speaking to my local tech shop who had suggested it was pretty symptomatic of a RAM issue..I had trouble in the past with Corsair so will swap to Kingston.

re Thermal Paste: I haven't reapplied because the temps have been clear/positive. Didn't push past 52c when running off the integrated GPU. When using factory default Mobo settings (voltage automatically increased/decreased for the CPU, but not OC), temps have hovered around 22c idle and 58c under load (with GPU plugged back in). I don't think this is the issue.

re GPU: going to RMA this too, purely because of the Thermal Pad issues with EVGA cards, as a precaution.

Questions/thoughts on the below.

@op, when you changed some settings in the mobo, you had a 6 day streak of zero accidents. I think that is no coincidence. But I'm not sure how that relates to removing the gpu tho, did you disable the integrated gpu when the 1070 was in?
There is definitely something to do with the mobo.

Edit: whats your ram voltage in the bios? Maybe change it to recommended volts by corsair?

RAM voltage recommended is 1.2V and it's controlled by the BIOS. the speed is 2400MHz but I have XMP turned off at the moment, so it's running at 2133MHz. The issue occurred with XMP on and off, however.

Enter MB Bios. Select "Reset to Defaults". Save and exit.

Have done that, and put the GPU back in. Going to see if the problem appears again.

Have you tried keeping your CPU at a single speed/voltage? Could be OC, could be stock, just keep at a constant speed/voltage that works.

Like I mentioned in my other post, I had an extremely similar situation for my build in that it wasn't the overclock that was causing issue, but the fact that the CPU downclock was causing the crash. Basically the CPU got undervolted on the power down and starved for power and crashed.

More and more I am convinced it's this, so I'm going to fiddle around with it tonight.

The problem ONLY happens when gaming, and when the GPU is in. So i'm thinking, the PC is being pushed hard, the CPU is working hard...it's not getting enough juice and it's crashing. JaseC mentioned this a few times, and now that i have ruled out most of the other hardware as being the issue, it's time to start tinkering on this end.

It seems like such an easy scapegoat to blame the motherboard but i feel like there is more here, as some other posters have experienced the same issues and resolved it by increasing the voltage.

Thanks again people.
 

Engell

Member
So, didn't bump into the issue when I was running off the CPU, and 2x8GB sticks of RAM.


just wanted to write a quick note here..
So had my own laptop acting strange the last week.. crashing out of games / blue screening and doing auto shutdowns etc.

Finally diagnosed the problem yesterday as a defective RAM module (ddr3 8gb 1600mhz cl9 SODIMM).. "funny" thing was that this problem didn't show up in memtest86 as i would normally expect, so hard to diagnose anything when you don't have anything that will reliably crash the computer or generate an error.

finally the program i used that would make the computer fuck up enough to test scenarios was a program called y-cruncher
http://www.numberworld.org/y-cruncher/#Download
it didn't crash when running it but the program will report back that there is a calculation error. when starting the program just choose "1 component stress tester" and then "0 start testing") this would constantly fail with the defective ram module inserted in the machine. Under the stress test option, you can also focus the test on mixed or RAM or CPU.

So basically its just a good program to test stability(doesn't test GPU ofc), and good to hear that you are getting closer to having a working machine.
This will be my goto program in the future to test for cpu/ram stability
 

Aureon

Please do not let me serve on a jury. I am actually a crazy person.
Did you test all the RAM in the same slot, or you tested all the slots you use when a freeze occurs?

RAM slots being broken is fairly common.

Also reset the OC until you clear this, and try to run on a single ram stick if possible. The less moving parts, the easier it is to find out what's wrong.

CPU\Mobo issues are extremely rare, but that's looking more likely if RAM, PSU, GPU, HDD all check out.
 

Resilient

Member
just wanted to write a quick note here..
So had my own laptop acting strange the last week.. crashing out of games / blue screening and doing auto shutdowns etc.

Finally diagnosed the problem yesterday as a defective RAM module (ddr3 8gb 1600mhz cl9 SODIMM).. "funny" thing was that this problem didn't show up in memtest86 as i would normally expect, so hard to diagnose anything when you don't have anything that will reliably crash the computer or generate an error.

finally the program i used that would make the computer fuck up enough to test scenarios was a program called y-cruncher
http://www.numberworld.org/y-cruncher/#Download
it didn't crash when running it but the program will report back that there is a calculation error. when starting the program just choose "1 component stress tester" and then "0 start testing") this would constantly fail with the defective ram module inserted in the machine. Under the stress test option, you can also focus the test on mixed or RAM or CPU.

So basically its just a good program to test stability(doesn't test GPU ofc), and good to hear that you are getting closer to having a working machine.
This will be my goto program in the future to test for cpu/ram stability

thanks man. i ran this but it passed everything and didn't crash lol :(

new RAM at some point this week...will see if the problem rears its head again. when i get the new RAM i will tinker with the voltage to the CPU.

after that, RMA motherboard.
 
Status
Not open for further replies.
Top Bottom