[Resolved] Looking for assistance with crashes, faults, bluescreens on a new build
Hi folks! This is a more thorough continuation of what I brought up in Discord: I recently built an all-AMD system, and when it works it's an absolute monster. However...
I experience intermittent bluescreens, no matter the configuration or part setup or drivers. Some more finicky games (WoW is the major one, pubg I think is included) also throw exceptions quite a bit. These happen under all types of load (idling, gaming, zipping large packages), with a huge range of time (two minutes after restart, four days of no problems, all in between). I've done a TON of testing and debugging and checking, to no avail. And so I turn to you!
I haven't found anything to reliably cause a crash. Today I had a crash from restarting from a crash, starting my music player, walking away to get the mail, and hearing it bluescreen and restart as I was sorting letters. It's crashed during stream viewing, youtube viewing, game playing, and playing music with nothing else open.
To date, it has NOT crashed while running memtests for more than a day, and putting the system under significant benchmarking strain doesn't cause it to happen more frequently, so far as I can tell.
Bluescreen codes seem random. KMODE_EXCEPTION_NOT_HANDLED, DRIVER_OVERRAN_STACK_BUFFER, PAGE_FAULT_IN_NONPAGED_AREA, several others.
Sometimes the bluescreen is yellow!
Sometimes the system just freezes and restarts without a code or dump.
WoW's errors are uniformly ACCESS_VIOLATIONs stating that pointers were to memory that couldn't be written to or read from. Research into this issue as its own thing has yielded no results so I do include it as a symptom here. No guarantees.
OS: Windows 10 Enterprise (currently trial)
PSU: EVGA Supernova 850W (site)
CPU: AMD Ryzen 7 3700X (site) with stock Wraith Cooler
Motherboard: Gigabyte X570 Gaming X Rev 1 (site)
RAM: G.Skill Ripjaw V DDR4-3200C16S-16GKV at XMP timings and voltage, x2 in dual channel (site) (QVL with motherboard)
Video: Gigabyte RX5700 8G (site)
Monitors: Dual AOC G2460 @ 144hz, 1920x1080 8-bit RGB via DisplayPort (have used HDMI as well)
4 HDDs: OCZ-Agility3s, WD Blacks, on sata, Mushkin Pilot-E on NVMe, each used solo in testing (including a known good from another machine that I formatted just for testing).
USB: Deathadder 3500 mouse, Microsoft 4000 Ergonomic keyboard
Sound: Tested with onboard, USB headphones, X-Fi Xtreme PCI card
RMAs: RMA'd the graphics card AND the motherboard, both came back as okay.
Temperature: Temps are stable, all remain under 72degC even when under load.
Voltages: No real voltage jitter seen during idle or under load.
Memory: memtest86 for 48 hours, memtest64 for 10 hours, windows 10 memtest tool twice, all pass.
Tried running at default speed and voltage.
Tried single sticks in each of the 4 slots available.
Never ran at higher voltage than XMP's 1.35V recommended.
CPU: The triangle lined up, seated with 0 force, lever locked into place, cpu was not loose after lock went in. Used thermal paste, seated heatsink, locked heatsink down.
Original purchase from newegg had totally legit-looking packaging but someone had stolen the 3700X and left a poorly-cleaned 1700X in its place. I sent that back and my current one totally shows up as a 3700X in hwinfo and other tools sooooooo...
Cabling: All cabling came from the rig this is upgrading which worked without issue.
Sata cables swapped with other known-good cables.
Modular power supply cables swapped out.
Checked for loops, kinks, cuts, weird looseness.
Tried using different sata power plugs in the power lines.
Tried different SATA slots on the motherboard (there are 6, tried 6).
Power: Motherboard has a 24-pin and an 8-pin power lines, both are plugged in.
Swapped in another known good power supply (Thermaltake Toughpower 750W Modular (site)). Issue still occurred, even after trying different modular cabling.
Tested grounding through case, and all motherboard screws, to the surge strip ground successfully.
Tested system laid out on non-conductive surface without a case at all.
Video: Swapped in a known-good Nvidia 970. Seeeeeemed to crash less, but no real data and still bluescreened.
Turned off all special settings via Radeon, set to single monitor 60hz 1920x1080 over HDMI to AOC monitor.
Turned PCIE to version 3.
Disabled hardware acceleration for browsing and discord.
Drivers and OS: Verified Win10 install iso via sha256sum. Installed via USB.
DISM and sfc come back clean (other than some noise that other places on the internet say is the windows default antivirus interfering and is completely normal).
Clean installs to various other hard drives tested solo (no other large disks) to no avail.
I've used Gigabyte's drivers (chipset, driver) for motherboard and video, and the AMD-available newer ones, in available sequence.
Also installed gigabyte's motherboard tools to no effect.
BIOS for motherboard has been updated and rolled back and updated again multiple times, testing each available new update.
Display Driver Uninstaller used in safe mode and in normal mode to remove video drivers, while not connected to internet, installing video drivers in normal mode without internet connection.
Just in case, even installed non-generics for the monitors to no avail.
God that's a lot. Sorry. I've gone through quite a bit trying to get this working. Thanks for nosing through, and thanks to the folks in Discord who tried there too!
Let me know if there is more information you'd need or suggestions for things I haven't done yet. My hope is I've missed something elementary that I just don't know to do, or something. I do not have the kind of money that would allow for randomly buying parts, and nobody near me has current-gen parts they can spare for testing. I'm mostly ready to just switch back to the old parts until things have gotten better a couple months from now, but I want this to work, dammit!