#First time 2v2 AMD driver timeout
120 messages · Page 1 of 1 (latest)
thank you for the reports! We're looking into GPU crash issues - here's a couple troubleshooting articles that might help, but it could be a game issue too
https://theorycraft.helpshift.com/hc/en/3-theorycraft/faq/25-outdated-graphics-drivers/
Outdated Graphics Drivers If you see this image as a pop-up when launching SUPERVIVE, then your graphics drivers are out of date. Take note
Audio Volume Too Low (hot issue) We have received a number of reports sharing some feedback on in-game volume. If you are experiencing weird
as a compooper engineer I'm mostly sure that the fix is either needed to be done by AMD with game-compatibility update, or you, though :)
Maybe even both.
oh yes, this is very possible! This is like when you call for network support from your ISP and they say "is the router plugged in? Can you turn it on and off again?" and you roll your eyes like "yeah of course". Gotta check the obvious stuff first 
shot in the dark here, but can you check if you have "Lumen" enabled in your video settings, and if so, turn it off? It should only affect the lobby, but I see some caching misses in the log 
Not likely to be a fix, but worth a shot
Is it the same as with the 4v4 crash? Because testing the crash in 4v4 will be much faster after turning off lumen :)
ah - would also be handy if you can record a video showing when it happens
That would be much easier if Steam Recording worked in supervive
but i'll enable Adrenalin Replay in background and try to catch it.
Though the video will be most likely damaged the moment the gpu driver crashes and adrenalin restarts
It feels like it happens when some champion is doing something~
Or I could try using OBS CPU-based recording that would not interrupt on screen restart. Hmm.
we have an issue I'm thinking of where some folks seem to be crashing when certain UI shows up, so I'm wondering if this is the same issue hitting you here
yeah OBS is the way to go IMO if you have it setup
Miiight be this!
And if the game itself somehow uses the UE5 built-in internet browser, that also might be an issue. AMD REALLY does not like hardware acceleration in non-gaming apps turned on when gaming apps are opened. I've crashed more in a month of using AMD than in 15 years of using nvidia because of this.
yeah AMD drivers are not the best IMHO, love the CPUs but the GPUs can have weird stuff like that
Also AMD has biiig issues with DX12 lately.
Now, starting the recording session and jumping into 4v4 que. Should crash in a couple of games max.
I'll actually stream it to twitch so it will autoarchive 
Soooo.
I've played a couple of 4v4 matches and didn't crash once. That's after disabling Lumen.

hmmmmm, that is... very suspicious
because it really shouldn't matter. I'll pass this along to our rendering pro
I can add I've had a single crash on supervive and I think it was after enabling lumen after seeing guntharf talk about it on a post, it only happened to me once though
But Im pretty sure it was after I enabled the setting yeah I just looked it up I enabled it on the 27 and I crashed a couple days ago, like 4 or so at max
Interesting, Lumen is only active as opt in from menus and only active on Lobby/Hero select. We force disable when in game
keep us posted - if you continue to have no crashes that would be interesting information. I'm wondering if maybe it's just luck of the draw since your crashes don't look that frequent
Played bit more and still no crashes.
I've checked my 4v4 stats on supervive stats website and at the moment of creating this thread I had a crash every 10 games.
I've played more than 10 games now and still no crash.
Will try more tomorrow!
thanks for the update!
thanks for the update! Very confusing results but I'll take the W
After 12 games I've got crashed during a teamfight (it's always during a teamfight).
It crashed the moment I've alt-Q to upgrage last spell.
And It was also my first game as Celeste for a looong time.
can you do the following steps please?
Head to your Temp folder C:\Users\%USERNAME%\AppData\Local\Temp
Look for recently created files ending in dmp with the format PACKER_*_*.dmp
Examples: %TEMP%\PACKER_9719935093_25724.dmp or %TEMP%\PACKER_954071203_58416.dmp
Zip them up and send them to me directly (do not share here as they may contain sensitive information). They may be large, if so they can be uploaded to a secure location such as Google Drive and then shared directly to me 
From today's crash I don't have any dmpfiles in my local appdata temp folder. If you're interested I have the selected two from today's crash and also some older ones. I see there's assertion crash dumps for supervive in steam dumps.
The QML renderer dump seems to be from the moment I've been updating gpu drivers
https://www.twitch.tv/videos/2319145643?t=01h43m09s
This is the crash moment. It ends with the crash (obs stopped transmission)
||(don't judge my disabled elderly elephant on a aim skillz)||
I'm wondering if it's crashing only when I play certain characters.
I do seem to remember than on each of the crashes I had, there was an enemy Kingpin in my vision range. But I can't say for sure.
if it's a rendering thing, it's possible certain visual effects cause the issue, tough to isolate though
Okay my gpu driver just crashed having game in background for a couple of minutes. xP But that might be caused because yesterday I've turned on discord hardware acceleration because I wanted to see video attachments in 4k actually played. That's AMD's fault, if that was this case.
mm yeah, unsure what occured here
[2024.12.10-19.15.16:153][ 46]LogD3D12RHI: Error: CurrentQueue.Fence.D3DFence->GetCompletedValue() failed
at C:\TheoryCraft\build-staging\Engine\Source\Runtime\D3D12RHI\Private\D3D12Submission.cpp:1013
with error DXGI_ERROR_DEVICE_REMOVED with Reason: DXGI_ERROR_DEVICE_HUNG```
seems to indicate just a GPU/Driver timeout and app closing because of it
So yes, confirming it's the AMD problem with hardware acceleration.
If more than one app runs on gpu, AMD is losing it's shit basically.
Any internet browser with hardware acceleration, discord with hardware acceleration, STEAM with hardware acceleration. Alt-tabbing to them while in a resource-intensive game results in driver timeouting for some god forgotten reason.
But AMD does not give a square crap about it. :/
Yeah, uncertain what are all internal scheduling and logic that occurs inside of the driver that might cause instability.
Some folks online have mentioned that modifying settings to favor performance vs power savings have helped in this area
Today's crash on ranked 4v4, had kinpin in my team and celeste on enemy and someone was constantly broadcasting on voice chat.
But I was on felix so that's new.
Unsure what occured here.
[2024.12.16-13.02.08:021][ 69]LogD3D12RHI: Error: CurrentQueue.Fence.D3DFence->GetCompletedValue() failed
at C:\TheoryCraft\build-staging\Engine\Source\Runtime\D3D12RHI\Private\D3D12Submission.cpp:1013
with error DXGI_ERROR_DEVICE_REMOVED with Reason: DXGI_ERROR_DEVICE_HUNG```
DirectX 12 and the AMD driver reported a GPU timeout and made the game crash
another one
And again. Seems I'm crashing a bit more lately.
The last 2 reports don't contain any logs or crash info..
Do you have any overlays enabled? AMD, Discord, Steam?
Oops, wrong folder zipped. xP
Steam and amd yes. Discord overlay is disabled.
mm same issue as before. the GPU seems to be timing out.
Does it occur when disabling Lumen from the graphics settings?
If it still crashed, does it crash on other quality settings? Low, Mid, High?
Lumen was disabled as Koalifier suggested first in the thread
This crash was on lowest game settings (btw i haven't noticed quality difference xD)
Also I'll add what I've told safelocked:
Locating the source of this problem might be ridiculously hard. AMD drivers are pure bullshit and for past 1-2 years they had enormous problems with DX12 in many games.
For example:
Firefox with hardware acceleration with a certain website opened as active tab -> launching Diablo 4 -> 100% crash each time when trying to enter the world
Changing the tab to any other website, no crash, proper load... And if I remember correctly it got fixed. 
I'm past of disabling hardware acceleration in discord, firefox and steam because it's causing driver timeouts, mostly when alt-tabbing.
I got curious,
TBH, it sounds like something is dying in the GPU / FW and causing a unhandled issue and causing a timeout.
Does AMD have some kind of post mortem tool that gives you drawcall and compute dispatches that can create an indepth dump ? 🤔 That'd be the best bet imo
I've sent AMD bug report dumps to Koalifier, and I think he did pass it to Felipe
AMD bug reports is not necessarily a proper post mortem tool
Most of game crashes are resolved by AMD after a couple of months
well I don't have anything better than that
From user pov it's just driver freezing and watchdog kicking in and restarting it.
The interesting part is that not always it's freezing. Sometimes it's kicking in even though everything was rendered smoothly~
:D
Felipe can ask what he wants. but i was thinking of something like this https://gpuopen.com/radeon-gpu-detective/
Installing right now. Will send him the dumperinos
wait until he wants them
or not
maybe he want something else or better :)). i think AMD has multiple post mortem tools IIRC
I just work as a gpu driver engineer so i'm used to these types of things :'')) its why i got curious since its a gpu and DX issue
I could take a look a the dumps to see if there is anything special or a pattern there.. but this doesn't seem to be widespread enough for us to suspect is due to a code defect with Unreal Engine or AMD driver.
I've been searching online for these issues on AMD and oh boy.. there are a ton of speculative solutions and nothing seems to be a sure thing. Some folks had luck disabling XMP, some folks updated motherboard bios, some folks replaced power supply and/or power cables, some folks found success raising the TdrDelay values in the Registry, disabling Windows MPO worked for some
I've been running here locally with a few AMD (RX 480 and 580) cards lately to see if I hit some of the issues but haven't hit any obvious problems
Yeah, I went through them all. I've had more driver crashed in a year of AMD usage than in almost 15 years of nvidia usage.
Hardware acceleration is #1 issue. Then comes drivers, which the update much less frequent. With diablo 4 I've had plenty of crashes, but after some updates I havent got none since... well, feels like forever now. I'll see if they'll update driver for supervive on launch. Hopefully.
I've tried the post-mortem tool but after I think 10+ games (which I've streamed also), no crashes. Though, my FPS were affected baaadly. Dips even below 40 FPS on 7900 XTX on medium. This tool is a butcher
but if it will produce anything interesting on crash, I'm down to continue using it.
I wonder if the throttling from the tool (copying GPU state to ram, etc) reduces load enough to improve stability..
Since it's a external tool doing the capture it has to modify the cmd buffers that are being submitted / activate some states. So the time in between submission of cmd buffers is going to be higher.
Though it means that its likely not to be contained in a singular submit but multiple submit potentially. (Including across apps)
To me it sounds like a scheduling issue on the hw side of AMD.
Hard to say what it could be without an actual hw dump though with these kinda issues :(
I've not looked at the actual dump though since I don't have a laptop with me 🙈
Well today was something new. I've made a screenshot of my lobby that contained mostly bots, and it crashed. Lol.
radeon post mortem tool dump is really small, I guess it's missing some configuration? @feral pasture
Also, this time I've got three .dmp files.
It would be all text files it could be correct
can you open the dmps in the tool?
they re binary files
the text files are in the same 7z
also this
that's supervive crash folder
first .zip is from amd
can you open the RGD file ?
the text file just say what mem and gpu is being used the RGD file is the radoen file that should have all the information
that's the .txt that opens when i double click rgd file in the radeon developer panel
Ahhh
MARKERS IN PROGRESS
INFO: no markers in progress since no command buffers were in flight during the crash.
=====================
EXECUTION MARKER TREE
INFO: execution marker tree is empty since no command buffers were in flight during the crash.
==================
PAGE FAULT SUMMARY
INFO: no page fault detected.
All work is completed but the VK fence didn't get an answer back
🙄
It's useful to know that everything completed correctly but if this is accurate and the VK fence didn't get signaled this sounds like an amd driver issue.
yeah it doesn't look like this
That's fine , is there anything in the output log on the bottom that's like suspicious
I think they just updated their ui
that's the log ¯_(ツ)_/¯
Well, if the log is correct then the Gpu is basically idling cause it has no work, then fence times out since it didn't get signaled not sure what's happening:/
We have a full-on crash today. While I was capturing a base on 4v4.
It's a bit bigger also. 41 mb rgd file.
@feral pasture @gentle pumice
For now I'm disabling the post-mortem capture tool because it's decimating my FPS and also makes my CPU stay on 75°C+ during the game instead of ~50s.
If you'd like any other post-mortem tool to be used Moozels just tell me which one
inspecting these, it mostly seems caused by some page fault (reading or writing from a restricted or unavailable part of memory) from the driver side..
DirectX usually fails or returns with error codes when the application (Unreal Engine) feeds it bad inputs and parameters, this issue however seems to be mostly on the driver side.
Offending VA: 0x61000 mostly says that there is no permission to read or write into that offset, or the offset is invalid
Tool doesn't seem to link that offset to any relevant resource (texture, buffer, render target, etc)
So you're confirming my thoughts about it. AMD does something incorrectly when shuffling resources between applications, maybe even application threads. That's why hardware acceleration with more than one hardware accelerated application running results in driver crashes.
So we're left with driver update for the game when it's released, if they care enough.
It’s been this dance between Unreal Engine, AMD/Nvidia and Microsoft..
It really helps that Fortnite is still doing well as that has encouraged those 3 companies to work well toward making UE5, DirectX 12 and GPUs play nicely. Things used to be quite bad like a year ago (I feel bad for any UE5 game that shipped during that window). We try to keep the version of Unreal 5 fairly up to date (one large revision behind) to take on all the fixes to improve overall performance and stability.
It’s too bad that AMD doesn’t have a version of what Nvidia calls “Studio Drivers” which tend to sacrifice a tiny bit of performance but with much added stability
Oh there were actually. I can't find them right now, but those were drivers without Adrenalin bloatware. I was considering using them, but the Radeon Chill feature is only available with Adrenalin unfortunately, and I really like using it.
the page fault itself is real weird.
cause you re still seeing a timeout (since the fence is hitting5 seconds) I feel if there was such a fatal page fault the gpu should just crash and reset directly rather than wait and timeout. Unless this is how AMD HW is working.
It feels more like the fault is due to the timeout happening then as memory gets cleared you hit a wrong VA.
also none of the MDI's have started which should mean nothing should (?) have even begun being accessed.
I feel bad for any engine makers trying to get performance tbh 😢 Having to support so many products with so many needs for WA to get performance out of each products 😂
@rugged canopy I wonder if having the 2 gpu's being recognize and used is causing some weird issues with shuffling of resources and allocating resources to the wrong places
I get this all the time too, happens during games a lot. sometimes my entire pc just crashes which idk if that is related to sv but it would never happen before.
Hmm, maybe. I've been trying it when I had intel gpu, but now that both are AMD it might be a separate issju. Disabled ryzen gpu, enabled discord hardware acceleration, lets crash some drivers
It looks to be the same pattern I think it's the same thing. Hangs in general you would see more things alive in the cmd buffers. But these show that nothing that could cause hang being alive. For example:
Draw
Barrier
Draw
yeah just crashed on last 4 teams in game on small circle...
Well, I've played a couple of games now, with the new driver, and I did not crash even once.
Suspicious.
Today's crash was different. First, the game graphics froze, but mouse kept working, and voice chat and audio was working correctly until it all frozen and crashed about 30s after.
@rugged canopy still facing this?
I did post new crash 2 days ago?
working now or what
It's different than your crash.
ok bro.
Still seems like a GPU Timeout
[2025.03.01-08.16.52:363][259]LogD3D12RHI: Error: CurrentQueue.Fence.D3DFence->GetCompletedValue() failed
at C:\TheoryCraft\build-staging\Engine\Source\Runtime\D3D12RHI\Private\D3D12Submission.cpp:1013
with error DXGI_ERROR_DEVICE_REMOVED with Reason: DXGI_ERROR_DEVICE_HUNG```
@gentle pumice I wonder, is there a possibility of introducing GPU re-hooking in the case of gpu crash?
Some games do it.
After gpu crashing, the application does not close itself, but rather re-hooks to the new instance (?) of the gpu that reboots itself after the driver crash, and continues normally after.
For example - Heroes of the Storm. My gpu can crash there, but in 8/10 of cases the game does not close. It uses the after-crash instance of GPU and continues working, except for some small visuals like cooldowns opacity.
It would almost solve the problem of the timeouts by reducing the time to reconnect to the game by 99% (steam keeps thinking SV is running, so you have to restart steam, it takes time...)
I haven't crashed even once since servers are up, that's 7 hours 
Nevermind, after 9h crash. That must be something with hardware acceleration.
/edit:
I'm crashing quite frequently now :(
Heya,
About re-hooking the GPU
I think it's very unlikely that we would be able to have the bandwidth and resources to implement that in our own, at least not at the team size and current resources.
Unreal Engine's RHI layer is massive, and specially the DX12 layer is "fat" in complexity. UE is also no great at the time at resetting state, doing smart reloads, etc. Even for simple state changes, the solution is to "restart the client" which is a bit absurd at times,
We would hope that Epic would add more robustness to the DX12 RHI layer to support restoring the state and re-hook after crash and we can make sure to make SUPERVIVE being able to easily merge latest changes done to Unreal Engine
Thank you for explanation!
I'll need to make a .bat script to kill steam background processes to make restarting quicker then :)