RenderDoc ‘Ref All Resources’ + Vulkan + Memory Aliasing = Beware


By default RenderDoc only downloads resources used by the frame being captured.

Resources not used during that frame are not saved.

Unless ‘Ref. All Resources’ is enabled, that is.

I recently fixed a bug in Ogre where Ref. All Resources would randomly corrupt GPU memory after capture; both in my application and/or RenderDoc’s capture.

At first I thought it may be a RenderDoc bug (and went kinda insane chasing this bug, there was a high chance the bug was of my own doing). I mean the game ran 100% fine without RenderDoc. But it would randomly corrupt after hitting F12 when run with RenderDoc.

Sometimes it would be textures becoming faceted or have wrong tiles. Other times the vertex buffers would have random NaNs or garbage, resulting in exploding triangles.

The heightMap is completely corrupted. The terrain’s normal map is halfway there. Sometimes the corruption was mild but it was there

Furthermore it would easily reproduce under both AMDVLK and RADV on Linux, but it did not seem to repro on Windows 10.

Until I discovered Ref All Resources was related to the corruption and then noticed stray VkImage and VkImageViews in the texture list that should’ve been destroyed a long time ago. In fact their contents were complete garbage.

These VkImages were still bound to valid memory ranges; but internally we had assigned its backing memory for other stuff. In Vulkan terms these resources were aliased and its contents in an undefined state.

Turns out when RenderDoc performs a resource transition to download the VkImages, the transition would corrupt its aliased resources.

In this case, it was an Ogre bug (basically our API handles were leaking). But there are legit cases where this would be a problem. Thus if you alias resources: Beware.

I contacted Baldur about this problem and he is unsure how to fix it. Checking aliasing ranges between resources is an N! process and could result in false positives. I guess doing a raw memcpy backup of the whole memory before performing transitions and then restoring it (after doing every transition?) may be a possible solution; but it would double VRAM consumption and cause slower captures.

So if you get corruption on Vulkan after captures, check if you’re leaking handles or if you’ve purposely aliased resources