Here I’m playing “Spelunky 2” on my laptop and simultaneously replaying the same Vulkan calls on an ARM board with Adreno GPU running the open source Turnip Vulkan driver. Hint: it’s an x64 Windows game that doesn’t run on ARM.
The bottom right is the game I’m playing on my laptop, the top left is GFXReconstruct immediately replaying Vulkan calls from the game on ARM board.
How is it done? And why would it be useful for debugging? Read below!
Debugging issues a driver faces with real-world applications requires the ability to capture and replay graphics API calls. However, for mobile GPUs it becomes even more challenging since for Vulkan driver the main “source” of real-world workload are x86-64 apps that run via Wine + DXVK, mainly games which were made for desktop x86-64 Windows and do not run on ARM. Efforts are being made to run these apps on ARM but it is still work-in-progress. And we want to test the drivers NOW.
The obvious solution would be to run those applications on an x86-64 machine capturing all Vulkan calls. Then replaying those calls on a second machine where we cannot run the app. This way it would be possible to test the driver even without running the application directly on it.
The main trouble is that Vulkan calls made on one GPU + Driver combo are not generally compatible with other GPU + Driver combo, sometimes even for one GPU vendor. There are different memory capabilities (VkPhysicalDeviceMemoryProperties), different memory requirements for buffer and images, different extensions available, and different optional features supported. It is easier with OpenGL but there are also some incompatibilities there.
There are two open-source vendor-agnostic tools for capturing Vulkan calls: RenderDoc (captures single frame) and GFXReconstruct (captures multiple frames). RenderDoc at the moment isn’t suitable for the task of capturing applications on desktop GPUs and replaying on mobile because it doesn’t translate memory type and requirements (see issue #814). GFXReconstruct on the other hand has the necessary features for this.
I’ll show a couple of tricks with GFXReconstruct I’m using to test things on Turnip.
Capturing with GFXReconstruct
At this point you either have the application itself or, if it doesn’t use Vulkan, a trace of its calls that could be translated to Vulkan. There is a detailed instruction on how to use GFXReconstruct to capture a trace on desktop OS. However there is no clear instruction of how to do this on Android (see issue #534), fortunately there is one in Android’s documentation:
Android how-to (click me)
For Android 9 you should copy layers to the application which will be traced For Android 10+ it's easier to copy them to com.lunarg.gfxreconstruct.replay You should have userdebug build of Android or probably rooted Android # Push GFXReconstruct layer to the device adb push libVkLayer_gfxreconstruct.so /sdcard/ # Since there is to APK for capture layer, # copy the layer to e.g. folder of com.lunarg.gfxreconstruct.replay adb shell run-as com.lunarg.gfxreconstruct.replay cp /sdcard/libVkLayer_gfxreconstruct.so . # Enable layers adb shell settings put global enable_gpu_debug_layers 1 # Specify target application adb shell settings put global gpu_debug_app <package_name> # Specify layer list (from top to bottom) adb shell settings put global gpu_debug_layers VK_LAYER_LUNARG_gfxreconstruct # Specify packages to search for layers adb shell settings put global gpu_debug_layer_app com.lunarg.gfxreconstruct.replay
If the target application doesn’t have rights to write into external storage - you should change where the capture file is created:
adb shell "setprop debug.gfxrecon.capture_file '/data/data/<target_app_folder>/files/'"
However, when trying to replay the trace you captured on another GPU - most likely it will result in an error:
[gfxrecon] FATAL - API call vkCreateDevice returned error value VK_ERROR_EXTENSION_NOT_PRESENT that does not match the result from the capture file: VK_SUCCESS. Replay cannot continue. Replay has encountered a fatal error and cannot continue: the specified extension does not exist
Or other errors/crashes. Fortunately we could limit the capabilities of desktop GPU with VK_LAYER_LUNARG_device_simulation
VK_LAYER_LUNARG_device_simulation when simulating another GPU should be told to intersect the capabilities of both GPUs, making the capture compatible with both of them. This could be achieved by recently added environment variables:
VK_DEVSIM_MODIFY_EXTENSION_LIST=whitelist VK_DEVSIM_MODIFY_FORMAT_LIST=whitelist VK_DEVSIM_MODIFY_FORMAT_PROPERTIES=whitelist
whitelist name is rather confusing because it’s essentially means “intersection”.
One would also need to get a json file which describes target GPU capabilities, this should be done by running:
vulkaninfo -j &> <device_name>.json
The final command to capture a trace would be:
VK_LAYER_PATH=<path/to/device-simulation-layer>:<path/to/gfxreconstruct-layer> \ VK_INSTANCE_LAYERS=VK_LAYER_LUNARG_gfxreconstruct:VK_LAYER_LUNARG_device_simulation \ VK_DEVSIM_FILENAME=<device_name>.json \ VK_DEVSIM_MODIFY_EXTENSION_LIST=whitelist \ VK_DEVSIM_MODIFY_FORMAT_LIST=whitelist \ VK_DEVSIM_MODIFY_FORMAT_PROPERTIES=whitelist \ <the_app>
Replaying with GFXReconstruct
gfxrecon-replay -m rebind --skip-failed-allocations <trace_name>.gfxr
-mEnable memory translation for replay on GPUs with memory types that are not compatible with the capture GPU’s
rebindChange memory allocation behavior based on resource usage and replay memory properties. Resources may be bound to different allocations with different offsets.
--skip-failed-allocationsskip vkAllocateMemory, vkAllocateCommandBuffers, and vkAllocateDescriptorSets calls that failed during capture
Without these options replay would fail.
Now you could easily test any app/game on your ARM board, if you have enough RAM =) I even successfully ran a capture of “Metro Exodus” on Turnip.
But what if I want to test something that requires interactivity?
Or you don’t want to save a huge trace on disk, which could grow tens of gigabytes if application is running for considerable amount of time.
During the recording GFXReconstruct just appends calls to a file, there are no additional post-processing steps. Given that the next logical step is to just skip writing to a disk and send Vulkan calls over the network!
This would allow us to interact with the application and immediately see the results on another device with different GPU. And so I hacked together a crude support of over-the-network replay.
The only difference with ordinary tracing is that now instead of file we have to specify a network address of the target device:
VK_LAYER_PATH=<path/to/device-simulation-layer>:<path/to/gfxreconstruct-layer> \ ... GFXRECON_CAPTURE_FILE="<ip>:<port>" \ <the_app>
And on the target device:
while true; do gfxrecon-replay -m rebind --sfa ":<port>"; done
while true? It is common for DXVK to call
vkCreateInstance several times leading to the creation of several traces. When replaying over the network we therefor want
gfxrecon-replay to immediately restart when one trace ends to be ready for another.
stranglevk -f 10
You have seen the result at the start of the post.