I went through a lot of Vulkan tutorials and each of them has talked about fences, which is good, but each of them does not clearly tell what exectly happen on GPU. It is so frustrating trying to learn Vulkans synchronization....
As far as I understand, I have CommandBuffer which can be executed by vkQueueSubmit. In this submit I can add a fence.
The fence is signaled when the execution of all commandBuffers is finished. So I can check the fence status on the CPU to see the progress of the submit with vkGetFenceStatus or vkWaitForFences.
But what happens to the GPU execution flow? Let's take a look at the following scenarios:
Scenario1:
Suppose I have recorded multiple commandBuffers and submit them in a single operation.
- vkQueueSubmit(commandBuffer1, commandBuffer2, commandBuffer3, commandBuffer4, fence1)
On the GPU, the queue is filled with commandBuffer1-commandBuffer4. I have read that the start of execution of the commandBuffer is ordered. So commandBuffer1 is executed before commandBuffer2. CommandBuffer2 before CommandBuffer3... But the GPU can do this in parallel, so it is not guaranteed that commandBuffer1 has finished its work before commandBuffer2 starts executing. It is therefore also possible that commandBuffer2 has finished its work before commandBuffer1 has finished.
Fence1 is signaled when commandBuffer1,2,3 and 4 have finished their work. See Figure 1
Figure1: Example of the execution of commands within a GPU with a fence
Scenario2:
As in Scenario1, we have 4 commandBuffers that are to be transmitted. This time, however, they are transmitted in separate transmission commands, each with a fence.
- vkQueueSubmit(commandBuffer1, fence1)
- vkQueueSubmit(commandBuffer2, fence2)
- vkQueueSubmit(commandBuffer3, fence3)
- vkQueueSubmit(commandBuffer4, fence4)
The start of execution is still ordered, so commandBuffer1 starts before commandBuffer2... same goes for commandBuffer3 and commandBuffer4.
But what happens on the GPU? Is commandBuffer2 blocked by the fence until commandBuffer1 is finished? Or does a fence not block at all?
In case that Fences do not block GPU commands: In the case of Figure 1, does this mean that Fence2 signals first, then Fence4, Fence1, Fence3?
