Explicit Vulkan

Vulkan attempts to change as many of the problematic implicitness of older APIs into explicit application choices as possible; there are no longer any implicit dependencies and memory allocation is completely handled by the application. A combination of all the explicit portions of Vulkan are needed to achieve this, as no single part solves all the listed issues in isolation.

Explicit Allocation

Allocating a resource in Vulkan is significantly more explicit than it was in OpenGL ES. The sequence of commands required to make and allocate an image, for example, is as follows:

  1. Create an image object with the desired format, size, etc.
  2. Query the image object for its memory requirements
  3. Pick a suitable memory type/heap to allocate from
  4. Allocate that memory
  5. Bind the memory to the image

Most of these steps are not normally trivial, as the large memory objects should be allocated and bound to portions of each resource. On certain platforms, there is even a limit on the amount of memory objects that can be created - a limit that could be hit before the system is out of memory.

Explicit Data Transfer

In addition to resource application, actually uploading data to resources is also more explicit in Vulkan than in OpenGL ES. Uploading data to resources is now performed either though a direct memory map operation for resources with linear memory, or via a copy from another resource for non-linear memory. For example, to upload a texture, the following chain of operations is liable to be performed:

  1. Create and bind memory to a linear image
  2. Persistently map that memory to the host CPU
  3. Read texture data from disk, directly into that mapped memory
  4. Submit a command to copy data from the buffer to the non-linear final image

In addition to these operations, there is a large amount of synchronisation to handle, as both the device and host operations function complete asynchronously. Should a copy from the buffer while data is still writing be attmempted, or an attempt to write more data at a subsequent point, synchronisation objects need to be used to coordinate the operations. There is no real scope for ghosting as data hazards and flushing are explicitly exposed in Vulkan.

Explicit Dependencies and Synchronisation

Vulkan requires some form of synchronisation in any operation that can be thought of. The API provies few implicit ordering guarantees, even between individual commands in a command buffer. Different architectures process work in diffedrent orders. For example, Tile-Based Deferred Renderers (TBDR GPUs) execute all vertex processing before any fragment processing, while Immediate Mode Renderers (IMR GPUs) will pipeline everything together.

By explicitly stating that there is no guaranteed order between operations, it allows the implementation to run as fast as possible while still giving the user the opportunity to explicitly ask for a guaranteed order if they require it. This is in contrast to OpenGL ES where an application could assume that it would get everythiung in a guaranteed order, and an implementation had to try to pick apart anywhere that could be potentially optimised. The same is true of CPU-to-GPU work; Vulkan provides the ability for the host device to both wait on events from the GPU and also some ability to directly trigger events for the GPU.

Explicit Work Submission

Queue submission is separated from command generation. Vulkan guarantees that no drawing work will begin when you call a draw function; all that happens is that draw calls get recorded into a command buffer. Once the command buffers are created, the submission can be controlled exactly by submitting it to a queue. No GPU work can happen on command buffers before that point and the application can receive explicit notification of when that work has finished as well.

Explicit Render Pass Delimiting

A large number of implicit flushes in OpenGL ES, particularly for tiling architectures, are caused by operations that occur when a tiler would otherwise be mid-render. Vulkan explicitly delimits where render passes begin and end, with the information being used by a tile-based GPU to handle transitioning data to and from the framebuffer.

The API forbids any operations like that in a section that would cause a flush during a render pass. This is helpful in that it makes application developers consider what they are doing when they introduce a dependency. More importantly, it means there is only one point of tile load/store operations for an application to specify, so a tile-based GPU has all the information it requires to perform the transition as efficiently as possible. More details can be found in Architecture Positive, where render pass structure is discussed, as well as other changes to the API.