Key Differences Between OpenGL ES and Vulkan

More important differences in the design of OpenGL ES and Vulkan

User-managed memory allocation

In OpenGL ES, a developer can only specify the amount of memory required and certain hints on what it would be used for.

In Vulkan, the developer has the ability to request large memory blocks from the driver and then sub-allocate it as required. This allows for finer-grained control over memory allocation and better resource lifetime tracking, but means the developer must take responsibility for freeing any allocated memory.

Vulkan memory is split up into two categories: host memory and device memory.

Host memory is the memory needed by the application for non-device-visible storage. Vulkan gives the opportunity of using a custom host allocator when allocating in this memory. The allocator is optional and if it is not used Vulkan will take care of allocations.

Device memory is the memory visible to the device. This is where some opaque images and buffers reside. In Vulkan, the number of device allocations is driver-limited. The correct way of allocating buffers and images is to use a single allocation for several images or buffers. Vulkan also provides resource pools to allocate resources such as command buffers and descriptor sets. The actual content is indirectly written by the driver.

Explicit data transfers

The performance of applications which are content heavy sometimes suffers from a phenomenon known as ghosting. Ghosting happens when a resource is used in multiple frames by the driver and the hardware, but it is being continuously updated by the application. As the resource is still in use by the hardware, the driver has to create 'ghost' copies of the resources which are invisible to the developer, so it can be updated.

In Vulkan, developers have to explicitly manage allocated memory and synchronise memory accesses. This approach completely eliminates ghosting, but developers need to implement a method, such as a ring buffer with sufficient synchronisation, to allow for modifying objects that are being used by the hardware.

Full control over resource lifespan

In OpenGL ES, a developer is only able to mark resources to be deleted, but the driver decides when to actually free the memory as the resource might still be in use by the GPU. This can result in spikes in frame times and unpredictable performance.

In Vulkan, the opposite is true the developer has full control over when resources are freed. This results in predictable performance and fewer frame time spikes but can lead to errors if the application tries to release resources that are still in use.

Scaling to multiple cores

The single threaded nature of older APIs like OpenGL ES has become more and more of a bottleneck over the years. There were many attempts to utilise multiple cores, such as driver-side multi-core work consumption, or deferred contexts in D3D11. However, these could not bring enough of a performance benefit for the amount of effort required to implement them.

Vulkan is designed to take advantage of multi-core architectures, providing many features to aid distributing work over multiple cores. The API splits the draw call submission into two parts: command buffer generation and command buffer submission. This allows the generation of individual command buffers to be distributed across multiple cores. After this is done a single core can submit all the command buffers generated. Command buffer submission is relatively cheap, so the serial part of this workload is minimised.

For further information see this blog post on the topic: Vulkan: Scaling to multiple threads

Precompiling shaders

In OpenGL ES, shader compilation is an expensive operation that has to be done on the target device, as each vendor uses their own binary format. Although shader caching can be utilised, this has helped less and less, as applications the numbers of shader variations that need to be compiled have increased. The driver usually compiles shaders just before they are needed, leading to spikes in frame time.

During the development of Vulkan a new vendor independent intermediate format called SPIR-V was proposed. This format allows developers to precompile the shaders into an optimised driver friendly format. Only register allocation, validation, shader patching and binary translation needs to be done on the target device.

Shader caching has also changed, as now entire pipeline states can be cached. This somewhat increases the shader variations needing to be cached, but it helps the graphic driver a lot to know the entire state. The developer has full control over when a shader is compiled and potentially cached.

Error handling

One of the many things that increases driver overhead with OpenGL ES is that it needs to validate the global state for errors all the time. This is because an application can query the error state after each and every API call, resulting in a lot of unnecessary overhead in release environments.

To alleviate the driver overhead, Vulkan removed the runtime error-checking mechanism, and introduced a validation layer-based system. These layers are not part of the core API, so can be switched on and off between debug and release builds. This way developers can still validate errors effectively, but release builds no longer have this overhead.

Pixel local storage support

Tile-based architectures have the ability to accelerate effects with fast Pixel Local Storage (PLS) memory. Utilising this memory has the benefit of significantly reduced bandwidth usage and battery life savings. OpenGL ES did have support for PLS, but it was through an extension, shader_pixel_local_storage(2).

In Vulkan, PLS memory (and therefore tile-based architecture) support is built-in through use of render passes and subpasses. These two features allow the developer to specify dependencies between effects so that intermediate data can be kept in PLS memory, rather than being written out to system memory.