Render Passes

Render passes group together rendering commands into well-defined chunks. These are not dissimilar from a command buffer, which is somewhat deliberate, as render passes are the effective unit of work on a tile-based architecture. Commands that could cause a mid-frame flush on a tiler are explicitly disallowed inside a render pass. Draw calls and some other select commands are only allowed within a render pass, as a tiler requires the information the render pass provides in order to run effectively. Render passes consist of multiple sub-passes, each of which is able to communicate information to further sub-passes by means of data local to a given pixel location. Every sub-pass defines:

  • The load and store operations which take place on the attachments used
  • The execution dependencies on other sub-passes
  • The list of attachments expected to be preserved or read from previous passes

By using sub-pass communication, an application can do simple post-processing techniques such as colur grading or vignetting without ever writing out to actual memory. Also, if an application is using a deferred shading technique, it is possible to express the G-buffer in terms of sub-pass dependencies and input attachments. Attachments do not even need to be allocated in many cases if they are only used as intermediates, as the concept of explicitly transient attachment usage exists in Vulkan. Any attachment that is only going to be used during a render pass can be tagged as being transient, which allows the lazy allocation strategy for memory objects bound to it. Vulkan features a lazily allocated memory type, although lazily allocated memory objects may not immediately have any actual physical memory backing when first created.

These objects may be completely empty, partially allocated, or fully allocated depending on the architecture. In the majority of situations, the objects should remain in their initial stae for their lifetime. If for any reason the implementation needs to allocate more of that memory, it can do so as a late binding operation. The maximum size of that memory object is known ahead of time and it is therefore possible to query the current memory commitment for the memory objects.

Render Passes on PowerVR GPUs

As discussed earlier, render passes disallow anything that would cause a mid-frame flush on tile architecture. By having the render pass explicitly describe the start and end of rendering the load/store operations for each sub-pass, PowerVR GPUs are able to operate with minimal bandwidth, only writing out attachments needed after the render pass completes.

Sub-pass dependencies do not permit anything that is sure to cause a flush during rasterization on the PowerVR architecture, which means that there is the potential to combine multiple sub-passes into a single render pass. The upshot of this is that there is no stopping and starting of rendering; also, intermediate attachments that aren't explicitly stored do not need to be written to memory as they can remain in tile memory. Input attachments to a sub-pass, when written from a previous sub-pass, map exactly to on-chip tile memory in the hardware. This means that the equivalent of EXT_shader_pixel_local_storage or EXT_shader_framebuffer_fetch is used for attachments.

Lazily allocated memory on the PowerVR architecture starts out unallocated, as it is obvious in advance that there is usually no reason to allocate any memory. This is because it exists only as an API construct, mapping as a formatted block of registers in on-chip memory during rendering. Sometimes, a small amount of memory can be allocated at framebuffer creation time, or in rare instances, the entire memory object.