Efficient Sprite Rendering

Sprites can have a significant impact on performance if handled incorrectly

Rendering sprites efficiently may seem like a trivial exercise. However, without careful consideration an application may be unresponsive and sluggish due to poor graphics performance. Traditional sprite rendering tends to see textures drawn using alpha blending on to quads. These quads will consist of large areas of alpha, either completely transparent (alpha value of zero), or partial alpha. Areas which are completely transparent are traditionally discarded using either the discard keyword or alpha testing, while areas of partial alpha undergo blending. Both have some form of impact on performance versus fully opaque objects, meaning that a large number of sprites being drawn inefficiently can seriously harm performance.

The discard keyword (more information) should be avoided in favour of the much faster alpha blending.

Even when favouring alpha blending, performance can still be affected if there are many sprites. One method to minimise the impact of several layers of blended sprites is to increase the geometry complexity of the sprites, to reduce the amount of wasted transparent fragments. For example, if a sprite is circular in shape and is rendered using the most optimal fitting quad, 22% of the fragments processed are redundant. Significant performance improvements can be gained by reducing the wasted transparency by increasing geometry complexity.

PowerVR hardware has excellent vertex processing capabilities and is designed to handle large amounts of geometry data, far more than what is present in most sprite-based applications. Therefore, increasing complexity should have minimal performance impact, and any impact this may have is most likely outweighed by the savings of rendering less transparency. If the complexity is increased with the previous case of a perfectly fitting quad around a circular sprite to that of a dodecagon (twelve-sided polygon) the amount of wasted fragment processing can be reduced to just 3%.

The formula below demonstrates increasing complexity and reducing processing.

Also consider splitting opaque and alpha blended objects, such as UI elements, that appear in the scene into separate draw submissions.

For the rasterization to be performed as efficiently as possible, the elements should be rendered in the following order:

  1. opaque scene elements
  2. alpha blended scene elements
  3. alpha blended UI elements