Performance Guidelines

Recommendations to optimise application performance on PowerVR Rogue.

PowerVR Rogue Graphics Cores are capable of efficiently executing a range of parallel processing algorithms. To achieve the best performance, it is important to follow the guidelines throughout this section as closely as possible. These guidelines are based on four main strategies:

For all these strategies, it is always a good idea to profile applications to find out where the bottlenecks in processing are. This will work as a guide as to where to target optimisation efforts and can be used to determine the results of any given optimisation. The PowerVR SDK provides PVRTune as a real-time hardware profiler which supports the visualisation of how the Graphics Core is being used.

Table 1. The methods outlined and their API appropriacy
Method Applicable to
OpenCL OpenGL OpenGL ES RenderScript
Create Allocations with Correct USAGE Flags
Create Shared Memory Objects with CL_ALLOC_HOST_PTR
Create Buffers with the Correct Usage Hints
Use Zero-Copy Paths if Possible
Use Shared/Local Memory Sensibly
Avoid Barrier Directly after Local or Constant Memory Access
Use Built-ins for Type Conversions