Introduction to the PowerVR Compute Development Recommendations

The PowerVR family of GPU Cores from Imagination Technologies is designed to perform both compute and graphics tasks. Its programmable core architecture allows for extremely efficient and high-performance compute execution.

Document Overview

This document describes the recommended usage guidelines for achieving optimal performance when using compute on Imagination’s PowerVR GPU Cores. Most of the guide describes constructs and patterns that directly emerge from the PowerVR cores architecture. This enables the developer to make the correct decisions irrespective of the preferred approach as far as APIs and programming languages are concerned. Additionally, several specific details are included for OpenCL, Vulkan Compute, and OpenGL ES Compute. These details have been written against the following API versions:

  • OpenGL ES 3.2

  • OpenCL 3.0

  • Vulkan 1.4

This document assumes that the reader has a good working knowledge of at least one of these APIs. It is also assumed that the reader has worked through the examples of the relevant PowerVR Compute SDK.

After reading this document, readers should have a solid understanding of how compute works on PowerVR GPU Cores, as well as a good understanding of how to develop efficient and well-optimised code for these devices.

This document also provides an optimisation strategy quick reference sheet.

Glossary

Throughout this document, extensive use is made of terminology identified in the table below.

Common terms

Term

Also referred to as

Description

USC (Unified Shading Cluster)

Shading Cluster, Shading Unit, Execution Unit

A semi-autonomous part of the GPU Core that can typically execute an entire work-group. Other large parts like Texture Units can be shared among USCs.

Core

Processor, GPU Core

An almost completely autonomous part of the GPU Core. Typically, a collection of USCs and possibly supporting hardware such as texture units.

Task

Thread Group, Warp, Wavefront

The native grouping of threads that a USC executes. Number of threads varies per PowerVR Core.

API-specific terms

Term

OpenGL ES

OpenCL

Vulkan

Kernel / shader

Compute Shader

Kernel

Compute Shader

Thread

Shader invocation

Work-item

Shader invocation

Work-group

Work-group

Work-group

Work-group

Shared memory

shared variables

local memory

shared variables

Image

Texture, image

Image

Texture, image

Constant, constant memory

const / uniform variable, uniform block, uniform buffer

constant memory

const / uniform variable, uniform block, uniform buffer

Private memory

Local variables, temporaries

Variables, private memory

Local variables, temporaries

Dataset

Dispatch size, dataset

Global work, ND Range

Dispatch size, dataset

Architecture Types

Architecture Type

Core

Family

Type 1

6/6XT, 7/7XT, 8XE/XEP, 9XE/XM

Rogue

Type 2

AXE, BXE, BXS (64 and Smaller)

Rogue

Type 3

8XT, 9XTP, AXM, BXM

Volcanic

Type 4

AXT, BXT, CXT/CXTP, DXT/DXTP/DXS

Volcanic