Exploiting the SOP/MAD FP16 Pipeline¶

Both the PowerVR Rogue and Volcanic architectures have a powerful FP16 pipeline optimised for common graphics operations. This section describes how to take advantage of this.

Note

Converting the inputs to FP16 and then converting the output to FP32 is free.

With SOP/MAD there are a number of options for execution in one cycle:

Two SOPs
Two MADs
One MAD and one SOP
Four FP16 MADs.

Here are some examples.

Executing four MADs in one cycle:

fragColor.x = t.x * t.y + t.z;
fragColor.y = t.y * t.z + t.w;
fragColor.z = t.z * t.w + t.x;
fragColor.w = t.w * t.x + t.y;
{sopmad, sopmad, sopmad, sopmad}

// Instuctions on Volcanic:
fragColor.x = t.x * t.y + t.z;
fragColor.y = t.y * t.z + t.w;
fragColor.z = t.z * t.w + t.x;
fragColor.w = t.w * t.x + t.y;
{fma}
{fma}
{fma}
{fma}

SOP with a choice of an operation between the result of the multiplies:

fragColor.z = t.y * t.z OP t.w * t.x;
fragColor.w = t.x * t.y OP t.z * t.w;

where OP can be either an addition, a subtraction, a min() or a max():

fragColor.z = t.y * t.z + t.w * t.x;
fragColor.z = t.y * t.z - t.w * t.x;
fragColor.z = min(t.y * t.z, t.w * t.x);
fragColor.z = max(t.y * t.z, t.w * t.x);

It is possible to apply either a negate, an abs(), or a clamp() (saturate) to all of the inputs:

fragColor.z = -t.y * abs(t.z) + clamp(t.w, 0.0, 1.0) * -t.x;

Finally, it is also possible to apply a clamp() (saturate) to the end result:

fragColor.z = clamp(t.y * t.z OP t.w * t.x, 0.0, 1.0);
fragColor.z = clamp(t.y * t.z + t.w, 0.0, 1.0);

After applying all this knowledge, the example below shows off the power of this pipeline by using everything in one cycle:

// one cycle
mediump vec4 fp16 = t;
highp vec4 res;
res.x = clamp(min(-fp16.y * abs(fp16.z), clamp(fp16.w, 0.0, 1.0) * abs(fp16.x)), 0.0, 1.0);
res.y = clamp(abs(fp16.w) * -fp16.z + clamp(fp16.x, 0.0, 1.0), 0.0, 1.0);
fragColor = res;
{sop, sop}

// Instructions on Volcanic:
{mov}
{mov}
{mov}
{mul}
{mul}
p0 = i0.f16.e1 <min i0.f16.e0
{movc}
{mov}
{fma}