Exploiting the SOP/MAD FP16 Pipeline#
The PowerVR Rogue architecture has a powerful FP16 pipeline optimised for common graphics operations. This section describes how to take advantage of this.
Note
Converting the inputs to FP16 and then converting the output to FP32 is free.
With SOP
/MAD
there are a number of options for execution in one cycle:
Two
SOP
sTwo
MAD
sOne
MAD
and oneSOP
Four FP16
MAD
s.
Here are some examples.
Executing four MAD
s in one cycle:
fragColor.x = t.x * t.y + t.z;
fragColor.y = t.y * t.z + t.w;
fragColor.z = t.z * t.w + t.x;
fragColor.w = t.w * t.x + t.y;
{sopmad, sopmad, sopmad, sopmad}
SOP
with a choice of an operation between the result of the multiplies:
fragColor.z = t.y * t.z OP t.w * t.x;
fragColor.w = t.x * t.y OP t.z * t.w;
where OP can be either an addition, a subtraction, a min()
or a max()
:
fragColor.z = t.y * t.z + t.w * t.x;
fragColor.z = t.y * t.z - t.w * t.x;
fragColor.z = min(t.y * t.z, t.w * t.x);
fragColor.z = max(t.y * t.z, t.w * t.x);
It is possible to apply either a negate, an abs()
, or a clamp()
(saturate
) to all of the inputs:
fragColor.z = -t.y * abs(t.z) + clamp(t.w, 0.0, 1.0) * -t.x;
Finally, it is also possible to apply a clamp()
(saturate
) to the end result:
fragColor.z = clamp(t.y * t.z OP t.w * t.x, 0.0, 1.0);
fragColor.z = clamp(t.y * t.z + t.w, 0.0, 1.0);
After applying all this knowledge, the example below shows off the power of this pipeline by using everything in one cycle:
// one cycle
mediump vec4 fp16 = t;
highp vec4 res;
res.x = clamp(min(-fp16.y * abs(fp16.z), clamp(fp16.w, 0.0, 1.0) * abs(fp16.x)), 0.0, 1.0);
res.y = clamp(abs(fp16.w) * -fp16.z + clamp(fp16.x, 0.0, 1.0), 0.0, 1.0);
fragColor = res;
{sop, sop}