Exploiting the SOP/MAD FP16 Pipeline

Making the best use of the SOP/MAD FP16 pipeline

The PowerVR Rogue architecture has a powerful FP16 pipeline optimised for common graphics operations. This section describes how to take advantage of this.

Important: Converting the inputs to FP16 and then converting the output to FP32 is free.

With SOP/MAD there are a number of options for execution in one cycle:

  • Two SOPs
  • Two MADs
  • One MAD and one SOP
  • Four FP16 MADs.

Here are some examples.

Executing four MADs in one cycle:

fragColor.x = t.x * t.y + t.z;
fragColor.y = t.y * t.z + t.w;
fragColor.z = t.z * t.w + t.x;
fragColor.w = t.w * t.x + t.y;
{sopmad, sopmad, sopmad, sopmad}

SOP with a choice of an operation between the result of the multiplies:

fragColor.z = t.y * t.z OP t.w * t.x;
fragColor.w = t.x * t.y OP t.z * t.w;

where OP can be either an addition, a subtraction, a min() or a max() :

fragColor.z = t.y * t.z + t.w * t.x;
fragColor.z = t.y * t.z - t.w * t.x;
fragColor.z = min(t.y * t.z, t.w * t.x);
fragColor.z = max(t.y * t.z, t.w * t.x);

It is possible to apply either a negate, an abs(), or a clamp() (saturate) to all of the inputs:

fragColor.z = -t.y * abs(t.z) + clamp(t.w, 0.0, 1.0) * -t.x;

Finally, it is also possible to apply a clamp() (saturate) to the end result:

fragColor.z = clamp(t.y * t.z OP t.w * t.x, 0.0, 1.0);
fragColor.z = clamp(t.y * t.z + t.w, 0.0, 1.0);

After applying all this knowledge, the example below shows off the power of this pipeline by using everything in one cycle:

// one cycle
mediump vec4 fp16 = t;
highp vec4 res;
res.x = clamp(min(-fp16.y * abs(fp16.z), clamp(fp16.w, 0.0, 1.0) * abs(fp16.x)), 0.0, 1.0);
res.y = clamp(abs(fp16.w) * -fp16.z + clamp(fp16.x, 0.0, 1.0), 0.0, 1.0);
fragColor = res;
{sop, sop}