Abs, Neg, and Saturate#
On PowerVR architecture, it is essential to use modifiers such as abs()
, neg()
, and clamp(…, 0.0, 1.0)
(also known as saturate()
) - being free in certain cases.
abs()
and neg()
are free if they are used on an input to an operation, in which case they are turned into a free modifier by the compiler. However, saturate()
turns into a free modifier when used on the output of an operation.
Note
Complex and sampling/interpolation instructions are exceptions to this rule. saturate()
is not free when used on a texture sampling output, or on a complex instruction output. When these functions are not used accordingly, they may introduce additional MOV
instructions which may increase the cycle count of the shaders.
It is also beneficial to use clamp(…, 0.0, 1.0)
instead of min(…, 1.0)
and max(…, 0.0)
. This changes a test instruction into a saturate modifier:
fragColor.x = abs(t.x * t.y); // two cycles
{sop, sop}
{mov, mov, mov}
-->
fragColor.x = abs(t.x) * abs(t.y); // one cycle
{sop, sop}
fragColor.x = -dot(t.xyz, t.yzx); // three cycles
{sop, sop, sopmov}
{sop, sop}
{mov, mov, mov}
-->
fragColor.x = dot(-t.xyz, t.yzx); // two cycles
{sop, sop, sopmov}
{sop, sop}
fragColor.x = 1.0 - clamp(t.x, 0.0, 1.0); // two cycles
{sop, sop, sopmov}
{sop, sop}
-->
fragColor.x = clamp(1.0 - t.x, 0.0, 1.0); // one cycle
{sop, sop}
fragColor.x = min(dot(t, t), 1.0) > 0.5 ? t.x : t.y; // five cycles
{sop, sop, sopmov}
{sop, sop}
{mov, fmad, tstg, mov}
{mov, mov, pck, tstg, mov}
{mov, mov, tstz, mov}
-->
fragColor.x = clamp(dot(t, t), 0.0, 1.0) > 0.5 ? t.x : t.y; // four cycles
{sop, sop, sopmov}
{sop, sop}
{fmad, mov, pck, tstg, mov}
{mov, mov, tstz, mov}
However, it is sensible to be wary of complex functions, as they are translated into multiple operations. Therefore in this case it matters where the modifiers are placed.
For example, normalize()
is broken down into:
vec3 normalize( vec3 v )
{
return v * inverssqrt( dot( v, v ) );
}
In this case it is best to negate one of the inputs of the final multiplication rather than the inputs in all cases, or create a temporary negated input:
fragColor.xyz = -normalize(t.xyz); // six cycles
{fmul, mov}
{fmad, mov}
{fmad, mov}
{frsq}
{fmul, fmul, mov, mov}
{fmul, mov}
-->
fragColor.xyz = normalize(-t.xyz); // seven cycles
{mov, mov, mov}
{fmul, mov}
{fmad, mov}
{fmad, mov}
{frsq}
{fmul, fmul, mov, mov}
{fmul, mov}