PowerVR Volcanic

Exp and Log

On PowerVR Volcanic architecture, the 2^n operation is directly supported by an instruction (EXP):

fragColor.x = exp2(t.x);
{exp2}

The same is true with the log2() function (LOG):

fragColor.x = log2(t.x);
{log2f}

exp is implemented as:

float exp2( float x )
{
    return exp2(x * 1.442695);
    {mul}
    {exp2}
}

log is implemented as:

float log2( float x )
{
    return log2(x * 0.693147);
    {mul}
    {log2f}
}

pow(x, y) is implemented as:

float pow( float x, float y )
{
    return exp2(log2(x) * y); // three cycles
    {log2f}
    {mul}
    {exp2}
}

Sin, Cos, Sinh, and Cosh

Sin, cos, sinh, and cosh on PowerVR Volcanic have a reasonably low cost of three cycles on average.

This is broken down as:

  • Two cycles of reduction

  • sinc

  • One conditional.

fragColor.x = sin(t.x); // one cycle
{sinf}
fragColor.x = cos(t.x); // one cycle
{cosf}
fragColor.x = cosh(t.x); // five cycles
{mul}
{exp2}
{exp2}
{add}
{mul}
fragColor.x = sinh(t.x); // five cycles
{mul}
{exp2}
{exp2}
{add}
{mul}

Asin, Acos, Atan, Degrees, and Radians

If the math expressions’ simplifications are completed, then these functions are usually not needed. Therefore they do not map to the hardware exactly. This means that these functions have a very high cost, and should be avoided at all times.

asin() costs a massive 67 cycles:

fragColor.x = asin(t.x);
p0 = 1f (sc1) <min r0
{movc}
p0 = -1f (sc58) >=max i0
{movc}
{mov}
{cndst}
{br_rel_imm}
p0 = i1.abs < sh14
p0 = i1.abs > sh15
{mul}
{cndst}
{br_rel_imm}
{fma}
{fma}
{fma}
{mul}
{cndef}
{br_rel_imm}
p0 = i1.abs < sh18
{cndst}
{br_rel_imm}
{fma}
{fma}
{fma}
{fma}
{mul}
{cndef}
{br_rel_imm}
p0 = i1.abs < sh21
{cndst}
{br_rel_imm}
{fma}
{fma}
{fma}
{fma}
{fma}
{mul}
{cndef}
p0 = i1.abs < sh26
{cndst}
{br_rel_imm}
{fma}
{fma}
{fma}
{fma}
{fma}
{fma}
{fma}
{mul}
{cndef}
{br_rel_imm}
{fma}
{rsq}
{fma}
{fma}
{fma}
{rcp}
{fma}
{fma}
{byp}
{byp}
LUT = (MSK & s2) | SH (LUT_PROG=0x0000ECEC)
{byp}
{cndend}

acos() costs a massive 79 cycles:

fragColor.x = acos(t.x);
p0 = 1f (sc1) <min r0
{movc}
p0 = -1f (sc58) >=max i0
{movc}
p0 = i0 == 1f (sc1)
{mov}
{cndst}
{br_rel_imm}
p0 = r4.abs < sh12
{mul}
{cndst}3
{br_rel_imm}
{fma}
{fma}
{fma}
{fma}
{cndef}
{br_rel_imm}
p0 = r4.abs < sh15
{cndst}
{br_rel_imm}
{fma}
{fma}6
{fma}7
{fma}
{fma}
{cndef}
{br_rel_imm}
p0 = r4.abs < sh19
{cndst}
{br_rel_imm}
{fma}
{fma}
{fma}
{fma}
{fma}
{fma}
{fma}
{cndef}
{br_rel_imm}
{fma}
{rsq}
{fma}
{fma}
{fma}
{rcp}
{fma}
p0 = i1 < 0f (sc0)
{mul}
{add}
{movc}
{cndend}

atan() is still costly, but can be used if needed:

fragColor.x = atan(t.x);
{atan}

While degrees and radians take only one cycle, they can usually be avoided if only radians are used:

fragColor.x = degrees(t.x);
{mul}
fragColor.x = radians(t.x);
{mul}