Main Instruction Group#
The Main ALU performs all floating point, integer, and packing/unpacking operations. The operations are split into three phases that are fully exposed as separate entities where they are not being used for other functions.
The Main ALU may not be used at the same time as the Bitwise ALU.
Example#
0 : if (p0)
mul ft0, s0, s1 # phase 0 instruction
mad ft1, s3, s4, s5 # ft1 drives w1 by default, second to last phase present
mul ft2, is0, ft0 # ft2 drives w0 by default, last phase present
tstz.f32 ftt, p0, is1 # test instruction always drives feedthrough ftt
# p0 write optional
movc w0,w1, ftt, ft0, ft1 # ft1 or ft0 output to w0 (overrides default)
uvsw w0 # backend instruction
Opcodes#
Phase 0, phase 1 and backend can each take one instruction. Phase 2 can take PACK, TEST and/or MOV instructions.
Instructions should appear in the code in phase order (zero to two) and if there is a TEST instruction it should appear before the MOV instruction.
Instructions do not need to be present for all phases, but there are some restrictions on which combinations of phases can be present. Phase 1 can only be present if phase 0 is also present. Backend instructions can be present on their own or in combination with phase 0 and phase 2.
Source Arguments#
There are two groups of ALU inputs:
S0, S1, S2
S3, S4, S5
0, 1, 2 or 3 inputs may be used from each instruction group, but the lowest numbered inputs within each group must be used first.
Source arguments denote possible input registers in the reference section.
Internal Source#
Internal feedthrough sources (FT0, FT1 and FT2) are generated by phase 0, 1 and 2 instructions respectively. These are shown as the destination (first argument) of the phase instructions.
Internal sources for phase 2 may need to be declared if phase two instructions are present. Phase 2 instructions can only choose from two pre-selected external sources in addition to FT0 and FT1 and these are named IS0 and IS1.
Destination Arguments#
ALU outputs are:
W0
W1
Destination arguments denote possible output registers in the reference section.
Complex Instructions#
Complex and trigonometric instructions use the entire Main ALU and have fixed sources and destinations.
Note
When issued in Phase 1, Complex and Trigonometric instruction types can leave resources free in Phase 0 of the Main ALU to be coissued. They can also be coissued with F16SOP operations. FRED instructions can never be coissued and 32/64 bit integer instructions cannot be used in Phase 0.
Texture Address Unit#
The texture address unit (TAU) is used to calculate texel memory addresses for a given set of texture coordinates. The texture address unit occupies all phases of the Main ALU, and is used via the GTA (Generate Texel Address) instruction. GTA has fixed sources and destinations.
The TAU processes four instances in parallel which allows common source data and calculations to be shared. For example, each instance reads only 32 bits of the 128-bit texture state.
F16 Sum-of-Products Unit#
The F16 Sum-of-Products (SOP) unit is an F16 precision vector ALU, which may be used for operations such as alpha blending. Any F32 inputs are converted to F16 prior to performing calculations. The F16 SOP unit uses the entire Main ALU.