VC4ASM - vc4.qinc include

↑ Top, expression predicated, VPM/VDC setup, bit operations, numerical constants, Broadcom compatibility

This file contains several useful macros for VC4 programming.
Use .include <vc4.qinc> in the source or the command line option -i vc4.qinc to use this file.

Expression predicates

Constants

isConstant(x)
Evaluates true if and only if x is a constant expression, i.e. no register, no semaphore and no label.
isLdPE(x)
isLdPES(x)
isLdPEU(x)
Evaluates true if x is a per QPU element constant expression. isLdPES is false when the constant contains negative values, isLdPEU is false when the constant contains values greater than 1.
isSmallImmd(x)
Evaluates true if x is a constant expression that fits into a small immediate field.

Registers

  • isRegister(x)
  • Evaluates true if x is a register expression, including semaphore registers.
    isRegfileA(x)
    isRegfileB(x)
    Evaluates true if x is a register from register file A/B, including peripheral registers.
    isAccu(x)
    Evaluates true if x is an accumulator, i.e. r0..r5.
    isReadable(x)
    isWritable(x)
    Evaluates true if x is a readable/writable register.
    isRotate(x)
    Evaluates true if x is a register with an vector rotation attribute, e.g. r5<<2.
    isSemaphore(x)
    Evaluates true if x is a semaphore register.

    VPM/VCD setup helpers

    See Broadcom Videocore IV Architecture Reference Guide for further details.

    VPM read/write

    vpm_setup(num,stride,dma)
    Setup VPM read/write for num items starting at dma and incrementing dma by stride after each item.
    h_32(y)
    Macro for dma component of vpm_setup. Start a horizontal 32 bit transfer at position 0,y in VPM, i.e. QPU elements access 16 consecutive horizontal values in the VPM.
    h16p(y,h)
    Macro for dma component of vpm_setup. Start a horizontal 16 bit packed transfer at position 0,y and half word h in VPM.
    h16l(y,h)
    Macro for dma component of vpm_setup. Start a horizontal 16 bit laned transfer at position 0,y and half word h in VPM.
    h16p(y,h)
    Macro for dma component of vpm_setup. Start a horizontal 8 bit packed transfer at position 0,y and byte b in VPM.
    h16l(y,h)
    Macro for dma component of vpm_setup. Start a horizontal 8 bit laned transfer at position 0,y and byte b in VPM.
    v_32(y,x)
    Macro for dma component of vpm_setup. Start a vertical 32 bit transfer at position x,y in VPM, i.e. QPU elements access 16 consecutive vertical values in the VPM. y must be a multiple of 16 in this mode.
    h16p(y,h)
    Macro for dma component of vpm_setup. Start a vertical 16 bit packed transfer at position x,y and half word h in VPM.
    h16l(y,h)
    Macro for dma component of vpm_setup. Start a vertical 16 bit laned transfer at position x,y and half word h in VPM.
    h8p(y,h)
    Macro for dma component of vpm_setup. Start a vertical 8 bit packed transfer at position x,y and byte b in VPM.
    h8l(y,h)
    Macro for dma component of vpm_setup. Start a vertical 8 bit laned transfer at position x,y and byte b in VPM.

    VCD DMA write access (VDW)

    vdw_setup_0(units,depth,dma)
    Setup VPM DMA write for units rows of length depth starting at dma.
    vdw_setup_1(stride)
    Setup VPM DMA write memory stride from the last item of a row to the first item of the next row. This function activates block mode, i.e. one row in memory is one row or column in VPM.
    vdw_setup_1p(stride)
    Setup VPM DMA write memory stride from the last item of a row to the first item of the next row. This function activates packed mode.
    dma_h32(y,x)
    Macro for dma component of vdw_setup_0. Start a horizontal 32 bit transfer at position x,y in VPM, i.e. elements in a row are horizontally aligned in VPM. The y component is incremented after each row.
    dma_h16p(y,x,h)
    Macro for dma component of vdw_setup_0. Start a horizontal, packed 16 bit transfer at position x,y half word h in VPM, i.e. elements in a row are horizontally aligned in VPM.
    dma_h8p(y,x,b)
    Macro for dma component of vdw_setup_0. Start a horizontal, packed 8 bit transfer at position x,y byte b in VPM, i.e. elements in a row are horizontally aligned in VPM.
    dma_v32(y,x)
    Macro for dma component of vdw_setup_0. Start a vertical 32 bit transfer at position x,y in VPM, i.e. elements in a row are vertically aligned in VPM. The x component is incremented after each row.
    dma_v16p(y,x,h)
    Macro for dma component of vdw_setup_0. Start a vertical, packed 16 bit transfer at position x,y half word h in VPM, i.e. elements in a row are vertically aligned in VPM.
    dma_v8p(y,x,b)
    Macro for dma component of vdw_setup_0. Start a vertical, packed 8 bit transfer at position x,y byte b in VPM, i.e. elements in a row are vertically aligned in VPM.

    VCD DMA read access (VDR)

    vdr_setup_0(mpitch,rowlen,nrows,dma)
    Setup VPM DMA read for nrows rows of length rowlen starting at dma. mpitch is the distance of two rows in memory. It must either be a power of 2 or 0 to indicate that vdw_setup_1 has to be used.
    vdr_setup_1(stride)
    Setup VPM DMA read memory stride from the last item of a row to the first item of the next row.
    vdr_h32(vpitch,y,x)
    Macro for dma component of vdr_setup_0. Start a horizontal 32 bit transfer at position x,y in VPM, i.e. elements in a row are horizontally aligned in VPM. The y component is incremented by vpitch after each row.
    vdr_v32(vpitch,y,x)
    Macro for dma component of vdr_setup_0. Start a vertical 32 bit transfer at position x,y in VPM, i.e. elements in a row are vertically aligned in VPM. The y component is incremented by vpitch after each row.
    further modes need to be defined...

    Bit operations

    countBits(x)
    Number of set bits in x.
    reverseBits(x,n)
    Rightmost n bits of x in reversed order.
    reverseBits4(x)
    reverseBits8(x)
    reverseBits16(x)
    reverseBits32(x)
    reverseBits64(x)
    Shorthand for reverseBits(x,4/8/16/32/64), also faster.
    ilog2(x)
    Index of the highest set bit in x, i.e. 0 → -1, 1 → 0, 2 → 1, 3 → 1, 4 → 2 ...

    Numerical constants

    Name Value Description
    M_E 2.7182818284590452354 e (Euler number)
    M_LOG2E 1.4426950408889634074 log2 e
    M_LOG10E 0.43429448190325182765 log10 e
    M_PI 3.14159265358979323846 π
    M_2PI 6.28318530717958647693
    M_PI_2 1.57079632679489661923 π/2
    M_PI_4 0.78539816339744830962 π/4
    M_1_PI 0.31830988618379067154 1/π
    M_2_PI 0.63661977236758134308 2/π
    M_2_SQRTPI 1.12837916709551257390 2/√π
    M_SQRT2 1.41421356237309504880 √2
    M_SQRT1_2 0.70710678118654752440 1/√2
    M_NAN NaN not a number
    M_INF Inf infinity

    Broadcom compatibility helpers

    sacq(i)
    srel(i)

    Alternative way to declare semaphore register sacq0..15 and srel0..15.