← Home

FDOT (half-precision to single-precision, by element)

Half-precision dot product to single-precision (vector, by element)

This instruction computes the fused sum-of-products of a pair of half-precision values held in each 32-bit element of the first source vector and a pair of half-precision values held in an indexed 32-bit element of the second source vector, without intermediate rounding, and then destructively adds the single-precision sum-of-products to the corresponding single-precision element of the destination vector.

Advanced SIMD class

(FEAT_F16F32DOT)

313029282726252423222120191817161514131211109876543210
0Q00111101LMRm1001H0RnRd
Usizeopcode

Encoding

FDOT <Vd>.<Ta>, <Vn>.<Tb>, <Vm>.2H[<index>]

Decode

if !IsFeatureImplemented(FEAT_F16F32DOT) then EndOfDecode(Decode_UNDEF); constant integer n = UInt(Rn); constant integer m = UInt(M:Rm); constant integer d = UInt(Rd); constant integer i = UInt(H:L); constant integer datasize = 64 << UInt(Q); constant integer elements = datasize DIV 32;

Assembler Symbols

<Vd>

Is the name of the SIMD&FP destination register, encoded in the "Rd" field.

<Ta>

Is an arrangement specifier, encoded in Q:

Q <Ta>
0 2S
1 4S
<Vn>

Is the name of the first SIMD&FP source register, encoded in the "Rn" field.

<Tb>

Is an arrangement specifier, encoded in Q:

Q <Tb>
0 4H
1 8H
<Vm>

Is the name of the second SIMD&FP source register, encoded in the "M:Rm" fields.

<index>

Is the immediate index of a pair of 16-bit elements in the range 0 to 3, encoded in the "H:L" fields.

Operation

AArch64.CheckFPAdvSIMDEnabled(); constant bits(datasize) operand1 = V[n, datasize]; constant bits(128) operand2 = V[m, 128]; constant bits(datasize) operand3 = V[d, datasize]; bits(datasize) result; for e = 0 to elements-1 constant bits(16) elt1_a = Elem[operand1, 2 * e + 0, 16]; constant bits(16) elt1_b = Elem[operand1, 2 * e + 1, 16]; constant bits(16) elt2_a = Elem[operand2, 2 * i + 0, 16]; constant bits(16) elt2_b = Elem[operand2, 2 * i + 1, 16]; constant bits(32) sum = Elem[operand3, e, 32]; Elem[result, e, 32] = FPDotAdd(sum, elt1_a, elt1_b, elt2_a, elt2_b, FPCR); V[d, datasize] = result;


Version 2025.09 — Copyright © 2010-2025 Arm Limited or its affiliates.

This site is provided as a community resource and is NOT affiliated with nor endorsed by Arm Limited.