image/svg+xmlVP4DPWSSDS — Dot Product of Signed Words with Dword Accumulation and Saturation (4-iterations)Instruction Operand EncodingDescriptionThis instruction computes 4 sequential register source-block dot-products of two signed word operands with doubleword accumulation and signed saturation. The memory operand is sequentially selected in each of the four steps.In the above box, the notation of “+3” is used to denote that the instruction accesses 4 source registers based on that operand; sources are consecutive, start in a multiple of 4 boundary, and contain the encoded register operand.This instruction supports memory fault suppression. The entire memory operand is loaded if any bit of the lowest 16-bits of the mask is set to 1 or if a “no masking” encoding is used.The tuple type Tuple1_4X implies that four 32-bit elements (16 bytes) are referenced by the memory operation portion of this instruction.Operationsrc_reg_id is the 5 bit index of the vector register specified in the instruction as the src1 register.VP4DPWSSDS dest, src1, src2(KL,VL) = (16,512)N := 4ORIGDEST := DESTsrc_base := src_reg_id & ~ (N-1) // for src1 operandFOR i := 0 to KL-1:IF k1[i] or *no writemask*:FOR m := 0 to N-1:t := SRC2.dword[m]p1dword := reg[src_base+m].word[2*i] * t.word[0]p2dword := reg[src_base+m].word[2*i+1] * t.word[1]DEST.dword[i] := SIGNED_DWORD_SATURATE(DEST.dword[i] + p1dword + p2dword)ELSE IF *zeroing*:DEST.dword[i] := 0ELSEDEST.dword[i] := ORIGDEST.dword[i]DEST[MAX_VL-1:VL] := 0Opcode/InstructionOp/En64/32 bit Mode SupportCPUID Feature FlagDescriptionEVEX.512.F2.0F38.W0 53 /rVP4DPWSSDS zmm1{k1}{z}, zmm2+3, m128AV/VAVX512_4VNNIWMultiply signed words from source register block indicated by zmm2 by signed words from m128 and accumulate the resulting dword results with signed saturation in zmm1.Op/EnTupleOperand 1Operand 2Operand 3Operand 4ATuple1_4XModRM:reg (r, w)EVEX.vvvv (r)ModRM:r/m (r)NA

image/svg+xmlIntel C/C++ Compiler Intrinsic EquivalentVP4DPWSSDS __m512i _mm512_4dpwssds_epi32(__m512i, __m512ix4, __m128i *);VP4DPWSSDS __m512i _mm512_mask_4dpwssds_epi32(__m512i, __mmask16, __m512ix4, __m128i *);VP4DPWSSDS __m512i _mm512_maskz_4dpwssds_epi32(__mmask16, __m512i, __m512ix4, __m128i *);SIMD Floating-Point ExceptionsNone.Other ExceptionsSee Type E4; additionally#UDIf the EVEX broadcast bit is set to 1.#UDIf the MODRM.mod = 0b11.

This UNOFFICIAL reference was generated from the official Intel® 64 and IA-32 Architectures Software Developer’s Manual by a dumb script. There is no guarantee that some parts aren't mangled or broken and is distributed WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.