image/svg+xml VPTERNLOGD/VPTERNLOGQ—Bitwise Ternary Logic Instruction Operand Encoding Description VPTERNLOGD/Q takes three bit vectors of 512-bit length (in the first, second and third operand) as input data to form a set of 512 indices, each index is comprised of one bit from each input vector. The imm8 byte specifies a boolean logic table producing a binary value for each 3-bit index value. The final 512-bit boolean result is written to the destination operand (the first operand) using the writemask k1 with the granularity of doubleword element or quadword element into the destination. The destination operand is a ZMM (EVEX.512)/YMM (EVEX.256)/XMM (EVEX.128) register. The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32/64-bit memory location The destination operand is a ZMM register conditionally updated with writemask k1. Opcode/ Instruction Op / En 64/32 bit Mode Support CPUID Feature Flag Description EVEX.128.66.0F3A.W0 25 /r ib VPTERNLOGD xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst, imm8 AV/VAVX512VL AVX512F Bitwise ternary logic taking xmm1, xmm2 and xmm3/m128/m32bcst as source operands and writing the result to xmm1 under writemask k1 with dword granularity. The immediate value determines the specific binary function being implemented. EVEX.256.66.0F3A.W0 25 /r ib VPTERNLOGD ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst, imm8 AV/VAVX512VL AVX512F Bitwise ternary logic taking ymm1, ymm2 and ymm3/m256/m32bcst as source operands and writing the result to ymm1 under writemask k1 with dword granularity. The immediate value determines the specific binary function being implemented. EVEX.512.66.0F3A.W0 25 /r ib VPTERNLOGD zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst, imm8 AV/VAVX512FBitwise ternary logic taking zmm1, zmm2 and zmm3/m512/m32bcst as source operands and writing the result to zmm1 under writemask k1 with dword granularity. The immediate value determines the specific binary function being implemented. EVEX.128.66.0F3A.W1 25 /r ib VPTERNLOGQ xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst, imm8 AV/VAVX512VL AVX512F Bitwise ternary logic taking xmm1, xmm2 and xmm3/m128/m64bcst as source operands and writing the result to xmm1 under writemask k1 with qword granularity. The immediate value determines the specific binary function being implemented. EVEX.256.66.0F3A.W1 25 /r ib VPTERNLOGQ ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst, imm8 AV/VAVX512VL AVX512F Bitwise ternary logic taking ymm1, ymm2 and ymm3/m256/m64bcst as source operands and writing the result to ymm1 under writemask k1 with qword granularity. The immediate value determines the specific binary function being implemented. EVEX.512.66.0F3A.W1 25 /r ib VPTERNLOGQ zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst, imm8 AV/VAVX512FBitwise ternary logic taking zmm1, zmm2 and zmm3/m512/m64bcst as source operands and writing the result to zmm1 under writemask k1 with qword granularity. The immediate value determines the specific binary function being implemented. Op/EnTuple TypeOperand 1Operand 2Operand 3Operand 4 AFullModRM:reg (r, w)EVEX.vvvv (r)ModRM:r/m (r)Imm8 image/svg+xml Table 5-9 shows two examples of Boolean functions specified by immediate values 0xE2 and 0xE4, with the look up result listed in the fourth column following the three columns containing all possible values of the 3-bit index. Specifying different values in imm8 will allow any arbitrary three-input Boolean functions to be implemented in software using VPTERNLOGD/Q. Table 5-1 and Table 5-2 provide a mapping of all 256 possible imm8 values to various Boolean expressions. Operation VPTERNLOGD (EVEX encoded versions) (KL, VL) = (4, 128), (8, 256), (16, 512) FOR j := 0 TO KL-1 i := j * 32 IF k1[j] OR *no writemask* THEN FOR k := 0 TO 31 IF (EVEX.b = 1) AND (SRC2 *is memory*) THEN DEST[j][k] := imm[(DEST[i+k] << 2) + (SRC1[ i+k ] << 1) + SRC2[ k ]] ELSE DEST[j][k] := imm[(DEST[i+k] << 2) + (SRC1[ i+k ] << 1) + SRC2[ i+k ]] FI; ; table lookup of immediate bellow; ELSE IF *merging-masking*; merging-masking THEN *DEST[31+i:i] remains unchanged* ELSE ; zeroing-masking DEST[31+i:i] := 0 FI; FI; ENDFOR; DEST[MAXVL-1:VL] := 0 Table 5-9. Examples of VPTERNLOGD/Q Imm8 Boolean Function and Input Index Values VPTERNLOGD reg1, reg2, src3, 0xE2Bit Result with Imm8=0xE2 VPTERNLOGD reg1, reg2, src3, 0xE4Bit Result with Imm8=0xE4 Bit(reg1)Bit(reg2)Bit(src3)Bit(reg1)Bit(reg2)Bit(src3) 00000000 00110010 01000101 01100110 10001000 10111011 11011101 11111111 image/svg+xml VPTERNLOGQ (EVEX encoded versions) (KL, VL) = (2, 128), (4, 256), (8, 512) FOR j := 0 TO KL-1 i := j * 64 IF k1[j] OR *no writemask* THEN FOR k := 0 TO 63 IF (EVEX.b = 1) AND (SRC2 *is memory*) THEN DEST[j][k] := imm[(DEST[i+k] << 2) + (SRC1[ i+k ] << 1) + SRC2[ k ]] ELSE DEST[j][k] := imm[(DEST[i+k] << 2) + (SRC1[ i+k ] << 1) + SRC2[ i+k ]] FI;; table lookup of immediate bellow; ELSE IF *merging-masking*; merging-masking THEN *DEST[63+i:i] remains unchanged* ELSE ; zeroing-masking DEST[63+i:i] := 0 FI; FI; ENDFOR; DEST[MAXVL-1:VL] := 0 Intel C/C++ Compiler Intrinsic Equivalents VPTERNLOGD __m512i _mm512_ternarylogic_epi32(__m512i a, __m512i b, int imm); VPTERNLOGD __m512i _mm512_mask_ternarylogic_epi32(__m512i s, __mmask16 m, __m512i a, __m512i b, int imm); VPTERNLOGD __m512i _mm512_maskz_ternarylogic_epi32(__mmask m, __m512i a, __m512i b, int imm); VPTERNLOGD __m256i _mm256_ternarylogic_epi32(__m256i a, __m256i b, int imm); VPTERNLOGD __m256i _mm256_mask_ternarylogic_epi32(__m256i s, __mmask8 m, __m256i a, __m256i b, int imm); VPTERNLOGD __m256i _mm256_maskz_ternarylogic_epi32( __mmask8 m, __m256i a, __m256i b, int imm); VPTERNLOGD __m128i _mm_ternarylogic_epi32(__m128i a, __m128i b, int imm); VPTERNLOGD __m128i _mm_mask_ternarylogic_epi32(__m128i s, __mmask8 m, __m128i a, __m128i b, int imm); VPTERNLOGD __m128i _mm_maskz_ternarylogic_epi32( __mmask8 m, __m128i a, __m128i b, int imm); VPTERNLOGQ __m512i _mm512_ternarylogic_epi64(__m512i a, __m512i b, int imm); VPTERNLOGQ __m512i _mm512_mask_ternarylogic_epi64(__m512i s, __mmask8 m, __m512i a, __m512i b, int imm); VPTERNLOGQ __m512i _mm512_maskz_ternarylogic_epi64( __mmask8 m, __m512i a, __m512i b, int imm); VPTERNLOGQ __m256i _mm256_ternarylogic_epi64(__m256i a, __m256i b, int imm); VPTERNLOGQ __m256i _mm256_mask_ternarylogic_epi64(__m256i s, __mmask8 m, __m256i a, __m256i b, int imm); VPTERNLOGQ __m256i _mm256_maskz_ternarylogic_epi64( __mmask8 m, __m256i a, __m256i b, int imm); VPTERNLOGQ __m128i _mm_ternarylogic_epi64(__m128i a, __m128i b, int imm); VPTERNLOGQ __m128i _mm_mask_ternarylogic_epi64(__m128i s, __mmask8 m, __m128i a, __m128i b, int imm); VPTERNLOGQ __m128i _mm_maskz_ternarylogic_epi64( __mmask8 m, __m128i a, __m128i b, int imm); SIMD Floating-Point Exceptions None Other Exceptions See Table2-49, “Type E4 Class Exception Conditions”. This UNOFFICIAL reference was generated from the official Intel® 64 and IA-32 Architectures Software Developer’s Manual by a dumb script. There is no guarantee that some parts aren't mangled or broken and is distributed WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE .