image/svg+xmlVRNDSCALEPS—Round Packed Float32 Values To Include A Given Number Of Fraction BitsInstruction Operand EncodingDescriptionRound the single-precision floating-point values in the source operand by the rounding mode specified in the immediate operand (see Figure5-29) and places the result in the destination operand.The destination operand (the first operand) is a ZMM register conditionally updated according to the writemask. The source operand (the second operand) can be a ZMM register, a 512-bit memory location, or a 512-bit vector broadcasted from a 32-bit memory location.The rounding process rounds the input to an integral value, plus number bits of fraction that are specified by imm8[7:4] (to be included in the result) and returns the result as a single-precision floating-point value.It should be noticed that no overflow is induced while executing this instruction (although the source is scaled by the imm8[7:4] value).The immediate operand also specifies control fields for the rounding operation, three bit fields are defined and shown in the “Immediate Control Description” figure below. Bit 3 of the immediate byte controls the processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (Immediate control table below lists the encoded values for rounding-mode field).The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to ‘1 then denormals will be converted to zero before rounding.The sign of the result of this instruction is preserved, including the sign of zero.The formula of the operation on each data element for VRNDSCALEPS isROUND(x) = 2-M*Round_to_INT(x*2M, round_ctrl), round_ctrl = imm[3:0];M=imm[7:4];The operation of x*2M is computed as if the exponent range is unlimited (i.e. no overflow ever occurs).VRNDSCALEPS is a more general form of the VEX-encoded VROUNDPS instruction. In VROUNDPS, the formula of the operation on each element isROUND(x) = Round_to_INT(x, round_ctrl), round_ctrl = imm[3:0];Opcode/InstructionOp / En64/32 bit Mode SupportCPUID Feature FlagDescriptionEVEX.128.66.0F3A.W0 08 /r ibVRNDSCALEPS xmm1 {k1}{z}, xmm2/m128/m32bcst, imm8AV/VAVX512VLAVX512FRounds packed single-precision floating point values in xmm2/m128/m32bcst to a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register. Under writemask.EVEX.256.66.0F3A.W0 08 /r ibVRNDSCALEPS ymm1 {k1}{z}, ymm2/m256/m32bcst, imm8AV/VAVX512VLAVX512FRounds packed single-precision floating point values in ymm2/m256/m32bcst to a number of fraction bits specified by the imm8 field. Stores the result in ymm1 register. Under writemask.EVEX.512.66.0F3A.W0 08 /r ibVRNDSCALEPS zmm1 {k1}{z}, zmm2/m512/m32bcst{sae}, imm8AV/VAVX512FRounds packed single-precision floating-point values in zmm2/m512/m32bcst to a number of fraction bits specified by the imm8 field. Stores the result in zmm1 register using writemask.Op/EnTuple TypeOperand 1 Operand 2Operand 3Operand 4AFullModRM:reg (w)ModRM:r/m (r)Imm8NA

image/svg+xmlNote: EVEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.Handling of special case of input values are listed in Table 5-16.OperationRoundToIntegerSP(SRC[31:0], imm8[7:0]) {if (imm8[2] = 1)rounding_direction := MXCSR:RC; get round control from MXCSRelserounding_direction := imm8[1:0]; get round control from imm8[1:0]FIM := imm8[7:4]; get the scaling factorcase (rounding_direction)00: TMP[31:0] := round_to_nearest_even_integer(2M*SRC[31:0])01: TMP[31:0] := round_to_equal_or_smaller_integer(2M*SRC[31:0])10: TMP[31:0] := round_to_equal_or_larger_integer(2M*SRC[31:0])11: TMP[31:0] := round_to_nearest_smallest_magnitude_integer(2M*SRC[31:0])ESAC;Dest[31:0] := 2-M* TMP[31:0] ; scale down back to 2-Mif (imm8[3] = 0) Then; check SPEif (SRC[31:0] != Dest[31:0]) Then; check precision lostset_precision(); set #PEFI;FI;return(Dest[31:0])}VRNDSCALEPS (EVEX encoded versions) (KL, VL) = (4, 128), (8, 256), (16, 512)IF *src is a memory operand*THEN TMP_SRC := BROADCAST32(SRC, VL, k1)ELSE TMP_SRC := SRCFI;FOR j := 0 TO KL-1i := j * 32IF k1[j] OR *no writemask*THEN DEST[i+31:i] := RoundToIntegerSP(TMP_SRC[i+31:i]), imm8[7:0])ELSE IF *merging-masking*; merging-maskingTHEN *DEST[i+31:i] remains unchanged*ELSE ; zeroing-maskingDEST[i+31:i] := 0FI;FI;ENDFOR;DEST[MAXVL-1:VL] := 0

image/svg+xmlIntel C/C++ Compiler Intrinsic EquivalentVRNDSCALEPS __m512 _mm512_roundscale_ps( __m512 a, int imm);VRNDSCALEPS __m512 _mm512_roundscale_round_ps( __m512 a, int imm, int sae);VRNDSCALEPS __m512 _mm512_mask_roundscale_ps(__m512 s, __mmask16 k, __m512 a, int imm);VRNDSCALEPS __m512 _mm512_mask_roundscale_round_ps(__m512 s, __mmask16 k, __m512 a, int imm, int sae);VRNDSCALEPS __m512 _mm512_maskz_roundscale_ps( __mmask16 k, __m512 a, int imm);VRNDSCALEPS __m512 _mm512_maskz_roundscale_round_ps( __mmask16 k, __m512 a, int imm, int sae);VRNDSCALEPS __m256 _mm256_roundscale_ps( __m256 a, int imm);VRNDSCALEPS __m256 _mm256_mask_roundscale_ps(__m256 s, __mmask8 k, __m256 a, int imm);VRNDSCALEPS __m256 _mm256_maskz_roundscale_ps( __mmask8 k, __m256 a, int imm);VRNDSCALEPS __m128 _mm_roundscale_ps( __m256 a, int imm);VRNDSCALEPS __m128 _mm_mask_roundscale_ps(__m128 s, __mmask8 k, __m128 a, int imm);VRNDSCALEPS __m128 _mm_maskz_roundscale_ps( __mmask8 k, __m128 a, int imm);SIMD Floating-Point ExceptionsInvalid, PrecisionIf SPE is enabled, precision exception is not reported (regardless of MXCSR exception mask).Other ExceptionsSee Table2-46, “Type E2 Class Exception Conditions”.

This UNOFFICIAL reference was generated from the official Intel® 64 and IA-32 Architectures Software Developer’s Manual by a dumb script. There is no guarantee that some parts aren't mangled or broken and is distributed WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.