Skip to content

Suboptimal codgen for Vector128.NarrowWithSaturation #116526

@xtqqczze

Description

@xtqqczze

I expect PackSources0 and PackSources1 to have exactly the same codegen in the following code snippet:

static Vector128<byte> PackSources1(Vector128<ushort> lower, Vector128<ushort> upper)
    => Vector128.NarrowWithSaturation(lower, upper);

static Vector128<byte> PackSources0(Vector128<ushort> lower, Vector128<ushort> upper)
    => Sse2.PackUnsignedSaturate(
        Vector128.Min(lower, Vector128.Create((ushort)255)).AsInt16(),
        Vector128.Min(upper, Vector128.Create((ushort)255)).AsInt16());
// coreclr trunk-20250611+5415b7342d44af9c974905760539f198fad13682

C:PackSources1(System.Runtime.Intrinsics.Vector128`1[ushort],System.Runtime.Intrinsics.Vector128`1[ushort]):System.Runtime.Intrinsics.Vector128`1[byte] (FullOpts):
       vbroadcastss xmm0, dword ptr [reloc @RWD00]
       vpminuw  xmm1, xmm0, xmmword ptr [rsp+0x08]
       vpand    xmm1, xmm1, xmm0
       vpminuw  xmm2, xmm0, xmmword ptr [rsp+0x18]
       vpand    xmm0, xmm2, xmm0
       vpackuswb xmm0, xmm1, xmm0
       vmovups  xmmword ptr [rdi], xmm0
       mov      rax, rdi
       ret      
RWD00  	dd	00FF00FFh		; 2.34184e-38

C:PackSources0(System.Runtime.Intrinsics.Vector128`1[ushort],System.Runtime.Intrinsics.Vector128`1[ushort]):System.Runtime.Intrinsics.Vector128`1[byte] (FullOpts):
       vbroadcastss xmm0, dword ptr [reloc @RWD00]
       vpminuw  xmm1, xmm0, xmmword ptr [rsp+0x08]
       vpminuw  xmm0, xmm0, xmmword ptr [rsp+0x18]
       vpackuswb xmm0, xmm1, xmm0
       vmovups  xmmword ptr [rdi], xmm0
       mov      rax, rdi
       ret      
RWD00  	dd	00FF00FFh		; 2.34184e-38

https://csharp.godbolt.org/z/o39G8GP9T

Related: #115525

cc: @tannergooding

Metadata

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIhelp wanted[up-for-grabs] Good issue for external contributors

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions