[WebAssembly] Add extra pattern for dot #151775

badumbatish · 2025-08-01T21:45:50Z

Fixes #50154

llvmbot · 2025-08-01T21:46:27Z

@llvm/pr-subscribers-backend-webassembly

Author: Jasmine Tang (badumbatish)

Changes

Fixes #50154

Full diff: https://github.com/llvm/llvm-project/pull/151775.diff

2 Files Affected:

(modified) llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp (+51)
(added) llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll (+21)

diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp index cd434f7a331e4..648e3b6b2b440 100644 --- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp +++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp @@ -192,6 +192,9 @@ WebAssemblyTargetLowering::WebAssemblyTargetLowering( // Combine wide-vector muls, with extend inputs, to extmul_half. setTargetDAGCombine(ISD::MUL); + // Combine add with vector shuffle of muls to dots + setTargetDAGCombine(ISD::ADD); + // Combine vector mask reductions into alltrue/anytrue setTargetDAGCombine(ISD::SETCC); @@ -3436,6 +3439,52 @@ static SDValue performSETCCCombine(SDNode *N, return SDValue(); } +static SDValue performAddCombine(SDNode *N, SelectionDAG &DAG) { + assert(N->getOpcode() == ISD::ADD); + EVT VT = N->getValueType(0); + SDValue N0 = N->getOperand(0), N1 = N->getOperand(1); + + if (VT != MVT::v4i32) + return SDValue(); + + auto IsShuffleWithMask = [](SDValue V, ArrayRef<int> ShuffleValue) { + if (V.getOpcode() != ISD::VECTOR_SHUFFLE) + return SDValue(); + if (cast<ShuffleVectorSDNode>(V)->getMask() != ShuffleValue) + return SDValue(); + return V; + }; + auto ShuffleA = IsShuffleWithMask(N0, {0, 2, 4, 6}); + auto ShuffleB = IsShuffleWithMask(N1, {1, 3, 5, 7}); + // two SDValues must be muls + if (!ShuffleA || !ShuffleB) + return SDValue(); + + if (ShuffleA.getOperand(0) != ShuffleB.getOperand(0) || + ShuffleA.getOperand(1) != ShuffleB.getOperand(1)) + return SDValue(); + + auto IsMulExtend = + [](SDValue V, WebAssemblyISD::NodeType I) -> std::pair<SDValue, SDValue> { + if (V.getOpcode() != ISD::MUL) + return {}; + + auto V0 = V.getOperand(0), V1 = V.getOperand(1); + if (V0.getOpcode() != I || V1.getOpcode() != I) + return {}; + return {V0.getOperand(0), V1.getOperand(0)}; + }; + + auto [LowA, LowB] = + IsMulExtend(ShuffleA.getOperand(0), WebAssemblyISD::EXTEND_LOW_S); + auto [HighA, HighB] = + IsMulExtend(ShuffleA.getOperand(1), WebAssemblyISD::EXTEND_HIGH_S); + + if (!LowA || !LowB || !HighA || !HighB || LowA != HighA || LowB != HighB) + return SDValue(); + + return DAG.getNode(WebAssemblyISD::DOT, SDLoc(N), MVT::v4i32, LowA, LowB); +} static SDValue performMulCombine(SDNode *N, SelectionDAG &DAG) { assert(N->getOpcode() == ISD::MUL); EVT VT = N->getValueType(0); @@ -3558,5 +3607,7 @@ WebAssemblyTargetLowering::PerformDAGCombine(SDNode *N, } case ISD::MUL: return performMulCombine(N, DCI.DAG); + case ISD::ADD: + return performAddCombine(N, DCI.DAG); } } diff --git a/llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll b/llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll new file mode 100644 index 0000000000000..7ac49794491a1 --- /dev/null +++ b/llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll @@ -0,0 +1,21 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5 +; RUN: llc < %s -mattr=+simd128 | FileCheck %s + +target triple = "wasm32-unknown-unknown" +define <4 x i32> @dot(<8 x i16> %a, <8 x i16> %b) { +; CHECK-LABEL: dot: +; CHECK: .functype dot (v128, v128) -> (v128) +; CHECK-NEXT: # %bb.0: +; CHECK-NEXT: local.get 0 +; CHECK-NEXT: local.get 1 +; CHECK-NEXT: i32x4.dot_i16x8_s +; CHECK-NEXT: # fallthrough-return + %sext1 = sext <8 x i16> %a to <8 x i32> + %sext2 = sext <8 x i16> %b to <8 x i32> + %mul = mul nsw <8 x i32> %sext1, %sext2 + %shuffle1 = shufflevector <8 x i32> %mul, <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6> + %shuffle2 = shufflevector <8 x i32> %mul, <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7> + %res = add <4 x i32> %shuffle1, %shuffle2 + ret <4 x i32> %res +} +

lukel97

Out of curiosity, is it possible to do this as a tablegen pattern? Not that it's necessarily the right thing to do, just wondering if it's easy to do or not!

llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll

badumbatish · 2025-08-05T02:01:14Z

Out of curiosity, is it possible to do this as a tablegen pattern? Not that it's necessarily the right thing to do, just wondering if it's easy to do or not!

I actually didn't think that tablegen can handle identical arguments in a pattern, i just tried it out just now and i think i might be able to make it work

sparker-arm · 2025-08-05T07:20:28Z

i think i might be able to make it work

I'm assuming the 'illegal' types will make this more difficult in tablegen? This approach looks good to me.

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp

llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll

lukel97

PR title needs updated to reflect that this isn't a combine but adds a ~~fold~~pattern

EDIT: Typo, sorry!

llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll

llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td

lukel97 · 2025-08-06T04:40:03Z

i think i might be able to make it work

I'm assuming the 'illegal' types will make this more difficult in tablegen? This approach looks good to me.

Good point. It looks like the old combine only operated on MVT::v8i16s feeding into MVT::v4i32 adds, but I presume we also want to handle i8s that are sext'd to i32, e.g:

define <4 x i32> @f(<8 x i8> %a, <8 x i8> %b) { %sext1 = sext <8 x i8> %a to <8 x i32> %sext2 = sext <8 x i8> %b to <8 x i32> %mul = mul nsw <8 x i32> %sext1, %sext2 %shuffle1 = shufflevector <8 x i32> %mul, <8 x i32> poison, <4 x i32> <i32 0, i32 2, i32 4, i32 6> %shuffle2 = shufflevector <8 x i32> %mul, <8 x i32> poison, <4 x i32> <i32 1, i32 3, i32 5, i32 7> %res = add <4 x i32> %shuffle1, %shuffle2 ret <4 x i32> %res }

And from a quick check it looks the multiply is hoisted into the narrower type:

Optimized legalized selection DAG: %bb.0 'f:' SelectionDAG has 46 nodes: t2: v16i8 = WebAssemblyISD::ARGUMENT TargetConstant:i32<0> t40: v8i16 = WebAssemblyISD::EXTEND_LOW_S t2 t4: v16i8 = WebAssemblyISD::ARGUMENT TargetConstant:i32<1> t39: v8i16 = WebAssemblyISD::EXTEND_LOW_S t4 t35: v8i16 = mul t40, t39 t37: v4i32 = WebAssemblyISD::EXTEND_HIGH_S t35 t36: v4i32 = WebAssemblyISD::EXTEND_LOW_S t35 t0: ch,glue = EntryToken t77: v4i32 = WebAssemblyISD::SHUFFLE t36, t37, Constant:i32<0>, Constant:i32<1>, Constant:i32<2>, Constant:i32<3>, Constant:i32<8>, Constant:i32<9>, Constant:i32<10>, Constant:i32<11>, Constant:i32<16>, Constant:i32<17>, Constant:i32<18>, Constant:i32<19>, Constant:i32<24>, Constant:i32<25>, Constant:i32<26>, Constant:i32<27> t60: v4i32 = WebAssemblyISD::SHUFFLE t36, t37, Constant:i32<4>, Constant:i32<5>, Constant:i32<6>, Constant:i32<7>, Constant:i32<12>, Constant:i32<13>, Constant:i32<14>, Constant:i32<15>, Constant:i32<20>, Constant:i32<21>, Constant:i32<22>, Constant:i32<23>, Constant:i32<28>, Constant:i32<29>, Constant:i32<30>, Constant:i32<31> t30: v4i32 = add t77, t60 t31: ch = WebAssemblyISD::RETURN t0, t30

We could add a second tablegen pattern for (add (shuffle (sext mul)) (shuffle (sext mul))).

Or we could try and do this again as a dagcombine, but it looks like the multiply is already hoisted by the time the types are legalized so we would have to handle the two different patterns anyway:

Type-legalized selection DAG: %bb.0 'f:' SelectionDAG has 26 nodes: t2: v16i8 = WebAssemblyISD::ARGUMENT TargetConstant:i32<0> t40: v8i16 = WebAssemblyISD::EXTEND_LOW_S t2 t4: v16i8 = WebAssemblyISD::ARGUMENT TargetConstant:i32<1> t39: v8i16 = WebAssemblyISD::EXTEND_LOW_S t4 t35: v8i16 = mul t40, t39 t36: v4i32 = WebAssemblyISD::EXTEND_LOW_S t35 t37: v4i32 = WebAssemblyISD::EXTEND_HIGH_S t35 t0: ch,glue = EntryToken t12: i32 = extract_vector_elt t36, Constant:i32<0> t14: i32 = extract_vector_elt t36, Constant:i32<2> t16: i32 = extract_vector_elt t37, Constant:i32<0> t18: i32 = extract_vector_elt t37, Constant:i32<2> t19: v4i32 = BUILD_VECTOR t12, t14, t16, t18 t21: i32 = extract_vector_elt t36, Constant:i32<1> t23: i32 = extract_vector_elt t36, Constant:i32<3> t25: i32 = extract_vector_elt t37, Constant:i32<1> t27: i32 = extract_vector_elt t37, Constant:i32<3> t28: v4i32 = BUILD_VECTOR t21, t23, t25, t27 t30: v4i32 = add t19, t28 t31: ch = WebAssemblyISD::RETURN t0, t30

badumbatish · 2025-08-07T19:10:27Z

godbolt for this https://godbolt.org/z/Y1rrnW5h3. The IR at the isel phase looks a bit different. Would you want me to try that in this PR as well?

lukel97

LGTM with the PR title updated

I think we can handle the zext from i8 case in a separate PR if the pattern ends up being somewhat different anyway, but I'll defer to @sparker-arm on this! (I'm not strongly opinionated on this as to if we want to do this as a combine or tablegen pattern)

llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td

badumbatish · 2025-10-13T15:55:04Z

hi @sparker-arm, i'm looking to add the somewhat same pattern to relaxed simd dot, can you confirm if this approach is good for merge and for the subsequent pr?

sparker-arm

Sorry, I didn't know you were waiting on me, please feel free to shout sooner next time!

badumbatish · 2025-10-13T17:27:04Z

no worries, i'll keep that in mind next time, ty for the reviews!

Fixes llvm#50154

The pattern I added for `relaxed dot` similar to normal dot @ #151775. For `relaxed dot add`, i noticed that in the proposal the portion of dot implementation is similar to `relaxed dot`, so I think we can add a pattern where after we do relaxed dot and do extadd pairwise, we can do `relaxed dot add`. One current obstacles is I don't think there is any pattern to singly create a extadd pairwise from other instructions so the `relaxed dot add` pattern would not cover a wide range of instructions. related to #55932

…266) The pattern I added for `relaxed dot` similar to normal dot @ llvm/llvm-project#151775. For `relaxed dot add`, i noticed that in the proposal the portion of dot implementation is similar to `relaxed dot`, so I think we can add a pattern where after we do relaxed dot and do extadd pairwise, we can do `relaxed dot add`. One current obstacles is I don't think there is any pattern to singly create a extadd pairwise from other instructions so the `relaxed dot add` pattern would not cover a wide range of instructions. related to llvm/llvm-project#55932

badumbatish added 2 commits August 1, 2025 14:41

Precommit test

4d304c8

Added combine support for dot

cb9aac0

llvmbot added the backend:WebAssembly label Aug 1, 2025

Fix merge conflict from main

6553946

badumbatish requested review from lukel97 and sparker-arm August 4, 2025 16:59

lukel97 reviewed Aug 4, 2025

View reviewed changes

llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll Outdated Show resolved Hide resolved

sparker-arm reviewed Aug 5, 2025

View reviewed changes

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp Outdated Show resolved Hide resolved

llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp Outdated Show resolved Hide resolved

llvm/test/CodeGen/WebAssembly/simd-dot-reductions.ll Outdated Show resolved Hide resolved

badumbatish added 2 commits August 5, 2025 10:38

Transition to tablegen for pattern

86fe99b

Addresses PR reviews

34f58f1

lukel97 reviewed Aug 6, 2025

View reviewed changes

Fix PR reviews

78965f0

lukel97 approved these changes Aug 8, 2025

View reviewed changes

llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td Outdated Show resolved Hide resolved

badumbatish changed the title ~~[WebAssembly] Add fold support for dot~~ [WebAssembly] Add extra pattern for dot Aug 11, 2025

Clarify dot pattern comment

75e0c38

sparker-arm approved these changes Oct 13, 2025

View reviewed changes

badumbatish merged commit 55d4e92 into llvm:main Oct 13, 2025
9 checks passed

badumbatish mentioned this pull request Oct 13, 2025

[WebAssembly] [Codegen] Add patterns for relaxed dot #163266

Merged

akadutta pushed a commit to akadutta/llvm-project that referenced this pull request Oct 14, 2025

[WebAssembly] Add extra pattern for dot (llvm#151775)

77a3db1

Fixes llvm#50154

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WebAssembly] Add extra pattern for dot #151775

[WebAssembly] Add extra pattern for dot #151775

Uh oh!

badumbatish commented Aug 1, 2025

llvmbot commented Aug 1, 2025

lukel97 left a comment

Uh oh!

badumbatish commented Aug 5, 2025

sparker-arm commented Aug 5, 2025

Uh oh!

Uh oh!

Uh oh!

lukel97 left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukel97 commented Aug 6, 2025

badumbatish commented Aug 7, 2025 •

edited

Loading

lukel97 left a comment

Uh oh!

badumbatish commented Oct 13, 2025

sparker-arm left a comment

badumbatish commented Oct 13, 2025

Uh oh!

Labels

4 participants

[WebAssembly] Add extra pattern for dot #151775

[WebAssembly] Add extra pattern for dot #151775

Uh oh!

Conversation

badumbatish commented Aug 1, 2025

llvmbot commented Aug 1, 2025

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

badumbatish commented Aug 5, 2025

sparker-arm commented Aug 5, 2025

Uh oh!

Uh oh!

Uh oh!

lukel97 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukel97 commented Aug 6, 2025

badumbatish commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

badumbatish commented Oct 13, 2025

sparker-arm left a comment

Choose a reason for hiding this comment

badumbatish commented Oct 13, 2025

Uh oh!

Labels

4 participants

lukel97 left a comment •

edited

Loading

badumbatish commented Aug 7, 2025 •

edited

Loading