Skip to content

Conversation

@PeddleSpam
Copy link
Contributor

@PeddleSpam PeddleSpam commented May 20, 2024

Fixes error in GlobalISel CTLZ lowering caused by #88512.

@llvmbot
Copy link
Member

llvmbot commented May 20, 2024

@llvm/pr-subscribers-llvm-globalisel

@llvm/pr-subscribers-backend-amdgpu

Author: Leon Clark (PeddleSpam)

Changes
  • Reapply "[ctx_profile] Integration test (#92456)"
  • [Github] Revert accidental changes to dependabot config
  • Fix: remove wrongly pushed etime-function.mlir at toplevel (#92634)
  • [MCAsmParser] .macro/.rept/.irp/.irpc: remove excess \n after expansion
  • [flang][OpenMP] Re-enable tests when building OpenMP as a runtime (#89046)
  • [flang][OpenMP] Try to unify induction var privatization for OMP regions. (#91116)
  • [MCAsmParser] Improve .rept/.irp tests
  • [clang][ThreadSafety] Skip past implicit cast in translateAttrExpr
  • [clang][NFC] Further improvements to const-correctness
  • [GlobalIsel] Combine select to integer min max more (#92570)
  • [X86][CodeGen] Support flags copy lowering for CCMP/CTEST (#91849)
  • [mlir] Add operator<< for printing Block (#92550)
  • [flang][cuf] Add attr gen dependency to fix #92635
  • [nfc][ctx_profile] Fix printf - related -Wformat-pedantic
  • [NVPTX] support immediate values in st.param instructions (#91523)
  • [VPlan] Remove unused removeLastOperand (NFC).
  • [dsymutil] Use operator==(StringRef, StringRef) (NFC)
  • [DWARFLinker] Use an implicit conversion of SmallString to StringRef (NFC)
  • [DXIL] Use consistent SmallVector parameters
  • [DAG] Use copysign in frem power-2 fold. (#91751)
  • [VectorCombine] Don't transform single shuffles in shuffleToIdentity
  • update_test_checks: match IR basic block labels (#88979)
  • [ThinLTO]Sort imported GUIDs before cache key update (#92622)
  • [nfc][InstrFDO]Encapsulate header writes in a class member function (#90142)
  • Reformat
  • Quick fix for a waning in clang_rt.ctx_profile [-Wgnu-anonymous-struct]
  • [NewPM][AMDGPU] Add CodeGenPassBuilder (#91040)
  • [gn build] Port b4ba3fe
  • [GISel][RISCV] Legalize G_CONSTANT_FOLD_BARRIER (#89960)
  • [VectorCombine] Additional extend tests for shuffleToIdentity. NFC
  • [DAG] canCreateUndefOrPoison - merge INSERT_VECTOR_ELT/EXTRACT_VECTOR_ELT cases. NFC.
  • [ctx_profile] Pass lib path into test
  • [DAG] canCreateUndefOrPoison - only compute extract/index vector elt index knownbits when not poison
  • [DAG] visitAVG - rewrite "fold (avgfloor x, 0) -> x >> 1" to use SDPatternMatch
  • [DAG] visitABD - rewrite "(abs x, 0)" folds to use SDPatternMatch
  • Revert "[Bounds-Safety] Temporarily relax a counted_by attribute restriction on flexible array members"
  • Revert "[BoundsSafety] Allow 'counted_by' attribute on pointers in structs in C (#90786)"
  • Revert "[Bounds-Safety] Fix pragma-attribute-supported-attributes-list.test"
  • [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (#88182)
  • [CodeGen][SDAG] Skip preferred extend at O0 (#92643)
  • [CodeGen][SDAG] Track returntwice in lowering info (#92640)
  • [llvm] Add KnownBits implementations for avgFloor and avgCeil (#86445)
  • SimplifyLibCalls: Permit pow(2, x) -> ldexp(1, x) fold for vectors (#92532)
  • [VPlan] Simplify (X && Y) || (X && !Y) -> X. (#89386)
  • HLSL availability diagnostics design doc (#92207)
  • [DOCS] ORCv2.rst Typo (#89482)
  • [Clang][HLSL] Add environment parameter to availability attribute (#89809)
  • ValueTracking: Correct undef handling for constant FP vectors (#92557)
  • [BOLT] Fix preserved offset in fixDoubleJumps (#92485)
  • [AMDGPU] Use LSH for lowering ctlz_zero_undef.i8/i16 (#88512)
  • [TableGen] Avoid std::string copy. NFC
  • Update llvm-bugs.yml (#77243)
  • [llvm] Use operator==(StringRef, StringRef) (NFC) (#92705)
  • [clang-format][NFC] Clean up SortIncludesTest.cpp
  • [mlir] Use operator==(StringRef, StringRef) (NFC) (#92706)
  • [CallPromotionUtils]Implement conditional indirect call promotion with vtable-based comparison (#81378)
  • [clang] Use operator==(StringRef, StringRef) (NFC) (#92708)
  • [SDAG][X86] Extend SplitVecOp_VSETCC for STRICT_FSETCC. (#92509)
  • [llvm] Use StringRef::contains (NFC) (#92710)
  • [Serialization] Read the initializer for interesting static variables before consuming it (#92353)
  • [BOLT][NFC] Don't assign YAML profile to functions with no CFG (#92487)
  • [InstCombine] Fold pointer adding in integer to arithmetic add (#91596)
  • [AMDGPU] Use removeFnAttrFromReachable in lower-module-lds pass. (#92686)
  • [AMDGPU] Fix kernarg preloading crash with some types and alignments (#91625)
  • [ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option (#88024)
  • [NFC] Remove unused ASTWriter::getTypeID
  • [SCEV] Don't use non-deterministic constant folding for trip counts (#90942)
  • Revert "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#92715)
  • [llvm] Use SmallString::str (NFC) (#92712)
  • [AMDGPU] Only set Info.memVT when not later overridden (#92670)
  • [MC] Make UseAssemblerInfoForParsing mostly true
  • MIPS: Support '%w' token in inline asm template for MSA (#91920)
  • Clang/MIPS: Add +fp64 if MSA and no explicit -mfp option (#91949)
  • MIPS/Clang: Use FP32 by default if CPU is mips1 (#92122)
  • [ELF] Support high address DW_EH_sdata4 for ELFCLASS32
  • [PowerPC]perform bitcast lowering only at 64 bit
  • [LoongArch] Select {DIV,MOD}.{W,WU} instruction to eliminate explicit sign extension (#92205)
  • [Clang] Fix __is_array returning true for zero-sized arrays (#86652)
  • [OpenCL] Add cl_khr_kernel_clock builtins (#91950)
  • [clang][ExtractAPI] Remove symbols defined in categories to external types unless requested (#92522)
  • [RISCV][CostModel] Remove cost of icmp inst in icmp+select with SFB. (#91158)
  • [DebugInfo][GVNSink] Fix #77415: GVNSink fails to optimize LLVM IR with debug info (#77602)
  • [AArch64] Add PreTest for optimizing MOV to ORR
  • [Driver][PS5] Set visibility option defaults (#92091)
  • [AArch64] Optimize MOV to ORR when load symmetric constants (#86249)
  • [Coverage] Rework !SystemHeadersCoverage (#91446)
  • [lldb][Windows] Fixed LibcxxChronoTimePointSecondsSummaryProvider() (#92701)
  • [ConstantFolding] Canonicalize constexpr GEPs to i8 (#89872)
  • InstSimplify: increase shufflevector test coverage (#92407)
  • [flang][HLFIR] Adapt SimplifyHLFIRIntrinsics to run on all top level ops (#92573)
  • movimm-expand-ldst.mir (d3d6565) requires asserts
  • [SLP] NFC. Use TreeEntry::getOperand if setOperandsInOrder is called (#92727)
  • [MLIR][OpenMP] NFC: Split OpenMP dialect definitions (#91741)
  • [mlir][irdl] Fix missing verifier in irdl.parametric (#92700)
  • [VPlan] Add commutative binary OR matcher, use in transform. (#92539)
  • [CloneFunction] Remove check that is no longer necessary (#92577)
  • [ValueTracking] Fix incorrect inferrence about the signbit of sqrt (#92510)
  • [LAA] Add tests with invariant accesses using vector types.
  • [clang] CTAD alias: Fix missing template arg packs during the transformation (#92535)
  • [TableGen] HasOneUse builtin predicate on PatFrags (#91578)
  • [clang] Make PS template DLL attribute propagation the same as MSVC (#92549)
  • [DebugInfo][NaryReassociate] Fix missing debug location updates (#92545)
  • [clang] Use SmallString::str (NFC) (#92717)
  • [libcxx] locale.cpp: Move build_name helper into unnamed namespace (#92461)
  • [Offload] Remove unused version script for plugins
  • [AMDGPU] Fix error in #88512.

Full diff: https://github.com/llvm/llvm-project/pull/92770.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp (+1-1)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp index 15a4b6796880f..3523fcc7dbd50 100644 --- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp +++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp @@ -4168,7 +4168,7 @@ bool AMDGPULegalizerInfo::legalizeCTLZ_ZERO_UNDEF(MachineInstr &MI, auto ShiftAmt = B.buildConstant(S32, 32u - NumBits); auto Extend = B.buildAnyExt(S32, {Src}).getReg(0u); - auto Shift = B.buildLShr(S32, {Extend}, ShiftAmt); + auto Shift = B.buildShl(S32, {Extend}, ShiftAmt); auto Ctlz = B.buildInstr(AMDGPU::G_AMDGPU_FFBH_U32, {S32}, {Shift}); B.buildTrunc(Dst, Ctlz); MI.eraseFromParent(); 
@PeddleSpam PeddleSpam requested review from arsenm and jayfoad May 20, 2024 15:24
@jayfoad
Copy link
Contributor

jayfoad commented May 20, 2024

This ought to require an update to the tests too.

@PeddleSpam PeddleSpam merged commit e1c06c3 into llvm:main May 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment