- Notifications
You must be signed in to change notification settings - Fork 14.1k
Open
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
Consider this example:
pub fn parse(x: usize) -> usize { match x % 5 { 0 => f1(x), 1 => f2(x), 2 => f3(x), 3 => f4(x), 4 => f5(x), // 5 | 6 | 7 => loop{}, _ => loop{}, } }It currently generates similar LLVM-IR:
start: %_2 = urem i64 %x, 5, !dbg !10 switch i64 %_2, label %bb12 [ i64 0, label %bb2 i64 1, label %bb4 i64 2, label %bb6 i64 3, label %bb8 i64 4, label %bb10 ], !dbg !11 ; ... bb12: br label %bb12, !dbg !32Even though the default branch is unreachable (x % 5 can't be greater than 4) it is still generated.
But, when un-commenting 5 | 6 | 7 => loop{} line (still unreachable branch, that does the same as default) the default branch becomes unreachable:
start: %_2 = urem i64 %x, 5, !dbg !10 switch i64 %_2, label %bb14 [ i64 0, label %bb2 i64 1, label %bb4 i64 2, label %bb6 i64 3, label %bb8 i64 4, label %bb10 ], !dbg !11 ; ... bb14: unreachableSo, adding an unreachable branch that does the same as the default helps optimize the code.
Some notes:
- These differences then propagate to assembler, i.e. it also has an unreachable branch when
5 | 6 | 7is commented-out - godbolt link
- Range patterns (like
5..=7) do not help - Making the default branch
5 | 6 | 7 | _doesn't help either - Similar things happen for any
Ninx % Nthat is not a power of two -- adding unreachable branches until the last branch has a power-of-two-minus-one-value helps optimizing the code loop{}s can be replaced with anything else, for exampleunreachable!()orpanic!(), effect is still the same- in case of panic-like things not removing the panicking branch seems like a big deal
- I've used
loop{}s to make diffs less noisy
if-else-ifchains are identical tomatchhere- clang seems to handle this situation just fine in any case: godbolt link
- it also makes the last reachable branch default in
llvm-ir, instead of generating anunreachableone
- it also makes the last reachable branch default in
Pzixel, mexus, oli-obk, novedevo, Urgau and 13 morecompiler-errors and maximeborges
Metadata
Metadata
Assignees
Labels
A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationArea: Code generationC-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchCategory: An issue highlighting optimization opportunities or PRs implementing suchE-needs-testCall for participation: An issue has been fixed and does not reproduce, but no test has been added.Call for participation: An issue has been fixed and does not reproduce, but no test has been added.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.