- Notifications
You must be signed in to change notification settings - Fork 15.3k
[AArch64] Separate PNR into its own Register Class #65306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this also add an implicitDef of the corresponding predicate-as-counter register?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it should, I've added it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is for line 921: def PPR_p8to15 : PPRClass<8, 15>;
Is this register class still used somewhere? I thought we added this for instructions that take predicate-as-counter registers (which are passed in pn8-pn15).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of, It was used in c52d950 because there was no alternative PNR class to use at the time. It's probably not very useful now, unless you still want to be able to constrain the upper end of normal predicate when writing asm? Maybe its useful with some of the SVE2p1/SME2 strided/contiguous load instructions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if it's not used at the moment, please remove it from this patch. If it's needed in the future we can always add it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, just ignore my comment above. The code is already in and used in getPredicateRegisterClass for inline-asm constraints. I'm not sure if anyone will ever use this for mask-predicates, but I guess there's no need to remove that functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change needed? I would expect a predicate-as-counter to be printed as a predicate-as-mask, but not the other way around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it's not needed, I've removed the normal predicate logic and tidied this if block up a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look entirely right because STR_PXI can't store $pn0, but it can store $p0.
I'm not sure if this is a problem though, because the encoding of $pn0 and $p0 is the same. Could you add the verifier to the RUN line, so that we know whether or not the MIR verifier accepts it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some changes to the spill/fill logic and this test is now outputting and asserting STR_PXI $p0, $sp, 7,implicit-def $pn0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some tests for these COPYs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added llvm/test/CodeGen/AArch64/PPRtoPNRCopy.mir and llvm/test/CodeGen/AArch64/PNRtoPPRCopy.mir
david-arm left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not done a full review, but I've left the comments I have so far!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add brackets around the inverted case above too, i.e. PPR->PNR? I think it's more obvious with the brackets
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @sdesmalen-arm's comment about adding implicit def applies here too, i.e. an implicit def of the same PNR register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed lower down you assert with
assert((Subtarget.hasSVE2p1() || Subtarget.hasSME2()) &&
"Unexpected register load without SVE2p1 or SME2");
and I think it makes sense to assert that here for the COPY instructions too, since predicate-as-counter registers were introduced in SVE2.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| @llvm/pr-subscribers-mc @llvm/pr-subscribers-backend-aarch64 ChangesThis patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand. --Patch is 78.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/65306.diff 20 Files Affected:
<pre>
let Namespace = "AArch64" in { //===----- END: v8.7a accelerator extension register operands -------------===// +// SVE predicate-as-counter registers
// SVE predicate registers
// The part of SVE registers that don't overlap Neon registers. -class PPRRegOp <string Suffix, AsmOperandClass C, ElementSizeEnum Size,
class ZPRRegOp <string Suffix, AsmOperandClass C, ElementSizeEnum Size, @@ -891,7 +911,7 @@ class ZPRRegOp <string Suffix, AsmOperandClass C, ElementSizeEnum Size,
-def PPRAsmOpAny : PPRAsmOperand<"PredicateAny", "PPR", 0>; |
ef2c776 to 0126977 Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, implicit-def $pn8 isn't right, because it's not defined. $p0 is defined and you could add $pn0 as an implicit-def, but not $pn8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the implicit def to be the destination register instead of the source now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected PPRtoPNRCopy.mir to be almost the same as PNRtoPPRCopy.mir, with the PN and P swapped. Was there a reason that you couldn't construct such a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic was they're both from organic examples. For example in sve2p1-intrinsics-predicate-as-counter.ll the pext tests will generate a PPR->PNR copy while the ptrue tests in that file will generate a PNR->PPR copy. If you'd rather I combine them into a single file and make them near carbon copies of each other that's fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you'd rather I combine them into a single file and make them near carbon copies of each other that's fine by me
That would have my preference, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've combined and simplified them into PNRtoPPRCopy.mir now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also not implicitly define $pn0, because no registers are defined by a store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the implicit def from store spills.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(same as previous comment: Why are you testing for AArch64::SP here?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw it done elsewhere in this function and wanted to be conservative, but yes, it is pointless. Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you testing for AArch64::SP here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw it done elsewhere in this function and wanted to be conservative, but yes, it is pointless. Removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it do an implicit-def of $pn0 here? It's copying $pn8 -> $p0.
I would expect there only to be an implicit-def for a $pn* register if the destination of the COPY is a predicate-as-counter register.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@david-arm Would you mind chipping in here since you suggested this change here? #65306 (comment) I'm not well versed with implict defs in MIR so I'm not sure if this is necessary or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a chat with @david-arm about this, who was concerned that the register allocator might not 'see' that $pn0 gets clobbered when $p0 gets written. However, with $pn0 being a subregister from $p0 that's not likely to be an issue, as the register allocator will know that writing $p0 also writes $pn0.
However, we concluded there probably is no harm in keeping the implicit-def either, so I'm happy to stick with what you've got now.
sdesmalen-arm left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand.
7197fe0 to 56d2802 Compare
This patch separates PNR registers into their own register class instead of sharing a register class with PPR registers. This primarily allows us to return more accurate register classes when applying assembly constraints, but also more protection from supplying an incorrect predicate type to an invalid register operand.