Skip to content

Conversation

@carlos4242
Copy link
Contributor

This patch is needed to add support to clang's AVR ABI for the Swift language. It is a pre-requisite for adding AVR support to the public Swift compiler itself.

I'm open to any suggestions how I might create suitable unit tests for this?

@benshi001 @rjmccall

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Nov 14, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Carl Peto (carlos4242)

Changes

This patch is needed to add support to clang's AVR ABI for the Swift language. It is a pre-requisite for adding AVR support to the public Swift compiler itself.

I'm open to any suggestions how I might create suitable unit tests for this?

@benshi001 @rjmccall


Full diff: https://github.com/llvm/llvm-project/pull/72298.diff

1 Files Affected:

  • (modified) clang/lib/CodeGen/Targets/AVR.cpp (+4-1)
diff --git a/clang/lib/CodeGen/Targets/AVR.cpp b/clang/lib/CodeGen/Targets/AVR.cpp index 50547dd6dec5e7d..bc6418c1e224eb4 100644 --- a/clang/lib/CodeGen/Targets/AVR.cpp +++ b/clang/lib/CodeGen/Targets/AVR.cpp @@ -112,7 +112,10 @@ class AVRABIInfo : public DefaultABIInfo { class AVRTargetCodeGenInfo : public TargetCodeGenInfo { public: AVRTargetCodeGenInfo(CodeGenTypes &CGT, unsigned NPR, unsigned NRR) - : TargetCodeGenInfo(std::make_unique<AVRABIInfo>(CGT, NPR, NRR)) {} + : TargetCodeGenInfo(std::make_unique<AVRABIInfo>(CGT, NPR, NRR)) { + SwiftInfo = + std::make_unique<SwiftABIInfo>(CGT, /*SwiftErrorInRegister=*/true); + } LangAS getGlobalVarAddressSpace(CodeGenModule &CGM, const VarDecl *D) const override { 
@github-actions
Copy link

github-actions bot commented Nov 14, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@carlos4242 carlos4242 force-pushed the avr-swift-abi-support branch from ed8b63c to fe21485 Compare November 14, 2023 23:58
This patch is needed to add support to clang's AVR ABI for the Swift language. It is a pre-requisite for upstreaming AVR patches to the Swift compiler itself. @benshi001 @rjmccall
@carlos4242 carlos4242 force-pushed the avr-swift-abi-support branch from fe21485 to 333916a Compare November 15, 2023 00:26
@benshi001 benshi001 self-requested a review November 15, 2023 05:52
@efriedma-quic
Copy link
Collaborator

Can you not test this by checking an __attribute__((swiftcall)) function works in C?

@efriedma-quic
Copy link
Collaborator

See also #71986

@rjmccall
Copy link
Contributor

Right, that would be the way to test it.

I don't know much about AVR, but you should also look at some of the parameters to the lowering (e.g. how many max values it's okay to break an aggregate into) and make sure you're happy with them.

@carlos4242
Copy link
Contributor Author

@efriedma-quic Cool. So it sounds like it's worth parking this for now, until Kuba's work #71986 is merged?

@rjmccall I'm not 100% sure I understand? The existing code in AVR.cpp handles the standard AVR ABI, which has a few simple rules based on GCC behaviour. Here we are effectively just passing on the handling for that to AVRABIInfo. The error in register thing, I just copied what ARM is doing. I figure that's OK for now. If we need to do some bugfixes later to get AVR working with errors then we'll take a look then probably. Is that reasonable?

@rjmccall
Copy link
Contributor

rjmccall commented Nov 17, 2023

@efriedma-quic Cool. So it sounds like it's worth parking this for now, until Kuba's work #71986 is merged?

@rjmccall I'm not 100% sure I understand? The existing code in AVR.cpp handles the standard AVR ABI, which has a few simple rules based on GCC behaviour. Here we are effectively just passing on the handling for that to AVRABIInfo. The error in register thing, I just copied what ARM is doing. I figure that's OK for now. If we need to do some bugfixes later to get AVR working with errors then we'll take a look then probably. Is that reasonable?

SwiftABIInfo exists so that targets can tweak more details than just whether to use swifterror. For example, you can change the conditions under which we'll try to return things in registers, and if you don't do that, IIRC Swift will try to return quite a few things in registers by default. I'm just saying that you should make sure that that matches with your backend — either your backend needs to make sure that it returns a type like { ptr, ptr, float, float } in registers for swiftcc or you should tweak the frontend so that it knows to return it indirectly. (I would recommend the former — you can make this specific to swiftcc so that you can continue to match the C conventions by default.)

If you don't do that, we can end up in the worst of both worlds. The frontend decides that a struct should be broken down into scalars, so it constructs a return type like the above in IR. Note that this doesn't necessarily match the in-memory layout of the original struct, because if it's just representing a series of scalar values that'll get assigned to registers, it doesn't need to. If the backend thinks that's too big to return directly, it'll force it to be returned indirectly, which means we'll be paying all the code-size costs of breaking down and reassembling the value twice on each side of the return.

@carlos4242
Copy link
Contributor Author

Thanks, that sounds like it's worth looking into and might avoid issues with AVR.

I'm still nowhere near enough of an LLVM expert to follow all the aspects of the discussion. Although from our perspective, I've never seen an issue that I know, using the above patch for the last few years in our LLVM back end for the swift compile (not in our build of clang). The code produced by the back end has always followed the AVR GCC ABI calling convention (https://gcc.gnu.org/wiki/avr-gcc ...roughly... registers assigned to parameters starting at r24 and working backward r22, r20, etc. to r8 then the rest on stack) when I have examined the generated assembly. But maybe that is because, until now, only our swift code has been emitted with the swift calling convention, until #71986 is merged, clang can't emit code with that calling convention and that's the issue?

Stupid question... Is it possible that our AVR LLVM back end is basically ignoring the swift calling convention when it actually emits our code? Doesn't usually the front end (clang/swift) just specify parameter types and the back end decides whether to put something in registers or on the stack? Sorry if I'm being dumb!


It might be that I just don't understand enough about LLVM and clang to really understand what all the surrounding code is doing properly, and maybe the right answer is for me to close this PR and wait for someone more experienced to come along when the time is right. We can always maintain this as a local patch in our builds instead, as we have for the past few years? I don't want to lay down issues for anyone using this in the future.

Or as an alternative, we can merge this PR as-is and deal with AVR problems later as an optimisation. I am neutral about what is the best approach and happy to take the advice of all the experts on here. (At the very least, the discussion has been enlightening for me!)

@rjmccall
Copy link
Contributor

This is less about the implementation weeds of LLVM and more about the actual details of your calling convention — the decisions about how arguments and results are passed that are ultimately downstream of the choices made here. Mostly, I'm encouraging you as a platform maintainer who's adding Swift CC support for your platform to take a few moments to make sure you're happy with the CC. It's likely that everything will work regardless — Swift will generate both declarations and calls according to the rules of the CC you specify here, which should be enough to make things functional — but you might find yourself looking at generated code in two years and realize you've been doing something ridiculous for that entire time, and maybe it'll be too awkward to change.

@carlos4242
Copy link
Contributor Author

This is less about the implementation weeds of LLVM and more about the actual details of your calling convention — the decisions about how arguments and results are passed that are ultimately downstream of the choices made here. Mostly, I'm encouraging you as a platform maintainer who's adding Swift CC support for your platform to take a few moments to make sure you're happy with the CC. It's likely that everything will work regardless — Swift will generate both declarations and calls according to the rules of the CC you specify here, which should be enough to make things functional — but you might find yourself looking at generated code in two years and realize you've been doing something ridiculous for that entire time, and maybe it'll be too awkward to change.

That makes a lot of sense. Thank you John. I guess here are my thoughts. As I understand it, the SwiftABIInfo by default does something like "if something can be passed in 4 registers or fewer then pass by register, otherwise pass indirectly"? I think that sweet spot (also sort of reflected in Existential Containers I believe?) makes sense for 32 bit or 64 bit registers (and presumably the sorts of caching you'd expect in modern intel/arm larger machine architectures). We have 8 bit registers, which is quite different...

Thinking aloud...

In our case, one complication is stack manipulation is fairly painful, we may be able to improve the AVR back end a bit but at the moment, just moving the stack pointer down for something like an alloca is 8 instructions (and then again 8 bytes in the function epilog).

But, equally, the default C ABI (avr gcc abi) allows a struct to be split over as many as 18 registers, and we are often producing pretty inefficient code like this when we have large structs, moving registers around a lot either side of a call site. Which is something I really wanted to find a way to solve "one day" with Swift for Arduino/Microswift/Swift for AVR.

Probably it's reasonable to say that ideally, when lowering to AVR assembly from Swift, any struct larger than 8 bytes should be passed on the stack in our case. Do you think we can implement that?

I'm happy to do some AVR back end work to change the ABI when the swift calling convention is in place on a function. (Although it might have to wait until I've had time to get the new Embedded mode working on our Swift compiler first, as that's taking up most of my time for the moment.)

@carlos4242
Copy link
Contributor Author

@benshi001 Have you got any thoughts on this as the AVR maintainer? I've been using various versions of this patch in my own branches for years. Should we merge?

I think ultimately it's your call as you're the AVR backend owner?

@benshi001
Copy link
Member

benshi001 commented Dec 30, 2023

As I have suggested, any functional change needs tests. So I think you need to add some tests to show what your change affects.

@benshi001
Copy link
Member

benshi001 commented Dec 30, 2023

Please refer to my previous commit of AVR ABI, how tests are provided for a functional change.

3fd9a32
51585aa

@rjmccall
Copy link
Contributor

rjmccall commented Jan 1, 2024

That makes a lot of sense. Thank you John. I guess here are my thoughts. As I understand it, the SwiftABIInfo by default does something like "if something can be passed in 4 registers or fewer then pass by register, otherwise pass indirectly"? I think that sweet spot (also sort of reflected in Existential Containers I believe?) makes sense for 32 bit or 64 bit registers (and presumably the sorts of caching you'd expect in modern intel/arm larger machine architectures). We have 8 bit registers, which is quite different...

Yes, I see. Pointers are 16-bit and would be passed in two registers, right? You need to do more accurate counting than the default implementation. The default implementation breaks up "large" integers (also specified by the ABI) when determining the scalar sequence, and then it assumes that all scalars count the same towards the limit. You probably want to keep e.g. i16s and ptrs in the scalar sequence instead of splitting them into i8s, but you want to count them properly as two registers rather than one. And then, yeah, maybe it makes sense to put the cap at something like 6 or even 4 registers.

In our case, one complication is stack manipulation is fairly painful, we may be able to improve the AVR back end a bit but at the moment, just moving the stack pointer down for something like an alloca is 8 instructions (and then again 8 bytes in the function epilog).

On most architectures, functions that include calls will need to perform a stack adjustment on entry anyway — at the very least, for the function's own frame — but you don't need an extra adjustment for individual call sites because you include space for the maximum stack argument space usage of all the calls in the function in that initial adjustment. Is that not how it's done on AVR?

But, equally, the default C ABI (avr gcc abi) allows a struct to be split over as many as 18 registers, and we are often producing pretty inefficient code like this when we have large structs, moving registers around a lot either side of a call site. Which is something I really wanted to find a way to solve "one day" with Swift for Arduino/Microswift/Swift for AVR.

Yeah, 18 registers seems like it'd be way higher than you want. I usually approach this from a code-size optimization perspective. Passing in registers is great in the ideal situation: the caller is able to efficiently produce the argument into whatever argument register is convenient (e.g. loading, or loading an immediate, or taking the address of a local variable), and the callee is going to immediately use the component values of argument in whatever register it came in. When that isn't true, you don't want passing in registers to be a highly punitive mistake. That's why the cap is normally expressed in terms of the number of scalars involved rather than the size of the data: the assumption is that each scalar will require a separate load/store if we're not in the ideal case. And for ISAs like AVR with non-orthogonal register use, I feel like the chances that taking complex data in registers will be useful to the callee are somewhat lower; and if spilling an argument to the stack retroactively is expensive, that also changes the balance. So having a relatively low cap seems like the right thing to do.

Probably it's reasonable to say that ideally, when lowering to AVR assembly from Swift, any struct larger than 8 bytes should be passed on the stack in our case. Do you think we can implement that?

It's important to distinguish the three ways of passing an argument: you can pass it in registers, you can pass it in the stack argument area, and you can pass a pointer to it. I am generally of the opinion that passing arguments in the stack argument area is a waste and should only be used if you don't have a better choice available, e.g. because you have too many arguments and you've run out of registers. This is because you usually can't forward such an argument efficiently: you have to copy it into a new stack argument area for the next call. So the Swift CC usually says that if an argument is too big for the cap, it gets passed by pointer.

So I guess it depends on what you mean by "passed on the stack". I would not recommend passing in the stack argument area. But triggering pass-by-pointer on relatively small structures makes sense to me.

@benshi001 benshi001 requested a review from rjmccall January 2, 2024 01:27
@benshi001
Copy link
Member

I am not familiar with swift. So is there an official document for the SWIFT specific ABI on AVR? I think this is necessary, just like C++: https://gcc.gnu.org/wiki/avr-gcc.

And what's more, tests are required to be committed with functional changes, this is a requirement for all llvm commits with few exception.

@carlos4242
Copy link
Contributor Author

Sure thing @benshi001 ... I'll create tests. The Swift ABI is documented here: https://github.com/apple/swift/blob/main/docs/ABI/CallingConvention.rst

I don't think this change will actually change any ABI. All my code uses the normal avr-gcc ABI you referenced.

I'll work out the details with the Apple engineers on a PR in their LLVM fork. Once we've got it in a state everyone is happy with, I'll bring it back here. That way nothing is blocked and we can take the time we need. Thank you.

@benshi001
Copy link
Member

Sure thing @benshi001 ... I'll create tests. The Swift ABI is documented here: https://github.com/apple/swift/blob/main/docs/ABI/CallingConvention.rst

I don't think this change will actually change any ABI. All my code uses the normal avr-gcc ABI you referenced.

I'll work out the details with the Apple engineers on a PR in their LLVM fork. Once we've got it in a state everyone is happy with, I'll bring it back here. That way nothing is blocked and we can take the time we need. Thank you.

It would be better to mention this ABI document link in the source code.

@carlos4242
Copy link
Contributor Author

That makes sense. I think I’ll close this PR for now and work on the apple fork to get these issues resolved with them. Then I’ll come back once we have squared off the edges. Cheers Ben.

@carlos4242 carlos4242 closed this Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category

5 participants