- Notifications
You must be signed in to change notification settings - Fork 15.3k
[C++20] [Modules] Don't set modules owner ship information for builtin declarations #102115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
declarations Close llvm#101939 As the issue said, the builtin declarations shouldn't be in any modules.
| @llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-modules Author: Chuanqi Xu (ChuanqiXu9) ChangesClose #101939 As the issue said, the builtin declarations shouldn't be in any modules. Full diff: https://github.com/llvm/llvm-project/pull/102115.diff 2 Files Affected:
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp index 1f4bfa247b544..a82c66ced817e 100644 --- a/clang/lib/Sema/SemaDecl.cpp +++ b/clang/lib/Sema/SemaDecl.cpp @@ -2376,6 +2376,12 @@ NamedDecl *Sema::LazilyCreateBuiltin(IdentifierInfo *II, unsigned ID, FunctionDecl *New = CreateBuiltin(II, R, ID, Loc); RegisterLocallyScopedExternCDecl(New, S); + // Builtin functions shouldn't be owned by any module. + if (New->hasOwningModule()) { + New->setLocalOwningModule(nullptr); + New->setModuleOwnershipKind(Decl::ModuleOwnershipKind::Unowned); + } + // TUScope is the translation-unit scope to insert this function into. // FIXME: This is hideous. We need to teach PushOnScopeChains to // relate Scopes to DeclContexts, and probably eliminate CurContext diff --git a/clang/test/Modules/pr101939.cppm b/clang/test/Modules/pr101939.cppm new file mode 100644 index 0000000000000..35c243ed0f1bf --- /dev/null +++ b/clang/test/Modules/pr101939.cppm @@ -0,0 +1,6 @@ +// RUN: %clang_cc1 -std=c++20 %s -ast-dump | FileCheck %s + +export module mod; +export auto a = __builtin_expect(true, true); + +// CHECK-NOT: FunctionDecl{{.*}} in mod {{.*}} __builtin_expect |
| | ||
| // Builtin functions shouldn't be owned by any module. | ||
| if (New->hasOwningModule()) { | ||
| New->setLocalOwningModule(nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is the owning module set?
It would be cleaner to check whether a function is a builtin and never set a module rather than adding it and removing it later, i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is the owning module set?
The owning module is set when we new the decl:
llvm-project/clang/lib/AST/DeclBase.cpp
Lines 106 to 108 in 47bf996
| auto *ParentModule = | |
| Parent ? cast<Decl>(Parent)->getOwningModule() : nullptr; | |
| return new (Buffer) Module*(ParentModule) + 1; |
And if we want to avoid setting the incorrect modules ownership in the first place, we need to change the new function. It smells bad.
I feel the current position is good as long as we can make sure we'll only create builtin functions here and it looks true now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug report is unclear to me.
Are you saying this is a bug, or are you just saying the AST representation is not clean from a user understanding point of view?
The "external C++" linkage affects the VarDecl, but not the builtin FunctionDecl, so again this is unclear.
Also, would this really only apply to builtin functions?
What about builtin TypedefDecls?
It's also strange that they would belong to the TU, and the TU would belong to the module, but the builtin itself would not.
Can this situation currently arise for anything else?
If this is a cleanliness issue, can we instead create a builtin module, add all those builtin declarations to a TU belonging to that module, and then implicitly import it in every module?
Is that idea workable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bug report is unclear to me.
Are you saying this is a bug, or are you just saying the AST representation is not clean from a user understanding point of view?
This is a reduced root cause of another problem I saw. http://eel.is/c++draft/basic.def.odr#15.3 says the declarations attached to named modules are not allowed to be duplicated in different TUs. But currently the builtin declarations violate this. And I believe this is only an example and we may meet other problems.
What about builtin TypedefDecls?
This is somewhat coincidence. We would create the builtin TypedefDecls in Sema::Initialize(). But we haven't assign a module for the TU that time. So it works more or less surprisingly. The difference between the builtin typedef declaration and the current builtin function decl is that, the builtin typedef declaration are created eagerly and the builtin function decls is created lazily.
Can this situation currently arise for anything else?
IIUC, the above builtin TypedefDecl is an example.
It's also strange that they would belong to the TU, and the TU would belong to the module, but the builtin itself would not.
Yes, from the high level, the abstract is more or less not so straight forward. Maybe we want to save some typing works initially so we tight the process of assigning modules with the process of assigning decl contexts. But I admit, the mechanism works pretty well except the few exceptions we raise above.
If this is a cleanliness issue, can we instead create a builtin module, add all those builtin declarations to a TU belonging to that module, and then implicitly import it in every module?
Is that idea workable?
First this may not be a cleanliness issue. And I feel the idea is more or less not good or match the design. From the perspective of the specification, there are only two kinds of module, named module and the unnamed, aka, the global module. And all declarations should live in modules, either named module and the global module. So technically the compiler should assign all the declaration not in the named module into the global module. But due to historical reasons, we didn't implement the model directly but treat the declarations not in a clang::Module as if they are in the global module.
So I feel the idea to introduce implicit module may make the model more complex and not match the design.
BTW, another idea may be, attach the builtin function decl to the global module. But now, the implementation may only create the global module (in implementation, it is actually the global module fragment in the specification) conditionally, in another word, we will only create the global module fragment after we see module;. So the idea may be more complex.
So I feel the current one may be the most simple fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat coincidence. We would create the builtin TypedefDecls in
Sema::Initialize(). But we haven't assign a module for the TU that time. So it works more or less surprisingly. The difference between the builtin typedef declaration and the current builtin function decl is that, the builtin typedef declaration are created eagerly and the builtin function decls is created lazily.
I see. This eagerness of creating these TypedefDecls is something that annoys me a little as well.
I think whatever outcome we end up with, should be the same for both.
BTW, another idea may be, attach the builtin function decl to the global module. But now, the implementation may only create the global module (in implementation, it is actually the global module fragment in the specification) conditionally, in another word, we will only create the global module fragment after we see
module;. So the idea may be more complex.
Yeah, attaching these things to the global module seems like a clean approach as well, I like that :)
What are the difficulties involved in always creating the GMF in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, attaching these things to the global module seems like a clean approach as well, I like that :)
What are the difficulties involved in always creating the GMF in this case?
It is easy for module interfaces, since we know we are compiling a module interface after the driver phase (by -x c++-module flag or .cppm suffix). But the GMF is also possible in module implementation units, where we can only know if the current TU is module unit after we see the module declaration. So the state may be slightly more complex.
And also, even if we did that, if we don't intend to change the implementation of Decl::operator new, where the modules are assigned to decls, we still need the current patch to reset modules ownership after we create the builtin function, where its contents becomes to:
if (New->hasOwningModule()) New->setLocalOwningModule(getGlobalModule()); I think the key of the problem we're discussing here, if we want the best abstraction, is to review the design of relationships between DeclContext and Modules Ownership. But maybe we don't have to do that in this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I posted #102290. Let's talk about this there. And I think I don't have time to do that in the near future and I feel the status quo is not really bad. So I'd like to propose to land this patch first and remove this after we make the implementation design better someday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since as you pointed out we already have this oddness for eagerly created implicit declarations, it seems like it would be an incremental improvement to have the same non-ideal outcome for both.
But I think the means to achieve it as implemented in the current patch looks fragile.
As Corentin suggested, I'd rather change the context around when we lazily create the builtin, than not having to forget to change it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to handle the eagerly created implicit declarations as we did in the current patch. Since they actually work fine now.
As Corentin suggested, I'd rather change the context around when we lazily create the builtin, than not having to forget to change it later.
But the modules owner ship are attached immediately when we new the builtin declarations. If we want to do that in the first place, we can only clear the modules ownership for the TU before the new operation and reset it for the TU again after the new operation. It looks pretty odd. Or do I misunderstand your point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to handle the eagerly created implicit declarations as we did in the current patch. Since they actually work fine now.
That's what I mean, both the eager and lazy situations are odd, but at least the eager one works now, so if we handle the lazy ones the same way, then we are also being more consistent and less wrong at the same time,
which might justify this as incremental improvement :)
But the modules owner ship are attached immediately when we new the builtin declarations. If we want to do that in the first place, we can only clear the modules ownership for the TU before the new operation and reset it for the TU again after the new operation. It looks pretty odd. Or do I misunderstand your point?
I was hoping there would be a clean way to do this, even if it meant rewriting this bit about how we set module ownership from the new operator.
I'll take a look at the implementation tomorrow and see if I can come up with any concrete suggestions.
Close #101939
As the issue said, the builtin declarations shouldn't be in any modules.