- Notifications
You must be signed in to change notification settings - Fork 15.3k
[clang] Support header shadowing diagnostics in Clang header search #162491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[clang] Support header shadowing diagnostics in Clang header search #162491
Conversation
| @llvm/pr-subscribers-clang Author: Jinjie Huang (Jinjie-Huang) ChangesWhen including a header file, multiple files with the same name may exist across different search paths. Therefore, this patch tries to provide a diagnostic message without changing the outcome of the header selection. It does this by performing an additional search for duplicate filenames across all search paths (both MSVC rules and standard paths). This informs the user about a potential "header shadowing" issue and clarifies which header path was actually used. Full diff: https://github.com/llvm/llvm-project/pull/162491.diff 6 Files Affected:
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td index 0c994e0b5ca4d..b08adc59c9a8a 100644 --- a/clang/include/clang/Basic/DiagnosticGroups.td +++ b/clang/include/clang/Basic/DiagnosticGroups.td @@ -789,6 +789,8 @@ def ShadowFieldInConstructor : DiagGroup<"shadow-field-in-constructor", def ShadowIvar : DiagGroup<"shadow-ivar">; def ShadowUncapturedLocal : DiagGroup<"shadow-uncaptured-local">; +def HeaderShadowing : DiagGroup<"header-shadowing">; + // -Wshadow-all is a catch-all for all shadowing. -Wshadow is just the // shadowing that we think is unsafe. def Shadow : DiagGroup<"shadow", [ShadowFieldInConstructorModified, diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td b/clang/include/clang/Basic/DiagnosticLexKinds.td index c7fe6e1db6d1f..8057e0e95e187 100644 --- a/clang/include/clang/Basic/DiagnosticLexKinds.td +++ b/clang/include/clang/Basic/DiagnosticLexKinds.td @@ -951,6 +951,10 @@ def warn_quoted_include_in_framework_header : Warning< def warn_framework_include_private_from_public : Warning< "public framework header includes private framework header '%0'" >, InGroup<FrameworkIncludePrivateFromPublic>; +def warn_header_shadowed + : Warning<"multiple candidates for header '%0' found; " + "using the one from '%1', shadowed by '%2'">, + InGroup<HeaderShadowing>, DefaultIgnore; def warn_deprecated_module_dot_map : Warning< "'%0' as a module map name is deprecated, rename it to %select{module.modulemap|module.private.modulemap}1%select{| in the 'Modules' directory of the framework}2">, InGroup<DeprecatedModuleDotMap>; diff --git a/clang/include/clang/Lex/DirectoryLookup.h b/clang/include/clang/Lex/DirectoryLookup.h index bb703dfad2b28..45c7360385a73 100644 --- a/clang/include/clang/Lex/DirectoryLookup.h +++ b/clang/include/clang/Lex/DirectoryLookup.h @@ -181,7 +181,7 @@ class DirectoryLookup { ModuleMap::KnownHeader *SuggestedModule, bool &InUserSpecifiedSystemFramework, bool &IsFrameworkFound, bool &IsInHeaderMap, SmallVectorImpl<char> &MappedName, - bool OpenFile = true) const; + bool NeedSuggest, bool OpenFile = true) const; private: OptionalFileEntryRef DoFrameworkLookup( diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h index 850aea41c4c3b..83f6e11cd6ddc 100644 --- a/clang/include/clang/Lex/HeaderSearch.h +++ b/clang/include/clang/Lex/HeaderSearch.h @@ -809,12 +809,11 @@ class HeaderSearch { /// Look up the file with the specified name and determine its owning /// module. - OptionalFileEntryRef - getFileAndSuggestModule(StringRef FileName, SourceLocation IncludeLoc, - const DirectoryEntry *Dir, bool IsSystemHeaderDir, - Module *RequestingModule, - ModuleMap::KnownHeader *SuggestedModule, - bool OpenFile = true, bool CacheFailures = true); + OptionalFileEntryRef getFileAndSuggestModule( + StringRef FileName, SourceLocation IncludeLoc, const DirectoryEntry *Dir, + bool IsSystemHeaderDir, Module *RequestingModule, + ModuleMap::KnownHeader *SuggestedModule, bool NeedSuggest, + bool OpenFile = true, bool CacheFailures = true); /// Cache the result of a successful lookup at the given include location /// using the search path at \c HitIt. diff --git a/clang/lib/Lex/HeaderSearch.cpp b/clang/lib/Lex/HeaderSearch.cpp index 238c5e2f2d9a5..fe11be5555977 100644 --- a/clang/lib/Lex/HeaderSearch.cpp +++ b/clang/lib/Lex/HeaderSearch.cpp @@ -444,8 +444,8 @@ StringRef DirectoryLookup::getName() const { OptionalFileEntryRef HeaderSearch::getFileAndSuggestModule( StringRef FileName, SourceLocation IncludeLoc, const DirectoryEntry *Dir, bool IsSystemHeaderDir, Module *RequestingModule, - ModuleMap::KnownHeader *SuggestedModule, bool OpenFile /*=true*/, - bool CacheFailures /*=true*/) { + ModuleMap::KnownHeader *SuggestedModule, bool NeedSuggest, + bool OpenFile /*=true*/, bool CacheFailures /*=true*/) { // If we have a module map that might map this header, load it and // check whether we'll have a suggestion for a module. auto File = getFileMgr().getFileRef(FileName, OpenFile, CacheFailures); @@ -462,6 +462,9 @@ OptionalFileEntryRef HeaderSearch::getFileAndSuggestModule( return std::nullopt; } + if (!NeedSuggest) + return *File; + // If there is a module that corresponds to this header, suggest it. if (!findUsableModuleForHeader( *File, Dir ? Dir : File->getFileEntry().getDir(), RequestingModule, @@ -479,7 +482,7 @@ OptionalFileEntryRef DirectoryLookup::LookupFile( Module *RequestingModule, ModuleMap::KnownHeader *SuggestedModule, bool &InUserSpecifiedSystemFramework, bool &IsFrameworkFound, bool &IsInHeaderMap, SmallVectorImpl<char> &MappedName, - bool OpenFile) const { + bool NeedSuggest, bool OpenFile) const { InUserSpecifiedSystemFramework = false; IsInHeaderMap = false; MappedName.clear(); @@ -501,7 +504,7 @@ OptionalFileEntryRef DirectoryLookup::LookupFile( return HS.getFileAndSuggestModule( TmpDir, IncludeLoc, getDir(), isSystemHeaderDirectory(), - RequestingModule, SuggestedModule, OpenFile); + RequestingModule, SuggestedModule, NeedSuggest, OpenFile); } if (isFramework()) @@ -881,6 +884,37 @@ diagnoseFrameworkInclude(DiagnosticsEngine &Diags, SourceLocation IncludeLoc, << IncludeFilename; } + +/// Return true if a shadow has been detected and the caller should +/// stop and return the first-found file and module, false otherwise. +static bool checkAndStoreCandidate( + ModuleMap::KnownHeader *SuggestedModule, + OptionalFileEntryRef CandidateFile, StringRef CandidateDir, + DiagnosticsEngine &Diags, StringRef Filename, SourceLocation IncludeLoc, + ModuleMap::KnownHeader &FirstModule, OptionalFileEntryRef &FirstHeader, + SmallString<1024> &FirstDir) { + if (!FirstHeader) { + // Found the first candidate + FirstHeader = CandidateFile; + FirstDir = CandidateDir; + if (SuggestedModule) + FirstModule = *SuggestedModule; + return false; + } + + if (FirstDir != CandidateDir) { + // Found a second candidate from a different directory + Diags.Report(IncludeLoc, diag::warn_header_shadowed) + << Filename << FirstDir << CandidateDir; + if (SuggestedModule) + *SuggestedModule = FirstModule; + return true; + } + + // Found a candidate from the same directory as the first one + return false; +} + /// LookupFile - Given a "foo" or \<foo> reference, look up the indicated file, /// return null on failure. isAngled indicates whether the file reference is /// for system \#include's or not (i.e. using <> instead of ""). Includers, if @@ -923,13 +957,18 @@ OptionalFileEntryRef HeaderSearch::LookupFile( // Otherwise, just return the file. return getFileAndSuggestModule(Filename, IncludeLoc, nullptr, /*IsSystemHeaderDir*/ false, - RequestingModule, SuggestedModule, OpenFile, - CacheFailures); + RequestingModule, SuggestedModule, true, + OpenFile, CacheFailures); } // This is the header that MSVC's header search would have found. + bool First = true; + bool NeedSuggest = true; ModuleMap::KnownHeader MSSuggestedModule; OptionalFileEntryRef MSFE; + ModuleMap::KnownHeader FirstModule; + OptionalFileEntryRef FirstHeader; + SmallString<1024> FirstDir; // Check to see if the file is in the #includer's directory. This cannot be // based on CurDir, because each includer could be a #include of a @@ -938,7 +977,6 @@ OptionalFileEntryRef HeaderSearch::LookupFile( // headers. if (!Includers.empty() && !isAngled) { SmallString<1024> TmpDir; - bool First = true; for (const auto &IncluderAndDir : Includers) { OptionalFileEntryRef Includer = IncluderAndDir.first; @@ -962,10 +1000,15 @@ OptionalFileEntryRef HeaderSearch::LookupFile( }(); if (OptionalFileEntryRef FE = getFileAndSuggestModule( TmpDir, IncludeLoc, IncluderAndDir.second, IncluderIsSystemHeader, - RequestingModule, SuggestedModule)) { + RequestingModule, SuggestedModule, NeedSuggest)) { if (!Includer) { assert(First && "only first includer can have no file"); - return FE; + checkAndStoreCandidate(SuggestedModule, FE, + IncluderAndDir.second.getName(), Diags, + Filename, IncludeLoc, FirstModule, + FirstHeader, FirstDir); + NeedSuggest = false; + break; } // Leave CurDir unset. @@ -994,21 +1037,30 @@ OptionalFileEntryRef HeaderSearch::LookupFile( diagnoseFrameworkInclude(Diags, IncludeLoc, IncluderAndDir.second.getName(), Filename, *FE); - return FE; + checkAndStoreCandidate(SuggestedModule, FE, + IncluderAndDir.second.getName(), Diags, Filename, IncludeLoc, + FirstModule, FirstHeader, FirstDir); + NeedSuggest = false; + break; } // Otherwise, we found the path via MSVC header search rules. If // -Wmsvc-include is enabled, we have to keep searching to see if we // would've found this header in -I or -isystem directories. - if (Diags.isIgnored(diag::ext_pp_include_search_ms, IncludeLoc)) { - return FE; - } else { - MSFE = FE; - if (SuggestedModule) { - MSSuggestedModule = *SuggestedModule; - *SuggestedModule = ModuleMap::KnownHeader(); - } - break; + if (checkAndStoreCandidate(SuggestedModule, FE, + IncluderAndDir.second.getName(), Diags, Filename, IncludeLoc, + FirstModule, FirstHeader, FirstDir)) { + // Found mutiple candidates via MSVC rules + if (Diags.isIgnored(diag::ext_pp_include_search_ms, IncludeLoc)) + return FirstHeader; + else + NeedSuggest = false; + break; + } + MSFE = FE; + if (SuggestedModule) { + MSSuggestedModule = *SuggestedModule; + *SuggestedModule = ModuleMap::KnownHeader(); } } First = false; @@ -1069,6 +1121,7 @@ OptionalFileEntryRef HeaderSearch::LookupFile( } SmallString<64> MappedName; + First = true; // Check each directory in sequence to see if it contains this file. for (; It != search_dir_end(); ++It) { @@ -1078,7 +1131,7 @@ OptionalFileEntryRef HeaderSearch::LookupFile( OptionalFileEntryRef File = It->LookupFile( Filename, *this, IncludeLoc, SearchPath, RelativePath, RequestingModule, SuggestedModule, InUserSpecifiedSystemFramework, IsFrameworkFoundInDir, - IsInHeaderMap, MappedName, OpenFile); + IsInHeaderMap, MappedName, NeedSuggest, OpenFile); if (!MappedName.empty()) { assert(IsInHeaderMap && "MappedName should come from a header map"); CacheLookup.MappedName = @@ -1097,53 +1150,60 @@ OptionalFileEntryRef HeaderSearch::LookupFile( if (!File) continue; - CurDir = It; - - IncludeNames[*File] = Filename; - - // This file is a system header or C++ unfriendly if the dir is. - HeaderFileInfo &HFI = getFileInfo(*File); - HFI.DirInfo = CurDir->getDirCharacteristic(); - - // If the directory characteristic is User but this framework was - // user-specified to be treated as a system framework, promote the - // characteristic. - if (HFI.DirInfo == SrcMgr::C_User && InUserSpecifiedSystemFramework) - HFI.DirInfo = SrcMgr::C_System; - - // If the filename matches a known system header prefix, override - // whether the file is a system header. - for (unsigned j = SystemHeaderPrefixes.size(); j; --j) { - if (Filename.starts_with(SystemHeaderPrefixes[j - 1].first)) { - HFI.DirInfo = SystemHeaderPrefixes[j-1].second ? SrcMgr::C_System - : SrcMgr::C_User; - break; + if (First) { + First = false; + NeedSuggest = false; + CurDir = It; + IncludeNames[*File] = Filename; + + // This file is a system header or C++ unfriendly if the dir is. + HeaderFileInfo &HFI = getFileInfo(*File); + HFI.DirInfo = CurDir->getDirCharacteristic(); + + // If the directory characteristic is User but this framework was + // user-specified to be treated as a system framework, promote the + // characteristic. + if (HFI.DirInfo == SrcMgr::C_User && InUserSpecifiedSystemFramework) + HFI.DirInfo = SrcMgr::C_System; + + // If the filename matches a known system header prefix, override + // whether the file is a system header. + for (unsigned j = SystemHeaderPrefixes.size(); j; --j) { + if (Filename.starts_with(SystemHeaderPrefixes[j - 1].first)) { + HFI.DirInfo = SystemHeaderPrefixes[j - 1].second ? SrcMgr::C_System + : SrcMgr::C_User; + break; + } } - } - if (checkMSVCHeaderSearch(Diags, MSFE, &File->getFileEntry(), IncludeLoc)) { - if (SuggestedModule) + if (checkMSVCHeaderSearch(Diags, MSFE, &File->getFileEntry(), IncludeLoc) && + SuggestedModule) *SuggestedModule = MSSuggestedModule; - return MSFE; - } - bool FoundByHeaderMap = !IsMapped ? false : *IsMapped; - if (!Includers.empty()) - diagnoseFrameworkInclude(Diags, IncludeLoc, - Includers.front().second.getName(), Filename, - *File, isAngled, FoundByHeaderMap); + bool FoundByHeaderMap = !IsMapped ? false : *IsMapped; + if (!Includers.empty()) + diagnoseFrameworkInclude(Diags, IncludeLoc, + Includers.front().second.getName(), Filename, + *File, isAngled, FoundByHeaderMap); - // Remember this location for the next lookup we do. - cacheLookupSuccess(CacheLookup, It, IncludeLoc); - return File; + // Remember this location for the next lookup we do. + cacheLookupSuccess(CacheLookup, It, IncludeLoc); + } + + if (checkAndStoreCandidate(SuggestedModule, File, It->getName(), Diags, + Filename, IncludeLoc, FirstModule, FirstHeader, FirstDir)) + return FirstHeader; } - if (checkMSVCHeaderSearch(Diags, MSFE, nullptr, IncludeLoc)) { + if (First && checkMSVCHeaderSearch(Diags, MSFE, nullptr, IncludeLoc)) { if (SuggestedModule) *SuggestedModule = MSSuggestedModule; return MSFE; } + if (FirstHeader) + return FirstHeader; + // Otherwise, didn't find it. Remember we didn't find this. CacheLookup.HitIt = search_dir_end(); return std::nullopt; diff --git a/clang/test/Preprocessor/header-shadowing.c b/clang/test/Preprocessor/header-shadowing.c new file mode 100644 index 0000000000000..08b3c3e3f8db7 --- /dev/null +++ b/clang/test/Preprocessor/header-shadowing.c @@ -0,0 +1,14 @@ +// RUN: rm -rf %t +// RUN: split-file %s %t + +// RUN: %clang_cc1 -Wheader-shadowing -Eonly %t/main.c -I %t/include 2>&1 | FileCheck %s --check-prefix=SHADOWING +// SHADOWING: {{.*}} warning: multiple candidates for header 'header.h' found; + +//--- main.c +#include "header.h" + +//--- include/header.h +#pragma once + +//--- header.h +#pragma once |
| ✅ With the latest revision this PR passed the C/C++ code formatter. |
92f3422 to 2663a53 Compare 2663a53 to 5e51efd Compare | The current implementation directly reuses the search logic from HeaderSearch::LookupFile() and avoids side effects caused by performing additional searches. Alternatively, we could recalculate the default search paths after locating the corresponding header file and then perform the extra search. This could make the implementation more consistent and better isolate functionalities. Please let me know if the alternative seems preferable, I’m happy to adjust the implementation. |
| @erichkeane @cor3ntin @AaronBallman @shafik Can you help to review this at your convenience? Not sure if this is a good idea. |
| @tahonermann might have opinions |
tahonermann left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang currently fails to resolve header files the same way that MSVC does when operating in its cl-compatible driver mode. I have a PR that corrects this (#105738), but I still need to add some docs before taking that PR out of draft status. The code in that PR will ship with the next Intel oneAPI compiler release; I hope to land it soon-ish.
I am worried that this diagnostic will be quite noisy for some projects. I think a more useful improvement would be more diagnostics for cases where clang and clang-cl would resolve a header file differently. I see you already found the one existing case of such a diagnostic here.
Is there special handling for #include_next? If not, wouldn't effectively every use of #include_next be accompanied by this new warning? Or am I misunderstanding something?
| @tahonermann Thank you for review and sharing the helpful context.
Yes, I agree that it could potentially generate noise, especially if a project intentionally uses header files with the same name and carefully designs the '-I' options. The consideration here is: currently, this new diagnostic is disabled by default. Perhaps we could enable it in debugging scenarios, such as when sanitizers are enabled, to help users discover subtle API mismatch problems. Alternatively, it could be grouped and enabled alongside something like the "ShadowAll DiagGroup", or we could consider enabling it by default at a more appropriate time in the future? After all, having numerous header files with the same name in a project does not seem to be a recommended practice, it could indeed lead to some difficult-to-trace problems.
According to my observations, the first-level #include header.h indeed will triger this warning, and the second-level #include_next header.h does not(naturally compatible). It seems there isn’t a particularly elegant way to filter out the combined #include + #include_next scenario for now? |
| I haven't taken the time to fully comprehend the source code changes and I'm not fully clear on the intent for this diagnostic. Additional test cases might help to make it clear. For example, is the intent that a diagnostic would be issued for a case like this? I would not be in favor of that example being diagnosed because I think it would interfere with too much intentional use of If the intent is to only issue a diagnostic if a header file is included via double quotes and is found relative to the including file and is also found in a different location via the header search path, I think that might make sense (this is what your test case exercises). Use of This change should include a release note in I wonder if the existing |
| I also haven't taken the time to fully comprehend the source code changes. Both libcxx and glibc ship |
| Thanks for replies!
Yes, the initial goal was quite simple: when user #include "header.h" or <header.h>, if multiple copies of header.h exist in the search paths (e.g., include1 and include2), we want to notify the user that the compiler selected include1/header.h, and the first-found one may have shadowed the second one from include2. The motivation is that we encountered this situation in a large internal project: when the main project implicitly introduces multiple versions of a third-party library, it can lead to dependencies on identical header files in different paths, which may cause ODR issues and a bit hard to debug. So we also want to see what the community's opinion about this diagnostics. When using "#include header.h" followed by "#include_next header.h", the header duplication is managed and expected, but currently this new warning is being issued at the #include header.h location. So for users of include_next, this warning would indeed be noise. Therefore, the current idea is to disable this warning by default, requiring the user to manually enable it, or to enable it automatically when a sanitizer is turned on, in order to avoid daily disruption. And based on observations, our internal uses of #include_next are quite rare, so the original idea was mainly to help identify issues in the standard #include scenarios. Does this sound reasonable?
Thanks for suggestion, it might be possible to incorporate the logic of the new warning into 'ext_pp_include_search_ms', I’ll give it a try. |
For current approach, this diagnostics is also performed for system headers like <stdio.h>. Because the Clang driver is theoretically supposed to automatically specify the system header search paths, so there seems not supposed to have multiple duplications(and this search priority is lower than those manually specified by the user)? If the user specifies additional search paths, and a stdio.h file happens to exist under the paths, this will cause the compiler to select the stdio.h from the user-specified path. So I think this case should be diagnosed and produce a warning as current behavior.
I observed that the currently use: "getFileMgr().getFileRef(FileName)" have an internal caching mechanism, which might be helpful here. My initial thought was that the cost of file searching is much lower than file loading, so if this diagnostics is valuable, the cost might be acceptable. Perhaps I can consider introducing this search cost only when the user explicitly enables it with -Wheader-shadowing. |
ecnelises left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change works directly in the header search facilities under PP level, so I think it can handle various cases properly. It would be nice to add more test cases including:
#include <...>#include_next#import-isystem-cxx-isystem-iframework-include- system headers (say C++ stdlib has a private some.h, and we have custom some.h)
to ensure the behavior (unluckily path related options are really complex 🙃).
I also worry if warnings explode when the CU includes a number of files:
// a.cpp #include "a.h" #include "b.h" #include "c.h" // a.h #pragma once #include "b.h" // b.h #pragma once #include "c.h" // c.h #pragma once The warning repeats three times if c.h has something shadowed.
| I'm also not sure if this handles system headers specially. As typical cases, |
c506ef1 to 4199d38 Compare | Sorry for the delay in this update. I have refactored the implementation to perform the diagnostic logic within a separate function that replicates the header search paths. The should simplify the implementation and make the source code changes easier to comprehend. I also observed that many system headers use #include_next and have multiple copies, which could lead to some helpless warnings. So this diagnostic is now performed only for quoted searches to avoid the noise. As for the existing |
375efdf to 7bd0535 Compare 7bd0535 to 0312677 Compare | A few additional comments/questions:
BTW, In compilers, warnings usually indicate an unexpected or potentially dangerous behavior. Here, header shadowing is the expected behavior. But since header searching is worded as 'implementation-defined' in standard (cpp.include), it's also fine as an extension to me. Basically, this change looks helpful for large codebases. I think this can proceed after addressing the nits in tests and release notes, unless there are further concerns about the design. |
| Thanks for the review.
Regarding the execution time, I tested with some internal files (ranging from average to small inputs) and didn't see any significant difference: |
🐧 Linux x64 Test Results
|
ecnelises left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this patch with LLVM trunk bootstrap on Debian 12 and macOS 26, no false positive found.
This looks good to me now, thanks. Please hold on a few days in case other reviewers have more comments.
| Thanks, everyone. If there are no further comments, I plan to merge this patch sometime next week. |
When including a header file, multiple files with the same name may exist across different search paths, like:
|-- main.cpp
|-- header.h
|-- include
| └── header.h
The compiler usually picks the first match it finds (typically following MSVC rules for current/include-chain paths first, then regular -I paths), which may not be the user’s intended header.
This silent behavior can lead to subtle runtime API mismatches or increase the cost of resolving errors such as “error: use of undeclared identifier”, especially in large projects.
Therefore, this patch tries to provide a diagnostic message without changing the current header selection. It does this by performing an additional search for duplicate filenames across all search paths (both MSVC rules and standard paths). This informs the user about a potential "header shadowing" issue and clarifies which header path was actually used.
Since header searching is much cheaper than file loading, the added overhead should be within an acceptable range -- assuming the diagnostic message is valuable.