Skip to content

Conversation

@Jinjie-Huang
Copy link
Contributor

@Jinjie-Huang Jinjie-Huang commented Oct 8, 2025

When including a header file, multiple files with the same name may exist across different search paths, like:
  |-- main.cpp
  |-- header.h
  |-- include
  |  └── header.h
The compiler usually picks the first match it finds (typically following MSVC rules for current/include-chain paths first, then regular -I paths), which may not be the user’s intended header.
This silent behavior can lead to subtle runtime API mismatches or increase the cost of resolving errors such as “error: use of undeclared identifier”, especially in large projects.

Therefore, this patch tries to provide a diagnostic message without changing the current header selection. It does this by performing an additional search for duplicate filenames across all search paths (both MSVC rules and standard paths). This informs the user about a potential "header shadowing" issue and clarifies which header path was actually used.

Since header searching is much cheaper than file loading, the added overhead should be within an acceptable range -- assuming the diagnostic message is valuable.

@Jinjie-Huang Jinjie-Huang added the clang Clang issues not falling into any other category label Oct 8, 2025
@llvmbot llvmbot added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Oct 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 8, 2025

@llvm/pr-subscribers-clang

Author: Jinjie Huang (Jinjie-Huang)

Changes

When including a header file, multiple files with the same name may exist across different search paths.
The compiler usually picks the first match it finds (typically following MSVC rules for current/include-chain paths first, then regular -I paths), which may not be the user’s intended header.
This silent behavior can lead to subtle runtime API mismatches or “use of undeclared identifier” errors, especially in large projects.

Therefore, this patch tries to provide a diagnostic message without changing the outcome of the header selection. It does this by performing an additional search for duplicate filenames across all search paths (both MSVC rules and standard paths). This informs the user about a potential "header shadowing" issue and clarifies which header path was actually used.
Since header searching is much cheaper than file loading, the added overhead should be minimal.


Full diff: https://github.com/llvm/llvm-project/pull/162491.diff

6 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticGroups.td (+2)
  • (modified) clang/include/clang/Basic/DiagnosticLexKinds.td (+4)
  • (modified) clang/include/clang/Lex/DirectoryLookup.h (+1-1)
  • (modified) clang/include/clang/Lex/HeaderSearch.h (+5-6)
  • (modified) clang/lib/Lex/HeaderSearch.cpp (+115-55)
  • (added) clang/test/Preprocessor/header-shadowing.c (+14)
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td index 0c994e0b5ca4d..b08adc59c9a8a 100644 --- a/clang/include/clang/Basic/DiagnosticGroups.td +++ b/clang/include/clang/Basic/DiagnosticGroups.td @@ -789,6 +789,8 @@ def ShadowFieldInConstructor : DiagGroup<"shadow-field-in-constructor", def ShadowIvar : DiagGroup<"shadow-ivar">; def ShadowUncapturedLocal : DiagGroup<"shadow-uncaptured-local">; +def HeaderShadowing : DiagGroup<"header-shadowing">; + // -Wshadow-all is a catch-all for all shadowing. -Wshadow is just the // shadowing that we think is unsafe. def Shadow : DiagGroup<"shadow", [ShadowFieldInConstructorModified, diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td b/clang/include/clang/Basic/DiagnosticLexKinds.td index c7fe6e1db6d1f..8057e0e95e187 100644 --- a/clang/include/clang/Basic/DiagnosticLexKinds.td +++ b/clang/include/clang/Basic/DiagnosticLexKinds.td @@ -951,6 +951,10 @@ def warn_quoted_include_in_framework_header : Warning< def warn_framework_include_private_from_public : Warning< "public framework header includes private framework header '%0'" >, InGroup<FrameworkIncludePrivateFromPublic>; +def warn_header_shadowed + : Warning<"multiple candidates for header '%0' found; " + "using the one from '%1', shadowed by '%2'">, + InGroup<HeaderShadowing>, DefaultIgnore; def warn_deprecated_module_dot_map : Warning< "'%0' as a module map name is deprecated, rename it to %select{module.modulemap|module.private.modulemap}1%select{| in the 'Modules' directory of the framework}2">, InGroup<DeprecatedModuleDotMap>; diff --git a/clang/include/clang/Lex/DirectoryLookup.h b/clang/include/clang/Lex/DirectoryLookup.h index bb703dfad2b28..45c7360385a73 100644 --- a/clang/include/clang/Lex/DirectoryLookup.h +++ b/clang/include/clang/Lex/DirectoryLookup.h @@ -181,7 +181,7 @@ class DirectoryLookup { ModuleMap::KnownHeader *SuggestedModule, bool &InUserSpecifiedSystemFramework, bool &IsFrameworkFound, bool &IsInHeaderMap, SmallVectorImpl<char> &MappedName, - bool OpenFile = true) const; + bool NeedSuggest, bool OpenFile = true) const; private: OptionalFileEntryRef DoFrameworkLookup( diff --git a/clang/include/clang/Lex/HeaderSearch.h b/clang/include/clang/Lex/HeaderSearch.h index 850aea41c4c3b..83f6e11cd6ddc 100644 --- a/clang/include/clang/Lex/HeaderSearch.h +++ b/clang/include/clang/Lex/HeaderSearch.h @@ -809,12 +809,11 @@ class HeaderSearch { /// Look up the file with the specified name and determine its owning /// module. - OptionalFileEntryRef - getFileAndSuggestModule(StringRef FileName, SourceLocation IncludeLoc, - const DirectoryEntry *Dir, bool IsSystemHeaderDir, - Module *RequestingModule, - ModuleMap::KnownHeader *SuggestedModule, - bool OpenFile = true, bool CacheFailures = true); + OptionalFileEntryRef getFileAndSuggestModule( + StringRef FileName, SourceLocation IncludeLoc, const DirectoryEntry *Dir, + bool IsSystemHeaderDir, Module *RequestingModule, + ModuleMap::KnownHeader *SuggestedModule, bool NeedSuggest, + bool OpenFile = true, bool CacheFailures = true); /// Cache the result of a successful lookup at the given include location /// using the search path at \c HitIt. diff --git a/clang/lib/Lex/HeaderSearch.cpp b/clang/lib/Lex/HeaderSearch.cpp index 238c5e2f2d9a5..fe11be5555977 100644 --- a/clang/lib/Lex/HeaderSearch.cpp +++ b/clang/lib/Lex/HeaderSearch.cpp @@ -444,8 +444,8 @@ StringRef DirectoryLookup::getName() const { OptionalFileEntryRef HeaderSearch::getFileAndSuggestModule( StringRef FileName, SourceLocation IncludeLoc, const DirectoryEntry *Dir, bool IsSystemHeaderDir, Module *RequestingModule, - ModuleMap::KnownHeader *SuggestedModule, bool OpenFile /*=true*/, - bool CacheFailures /*=true*/) { + ModuleMap::KnownHeader *SuggestedModule, bool NeedSuggest, + bool OpenFile /*=true*/, bool CacheFailures /*=true*/) { // If we have a module map that might map this header, load it and // check whether we'll have a suggestion for a module. auto File = getFileMgr().getFileRef(FileName, OpenFile, CacheFailures); @@ -462,6 +462,9 @@ OptionalFileEntryRef HeaderSearch::getFileAndSuggestModule( return std::nullopt; } + if (!NeedSuggest) + return *File; + // If there is a module that corresponds to this header, suggest it. if (!findUsableModuleForHeader( *File, Dir ? Dir : File->getFileEntry().getDir(), RequestingModule, @@ -479,7 +482,7 @@ OptionalFileEntryRef DirectoryLookup::LookupFile( Module *RequestingModule, ModuleMap::KnownHeader *SuggestedModule, bool &InUserSpecifiedSystemFramework, bool &IsFrameworkFound, bool &IsInHeaderMap, SmallVectorImpl<char> &MappedName, - bool OpenFile) const { + bool NeedSuggest, bool OpenFile) const { InUserSpecifiedSystemFramework = false; IsInHeaderMap = false; MappedName.clear(); @@ -501,7 +504,7 @@ OptionalFileEntryRef DirectoryLookup::LookupFile( return HS.getFileAndSuggestModule( TmpDir, IncludeLoc, getDir(), isSystemHeaderDirectory(), - RequestingModule, SuggestedModule, OpenFile); + RequestingModule, SuggestedModule, NeedSuggest, OpenFile); } if (isFramework()) @@ -881,6 +884,37 @@ diagnoseFrameworkInclude(DiagnosticsEngine &Diags, SourceLocation IncludeLoc, << IncludeFilename; } + +/// Return true if a shadow has been detected and the caller should +/// stop and return the first-found file and module, false otherwise. +static bool checkAndStoreCandidate( + ModuleMap::KnownHeader *SuggestedModule, + OptionalFileEntryRef CandidateFile, StringRef CandidateDir, + DiagnosticsEngine &Diags, StringRef Filename, SourceLocation IncludeLoc, + ModuleMap::KnownHeader &FirstModule, OptionalFileEntryRef &FirstHeader, + SmallString<1024> &FirstDir) { + if (!FirstHeader) { + // Found the first candidate + FirstHeader = CandidateFile; + FirstDir = CandidateDir; + if (SuggestedModule) + FirstModule = *SuggestedModule; + return false; + } + + if (FirstDir != CandidateDir) { + // Found a second candidate from a different directory + Diags.Report(IncludeLoc, diag::warn_header_shadowed) + << Filename << FirstDir << CandidateDir; + if (SuggestedModule) + *SuggestedModule = FirstModule; + return true; + } + + // Found a candidate from the same directory as the first one + return false; +} + /// LookupFile - Given a "foo" or \<foo> reference, look up the indicated file, /// return null on failure. isAngled indicates whether the file reference is /// for system \#include's or not (i.e. using <> instead of ""). Includers, if @@ -923,13 +957,18 @@ OptionalFileEntryRef HeaderSearch::LookupFile( // Otherwise, just return the file. return getFileAndSuggestModule(Filename, IncludeLoc, nullptr, /*IsSystemHeaderDir*/ false, - RequestingModule, SuggestedModule, OpenFile, - CacheFailures); + RequestingModule, SuggestedModule, true, + OpenFile, CacheFailures); } // This is the header that MSVC's header search would have found. + bool First = true; + bool NeedSuggest = true; ModuleMap::KnownHeader MSSuggestedModule; OptionalFileEntryRef MSFE; + ModuleMap::KnownHeader FirstModule; + OptionalFileEntryRef FirstHeader; + SmallString<1024> FirstDir; // Check to see if the file is in the #includer's directory. This cannot be // based on CurDir, because each includer could be a #include of a @@ -938,7 +977,6 @@ OptionalFileEntryRef HeaderSearch::LookupFile( // headers. if (!Includers.empty() && !isAngled) { SmallString<1024> TmpDir; - bool First = true; for (const auto &IncluderAndDir : Includers) { OptionalFileEntryRef Includer = IncluderAndDir.first; @@ -962,10 +1000,15 @@ OptionalFileEntryRef HeaderSearch::LookupFile( }(); if (OptionalFileEntryRef FE = getFileAndSuggestModule( TmpDir, IncludeLoc, IncluderAndDir.second, IncluderIsSystemHeader, - RequestingModule, SuggestedModule)) { + RequestingModule, SuggestedModule, NeedSuggest)) { if (!Includer) { assert(First && "only first includer can have no file"); - return FE; + checkAndStoreCandidate(SuggestedModule, FE, + IncluderAndDir.second.getName(), Diags, + Filename, IncludeLoc, FirstModule, + FirstHeader, FirstDir); + NeedSuggest = false; + break; } // Leave CurDir unset. @@ -994,21 +1037,30 @@ OptionalFileEntryRef HeaderSearch::LookupFile( diagnoseFrameworkInclude(Diags, IncludeLoc, IncluderAndDir.second.getName(), Filename, *FE); - return FE; + checkAndStoreCandidate(SuggestedModule, FE, + IncluderAndDir.second.getName(), Diags, Filename, IncludeLoc, + FirstModule, FirstHeader, FirstDir); + NeedSuggest = false; + break; } // Otherwise, we found the path via MSVC header search rules. If // -Wmsvc-include is enabled, we have to keep searching to see if we // would've found this header in -I or -isystem directories. - if (Diags.isIgnored(diag::ext_pp_include_search_ms, IncludeLoc)) { - return FE; - } else { - MSFE = FE; - if (SuggestedModule) { - MSSuggestedModule = *SuggestedModule; - *SuggestedModule = ModuleMap::KnownHeader(); - } - break; + if (checkAndStoreCandidate(SuggestedModule, FE, + IncluderAndDir.second.getName(), Diags, Filename, IncludeLoc, + FirstModule, FirstHeader, FirstDir)) { + // Found mutiple candidates via MSVC rules + if (Diags.isIgnored(diag::ext_pp_include_search_ms, IncludeLoc)) + return FirstHeader; + else + NeedSuggest = false; + break; + } + MSFE = FE; + if (SuggestedModule) { + MSSuggestedModule = *SuggestedModule; + *SuggestedModule = ModuleMap::KnownHeader(); } } First = false; @@ -1069,6 +1121,7 @@ OptionalFileEntryRef HeaderSearch::LookupFile( } SmallString<64> MappedName; + First = true; // Check each directory in sequence to see if it contains this file. for (; It != search_dir_end(); ++It) { @@ -1078,7 +1131,7 @@ OptionalFileEntryRef HeaderSearch::LookupFile( OptionalFileEntryRef File = It->LookupFile( Filename, *this, IncludeLoc, SearchPath, RelativePath, RequestingModule, SuggestedModule, InUserSpecifiedSystemFramework, IsFrameworkFoundInDir, - IsInHeaderMap, MappedName, OpenFile); + IsInHeaderMap, MappedName, NeedSuggest, OpenFile); if (!MappedName.empty()) { assert(IsInHeaderMap && "MappedName should come from a header map"); CacheLookup.MappedName = @@ -1097,53 +1150,60 @@ OptionalFileEntryRef HeaderSearch::LookupFile( if (!File) continue; - CurDir = It; - - IncludeNames[*File] = Filename; - - // This file is a system header or C++ unfriendly if the dir is. - HeaderFileInfo &HFI = getFileInfo(*File); - HFI.DirInfo = CurDir->getDirCharacteristic(); - - // If the directory characteristic is User but this framework was - // user-specified to be treated as a system framework, promote the - // characteristic. - if (HFI.DirInfo == SrcMgr::C_User && InUserSpecifiedSystemFramework) - HFI.DirInfo = SrcMgr::C_System; - - // If the filename matches a known system header prefix, override - // whether the file is a system header. - for (unsigned j = SystemHeaderPrefixes.size(); j; --j) { - if (Filename.starts_with(SystemHeaderPrefixes[j - 1].first)) { - HFI.DirInfo = SystemHeaderPrefixes[j-1].second ? SrcMgr::C_System - : SrcMgr::C_User; - break; + if (First) { + First = false; + NeedSuggest = false; + CurDir = It; + IncludeNames[*File] = Filename; + + // This file is a system header or C++ unfriendly if the dir is. + HeaderFileInfo &HFI = getFileInfo(*File); + HFI.DirInfo = CurDir->getDirCharacteristic(); + + // If the directory characteristic is User but this framework was + // user-specified to be treated as a system framework, promote the + // characteristic. + if (HFI.DirInfo == SrcMgr::C_User && InUserSpecifiedSystemFramework) + HFI.DirInfo = SrcMgr::C_System; + + // If the filename matches a known system header prefix, override + // whether the file is a system header. + for (unsigned j = SystemHeaderPrefixes.size(); j; --j) { + if (Filename.starts_with(SystemHeaderPrefixes[j - 1].first)) { + HFI.DirInfo = SystemHeaderPrefixes[j - 1].second ? SrcMgr::C_System + : SrcMgr::C_User; + break; + } } - } - if (checkMSVCHeaderSearch(Diags, MSFE, &File->getFileEntry(), IncludeLoc)) { - if (SuggestedModule) + if (checkMSVCHeaderSearch(Diags, MSFE, &File->getFileEntry(), IncludeLoc) && + SuggestedModule) *SuggestedModule = MSSuggestedModule; - return MSFE; - } - bool FoundByHeaderMap = !IsMapped ? false : *IsMapped; - if (!Includers.empty()) - diagnoseFrameworkInclude(Diags, IncludeLoc, - Includers.front().second.getName(), Filename, - *File, isAngled, FoundByHeaderMap); + bool FoundByHeaderMap = !IsMapped ? false : *IsMapped; + if (!Includers.empty()) + diagnoseFrameworkInclude(Diags, IncludeLoc, + Includers.front().second.getName(), Filename, + *File, isAngled, FoundByHeaderMap); - // Remember this location for the next lookup we do. - cacheLookupSuccess(CacheLookup, It, IncludeLoc); - return File; + // Remember this location for the next lookup we do. + cacheLookupSuccess(CacheLookup, It, IncludeLoc); + } + + if (checkAndStoreCandidate(SuggestedModule, File, It->getName(), Diags, + Filename, IncludeLoc, FirstModule, FirstHeader, FirstDir)) + return FirstHeader; } - if (checkMSVCHeaderSearch(Diags, MSFE, nullptr, IncludeLoc)) { + if (First && checkMSVCHeaderSearch(Diags, MSFE, nullptr, IncludeLoc)) { if (SuggestedModule) *SuggestedModule = MSSuggestedModule; return MSFE; } + if (FirstHeader) + return FirstHeader; + // Otherwise, didn't find it. Remember we didn't find this. CacheLookup.HitIt = search_dir_end(); return std::nullopt; diff --git a/clang/test/Preprocessor/header-shadowing.c b/clang/test/Preprocessor/header-shadowing.c new file mode 100644 index 0000000000000..08b3c3e3f8db7 --- /dev/null +++ b/clang/test/Preprocessor/header-shadowing.c @@ -0,0 +1,14 @@ +// RUN: rm -rf %t +// RUN: split-file %s %t + +// RUN: %clang_cc1 -Wheader-shadowing -Eonly %t/main.c -I %t/include 2>&1 | FileCheck %s --check-prefix=SHADOWING +// SHADOWING: {{.*}} warning: multiple candidates for header 'header.h' found; + +//--- main.c +#include "header.h" + +//--- include/header.h +#pragma once + +//--- header.h +#pragma once 
@github-actions
Copy link

github-actions bot commented Oct 8, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@Jinjie-Huang Jinjie-Huang force-pushed the users/huangjinjie/clang_include_multi_candidates branch 2 times, most recently from 92f3422 to 2663a53 Compare October 8, 2025 15:04
@Jinjie-Huang Jinjie-Huang marked this pull request as draft October 8, 2025 15:35
@Jinjie-Huang Jinjie-Huang force-pushed the users/huangjinjie/clang_include_multi_candidates branch from 2663a53 to 5e51efd Compare October 10, 2025 08:13
@Jinjie-Huang Jinjie-Huang marked this pull request as ready for review October 10, 2025 09:08
@Jinjie-Huang
Copy link
Contributor Author

Jinjie-Huang commented Oct 10, 2025

The current implementation directly reuses the search logic from HeaderSearch::LookupFile() and avoids side effects caused by performing additional searches.

Alternatively, we could recalculate the default search paths after locating the corresponding header file and then perform the extra search. This could make the implementation more consistent and better isolate functionalities.

Please let me know if the alternative seems preferable, I’m happy to adjust the implementation.

@Jinjie-Huang
Copy link
Contributor Author

@erichkeane @cor3ntin @AaronBallman @shafik Can you help to review this at your convenience? Not sure if this is a good idea.

@cor3ntin
Copy link
Contributor

@tahonermann might have opinions

Copy link
Contributor

@tahonermann tahonermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clang currently fails to resolve header files the same way that MSVC does when operating in its cl-compatible driver mode. I have a PR that corrects this (#105738), but I still need to add some docs before taking that PR out of draft status. The code in that PR will ship with the next Intel oneAPI compiler release; I hope to land it soon-ish.

I am worried that this diagnostic will be quite noisy for some projects. I think a more useful improvement would be more diagnostics for cases where clang and clang-cl would resolve a header file differently. I see you already found the one existing case of such a diagnostic here.

Is there special handling for #include_next? If not, wouldn't effectively every use of #include_next be accompanied by this new warning? Or am I misunderstanding something?

@Jinjie-Huang Jinjie-Huang requested a review from MaskRay October 22, 2025 17:39
@Jinjie-Huang
Copy link
Contributor Author

Jinjie-Huang commented Oct 22, 2025

@tahonermann Thank you for review and sharing the helpful context.​

I am worried that this diagnostic will be quite noisy for some projects. I think a more useful improvement would be more diagnostics for cases where clang and clang-cl would resolve a header file differently. I see you already found the one existing case of such a diagnostic here.

Yes, I agree that it could potentially generate noise, especially if a project intentionally uses header files with the same name and carefully designs the '-I' options.

The consideration here is: currently, this new diagnostic is disabled by default. Perhaps we could enable it in debugging scenarios, such as when sanitizers are enabled, to help users discover subtle API mismatch problems. Alternatively, it could be grouped and enabled alongside something like the "ShadowAll DiagGroup", or we could consider enabling it by default at a more appropriate time in the future?

After all, having numerous header files with the same name in a project does not seem to be a recommended practice, it could indeed lead to some difficult-to-trace problems.

Is there special handling for #include_next? If not, wouldn't effectively every use of #include_next be accompanied by this new warning? Or am I misunderstanding something?

According to my observations, the first-level #include header.h indeed will triger this warning, and the second-level #include_next header.h does not(naturally compatible). It seems there isn’t a particularly elegant way to filter out the combined #include + #include_next scenario for now?

@Jinjie-Huang Jinjie-Huang self-assigned this Oct 23, 2025
@tahonermann
Copy link
Contributor

I haven't taken the time to fully comprehend the source code changes and I'm not fully clear on the intent for this diagnostic. Additional test cases might help to make it clear. For example, is the intent that a diagnostic would be issued for a case like this?

$ cat t.c #include "t.h" $ cat include1/t.h #warning include1/t.h included! $ cat include2/t.h #warning include2/t.h included! $ clang -c -Iinclude1 -Iinclude2 t.c 

I would not be in favor of that example being diagnosed because I think it would interfere with too much intentional use of #include_next.

If the intent is to only issue a diagnostic if a header file is included via double quotes and is found relative to the including file and is also found in a different location via the header search path, I think that might make sense (this is what your test case exercises). Use of #include_next in such cases seems somewhat unlikely (if you are using #include_next, you probably want to be relying completely on header search path ordering). I'm not sure if there is a good way to validate how much existing code might be impacted though.

This change should include a release note in clang/docs/ReleaseNotes.rst that explains the circumstances under which the diagnostic is issued and how to disable it.

I wonder if the existing ext_pp_include_search_ms diagnostic should be incorporated into this one. Technically, that shouldn't be an extension warning since header file resolution isn't defined by the C or C++ standards.

@MaskRay
Copy link
Member

MaskRay commented Oct 28, 2025

I also haven't taken the time to fully comprehend the source code changes. Both libcxx and glibc ship stdio.h. How do you handle #include <stdio.h>? Probing every search path also has a potential syscall performance issue.

@Jinjie-Huang
Copy link
Contributor Author

Jinjie-Huang commented Oct 28, 2025

Thanks for replies!

I would not be in favor of that example being diagnosed because I think it would interfere with too much intentional use of #include_next.

If the intent is to only issue a diagnostic if a header file is included via double quotes and is found relative to the including file and is also found in a different location via the header search path, I think that might make sense (this is what your test case exercises). Use of #include_next in such cases seems somewhat unlikely (if you are using #include_next, you probably want to be relying completely on header search path ordering). I'm not sure if there is a good way to validate how much existing code might be impacted though.

Yes, the initial goal was quite simple: when user #include "header.h" or <header.h>, if multiple copies of header.h exist in the search paths (e.g., include1 and include2), we want to notify the user that the compiler selected include1/header.h, and the first-found one may have shadowed the second one from include2.

The motivation is that we encountered this situation in a large internal project: when the main project implicitly introduces multiple versions of a third-party library, it can lead to dependencies on identical header files in different paths, which may cause ODR issues and a bit hard to debug. So we also want to see what the community's opinion about this diagnostics.

When using "#include header.h" followed by "#include_next header.h", the header duplication is managed and expected, but currently this new warning is being issued at the #include header.h location. So for users of include_next, this warning would indeed be noise. Therefore, the current idea is to disable this warning by default, requiring the user to manually enable it, or to enable it automatically when a sanitizer is turned on, in order to avoid daily disruption. And based on observations, our internal uses of #include_next are quite rare, so the original idea was mainly to help identify issues in the standard #include scenarios. Does this sound reasonable?

I wonder if the existing ext_pp_include_search_ms diagnostic should be incorporated into this one. Technically, that shouldn't be an extension warning since header file resolution isn't defined by the C or C++ standards.

Thanks for suggestion, it might be possible to incorporate the logic of the new warning into 'ext_pp_include_search_ms', I’ll give it a try.

@Jinjie-Huang
Copy link
Contributor Author

Jinjie-Huang commented Oct 28, 2025

Both libcxx and glibc ship stdio.h. How do you handle #include <stdio.h>?

For current approach, this diagnostics is also performed for system headers like <stdio.h>. Because the Clang driver is theoretically supposed to automatically specify the system header search paths, so there seems not supposed to have multiple duplications(and this search priority is lower than those manually specified by the user)? If the user specifies additional search paths, and a stdio.h file happens to exist under the paths, this will cause the compiler to select the stdio.h from the user-specified path. So I think this case should be diagnosed and produce a warning as current behavior.

Probing every search path also has a potential syscall performance issue.

I observed that the currently use: "getFileMgr().getFileRef(FileName)" have an internal caching mechanism, which might be helpful here. My initial thought was that the cost of file searching is much lower than file loading, so if this diagnostics is valuable, the cost might be acceptable.

Perhaps I can consider introducing this search cost only when the user explicitly enables it with -Wheader-shadowing.

Copy link
Member

@ecnelises ecnelises left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change works directly in the header search facilities under PP level, so I think it can handle various cases properly. It would be nice to add more test cases including:

  • #include <...>
  • #include_next
  • #import
  • -isystem
  • -cxx-isystem
  • -iframework
  • -include
  • system headers (say C++ stdlib has a private some.h, and we have custom some.h)

to ensure the behavior (unluckily path related options are really complex 🙃).

I also worry if warnings explode when the CU includes a number of files:

// a.cpp #include "a.h" #include "b.h" #include "c.h" // a.h #pragma once #include "b.h" // b.h #pragma once #include "c.h" // c.h #pragma once 

The warning repeats three times if c.h has something shadowed.

@ecnelises
Copy link
Member

ecnelises commented Nov 5, 2025

I'm also not sure if this handles system headers specially. As typical cases, stdatomic.h and math.h exist in both C++ stdlib headers and C stdlib headers (due to some standard incompatibility). (maybe also for stdio.h as @MaskRay raises) Should these warnings be also treated as expected?

$ find . -name math.h ./sysroot/usr/include/math.h ./include/c++/xxx/tr1/math.h ./include/c++/xxx/math.h ./lib/clang/xxx/include/openmp_wrappers/math.h $ find . -name stdatomic.h ./include/c++/xxx/stdatomic.h ./lib/clang/xx/include/stdatomic.h ./lib/gcc/x86_64-linux-gnu/xxx/include/stdatomic.h 
@Jinjie-Huang Jinjie-Huang force-pushed the users/huangjinjie/clang_include_multi_candidates branch 3 times, most recently from c506ef1 to 4199d38 Compare November 6, 2025 11:39
@Jinjie-Huang
Copy link
Contributor Author

Jinjie-Huang commented Nov 6, 2025

Sorry for the delay in this update.

I have refactored the implementation to perform the diagnostic logic within a separate function that replicates the header search paths. The should simplify the implementation and make the source code changes easier to comprehend.

I also observed that many system headers use #include_next and have multiple copies, which could lead to some helpless warnings. So this diagnostic is now performed only for quoted searches to avoid the noise.

As for the existing ext_pp_include_search_ms diagnostic, I believe it could be merged into this separate function as well. This would be a nice simplification for HeaderSearch::LookupFile. I'm happy to tackle that in a separate PR if this sounds reasonable.

@Jinjie-Huang Jinjie-Huang force-pushed the users/huangjinjie/clang_include_multi_candidates branch 2 times, most recently from 375efdf to 7bd0535 Compare November 7, 2025 13:08
@Jinjie-Huang Jinjie-Huang force-pushed the users/huangjinjie/clang_include_multi_candidates branch from 7bd0535 to 0312677 Compare November 7, 2025 15:30
@ecnelises
Copy link
Member

A few additional comments/questions:

  1. While I know this is disabled by default, have we considered the compilation time impact when it is enabled? Since getFileRef has caching mechanisms (IIUC it only calls stat), the overhead should theoretically be light. However, for projects with deep search paths, the linear scan for every quoted include might still add up. It might be worth verifying this on a larger build.

  2. Clang has a complex set of options to adjust include paths. I think it would be valuable to add test cases for flags like -idirafter, -iquote, and -isystem-after. This ensures the logic behaves consistently.

  3. Please add a release note to Improvements to Clang's diagnostics section.

BTW, In compilers, warnings usually indicate an unexpected or potentially dangerous behavior. Here, header shadowing is the expected behavior. But since header searching is worded as 'implementation-defined' in standard (cpp.include), it's also fine as an extension to me.

Basically, this change looks helpful for large codebases. I think this can proceed after addressing the nits in tests and release notes, unless there are further concerns about the design.

@Jinjie-Huang
Copy link
Contributor Author

Jinjie-Huang commented Nov 20, 2025

Thanks for the review.

While I know this is disabled by default, have we considered the compilation time impact when it is enabled?

Regarding the execution time, I tested with some internal files (ranging from average to small inputs) and didn't see any significant difference:

Files Baseline(s) With Diagnostic(s) Gap 1	58.647 59.69 1.78% 2	42.968 43.437 1.09% 3	20.14 20.195 0.27% 4	5.057 5.057 -0.02% 5	0.031 0.031 0.00% 
@github-actions
Copy link

github-actions bot commented Nov 20, 2025

🐧 Linux x64 Test Results

  • 111398 tests passed
  • 4440 tests skipped
Copy link
Member

@ecnelises ecnelises left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this patch with LLVM trunk bootstrap on Debian 12 and macOS 26, no false positive found.

This looks good to me now, thanks. Please hold on a few days in case other reviewers have more comments.

@Jinjie-Huang
Copy link
Contributor Author

Thanks, everyone. If there are no further comments, I plan to merge this patch sometime next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

6 participants