Skip to content

Conversation

@yshui
Copy link
Contributor

@yshui yshui commented Jul 18, 2024

Unsure if this is the correct fix. But let me explain what I know so far about this problem:

  1. Include cleaner goes over macro refs, and calculates their source locations:
    for (const auto &[_, Refs] : AST.getMacros().MacroRefs) {
    for (const auto &Ref : Refs) {
    auto Loc = SM.getComposedLoc(SM.getMainFileID(), Ref.StartOffset);
  2. It assumes the macro ref is in the main file. I think ParsedAST::Macros is meant to only contain macro usages in the main file.
  3. ParsedAST::Macros is filled in by CollectMainFileMacros:
    void CollectMainFileMacros::add(const Token &MacroNameTok, const MacroInfo *MI,
    bool IsDefinition, bool InIfCondition) {
    if (!InMainFile)
    return;
    auto Loc = MacroNameTok.getLocation();
    if (Loc.isInvalid() || Loc.isMacroID())
    return;
    auto Name = MacroNameTok.getIdentifierInfo()->getName();
    Out.Names.insert(Name);
    size_t Start = SM.getFileOffset(Loc);
    size_t End = SM.getFileOffset(MacroNameTok.getEndLoc());
    if (auto SID = getSymbolID(Name, MI, SM))
    Out.MacroRefs[SID].push_back({Start, End, IsDefinition, InIfCondition});
    else
    Out.UnknownMacros.push_back({Start, End, IsDefinition, InIfCondition});
    }
  4. CollectMainFileMacros checks if a macro usage is in main file with if (InMainFile), which is updated in CollectMainFileMacros::FileChange.
  5. I think the intention of PPCallbacks::FileChange is that it is called whenever the lexer moves to a new file.

Above is what I am sure about, after that I am not quite sure what's happening.

I think macro expansions (HandleIdentifier -> HandleMacroExpansion) are done out of the normal file change logic? Because I observed CollectMainFileMacros::add being called with source locations that are clearly not in the same file as reported by the most recent FileChange.

The consequence of that is that CollectMainFileMacros will get offsets related to one file, and those offsets will later be combined with another file (the main file) into a source location in the include cleaner. Those source locations will then be used to get a spelling, which at best will be wrong, and at worst will trigger an assertion failure at:

assert(It != Files.end());


What I did here is instead of using InMainFile/PPCallbacks::FileChange to make sure the macro is in the main file, I check the FileID directly

@llvmbot
Copy link
Member

llvmbot commented Jul 18, 2024

@llvm/pr-subscribers-clangd

@llvm/pr-subscribers-clang-tools-extra

Author: Yuxuan Shui (yshui)

Changes

Unsure if this is the correct fix. But let me explain what I know so far about this problem:

  1. Include cleaner goes over macro refs, and calculates their source locations:
    for (const auto &[_, Refs] : AST.getMacros().MacroRefs) {
    for (const auto &Ref : Refs) {
    auto Loc = SM.getComposedLoc(SM.getMainFileID(), Ref.StartOffset);
  2. It assumes the macro ref is in the main file. I think the intention is that ParsedAST::Macros only contains macro usages in the main file.
  3. ParsedAST::Macros is filled in by CollectMainFileMacros:
    void CollectMainFileMacros::add(const Token &MacroNameTok, const MacroInfo *MI,
    bool IsDefinition, bool InIfCondition) {
    if (!InMainFile)
    return;
    auto Loc = MacroNameTok.getLocation();
    if (Loc.isInvalid() || Loc.isMacroID())
    return;
    auto Name = MacroNameTok.getIdentifierInfo()->getName();
    Out.Names.insert(Name);
    size_t Start = SM.getFileOffset(Loc);
    size_t End = SM.getFileOffset(MacroNameTok.getEndLoc());
    if (auto SID = getSymbolID(Name, MI, SM))
    Out.MacroRefs[SID].push_back({Start, End, IsDefinition, InIfCondition});
    else
    Out.UnknownMacros.push_back({Start, End, IsDefinition, InIfCondition});
    }
  4. CollectMainFileMacros checks if the macro usage is in main file with if (InMainFile), and InMainFile is updated in CollectMainFileMacros::FileChange.
  5. I think the intention of PPCallbacks::FileChange is that it is called whenever the lexer moves to a new file.

Above is what I am sure about, after that I am not quite sure what's happening.

I think macro expansions (HandleIdentifier -> HandleMacroExpansion) are done out of the normal file change logic? Because I observed CollectMainFileMacros::add being calling with source locations that are clearly not in the same file as reported by the most recent FileChange.

The consequence of that is that CollectMainFileMacros will get offsets related to one file, and those offsets will later be combined with another file (the main file) into a source location in the include cleaner. Those source locations will then be used to get a spelling, which triggers a assertion failure at:

assert(It != Files.end());


What I did here is instead of using InMainFile/PPCallbacks::FileChange to make sure the macro is in the main file, I check the FileID directly


Full diff: https://github.com/llvm/llvm-project/pull/99514.diff

1 Files Affected:

  • (modified) clang-tools-extra/clangd/CollectMacros.cpp (+7-7)
diff --git a/clang-tools-extra/clangd/CollectMacros.cpp b/clang-tools-extra/clangd/CollectMacros.cpp index c5ba8d903ba48..bda5616447193 100644 --- a/clang-tools-extra/clangd/CollectMacros.cpp +++ b/clang-tools-extra/clangd/CollectMacros.cpp @@ -26,11 +26,12 @@ Range MacroOccurrence::toRange(const SourceManager &SM) const { void CollectMainFileMacros::add(const Token &MacroNameTok, const MacroInfo *MI, bool IsDefinition, bool InIfCondition) { - if (!InMainFile) - return; auto Loc = MacroNameTok.getLocation(); if (Loc.isInvalid() || Loc.isMacroID()) return; + auto FID = SM.getFileID(Loc); + if (FID != SM.getMainFileID()) + return; auto Name = MacroNameTok.getIdentifierInfo()->getName(); Out.Names.insert(Name); @@ -42,10 +43,8 @@ void CollectMainFileMacros::add(const Token &MacroNameTok, const MacroInfo *MI, Out.UnknownMacros.push_back({Start, End, IsDefinition, InIfCondition}); } -void CollectMainFileMacros::FileChanged(SourceLocation Loc, FileChangeReason, - SrcMgr::CharacteristicKind, FileID) { - InMainFile = isInsideMainFile(Loc, SM); -} +void CollectMainFileMacros::FileChanged(SourceLocation, FileChangeReason, + SrcMgr::CharacteristicKind, FileID) {} void CollectMainFileMacros::MacroExpands(const Token &MacroName, const MacroDefinition &MD, @@ -93,7 +92,8 @@ void CollectMainFileMacros::Defined(const Token &MacroName, void CollectMainFileMacros::SourceRangeSkipped(SourceRange R, SourceLocation EndifLoc) { - if (!InMainFile) + auto FID = SM.getFileID(R.getBegin()); + if (FID != SM.getMainFileID()) return; Position Begin = sourceLocToPosition(SM, R.getBegin()); Position End = sourceLocToPosition(SM, R.getEnd()); 
@yshui
Copy link
Contributor Author

yshui commented Jul 18, 2024

I can provide a repro if anyone is interested

@yshui
Copy link
Contributor Author

yshui commented Jul 18, 2024

Looks like I definitely don't understand what's going on. Instead let me try a more conservative fix. This should preserve what the current code is doing and just fix the assertion failure.

@github-actions
Copy link

github-actions bot commented Jul 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@HighCommander4
Copy link
Collaborator

Is there a bug report this is fixing? Or a test case?

@yshui
Copy link
Contributor Author

yshui commented Jul 18, 2024

Good question. I will try to add one once I figure out how this test system works

@HighCommander4
Copy link
Collaborator

Looks like I definitely don't understand what's going on. Instead let me try a more conservative fix. This should preserve what the current code is doing and just fix the assertion failure.

Could you say more about what problem you ran into with the first fix approach?

@yshui
Copy link
Contributor Author

yshui commented Aug 23, 2024

@HighCommander4 It's been a while. I think I was trying to ignore everything but the main file in CollectMacros, but the naive approach i used wasn't working

Copy link
Collaborator

@HighCommander4 HighCommander4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having debugged the failing scenario a bit (see this comment), my sense is that we do not want to accept MacroRefs entries which are not in the main file; instead, I have a proposal for an alternative solution at the end of that comment.

Let me know if you're interested in implementing it, otherwise I can do it.

@yshui
Copy link
Contributor Author

yshui commented Aug 26, 2024

@HighCommander4 thanks for looking into this! I think this can be closed, right?

@yshui yshui closed this Aug 26, 2024
@yshui yshui deleted the macros-offset branch August 26, 2024 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment