Skip to content

Conversation

@neoncube2
Copy link
Contributor

Improve clang-tidy's container-data-pointer check to use c_str(), when available.

Fixes #55026

This is my first Clang PR! :)

@llvmbot
Copy link
Member

llvmbot commented Nov 5, 2023

@llvm/pr-subscribers-clang-tidy

Author: Eli Black (neoncube2)

Changes

Improve clang-tidy's container-data-pointer check to use c_str(), when available.

Fixes #55026

This is my first Clang PR! :)


Full diff: https://github.com/llvm/llvm-project/pull/71304.diff

4 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/readability/ContainerDataPointerCheck.cpp (+16-10)
  • (modified) clang-tools-extra/docs/ReleaseNotes.rst (+4)
  • (modified) clang-tools-extra/docs/clang-tidy/checks/readability/container-data-pointer.rst (+2-6)
  • (modified) clang-tools-extra/test/clang-tidy/checkers/readability/container-data-pointer.cpp (+6-6)
diff --git a/clang-tools-extra/clang-tidy/readability/ContainerDataPointerCheck.cpp b/clang-tools-extra/clang-tidy/readability/ContainerDataPointerCheck.cpp index a05e228520c9ef1..c20050063338281 100644 --- a/clang-tools-extra/clang-tidy/readability/ContainerDataPointerCheck.cpp +++ b/clang-tools-extra/clang-tidy/readability/ContainerDataPointerCheck.cpp @@ -40,8 +40,10 @@ void ContainerDataPointerCheck::registerMatchers(MatchFinder *Finder) { cxxRecordDecl( unless(matchers::matchesAnyListedName(IgnoredContainers)), isSameOrDerivedFrom( - namedDecl( - has(cxxMethodDecl(isPublic(), hasName("data")).bind("data"))) + namedDecl(anyOf(has(cxxMethodDecl(isPublic(), hasName("c_str")) + .bind("c_str")), + has(cxxMethodDecl(isPublic(), hasName("data")) + .bind("data")))) .bind("container"))) .bind("record"); @@ -93,6 +95,8 @@ void ContainerDataPointerCheck::check(const MatchFinder::MatchResult &Result) { const auto *DCE = Result.Nodes.getNodeAs<Expr>(DerefContainerExprName); const auto *ACE = Result.Nodes.getNodeAs<Expr>(AddrOfContainerExprName); + const auto *CStrMethod = Result.Nodes.getNodeAs<CXXMethodDecl>("c_str"); + if (!UO || !CE) return; @@ -111,16 +115,18 @@ void ContainerDataPointerCheck::check(const MatchFinder::MatchResult &Result) { MemberExpr>(CE)) ReplacementText = "(" + ReplacementText + ")"; - if (CE->getType()->isPointerType()) - ReplacementText += "->data()"; - else - ReplacementText += ".data()"; + ReplacementText += CE->getType()->isPointerType() ? "->" : "."; + ReplacementText += CStrMethod != NULL ? "c_str()" : "data()"; + + std::string Description = + CStrMethod != NULL + ? "'c_str' should be used instead of taking the address of the 0-th " + "element" + : "'data' should be used for accessing the data pointer instead of " + "taking the address of the 0-th element"; FixItHint Hint = FixItHint::CreateReplacement(UO->getSourceRange(), ReplacementText); - diag(UO->getBeginLoc(), - "'data' should be used for accessing the data pointer instead of taking " - "the address of the 0-th element") - << Hint; + diag(UO->getBeginLoc(), Description) << Hint; } } // namespace clang::tidy::readability diff --git a/clang-tools-extra/docs/ReleaseNotes.rst b/clang-tools-extra/docs/ReleaseNotes.rst index ecfb3aa9267f140..6477f2243d31e18 100644 --- a/clang-tools-extra/docs/ReleaseNotes.rst +++ b/clang-tools-extra/docs/ReleaseNotes.rst @@ -238,6 +238,10 @@ Changes in existing checks Casting to ``void`` no longer suppresses issues by default, control this behavior with the new `AllowCastToVoid` option. +- Improved :doc:`container-data-pointer + <clang-tidy/checks/readability/container-data-pointer>` check + to use ``c_str()`` when it's present on a container. + - Improved :doc:`cppcoreguidelines-avoid-non-const-global-variables <clang-tidy/checks/cppcoreguidelines/avoid-non-const-global-variables>` check to ignore ``static`` variables declared within the scope of diff --git a/clang-tools-extra/docs/clang-tidy/checks/readability/container-data-pointer.rst b/clang-tools-extra/docs/clang-tidy/checks/readability/container-data-pointer.rst index 0d10829ed3c2f9b..8a6b48c58005bf2 100644 --- a/clang-tools-extra/docs/clang-tidy/checks/readability/container-data-pointer.rst +++ b/clang-tools-extra/docs/clang-tidy/checks/readability/container-data-pointer.rst @@ -3,13 +3,9 @@ readability-container-data-pointer ================================== -Finds cases where code could use ``data()`` rather than the address of the -element at index 0 in a container. This pattern is commonly used to materialize -a pointer to the backing data of a container. ``std::vector`` and -``std::string`` provide a ``data()`` accessor to retrieve the data pointer which -should be preferred. +Finds cases where code references the address of the element at index 0 in a container and replaces them with calls to ``data()`` or ``c_str()``. -This also ensures that in the case that the container is empty, the data pointer +Using ``data()`` or ``c_str()`` is more readable and ensures that if the container is empty, the data pointer access does not perform an errant memory access. Options diff --git a/clang-tools-extra/test/clang-tidy/checkers/readability/container-data-pointer.cpp b/clang-tools-extra/test/clang-tidy/checkers/readability/container-data-pointer.cpp index 7d8500ed4affe36..a53f303eba1f5d3 100644 --- a/clang-tools-extra/test/clang-tidy/checkers/readability/container-data-pointer.cpp +++ b/clang-tools-extra/test/clang-tidy/checkers/readability/container-data-pointer.cpp @@ -52,15 +52,15 @@ void g(size_t s) { void h() { std::string s; f(&((s).operator[]((z)))); - // CHECK-MESSAGES-CLASSIC: :[[@LINE-1]]:5: warning: 'data' should be used for accessing the data pointer instead of taking the address of the 0-th element [readability-container-data-pointer] - // CHECK-FIXES-CLASSIC: {{^ }}f(s.data());{{$}} - // CHECK-MESSAGES-WITH-CONFIG-NOT: :[[@LINE-3]]:5: warning: 'data' should be used for accessing the data pointer instead of taking the address of the 0-th element [readability-container-data-pointer] + // CHECK-MESSAGES-CLASSIC: :[[@LINE-1]]:5: warning: 'c_str' should be used instead of taking the address of the 0-th element [readability-container-data-pointer] + // CHECK-FIXES-CLASSIC: {{^ }}f(s.c_str());{{$}} + // CHECK-MESSAGES-WITH-CONFIG-NOT: :[[@LINE-3]]:5: warning: 'c_str' should be used instead of taking the address of the 0-th element [readability-container-data-pointer] std::wstring w; f(&((&(w))->operator[]((z)))); - // CHECK-MESSAGES-CLASSIC: :[[@LINE-1]]:5: warning: 'data' should be used for accessing the data pointer instead of taking the address of the 0-th element [readability-container-data-pointer] - // CHECK-FIXES-CLASSIC: {{^ }}f(w.data());{{$}} - // CHECK-MESSAGES-WITH-CONFIG-NOT: :[[@LINE-3]]:5: warning: 'data' should be used for accessing the data pointer instead of taking the address of the 0-th element [readability-container-data-pointer] + // CHECK-MESSAGES-CLASSIC: :[[@LINE-1]]:5: warning: 'c_str' should be used instead of taking the address of the 0-th element [readability-container-data-pointer] + // CHECK-FIXES-CLASSIC: {{^ }}f(w.c_str());{{$}} + // CHECK-MESSAGES-WITH-CONFIG-NOT: :[[@LINE-3]]:5: warning: 'c_str' should be used instead of taking the address of the 0-th element [readability-container-data-pointer] } template <typename T, typename U, 
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrap on 80 column,
And first sentence of documentation should be in sync with class description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not wrapped on 80 collumn

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not so easy, because if result of &[] is used as non-const, then using c_str will make code non-compilable. Some additional checks wuold be required, like checking if object is const, or checking if there is implicit cast to const pointer from an nonconst pointer, only then c_str could be used.

Make "f" function take non-const pointer, then you will see.
Also there can be an issue when for example, container have only-const version of some functions (c_str).

…instead of `data()` when it's available. Fixes #55026
@neoncube2
Copy link
Contributor Author

@PiotrZSL Thanks for the quick review! :)

I think #54076 covers the const issue. Since container-data-pointer is already broken for code such as char *t = str.data();, I think merging this PR wouldn't cause additional issues, but I could be wrong about that! :)

I'd like to take a look at #54076 next, but I think that'd be a bit more complicated to fix, since it involves the left-hand side of the expression (Unless there's already a helper method to determine whether the left-hand side of the expression is const?)

Thanks again for the review! :)

ReplacementText += CE->getType()->isPointerType() ? "->" : ".";
ReplacementText += CStrMethod ? "c_str()" : "data()";

std::string Description =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use llvm::StringRef instead of std::string here

Finds cases where code references the address of the element at index 0 in a container and replaces them with calls to ``data()`` or ``c_str()``.

This also ensures that in the case that the container is empty, the data pointer
Using ``data()`` or ``c_str()`` is more readable and ensures that if the container is empty, the data pointer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow 80 characters limit.

Copy link
Contributor

@5chmidti 5chmidti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neoncube2 do you plan to continue working on this?


Because we are emitting a fix-it, it might be a good idea to check if the return type of c_str is that of data (~possibly unwrapping aliases).

Comment on lines +118 to +119
ReplacementText += CE->getType()->isPointerType() ? "->" : ".";
ReplacementText += CStrMethod ? "c_str()" : "data()";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to use llvm::Twine for this concatenation

namedDecl(anyOf(has(cxxMethodDecl(isPublic(), hasName("c_str"))
.bind("c_str")),
has(cxxMethodDecl(isPublic(), hasName("data"))
.bind("data"))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The node bound to data is never retrieved, so you can remove the .bind("data").

@neoncube2
Copy link
Contributor Author

@5chmidti Ah, no, I no longer plan to work on this, sorry. It turns out that C++ compilers are too far out of the range of my experience, haha.

Should I close this pull request? Or if someone wants to use my branch as a base for a new PR, feel free! :)

@5chmidti
Copy link
Contributor

No worries. Feel free to ask in reviews or in discord though. Feel free to close it if you don't want to work on this anymore. The pr will still be linked in the issue, in case someone wants to look at it.

@neoncube2 neoncube2 closed this by deleting the head repository Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

5 participants