Skip to content

Conversation

@ldionne
Copy link
Member

@ldionne ldionne commented Sep 30, 2024

Since we don't generate a full dependency graph of headers, we can
greatly simplify the script that parses the result of --trace-includes.

At the same time, we also unify the mechanism for detecting whether a
header is a public/C compat/internal/etc header with the existing
mechanism in header_information.py.

As a drive-by this fixes the headers_in_modulemap.sh.py test which had
been disabled by mistake because it used its own way of determining
the list of libc++ headers. By consistently using header_information.py
to get that information, problems like this shouldn't happen anymore.

This should also unblock #110303, which was blocked because of
a brittle implementation of the transitive includes check which broke
when the repository was cloned at a path like /path/__something/more.

@ldionne ldionne requested a review from a team as a code owner September 30, 2024 18:57
@llvmbot llvmbot added libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. github:workflow labels Sep 30, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-libcxx

Author: Louis Dionne (ldionne)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/110554.diff

4 Files Affected:

  • (modified) .github/workflows/libcxx-build-and-test.yaml (+3-1)
  • (modified) libcxx/test/libcxx/transitive_includes.gen.py (+3-6)
  • (added) libcxx/test/libcxx/transitive_includes/expected.py (+35)
  • (renamed) libcxx/test/libcxx/transitive_includes/to_csv.py ()
diff --git a/.github/workflows/libcxx-build-and-test.yaml b/.github/workflows/libcxx-build-and-test.yaml index b5e60781e00064..831db85ca8e5c2 100644 --- a/.github/workflows/libcxx-build-and-test.yaml +++ b/.github/workflows/libcxx-build-and-test.yaml @@ -49,7 +49,9 @@ env: jobs: stage1: if: github.repository_owner == 'llvm' - runs-on: libcxx-runners-8-set + runs-on: ubuntu-latest + container: + image: ghcr.io/libcxx/actions-builder:fozzie-live continue-on-error: false strategy: fail-fast: false diff --git a/libcxx/test/libcxx/transitive_includes.gen.py b/libcxx/test/libcxx/transitive_includes.gen.py index 22075364bf1b79..f9a868128c6c83 100644 --- a/libcxx/test/libcxx/transitive_includes.gen.py +++ b/libcxx/test/libcxx/transitive_includes.gen.py @@ -55,7 +55,7 @@ print( f"""\ -// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes_to_csv.py {' '.join(all_traces)} > %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv +// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/to_csv.py {' '.join(all_traces)} > %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv """ ) @@ -64,9 +64,6 @@ if header.endswith(".h"): # Skip C compatibility or detail headers continue - # Escape slashes for the awk command below - escaped_header = header.replace("/", "\\/") - print( f"""\ //--- {header}.sh.cpp @@ -92,8 +89,8 @@ // RUN: mkdir %t // RUN: %{{cxx}} %s %{{flags}} %{{compile_flags}} --trace-includes -fshow-skipped-includes --preprocess > /dev/null 2> %t/trace-includes.txt -// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes_to_csv.py %t/trace-includes.txt > %t/actual_transitive_includes.csv -// RUN: cat %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv | awk '/^{escaped_header} / {{ print }}' > %t/expected_transitive_includes.csv +// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/to_csv.py %t/trace-includes.txt > %t/actual_transitive_includes.csv +// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/expected.py %{{cxx_std}} "{header}" > %t/expected_transitive_includes.csv // RUN: diff -w %t/expected_transitive_includes.csv %t/actual_transitive_includes.csv #include <{header}> """ diff --git a/libcxx/test/libcxx/transitive_includes/expected.py b/libcxx/test/libcxx/transitive_includes/expected.py new file mode 100755 index 00000000000000..b2519f36b3974b --- /dev/null +++ b/libcxx/test/libcxx/transitive_includes/expected.py @@ -0,0 +1,35 @@ +#!/usr/bin/env python +# ===----------------------------------------------------------------------===## +# +# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +# See https://llvm.org/LICENSE.txt for license information. +# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +# +# ===----------------------------------------------------------------------===## + +import argparse +import os + + +if __name__ == "__main__": + parser = argparse.ArgumentParser( + description="""Extract the list of expected transitive includes for the given Standard and header.""", + ) + parser.add_argument( + "standard", + default=None, + choices=["cxx03", "cxx11", "cxx14", "cxx20", "cxx23", "cxx26"], + ) + parser.add_argument( + "header", + default=None, + help="The header to extract the expected transitive includes for." + ) + args = parser.parse_args() + + CSV_ROOT = os.path.dirname(__file__) + filename = os.path.join(CSV_ROOT, f"{args.standard}.csv") + with open(filename, 'r') as f: + for line in f.readlines(): + if line.startswith(args.header + ' '): + print(line, end='') # lines already end in newline diff --git a/libcxx/test/libcxx/transitive_includes_to_csv.py b/libcxx/test/libcxx/transitive_includes/to_csv.py similarity index 100% rename from libcxx/test/libcxx/transitive_includes_to_csv.py rename to libcxx/test/libcxx/transitive_includes/to_csv.py 
@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-github-workflow

Author: Louis Dionne (ldionne)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/110554.diff

4 Files Affected:

  • (modified) .github/workflows/libcxx-build-and-test.yaml (+3-1)
  • (modified) libcxx/test/libcxx/transitive_includes.gen.py (+3-6)
  • (added) libcxx/test/libcxx/transitive_includes/expected.py (+35)
  • (renamed) libcxx/test/libcxx/transitive_includes/to_csv.py ()
diff --git a/.github/workflows/libcxx-build-and-test.yaml b/.github/workflows/libcxx-build-and-test.yaml index b5e60781e00064..831db85ca8e5c2 100644 --- a/.github/workflows/libcxx-build-and-test.yaml +++ b/.github/workflows/libcxx-build-and-test.yaml @@ -49,7 +49,9 @@ env: jobs: stage1: if: github.repository_owner == 'llvm' - runs-on: libcxx-runners-8-set + runs-on: ubuntu-latest + container: + image: ghcr.io/libcxx/actions-builder:fozzie-live continue-on-error: false strategy: fail-fast: false diff --git a/libcxx/test/libcxx/transitive_includes.gen.py b/libcxx/test/libcxx/transitive_includes.gen.py index 22075364bf1b79..f9a868128c6c83 100644 --- a/libcxx/test/libcxx/transitive_includes.gen.py +++ b/libcxx/test/libcxx/transitive_includes.gen.py @@ -55,7 +55,7 @@ print( f"""\ -// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes_to_csv.py {' '.join(all_traces)} > %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv +// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/to_csv.py {' '.join(all_traces)} > %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv """ ) @@ -64,9 +64,6 @@ if header.endswith(".h"): # Skip C compatibility or detail headers continue - # Escape slashes for the awk command below - escaped_header = header.replace("/", "\\/") - print( f"""\ //--- {header}.sh.cpp @@ -92,8 +89,8 @@ // RUN: mkdir %t // RUN: %{{cxx}} %s %{{flags}} %{{compile_flags}} --trace-includes -fshow-skipped-includes --preprocess > /dev/null 2> %t/trace-includes.txt -// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes_to_csv.py %t/trace-includes.txt > %t/actual_transitive_includes.csv -// RUN: cat %{{libcxx-dir}}/test/libcxx/transitive_includes/%{{cxx_std}}.csv | awk '/^{escaped_header} / {{ print }}' > %t/expected_transitive_includes.csv +// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/to_csv.py %t/trace-includes.txt > %t/actual_transitive_includes.csv +// RUN: %{{python}} %{{libcxx-dir}}/test/libcxx/transitive_includes/expected.py %{{cxx_std}} "{header}" > %t/expected_transitive_includes.csv // RUN: diff -w %t/expected_transitive_includes.csv %t/actual_transitive_includes.csv #include <{header}> """ diff --git a/libcxx/test/libcxx/transitive_includes/expected.py b/libcxx/test/libcxx/transitive_includes/expected.py new file mode 100755 index 00000000000000..b2519f36b3974b --- /dev/null +++ b/libcxx/test/libcxx/transitive_includes/expected.py @@ -0,0 +1,35 @@ +#!/usr/bin/env python +# ===----------------------------------------------------------------------===## +# +# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +# See https://llvm.org/LICENSE.txt for license information. +# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +# +# ===----------------------------------------------------------------------===## + +import argparse +import os + + +if __name__ == "__main__": + parser = argparse.ArgumentParser( + description="""Extract the list of expected transitive includes for the given Standard and header.""", + ) + parser.add_argument( + "standard", + default=None, + choices=["cxx03", "cxx11", "cxx14", "cxx20", "cxx23", "cxx26"], + ) + parser.add_argument( + "header", + default=None, + help="The header to extract the expected transitive includes for." + ) + args = parser.parse_args() + + CSV_ROOT = os.path.dirname(__file__) + filename = os.path.join(CSV_ROOT, f"{args.standard}.csv") + with open(filename, 'r') as f: + for line in f.readlines(): + if line.startswith(args.header + ' '): + print(line, end='') # lines already end in newline diff --git a/libcxx/test/libcxx/transitive_includes_to_csv.py b/libcxx/test/libcxx/transitive_includes/to_csv.py similarity index 100% rename from libcxx/test/libcxx/transitive_includes_to_csv.py rename to libcxx/test/libcxx/transitive_includes/to_csv.py 
@ldionne ldionne changed the title Review/transitive includes more portable [libc++] Stop using awk in transitive includes test Sep 30, 2024
@github-actions
Copy link

github-actions bot commented Sep 30, 2024

⚠️ Python code formatter, darker found issues in your code. ⚠️

You can test this locally with the following command:
darker --check --diff -r a4367d2d136420f562f64e7731b9393fb609f3fc...c43371183ac3d0aef9f8fc66fdc1a4ec2d16deff libcxx/test/libcxx/transitive_includes/to_csv.py libcxx/test/libcxx/header_inclusions.gen.py libcxx/test/libcxx/headers_in_modulemap.sh.py libcxx/test/libcxx/transitive_includes.gen.py libcxx/utils/generate_iwyu_mapping.py libcxx/utils/generate_libcxx_cppm_in.py libcxx/utils/libcxx/header_information.py
View the diff from darker here.
--- test/libcxx/header_inclusions.gen.py	2024-10-16 23:54:58.000000 +0000 +++ test/libcxx/header_inclusions.gen.py	2024-10-17 00:07:22.691172 +0000 @@ -10,34 +10,43 @@ # prescribed by the Standard. # RUN: %{python} %s %{libcxx-dir}/utils import sys + sys.path.append(sys.argv[1]) -from libcxx.header_information import lit_header_restrictions, public_headers, mandatory_inclusions +from libcxx.header_information import ( + lit_header_restrictions, + public_headers, + mandatory_inclusions, +) for header in public_headers: - header_guard = lambda h: f"_LIBCPP_{str(h).upper().replace('.', '_').replace('/', '_')}" + header_guard = ( + lambda h: f"_LIBCPP_{str(h).upper().replace('.', '_').replace('/', '_')}" + ) - # <cassert> has no header guards - if header == 'cassert': - checks = '' - else: - checks = f''' + # <cassert> has no header guards + if header == "cassert": + checks = "" + else: + checks = f""" #ifndef {header_guard(header)} # error <{header}> was expected to define a header guard {header_guard(header)} #endif -''' - for includee in mandatory_inclusions.get(header, []): - checks += f''' +""" + for includee in mandatory_inclusions.get(header, []): + checks += f""" #ifndef {header_guard(includee)} # error <{header}> was expected to include <{includee}> #endif -''' +""" - print(f"""\ + print( + f"""\ //--- {header}.compile.pass.cpp {lit_header_restrictions.get(header, '')} #include <{header}> {checks} -""") +""" + ) --- test/libcxx/headers_in_modulemap.sh.py	2024-10-16 23:59:35.000000 +0000 +++ test/libcxx/headers_in_modulemap.sh.py	2024-10-17 00:07:22.711775 +0000 @@ -1,8 +1,9 @@ # RUN: %{python} %s %{libcxx-dir}/utils import sys + sys.path.append(sys.argv[1]) from libcxx.header_information import all_headers, libcxx_include with open(libcxx_include / "module.modulemap") as f: modulemap = f.read() --- test/libcxx/transitive_includes/to_csv.py	2024-10-16 23:54:58.000000 +0000 +++ test/libcxx/transitive_includes/to_csv.py	2024-10-17 00:07:22.745144 +0000 @@ -14,13 +14,16 @@ import os import pathlib import re import sys -libcxx_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))) +libcxx_root = os.path.dirname( + os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) +) sys.path.append(os.path.join(libcxx_root, "utils")) from libcxx.header_information import Header + def parse_line(line: str) -> Tuple[int, str]: """ Parse a single line of --trace-includes output. @@ -31,10 +34,11 @@ raise ArgumentError(f"Line {line} contains invalid data.") # The number of periods in front of the header name is the nesting level of # that header. return (len(match.group(1)), match.group(2)) + def make_cxx_v1_relative(header: str) -> Optional[str]: """ Returns the path of the header as relative to <whatever>/c++/v1, or None if the path doesn't contain c++/v1. @@ -51,10 +55,11 @@ match = re.match(CXX_V1_REGEX, header) if not match: return None else: return match.group(1) + def parse_file(file: io.TextIOBase) -> List[Tuple[Header, Header]]: """ Parse a file containing --trace-includes output to generate a list of the transitive includes contained in it. @@ -79,25 +84,27 @@ else: assert includer is not None result.append((includer, Header(relative))) return result + def print_csv(includes: List[Tuple[Header, Header]]) -> None: """ Print the transitive includes as space-delimited CSV. This function only prints public libc++ headers that are not C compatibility headers. """ # Sort and group by includer by_includer = lambda t: t[0] includes = itertools.groupby(sorted(includes, key=by_includer), key=by_includer) - for (includer, includees) in includes: + for includer, includees in includes: includees = map(lambda t: t[1], includees) for h in sorted(set(includees)): if h.is_public() and not h.is_C_compatibility(): print(f"{includer} {h}") + def main(argv): parser = argparse.ArgumentParser( description=""" Given a list of headers produced by --trace-includes, produce a list of libc++ headers in that output. @@ -106,15 +113,22 @@ information for this script to run. The output of this script is provided in space-delimited CSV format where each line contains: <header performing inclusion> <header being included> - """) - parser.add_argument("inputs", type=argparse.FileType("r"), nargs='+', default=None, - help="One or more files containing the result of --trace-includes") + """ + ) + parser.add_argument( + "inputs", + type=argparse.FileType("r"), + nargs="+", + default=None, + help="One or more files containing the result of --trace-includes", + ) args = parser.parse_args(argv) includes = [line for file in args.inputs for line in parse_file(file)] print_csv(includes) + if __name__ == "__main__": main(sys.argv[1:]) --- utils/generate_libcxx_cppm_in.py	2024-10-16 23:54:58.000000 +0000 +++ utils/generate_libcxx_cppm_in.py	2024-10-17 00:07:22.841463 +0000 @@ -7,11 +7,17 @@ # ===----------------------------------------------------------------------===## import os.path import sys -from libcxx.header_information import module_c_headers, module_headers, header_restrictions, headers_not_available, libcxx_root +from libcxx.header_information import ( + module_c_headers, + module_headers, + header_restrictions, + headers_not_available, + libcxx_root, +) def write_file(module): with open(libcxx_root / "modules" / f"{module}.cppm.in", "w") as module_cpp_in: module_cpp_in.write( --- utils/libcxx/header_information.py	2024-10-17 00:02:30.000000 +0000 +++ utils/libcxx/header_information.py	2024-10-17 00:07:22.925490 +0000 @@ -6,22 +6,26 @@ # # ===----------------------------------------------------------------------===## import os, pathlib, functools -libcxx_root = pathlib.Path(os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))) +libcxx_root = pathlib.Path( + os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) +) libcxx_include = libcxx_root / "include" assert libcxx_root.exists() + def _is_header_file(file): """Returns whether the given file is a header file, i.e. not a directory or the modulemap file.""" return not file.is_dir() and not file.name in [ "module.modulemap", "CMakeLists.txt", "libcxx.imp", "__config_site.in", ] + @functools.total_ordering class Header: _name: str """Relative path from the root of libcxx/include""" @@ -91,11 +95,16 @@ These headers are all C++23-and-later headers, excluding C compatibility headers and experimental headers. """ # These headers have been removed in C++20 so are never part of a module. removed_in_20 = ["ccomplex", "ciso646", "cstdbool", "ctgmath"] - return self.is_public() and not self.is_experimental() and not self.is_C_compatibility() and not self._name in removed_in_20 + return ( + self.is_public() + and not self.is_experimental() + and not self.is_C_compatibility() + and not self._name in removed_in_20 + ) def is_in_modulemap(self) -> bool: """Returns whether a header should be listed in the modulemap.""" # TODO: Should `__config_site` be in the modulemap? if self._name == "__config_site": @@ -139,35 +148,47 @@ def __hash__(self) -> int: return hash(self._name) # Commonly-used sets of headers -all_headers = [Header(p.relative_to(libcxx_include).as_posix()) for p in libcxx_include.rglob("[_a-z]*") if _is_header_file(p)] -all_headers += [Header("__config_site"), Header("__assertion_handler")] # Headers generated during the build process +all_headers = [ + Header(p.relative_to(libcxx_include).as_posix()) + for p in libcxx_include.rglob("[_a-z]*") + if _is_header_file(p) +] +all_headers += [ + Header("__config_site"), + Header("__assertion_handler"), +] # Headers generated during the build process public_headers = [h for h in all_headers if h.is_public()] module_headers = [h for h in all_headers if h.has_cxx20_module()] module_c_headers = [h for h in all_headers if h.has_cxx20_module() and h.is_cstd()] # These headers are not yet implemented in libc++ # # These headers are required by the latest (draft) Standard but have not been # implemented yet. They are used in the generated module input. The C++23 standard # modules will fail to build if a header is added but this list is not updated. -headers_not_available = list(map(Header, [ - "debugging", - "flat_map", - "flat_set", - "generator", - "hazard_pointer", - "inplace_vector", - "linalg", - "rcu", - "spanstream", - "stacktrace", - "stdfloat", - "text_encoding", -])) +headers_not_available = list( + map( + Header, + [ + "debugging", + "flat_map", + "flat_set", + "generator", + "hazard_pointer", + "inplace_vector", + "linalg", + "rcu", + "spanstream", + "stacktrace", + "stdfloat", + "text_encoding", + ], + ) +) header_restrictions = { # headers with #error directives "atomic": "_LIBCPP_HAS_ATOMIC_HEADER", "stdatomic.h": "_LIBCPP_HAS_ATOMIC_HEADER", 
@ldionne ldionne force-pushed the review/transitive-includes-more-portable branch 2 times, most recently from bbeecef to 4e981c9 Compare September 30, 2024 21:20
@ldionne
Copy link
Member Author

ldionne commented Sep 30, 2024

It looks like the issue has to do with Lit's builtin diff command. Will have to keep investigating tomorrow.

@ldionne ldionne force-pushed the review/transitive-includes-more-portable branch from 4e981c9 to 1f80ab9 Compare October 1, 2024 12:52
@ldionne ldionne changed the title [libc++] Stop using awk in transitive includes test [libc++] Fix issue with running transitive includes tests on Docker-in-Docker Oct 1, 2024
@EricWF
Copy link
Member

EricWF commented Oct 2, 2024

So I don't think the diff tool is at fault here, because using /usr/bin/diff fails as well.

On my local machine, I reproduced the issue by creating a /__w/foo/llvm-project directory and building using that.
I suspect there's something wrong with our escaping or regular expressions.

EDIT: And /w/foo/llvm-project works fine. So it's the /__w/ bit that's causing the problem.

Since we don't generate a full dependency graph of headers, we can greatly simplify the script that parses the result of --trace-includes. At the same time, we also unify the mechanism for detecting whether a header is a public/C compat/internal/etc header with the existing mechanism in header_information.py. As a drive-by this fixes the headers_in_modulemap.sh.py test which had been disabled by mistake because it used its own way of determining the list of libc++ headers. By consistently using header_information.py to get that information, problems like this shouldn't happen anymore.
@ldionne ldionne force-pushed the review/transitive-includes-more-portable branch from 1f80ab9 to 162bef9 Compare October 16, 2024 23:56
@ldionne ldionne changed the title [libc++] Fix issue with running transitive includes tests on Docker-in-Docker [libc++] Rewrite the transitive header checking machinery Oct 16, 2024
@ldionne
Copy link
Member Author

ldionne commented Oct 22, 2024

I'll merge this to unblock a chain of dependencies getting longer and longer. I'm willing to take any post-review feedback and I can create a follow-up PR, but for now this fixes some concrete issues preventing us (amongst other things) to move to Docker-in-Docker and update the Docker images, which is blocking the clang-tidy tests re-enablement.

@ldionne ldionne merged commit d6e714b into llvm:main Oct 22, 2024
58 of 59 checks passed
@ldionne ldionne deleted the review/transitive-includes-more-portable branch October 22, 2024 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

github:workflow libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

3 participants