0

I'm building tensorflow 2.20.0 from source from official repo (from release branch v2.20.0)
with rocm and -march=native

I use:

export HERMETIC_PYTHON_VERSION=3.13 export ROCM_PATH=/opt/rocm/ export PATH="$PATH:/opt/rocm/bin" 
Tools: Python 3.13 Clang version 18.1.8 Bazel 7.4.1 (via bazelisk) 

Just like it's stating on the official pageenter image description here

on Linux 6.19.6-arch1-1 x86_64

Command I use:

bazel build //tensorflow/tools/pip_package:wheel --repo_env=USE_PYWRAP_RULES=1 --repo_env=WHEEL_NAME=tensorflow_cpu --copt=-w --config=nonccl -j 16 

But it keeps failing when building fft:

Multiple errors like this:

call of overloaded ‘raw(ptrdiff_t)’ is ambiguous
if (src == &dst.raw(it.oofs(0))) return; // in-place
bazel-out/k8-opt/bin/external/ducc/_virtual_includes/fft/ducc/src/ducc0/fft/fftnd_impl.h:486:22: note: there are 2 candidates
external/ducc/src/ducc0/infra/mav.h:118:35: note: candidate 1: ‘const T& ducc0::detail_mav::cmembuf<T>::raw(I) const [with I = long int; T = double]’
118 | template<typename I> const T &raw(I i) const

./configure file is:

build --action_env PYTHON_BIN_PATH="/usr/bin/python3.13"

build --action_env PYTHON_LIB_PATH="/usr/lib/python3.13/site-packages"

build --python_path="/usr/bin/python3.13"

build --config=rocm

build --action_env ROCM_PATH="/opt/rocm/"

build --action_env CLANG_COMPILER_PATH="/usr/lib/llvm18/bin/clang-18"

build --repo_env=CC=/usr/lib/llvm18/bin/clang-18

build --repo_env=BAZEL_COMPILER=/usr/lib/llvm18/bin/clang-18

build:opt --copt=-march=native

build:opt --host_copt=-march=native

test --test_size_filters=small,medium

test --test_env=LD_LIBRARY_PATH

test:v1 --test_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-oss_serial

test:v1 --build_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu

test:v2 --test_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-oss_serial,-v1only

test:v2 --build_tag_filters=-benchmark-test,-no_oss,-oss_excluded,-no_gpu,-v1only

1 Answer 1

0

Managed to build it with the following params:

bazel --repo_env=USE_PYWRAP_RULES=1 \ --repo_env=WHEEL_NAME=tensorflow_cpu \ --copt=-w \ --copt=-O3 \ --copt=-march=znver4 \ --copt=-mtune=znver4 \ --copt=-flto=thin \ --copt=-fomit-frame-pointer \ --copt=-fstrict-aliasing \ --copt=-DEIGEN_MAX_ALIGN_BYTES=64 \ --copt=-DEIGEN_VECTORIZE_AVX2 \ --copt=-fno-math-errno \ --copt=-fno-trapping-math \ --copt=-fassociative-math \ --copt=-fprofile-generate \ --cxxopt=-std=c++17 \ --cxxopt=-include \ --cxxopt=cstdint \ --linkopt=-flto=thin \ --linkopt=-Wl,-O3 \ --linkopt=-Wl,--as-needed \ --linkopt=-fprofile-generate \ --strip=always \ --experimental_strict_action_env \ --jobs=16 \ --local_resources=memory=32768 

What actually solved the problem was

 --cxxopt=-std=c++17 \ 

Then next error was fixed by

--cxxopt=-include \ --cxxopt=cstdint \ 

And its worked for nonccl, non mk1 build without CUDA and without ROCM
Haven't tried CUDA (have no appropriate hardware), but ROCM and mk1 makes build failing again.

The mk1 can be enabled later via env variable (according to ChatGPT).

Absence of ROCM not a big deal for me (hardware limitations), but if you need it, poor you.

If you figured out how to build it with ROCM please post it as an answer here.

Meanwhile TF 2.21.0 has been released and it is not buildable with the options I specified here.

Mind that

-copt=-march=znver4 \ --copt=-mtune=znver4 \ 

are for amd-zen specific architecture, for general case

-copt=-march=native \ --copt=-mtune=native \ 

is enough.

Also mind that

--copt=-fprofile-generate and --linkopt=-fprofile-generate 

gathers performance statistics, so you need either to remove the flag at all, or rebuild tensorflow once again with

--copt=-fprofile-use \ and --linkopt=-fprofile-use \ 

instead

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.