Skip to content

Integrate libbacktrace for enhanced stack trace resolution#7721

Open
eddyashton wants to merge 7 commits intomicrosoft:mainfrom
eddyashton:try_libbacktrace
Open

Integrate libbacktrace for enhanced stack trace resolution#7721
eddyashton wants to merge 7 commits intomicrosoft:mainfrom
eddyashton:try_libbacktrace

Conversation

@eddyashton
Copy link
Member

Closes #7714.

BEFORE (Release):

2026-03-06T15:57:58.537393Z 0 [fatal] CCF/src/tasks/worker.cpp:104 | BasicTask task failed with exception: I went boom on purpose 2026-03-06T15:57:58.537544Z 0 [fatal] CCF/src/tasks/worker.cpp:109 | Stack trace: #0: ./js_generic(__cxa_throw+0x31) [0x592b5265ffa1] #1: ./js_generic(+0x398c99) [0x592b52430c99] #2: ./js_generic(+0x5bf1fe) [0x592b526571fe] #3: ./js_generic(+0x3e6cd1) [0x592b5247ecd1] #4: ./js_generic(+0x1ffac5) [0x592b52297ac5] #5: ./js_generic(+0x1ca502) [0x592b52262502] #6: ./js_generic(+0x634cb) [0x592b520fb4cb] #7: /usr/lib/libstdc++.so.6(+0xec013) [0x79c7a4465013] #8: /usr/lib/libc.so.6(+0x8bca7) [0x79c7a40b2ca7] #9: /usr/lib/libc.so.6(+0x10fb1c) [0x79c7a4136b1c] 

AFTER (Release):

2026-03-06T16:06:31.506310Z 0 [fatal] CCF/src/tasks/worker.cpp:166 | BasicTask task failed with exception: I went boom on purpose 2026-03-06T16:06:31.569938Z 0 [fatal] CCF/src/tasks/worker.cpp:171 | Stack trace: #0: __cxa_throw #1: ccf::JwtKeyAutoRefresh::start()::{lambda()#1}::operator()() const #2: ccf::tasks::BaseTask::do_task() #3: ccf::tasks::try_do_task(ccf::tasks::BaseTask&, bool) #4: ccf::Enclave::run_main() #5: ccf::enclave_run() #6: std::thread::_State_impl<std::thread::_Invoker<std::tuple<ccf::run_enclave_threads(host::CCHostConfig const&)::$_0, unsigned int> > >::_M_run() #7: execute_native_thread_routine #8: start_thread #9: clone #10: 0xffffffffffffffff 

AFTER (RelWithDebInfo):

2026-03-06T16:28:41.220970Z 0 [fatal] CCF/src/tasks/worker.cpp:169 | BasicTask task failed with exception: I went boom on purpose 2026-03-06T16:28:41.810765Z 0 [fatal] CCF/src/tasks/worker.cpp:174 | Stack trace: #0: __cxa_throw at CCF/build/CCF/src/tasks/worker.cpp:208 #1: ccf::JwtKeyAutoRefresh::start()::{lambda()#1}::operator()() const at CCF/src/node/jwt_key_auto_refresh.h:65 #2: ccf::tasks::BaseTask::do_task() at CCF/build/CCF/src/tasks/task_system.cpp:32 #3: ccf::tasks::try_do_task(ccf::tasks::BaseTask&, bool) at CCF/src/tasks/worker.h:32 #4: ccf::Enclave::run_main() at CCF/src/enclave/enclave.h:394 #5: ccf::enclave_run() at CCF/build/CCF/src/enclave/main.cpp:179 #6: std::thread::_State_impl<std::thread::_Invoker<std::tuple<ccf::run_enclave_threads(host::CCHostConfig const&)::$_0, unsigned int> > >::_M_run() at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/13.2.0/../../../../include/c++/13.2.0/bits/std_thread.h:244 #7: execute_native_thread_routine #8: start_thread #9: clone #10: 0xffffffffffffffff 
@eddyashton eddyashton requested a review from a team as a code owner March 6, 2026 16:51
Copilot AI review requested due to automatic review settings March 6, 2026 16:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Integrates libbacktrace into CCF’s task worker exception handling to produce higher-fidelity stack traces (function names, and optionally file/line with debug info) compared to the previous execinfo/backtrace_symbols approach.

Changes:

  • Replace execinfo-based stack trace formatting with libbacktrace-based DWARF-aware resolution in src/tasks/worker.cpp.
  • Update task-system unit test expectations and prevent inlining of call-chain helpers to make stack frames stable in optimised builds.
  • Add CI package installation and CMake linkage for libbacktrace.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
src/tasks/worker.cpp Uses libbacktrace to capture/resolve throw-point stack traces and demangle symbols.
src/tasks/test/basic_tasks.cpp Adjusts stack-trace assertions and marks helper functions noinline to preserve frames.
scripts/setup-ci.sh Installs libbacktrace-static in CI images.
CMakeLists.txt Removes Debug-only -rdynamic/-fno-omit-frame-pointer flags and links ccf_tasks to libbacktrace.
Comment on lines +81 to 95
int pcinfo_callback(
void* data,
uintptr_t /*pc*/,
const char* filename,
int lineno,
const char* function)
{
// backtrace_symbols format: "binary(mangled+0xoffset) [0xaddr]"
// Try to extract and demangle the symbol name between '(' and '+'/')'
std::string entry(raw);
auto open = entry.find('(');
auto plus = entry.find('+', open != std::string::npos ? open : 0);
auto close = entry.find(')', open != std::string::npos ? open : 0);

if (
open != std::string::npos && close != std::string::npos &&
close > open + 1)
auto* result = static_cast<PcinfoResult*>(data);
if (function != nullptr)
{
auto end = (plus != std::string::npos && plus < close) ? plus : close;
std::string mangled = entry.substr(open + 1, end - open - 1);

if (!mangled.empty())
{
int status = 0;
std::unique_ptr<char, FreeDeleter> demangled(
abi::__cxa_demangle(mangled.c_str(), nullptr, nullptr, &status));
if (status == 0 && demangled != nullptr)
{
std::string rest = entry.substr(end);
entry = entry.substr(0, open + 1) + demangled.get() + rest;
}
}
result->resolved = true;
result->function = demangle(function);
result->filename = (filename != nullptr) ? filename : "";
result->lineno = lineno;
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pcinfo_callback only marks the frame as resolved and records file/line when function != nullptr. libbacktrace can provide filename/lineno even when function is null; with the current logic, those frames will be treated as unresolved and printed as raw addresses. Consider setting resolved (and recording filename/lineno) whenever any of {function, filename, lineno} is present, and using a placeholder like <unknown> when the function name is missing.

Copilot uses AI. Check for mistakes.
CMakeLists.txt Outdated
Comment on lines +282 to +283
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes user-visible behaviour (stack traces in fatal logs) and adds a new dependency. Per repo guidance, user-facing behaviour changes should be recorded in CHANGELOG.md. Please add an entry describing the improved stack trace resolution and the new libbacktrace dependency.

Suggested change
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})
find_library(BACKTRACE_LIBRARY backtrace)
if(BACKTRACE_LIBRARY)
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS} ${BACKTRACE_LIBRARY})
else()
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS})
endif()
Copilot uses AI. Check for mistakes.
CMakeLists.txt Outdated
target_link_libraries(ccf_tasks PRIVATE ${CMAKE_DL_LIBS})
# libbacktrace reads DWARF debug info directly, providing file/line/function
# resolution in stack traces without requiring -rdynamic.
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This introduces a hard build dependency on libbacktrace (find_library(... REQUIRED) + include <backtrace.h>). If this is intended to be mandatory, it should be surfaced with a clearer configure-time message and documented as a new build prerequisite (and/or made optional with a fallback to the previous implementation when libbacktrace isn't available).

Suggested change
find_library(BACKTRACE_LIBRARY backtrace REQUIRED)
find_library(BACKTRACE_LIBRARY backtrace)
if(NOT BACKTRACE_LIBRARY)
message(
FATAL_ERROR
"libbacktrace (library 'backtrace') is required to build the CCF task "
"system (target ccf_tasks). Please install libbacktrace and retry "
"configuration."
)
endif()
Copilot uses AI. Check for mistakes.
Comment on lines 201 to +219
auto& trace = ccf::tasks::current_throw_trace;
trace.num_frames =
backtrace(trace.frames, ccf::tasks::throw_trace_max_frames);
trace.num_frames = 0;
auto* bt_state = ccf::tasks::get_backtrace_state();
if (bt_state != nullptr)
{
backtrace_simple(
bt_state,
0, // skip = 0, capture from here
[](void* data, uintptr_t pc) -> int {
auto* t = static_cast<ccf::tasks::ThrowTrace*>(data);
if (t->num_frames < ccf::tasks::throw_trace_max_frames)
{
t->frames[t->num_frames++] = reinterpret_cast<void*>(pc); // NOLINT
}
return 0;
},
nullptr, // ignore errors
&trace);
}
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If backtrace_create_state() fails (so bt_state == nullptr), this code records trace.num_frames = 0 and captures nothing. Downstream, this means exceptions may log with no stack trace at all. Consider adding a fallback capture path when bt_state is null (for example, a glibc backtrace() fallback, or a secondary libbacktrace state initialised with an error callback so failures are visible).

Copilot uses AI. Check for mistakes.
Comment on lines +206 to +216
backtrace_simple(
bt_state,
0, // skip = 0, capture from here
[](void* data, uintptr_t pc) -> int {
auto* t = static_cast<ccf::tasks::ThrowTrace*>(data);
if (t->num_frames < ccf::tasks::throw_trace_max_frames)
{
t->frames[t->num_frames++] = reinterpret_cast<void*>(pc); // NOLINT
}
return 0;
},
Copy link

Copilot AI Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backtrace_simple callback stores every pc value it receives and always returns 0 (continue). In the example output, this results in a bogus trailing frame 0xffffffffffffffff. Consider stopping/ignoring sentinel PCs (eg pc == 0 or pc == std::numeric_limits<uintptr_t>::max()) by returning non-zero and/or skipping these values so they don’t appear as frames.

Copilot uses AI. Check for mistakes.
# backtrace_symbols can resolve function names in stacktraces, and preserve
# frame pointers so that backtrace() can walk the full call stack.
add_link_options($<$<CONFIG:Debug>:-rdynamic>)
add_compile_options($<$<CONFIG:Debug>:-fno-omit-frame-pointer>)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not omitting frame pointers still seems useful?

auto* bt_state = ccf::tasks::get_backtrace_state();
if (bt_state != nullptr)
{
backtrace_simple(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we backtrace_full() depending on NDEBUG? It's not obvious from the doc to what extent full is nicer than simple, but it sounds like it might be?

for (int i = 0; i < num_frames; ++i)
{
oss << " #" << i << ": " << demangle_symbol(symbols.get()[i]) << "\n";
auto pc = reinterpret_cast<uintptr_t>(frames[i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/tasks/test/basic_tasks.cpp:402

  • The stack-trace assertions are still compiled out under #ifndef NDEBUG. However, this PR’s goal is improved symbol resolution in Release/RelWithDebInfo, and the call-chain helpers are now marked noinline specifically to keep frames stable under optimisation. Consider enabling these assertions in non-Debug builds too (or gating on the presence of backtrace support rather than NDEBUG) so the new Release behaviour is covered by CI/release workflows.
 // Verify demangled function names appear in the stack traces #ifndef NDEBUG // ThrowsException call chain REQUIRE(logger_ptr->contains("level_3_throws_runtime_error")); REQUIRE(logger_ptr->contains("level_2_calls_level_3")); REQUIRE(logger_ptr->contains("level_1_calls_level_2")); // ThrowsUnknown call chain REQUIRE(logger_ptr->contains("level_2_calls_level_3_int")); REQUIRE(logger_ptr->contains("level_1_calls_level_2_int")); #endif 

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants