Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust dependencies spoil debug symbol loading in gdb #21836

Open
rpoyner-tri opened this issue Aug 21, 2024 · 14 comments
Open

rust dependencies spoil debug symbol loading in gdb #21836

rpoyner-tri opened this issue Aug 21, 2024 · 14 comments
Assignees
Labels
component: build system Bazel, CMake, dependencies, memory checkers, linters type: bug

Comments

@rpoyner-tri
Copy link
Contributor

What happened?

  • Do a debug build.
  • Try to debug any unit test that has clarabel_solver in its dependency tree (almost anything now that geometry/proximity has the dependency).
  • Notice that source line viewing and line numbers in general are not available.

Version

master circa 1.32.0

What operating system are you using?

Ubuntu 22.04

What installation option are you using?

compiled from source code using Bazel

Relevant log output

rico@PUGET-255560:~/checkout/drake$ bazel test -c dbg //geometry/proximity/...
INFO: Analyzed 730 targets (0 packages loaded, 0 targets configured).
INFO: Found 235 targets and 495 test targets...
INFO: Elapsed time: 0.216s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action

Executed 0 out of 495 tests: 495 tests pass.
rico@PUGET-255560:~/checkout/drake$ gdb --args ./bazel-bin//geometry/proximity/inflate_mesh_test
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./bazel-bin//geometry/proximity/inflate_mesh_test...
warning: Missing auto-load script at offset 0 in section .debug_gdb_scripts
of file /home/rico/.cache/bazel/_bazel_rico/7f8997f28c9253517a55d673c67a6c74/execroot/drake/bazel-out/k8-dbg/bin/geometry/proximity/inflate_mesh_test.
Use `info auto-load python-scripts [REGEXP]' to list them.
(gdb) b inflate_mesh_test.cc:91
No source file named inflate_mesh_test.cc.
Make breakpoint pending on future shared library load? (y or [n]) n
(gdb) quit
@rpoyner-tri rpoyner-tri added type: bug component: build system Bazel, CMake, dependencies, memory checkers, linters labels Aug 21, 2024
@rpoyner-tri
Copy link
Contributor Author

More detail: using a non-clarabel-infected program still works. Also, hacking out the clarabel dependency also works to restore full debug symbols.

@RussTedrake
Copy link
Contributor

Nice! I had noticed this, too, but had not investigated the root cause.

@rpoyner-tri
Copy link
Contributor Author

I suspect some cure can be found by reading up about rust debugging. I've seen evidence of rust users having some version of this problem.

@rpoyner-tri
Copy link
Contributor Author

Well, here are some completely unsatisfying partial answers.

  • get any source file line number symbols at all: add rustc_flags = ["--emit=obj"], to the rust static library rule in clarabel_cpp_internal. This results in sources for rust modules and for some drake dependencies, but no drake modules.
  • get all source file line number symbols: switch to mold linker (https://github.com/rui314/mold). This results in line number symbols from all modules, but (currently) requires a local build of mold and some bazel configuration I haven't yet discovered. I forced the issue by redirecting the symlink at /usr/bin/ld.gold, but this is clearly unacceptable.

@rpoyner-tri
Copy link
Contributor Author

rpoyner-tri commented Aug 25, 2024

It appears that noble (binutils 2.42) has at least some of the same problems. Forcing bazel to use mold (via apt-get on noble) results in an executable with all line numbers.

@jwnimmer-tri
Copy link
Collaborator

FYI workaround: add --define=NO_CLARABEL=ON to your bazel command line to opt-out of Clarabel.

@sherm1

This comment was marked as off-topic.

@jwnimmer-tri

This comment was marked as off-topic.

@rpoyner-tri
Copy link
Contributor Author

Now that #21961 has landed, using replacement linker mold should be a usable workaround for debug builds.

$ apt install mold  # The version in jammy is good enough
$ bazel test -c dbg --linkopt=-fuse-ld=mold //geometry/proximity:inflate_mesh_test
$ gdb --args ./bazel-bin/geometry/proximity/inflate_mesh_test
(gdb) b inflate_mesh_test.cc:50
(gdb) r
# breakpoint is hit, and `list` command works

@jwnimmer-tri
Copy link
Collaborator

In that case, can we automate the fix (and thus close the issue)? We can easily add mold to setup/ubuntu/source_distribution/packages-*-test-only.txt. The question is how to add linkopt to tools/ubuntu.bazelrc in some fashion, ideally only for debug builds?

@rpoyner-tri
Copy link
Contributor Author

I've been puzzling over the "how to add flags to bazel only for debug" for some time. Still got nothin'. But, I haven't gone dumpster-diving in bazel source code.

@jwnimmer-tri
Copy link
Collaborator

If we're willing to do it in skylark, we could imagine using select({"//tools/cc_toolchain:debug" ... }) like we do to toggle some tests currently. If we only need to change test program linking, maybe that's not too bad.

@rpoyner-tri
Copy link
Contributor Author

rpoyner-tri commented Sep 30, 2024

Probably adding some starlark to drake_cc*test and drake_cc*binary would cover it.

@rpoyner-tri
Copy link
Contributor Author

A minor impediment to automating mold usage: the Jammy package is old enough that builds will fail the exported_symbols_test, but pass with a late-model built-from-source mold.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: build system Bazel, CMake, dependencies, memory checkers, linters type: bug
Projects
None yet
Development

No branches or pull requests

4 participants