Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate removing spdlog from all public APIs #104

Open
vyasr opened this issue Sep 25, 2024 · 6 comments
Open

Investigate removing spdlog from all public APIs #104

vyasr opened this issue Sep 25, 2024 · 6 comments
Assignees

Comments

@vyasr
Copy link
Contributor

vyasr commented Sep 25, 2024

Currently RAPIDS makes use of spdlog and fmt in a few places that I am aware of. rmm, cudf, and cuml (please add to this list if there are more) all have their own loggers that expose public APIs to the user for capturing relevant information. This makes spdlog (and transitively, fmt) a build-time requirement of the packages (cudf, rmm, cuml). This is expected and unavoidable. However, it turns out that spdlog and fmt are actually also runtime requirements of these packages. At first glance this may seem surprising since rmm constrains its consumers to use the header-only versions of these libraries, so at runtime spdlog/fmt libraries do not actually have to exist on the system. The sticking point is that in addition to exposing their own logging APIs, rmm and cudf also allow the user to get a reference to the global logger directly (cudf, rmm). That means that even if they use the header-only spdlog, they need to constrain the ABIs of any other libraries within the same environment to ensure that compatible versions of spdlog are being used so that spdlog objects can be safely passed across library boundaries. In practice, in conda this runtime requirement is enforced via spdlog's (and fmt's) run exports. Unfortunately, in a conda-forge world this constraint becomes quite problematic because spdlog (and fmt) is such a central dependency that many libraries require it, and as a result the preferred version is globally pinned and managed for all conda-forge package builds. Even though RAPIDS publishes packages to a different channel, we aim for compatibility with conda-forge since it is the source for most community packages that our users want to use with RAPIDS libraries. As a result, whenever conda-forge changes its global pinning RAPIDS needs to also upgrade in lockstep to ensure that conda environments will solve. This results in meaningful pain for our consumers, who are forced to wait for us to upgrade, and adds work every few months when RAPIDS decides to upgrade. Ideally, we would find a way to avoid this problem.

cuml's logger implementation is a good example of how we might avoid this problem. cuml hides spdlog behind forward declarations and does not directly expose any symbols involving spdlog. In theory, it would be safe for cuml to ignore the run exports from spdlog entirely. However, there a few caveats: 1) cuml is currently implicitly relying on rmm to bring in spdlog in its build system, and is therefore at the mercy of upstream changes to how spdlog is used; 2) despite the way the code is designed, the cuml conda recipe is not configured to ignore the run exports, so it still has a direct runtime dependence on spdlog; and 3) even if the run exports were set up correctly, cuml would still inherit a transitive runtime dependence on spdlog from rmm.

To resolve this problem I propose that:

  1. We deprecate and remove the functions from cudf and rmm that allow users to interact directly with spdlog objects.
  2. Ensure that all references to spdlog are removed from the public headers of rmm, cudf, and cuml
  3. We remove spdlog and fmt from the requirements of these packages and rely on CPM to fetch them instead. rapids-cmake will still manage a central version, FWIW.

This proposal is along the lines of two similar proposals I made within the past year around removing protobuf and arrow as runtime dependencies for similar reasons.

Critical note
Note that since rmm is header-only, the logger implementation is still going to exist directly in rmm's public headers, and therefore it is technically still part of the public API. Therefore, removing the fmt/spdlog constraint from rmm shifts the responsibility to the consuming library not to expose spdlog symbols in their own public headers. One alternative would be to have the librmm conda package ignore the run exports from spdlog/fmt but then to have librmm's own run export list include spdlog/fmt. That would mean that while spdlog/fmt would not be runtime requirements of librmm, they would become runtime requirements of any consumer of librmm. The benefit of that approach is that it chooses the safest path by default while allowing consumers to ignore the run exports from librmm if they wished to. The downside is that consumers would also have to be careful to pin_compatible librmm itself when ignoring the run exports since presumably they wish to retain ABI-compatibility guarantees with librmm itself (since e.g. all RAPIDS repos expose rmm symbols as part of their public interface in order to support stream-ordering and memory resource management).

P.S.
This issue is proposing removing a current feature of rmm and cudf, namely the direct access to the underlying global spdlog loggers managed by those libraries. Based on offline discussions, removing this feature is acceptable, and if there are specific bits of functionality that we need to expose we can always enrich the rmm/cudf logger APIs to provide that by wrapping spdlog calls.

@bdice
Copy link
Contributor

bdice commented Sep 25, 2024

From what I know, this sounds like a good plan. I support it.

@bdice
Copy link
Contributor

bdice commented Sep 27, 2024

xref: rapidsai/cudf#16951

@ava6969
Copy link

ava6969 commented Sep 28, 2024

Please very needed, I use the cudf and cuml c++ library, and currently I have a forked repo for all 3 of them were i just comment out all rapid call to find fmt and spdlog. also

@jameslamb
Copy link
Member

Unfortunately, in a conda-forge world this constraint becomes quite problematic because spdlog (and fmt) is such a central dependency that many libraries require it...

Brand new example of that: rapidsai/docker#712 (comment)

rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue Oct 4, 2024
This PR removes support for accessing cudf's underlying spdlog logger directly.

Contributes to rapidsai/build-planning#104

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Nghia Truong (https://github.com/ttnghia)
  - David Wendt (https://github.com/davidwendt)
  - Yunsong Wang (https://github.com/PointKernel)

URL: #16964
@harrism
Copy link
Member

harrism commented Oct 9, 2024

RMM has two forms of logging, as described in the docs. The logging_resource_adaptor, and debug logging (the built-in logger).

The former is useful for capturing logs in failing cases to better understand their allocation behavior. It needs to be compiled into RMM Python so that it can be dynamically enabled.

The latter is only used when debugging logic problems inside of RMM. It is the RMM_LOG_INFO, RMM_LOG_TRACE etc. macros. I have not actively used it in a long time, since I was developing the RMM pool.

I don't know if it would help, but we could disabled the latter (debug logging) completely by default, so that using in any form would require a rebuild with a CMake flag turned on. Currently that rebuild is required to change the logging level to something other than the default anyway.

rapids-bot bot pushed a commit to rapidsai/rmm that referenced this issue Oct 16, 2024
Contributes to rapidsai/build-planning#104




This PR removes support for accessing rmm's underlying spdlog logger directly.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Lawrence Mitchell (https://github.com/wence-)
  - Mark Harris (https://github.com/harrism)

URL: #1690
@vyasr
Copy link
Contributor Author

vyasr commented Oct 21, 2024

I'm going to work on the rmm piece in rapidsai/rmm#1709. Once rmm is working I'll generalize that for rollout to other repos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants