Skip to content

gpud-v0.1.0

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 27 Oct 09:39
a9d8b90

GPUd release notes (2024-10-27T09:38:10Z)

Welcome to this new release!

What's Changed

  • nits(server): debug level log for redundant register attempts by @gyuho in #126
  • fix(nvidia-smi/parse): do not parse remapped rows N/A by @gyuho in #128
  • feat(component/network): latency checks to global edge/DERP servers (using tailscale) by @gyuho in #125
  • fix(containerd): readable query failure error message (When CRI is not set up) by @gyuho in #129
  • fix(components): do not panic when there's no data collected yet by @gyuho in #130
  • feat(nvidia): exposing SM core and tensor core metrics in GPUd by @photoszzt in #132
  • fix(nvidia/query/metrics): remove duplicate metric register call by @gyuho in #133
  • feat(charts): add gpud run helm chart by @gyuho in #123
  • fix(infiniband): simplify ibstat existence when evaluating healthy by @gyuho in #124
  • feat(network/latency): track latency in metrics per region by @gyuho in #134
  • Update mothership endpoint by @cardyok in #82
  • fix(nvidia): use NVML + lspci to detect NVIDIA GPUs (without running nvidia-smi) by @gyuho in #127
  • fix(server): handle "components" URL query, return 404 not found on unknown component queries by @gyuho in #131
  • nits(nvidia/query): make detect logs debug level by @gyuho in #135
  • fix(status): fix divide by zero by @cardyok in #136
  • fix(nvidia/xid): do not error log when no xid happened yet by @gyuho in #138
  • fix(nvidia): persistence mode check based on NVML, do not rely on "nvidia-persistenced" binary by @gyuho in #137

New Contributors

Full Changelog: v0.0.5...v0.1.0