Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: node stuck in restart loop when rln-relay-eth-client-address unavailable #3126

Open
jakubgs opened this issue Oct 17, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@jakubgs
Copy link
Contributor

jakubgs commented Oct 17, 2024

Problem

When the RPC endpoint specified in rln-relay-eth-client-address is unavailable for any reason, the node is stuck in a restart loop:

Impact

This behavior makes the node fragile, since issues with just a single endpoint needed for a single protocol can cause the node to fail to start despite only one protocol having a problem. This behavior can easily lead to whole fleets going down simply due to issues with one protocol.

To reproduce

  1. Run node with unavailable rln-relay-eth-client-address
  2. See the restart loop.

Expected behavior

I would expect the node to start and provide functionality of all other protocols aside from rln-relay and simply report that protocol is broken.

  • If nodes are already running and rln-relay RPC endpoint becomes unavailable only rln-relay will have issues.
  • If nodes are restarted and rln-relay RPC endpoint becomes unavailable all nodes will fail to start.

Screenshots/logs

DBG 2024-10-17 08:54:16.527+00:00 Sending message to RPC server              topics="JSONRPC-HTTP-CLIENT" tid=1 file=httpclient.nim:79 address="ok((id: \"linux-01.ih-eu-mda1.nimbus.sepolia.wg:8556\", scheme: NonSecure, hostname: \"linux-01.ih-eu-mda1.nimbus.sepolia.wg\", port: 8556, path: \"\", query: \"\", anchor: \"\", username: \"\", password: \"\", addresses: @[10.14.0.131:8556]))" msg_len=59 name=eth_chainId
DBG 2024-10-17 08:54:28.536+00:00 Failed to send POST Request with JSON-RPC  topics="JSONRPC-HTTP-CLIENT" tid=1 file=httpclient.nim:95 e="Connection timed out"

nwaku version/commit hash

v0.33.1

Additional context

Discovered due to firewall issues on node-01.gc-us-central1-a.waku.sandbox.

@jakubgs jakubgs added the bug Something isn't working label Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

1 participant