Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure joining nodes are immediately trusted #89

Closed
wants to merge 9 commits into from

Conversation

masnax
Copy link
Contributor

@masnax masnax commented Mar 8, 2024

This cleans up some of the handling around authentication and trust when a system joins microcluster, or restarts.

Previously, we were relying on the dqlite heartbeat to carry trust information to existing nodes when a new node has joined the cluster. This meant that immediately after a node joins, the only systems that trust it are the one that handled its join request first, and potentially also the leader if that node wasn't the leader.

To get around this, there is now a PUT /cluster/1.0/cluster method which forwards a joining node's truststore information to all other nodes. Since we have this in place now, we can also ensure that all nodes fully trust the new node before any of the PreJoin, OnNewMember and PostJoin hooks run.

For the reason above, the POST /cluster/1.0/cluster endpoint was actually untrusted by default, and authenticated with the join token. However, this endpoint is reused to send a cluster notification for the OnNewMember hook, which would bypass this authentication entirely as there's no join token. To fix that, the authenticate function is now exported, so that we can manage authentication a bit better, and make these endpoints require authentication except for the specific case of handling the join token.

Additionally cleans up some errors and makes the default /cluster/1.0 endpoint untrusted.

…r/1.0

Untrusted systems should still be able to view this non-sensitive
information.

Signed-off-by: Max Asnaashari <[email protected]>
Returns a copy of the config so that the upcoming non-cluster member
support can utilize it, even though it's set to _ at the moment.

Signed-off-by: Max Asnaashari <[email protected]>
This moves some of the logic in StartAPI out into other helpers to make
it easier to read, and prevent duplication of those actions.

This includes detecting when a cluster is finished comparing its schema
version to other members, and obtaining clients for the whole cluster to
send a notification.

Signed-off-by: Max Asnaashari <[email protected]>
@masnax
Copy link
Contributor Author

masnax commented Mar 8, 2024

Might be a good idea to wait until #86 is merged, and then add an API extension for this, as it does affect the API.

Moves the authentication handlers out into their own function that is
exported and can be used to facilitate finer control of authentication
per-endpoint.

Additionally adds a new RestrictNotification AccessHandler so that
untrusted endpoints that are re-used for cluster notifications can
restrict those communications only to trusted systems.

Signed-off-by: Max Asnaashari <[email protected]>
…ecords on join

Previously, local records weren't updated until the next heartbeat,
except on the system that handled the join request. This adds a new
method PUT to /cluster which will forward a request to every system,
instructing it to add the newly added node to its store.

As a result, the node forwarding a join request to a leader will no
longer also implicitly trust whoever sent the request, as the token
won't have been validated, and the endpoint does not have
authentication. Join requests now have the RestrictNotification
AccessHandler set so a request sent with the cluster notification flag
will go through authentication.

Lastly, as none of the methods on this endpoint should be called while
the node is uninitialied, `AllowedBeforeInit` is set to false.

Signed-off-by: Max Asnaashari <[email protected]>
…nNewMember

Ensures that each cluster member has actually recorded the joining node
in their local trust store before executing the new-member hooks.

Signed-off-by: Max Asnaashari <[email protected]>
@tomponline
Copy link
Member

Might be a good idea to wait until #86 is merged, and then add an API extension for this, as it does affect the API.

Agreed good idea.

@tomponline
Copy link
Member

PUT /cluster/1.0/cluster

This seems rather stammering - can we just use PUT /cluster/1.0?

@masnax
Copy link
Contributor Author

masnax commented Apr 18, 2024

PUT /cluster/1.0/cluster

This seems rather stammering - can we just use PUT /cluster/1.0?

/cluster represents pre-packaged microcluster endpoints

/cluster/1.0 is publicly available endpoints provided by microcluster. (There's also /cluster/internal and /cluster/control for the fully internal and control socket endpoints respectively).

/cluster/1.0/cluster is the subset of endpoints relating to management of the cluster itself.

I'm not sure if making those top-level public endpoints is appropriate.

@tomponline
Copy link
Member

top-level public endpoints

What does that mean, what are the implications you're thinking of?

@masnax masnax closed this Apr 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants