-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix/Child objects ACL #283
Conversation
Judging from nspcc-dev/neofs-sdk-go#543 we're missing at least offset data for #264. And I'm worried about #263 as well. |
Yes, I separate them to make every change smaller but merge them with a few PRs. Do you think we need to change them at once? |
Let's make a complete spec for the new format first, then implement it step by step fixing along the way if needed. |
4b26e9e
to
ae6f915
Compare
A few intermediate questions:
My answers:
|
ae6f915
to
da4aa5f
Compare
|
If we make it from scratch -- yes. But currently a lot of places think that link object is a regular object. I can not even imagine how many. Moreover, backward compatibility is a question here: there are places that are not ready to get anything except REGULAR, TOMB, LOCK. But I do support you.
No ID, no payload length (in a general case), no hashes. That is not a header I would say. Previously, "parent" was a ready term in NeoFS. If you saw "parent" somewhere, you know what it was and were able to get anything you want about it. But not finished header solves our problem for now.
So you think |
// IDs. IDs MUST be ordered according to the original payload split, meaning the | ||
// first payload part holder MUST be placed at the first place in the corresponding | ||
// link object. | ||
message Link { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really deserves a separate package? to me its overhead. object
fits well, or at least in refs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
depends on the final solution about a separate object type. if it is a separate object type, i do not see any difference with tombstone
, storagegroup
and lock
pkgs. but if you ask me, should it be like that, my answer is no, but we have what we have, do not want this object type to be different in any way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo the previously used approach only complicates the structuring and, as a consequence, the knowledge of the protocol concepts. All these entities are inextricably linked with objects - they are about objects, they are in the payload of objects. Сurrent packetization is extremely redundant to me
we have what we have
obviously, but we are not burdened with creating a new package (). Keep the existing ones, and this message can be left in object
to comply with the system architecture hierarchy
being a newbie, i'll see acl
, container
, object
and other large sections of the system. And then link
- specific detail elevated to the top level of the system with a non-descript name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to comply with the system architecture hierarchy
so do i. type of the objects are placed where they always have been
the issue was created #284
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for issue, link
is still under the question
the concept is very simple: if payload format is strict, then separate type is required. Separate type also grants efficient
it cannot do without logical features expressed in structural differences. For example, there should be no incompatible collisions with the current split schema. About the structure itself, i think it should be as efficient as possible to solve exact problems regardless of the server implementation |
da4aa5f
to
4dc7ec0
Compare
// ID. It is NOT required for the original object assembling. It MUST have ALL | ||
// the "child objects" IDs. Child objects MUST be ordered according to the | ||
// original payload split, meaning the first payload part holder MUST be placed | ||
// at the first place in the corresponding link object. Sizes MUST NOT be omitted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we wanna use it, we should be able to rely on it so there is MUST word. but then, we need to validate it and that is N head requests per link object PUT (N is num of objects in the chain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
N head requests per link object
the problem is more specific - when assembling a split object, the list of descriptors of child objects can diverge with the stored children (not only the sizes btw, IDs too). The proposed validation is correct, but since we cannot guarantee it at the protocol level (this is a server implementation detail), the assembler - one who operates with split chain - must be prepared for the divergence in any case
protocol-level solution is to additionally store such metadata in the child object headers themselves (mentioned here). Then the link object will remain a pure stand-alone helper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but since we cannot guarantee it at the protocol level (this is a server implementation detail)
what do you mean? nodes may not accept such link objects
must be prepared for the divergence in any case
there is no solution for #264 then. if you cannot trust the link object, you cannot be sure you are answering with a correct range
protocol-level solution is to additionally store such metadata in the child object headers themselves
should also be validated somehow. a node can get Nth object with some offset in it. what does it do? just believes it is true? searches for the previous ones and calculates their sizes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean? nodes may not accept such link objects
nodes may do whatever they want. They may accept linking objects w/o any validation
there is no solution for #264 then
i dont deny current solution, im just highligting that linking object can be unavailable or diverge with the split chain and any network node must be ready for this. In these cases, extra metadata in child objects may be a painkiller in some cases
what does it do?
its own implementation detail, each node can only control what to do itself and what it receives from other nodes
or maybe i misunderstood ur original comment. I dont see any problem with current structure and requirements. If u didnt mean that N head requests per link object PUT
is a problem, then everything is ok to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodes may do whatever they want
what does it do?
its own implementation detail, each node can only control what to do itself and what it receives from other nodes
i meant we need to find a balance to implement our "perfect" node that may exist in the world of "worst" nodes. if we may not rely on link object, i think the whole work is useless. but if we want to rely on it, we should validate it somehow
extra metadata in child objects may be a painkiller in some cases
here i want to say that ensuring child object data is ok is the same to ensuring similar info about the link object
If u didnt mean that N head requests per link object PUT is a problem, then everything is ok to me
yes. this and the consequences
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation should be implemented. Then I'd be optimistic wrt the link object contents. If it's not correct --- either you get an error (trying to access inexisting object) or reply with a wrong result, but it's not node's fault, garbage in -- garbage out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation should be implemented.
Yes, sure. Just not sure about some link object replication: imagine you have a 1к parts object and receive a link to it for the first time: you need to do 1k head objects before you finish with this object. And you may be an attacker that sends 1k of such 1k-parts objects. 64mb data for a HEAD chaos in the NeoFS network.
// ID. It is NOT required for the original object assembling. It MUST have ALL | ||
// the "child objects" IDs. Child objects MUST be ordered according to the | ||
// original payload split, meaning the first payload part holder MUST be placed | ||
// at the first place in the corresponding link object. Sizes MUST NOT be omitted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
N head requests per link object
the problem is more specific - when assembling a split object, the list of descriptors of child objects can diverge with the stored children (not only the sizes btw, IDs too). The proposed validation is correct, but since we cannot guarantee it at the protocol level (this is a server implementation detail), the assembler - one who operates with split chain - must be prepared for the divergence in any case
protocol-level solution is to additionally store such metadata in the child object headers themselves (mentioned here). Then the link object will remain a pure stand-alone helper
// IDs. IDs MUST be ordered according to the original payload split, meaning the | ||
// first payload part holder MUST be placed at the first place in the corresponding | ||
// link object. | ||
message Link { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo the previously used approach only complicates the structuring and, as a consequence, the knowledge of the protocol concepts. All these entities are inextricably linked with objects - they are about objects, they are in the payload of objects. Сurrent packetization is extremely redundant to me
we have what we have
obviously, but we are not burdened with creating a new package (). Keep the existing ones, and this message can be left in object
to comply with the system architecture hierarchy
being a newbie, i'll see acl
, container
, object
and other large sections of the system. And then link
- specific detail elevated to the top level of the system with a non-descript name
4dc7ec0
to
7b98b6a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
format is ok, left some discussions about docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a behavior spec in https://github.com/nspcc-dev/neofs-spec/? I guess most of mechanics stays the same, except for INIT.
I'm also not sure how split_id/init interact, do we need split_id if init is present?
// ID. It is NOT required for the original object assembling. It MUST have ALL | ||
// the "child objects" IDs. Child objects MUST be ordered according to the | ||
// original payload split, meaning the first payload part holder MUST be placed | ||
// at the first place in the corresponding link object. Sizes MUST NOT be omitted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation should be implemented. Then I'd be optimistic wrt the link object contents. If it's not correct --- either you get an error (trying to access inexisting object) or reply with a wrong result, but it's not node's fault, garbage in -- garbage out.
You mean before this PR merge? Let's have at least approve+draft to ensure we agree about this PR. I can't continue before we are done here, it has already changed 3 times.
I'm not sure init object changes something here: if the link and the init objects are lost and there is no SplitID, there is no ability to assembly the original object (except you find somehow the latest part and compare its parent header). Also, you cannot update any indexes related to the original object if you meet some object part before the init or the link object (possible situation, why not?). cc @cthulhu-rider |
To me, |
e5b2df3
7b98b6a
to
e5b2df3
Compare
e5b2df3
to
bbeae5c
Compare
…#283 Signed-off-by: Pavel Karpy <[email protected]>
…#283 Signed-off-by: Pavel Karpy <[email protected]>
…#283 Keep both versions of the split objects and mark them as v1 and v2. Signed-off-by: Pavel Karpy <[email protected]>
bbeae5c
to
9df9a1b
Compare
It describes future protocol version's link object payload. Child objects list will be moved from the header to the payload. This is done due to the header size restrictions. Closes #263. Signed-off-by: Pavel Karpy <[email protected]>
9df9a1b
to
fcbf8f8
Compare
|
@carpawell closes #264? |
It allows faster seeking through a split object without fetching the whole chain. Closes #264. Signed-off-by: Pavel Karpy <[email protected]>
This commit makes it easier to differ link objects from the other types. Object split hierarchy rework increases the link object's structure and makes it more strictly formatted, so now it plays a more important role in the split chains (and the split rules became more complex too). Signed-off-by: Pavel Karpy <[email protected]>
There is no need to generate some UUID with non-specified rules if the first object part allows the same identification routines but uses hashes widely accepted in the protocol. Signed-off-by: Pavel Karpy <[email protected]>
fcbf8f8
to
a825a7e
Compare
Yes, added it to one of the commits. |
…#283 Keep both versions of the split objects and mark them as v1 and v2. Signed-off-by: Pavel Karpy <[email protected]>
…#283 Keep both versions of the split objects and mark them as v1 and v2. Signed-off-by: Pavel Karpy <[email protected]>
No description provided.