Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing triples after a fragmentation #51

Open
woutslabbinck opened this issue Nov 10, 2021 · 5 comments · Fixed by #55 or #56
Open

Missing triples after a fragmentation #51

woutslabbinck opened this issue Nov 10, 2021 · 5 comments · Fixed by #55 or #56
Assignees

Comments

@woutslabbinck
Copy link
Collaborator

woutslabbinck commented Nov 10, 2021

Missing type in the view

When executing npm run test with the substring fragmenter (INPUT_FRAGMENTATION_STRATEGY="substring") in the root.ttl file, the view <begin_of_IRI/root.ttl> is a tree:node as it contains the relations to the other nodes.
However, this node does not have its type declared in the file.

The expected triple that is missing is <begin_of_IRI/root.ttl> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://w3id.org/tree#Node>.

Note: While this was tested using the substring behaviour, I assume it would be nice that the type is added when using any bucketizer in the LDES Action

Missing property path in relation

Furthermore, when known (which is the case in the substring fragmenter) could the tree:path be added to each relation. While it is not explicitly required according to the TREE specification, as when no property path, each triple from a member is evaluated, it gives more information and can improve performance when creating application on top of TREE/LDES.

@ddvlanck ddvlanck self-assigned this Nov 12, 2021
@woutslabbinck woutslabbinck changed the title Missing triples during after a fragmentation Missing triples after a fragmentation Nov 16, 2021
@ddvlanck ddvlanck linked a pull request Nov 17, 2021 that will close this issue
@ddvlanck
Copy link
Contributor

#55 contains the following fixes

  • Every page contains a type, tree:Node<https://ddvlanck.github.io/Republish-LDES/output/a.ttl> a <https://w3id.org/tree#Node>.
  • For every relation a page has, tree:path is added when relevant:

Substring fragmentation

<https://ddvlanck.github.io/Republish-LDES/output/a.ttl> <https://w3id.org/tree#relation> _:df_1_52.
_:df_1_52 a <https://w3id.org/tree#SubstringRelation>;
    <https://w3id.org/tree#node> <https://ddvlanck.github.io/Republish-LDES/output/ar.ttl>;
    <https://w3id.org/tree#path> _:n3-0;
    <https://w3id.org/tree#value> "ar".
_:n3-0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> <http://www.w3.org/2000/01/rdf-schema#label>;
    <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil>.

Subject pages → Does not have relations to other pages
Basic fragmentation → Does contain relations to other pages, but does not use the property path (so no tree:path)

<https://ddvlanck.github.io/Republish-LDES/output/0.ttl> <https://w3id.org/tree#relation> _:df_1_0.
_:df_1_0 a <https://w3id.org/tree#Relation>;
    <https://w3id.org/tree#node> <https://ddvlanck.github.io/Republish-LDES/output/1.ttl>.

@ddvlanck ddvlanck linked a pull request Nov 17, 2021 that will close this issue
@pietercolpaert
Copy link
Member

For the basic fragmentation: can’t we make sure we add one based on a time property? Or is this too complicated?

@KasperZutterman
Copy link
Collaborator

If we want different relations for every fragmentation/bucketizer strategy, we should look into moving this into the bucketizers themselves maybe?

@pietercolpaert
Copy link
Member

Can’t this then be fixed with this idea? TREEcg/bucketizers#3

@ddvlanck
Copy link
Contributor

For a basic fragmentation it is not possible to add tree:path or a more specific relation, because we do not know the sorting of the original LDES. However, if we load the entire dataset into memory, we could apply a sorting based on the property path, which is then preferably set to a timestamp predicate, but I don't know if that is the most sustainable option.

The core relation information is defined in RelationParameters and allows different types of relations and properties tree:node, tree:value and tree:remainingItems. The property tree:path is added in the LDES Action itself

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants