LODKit is a collection of Linked Open Data related Python functionalities.
lodkit.RDFImporter
is a custom importer for importing RDF files as if they were modules.
Assuming 'graphs/some_graph.ttl' exists in the import path, lodkit.RDFImporter
makes it possible to do the following:
import lodkit
from graphs import some_graph
type(some_graph) # <class 'rdflib.graph.Graph'>
Note that lodkit.RDFImporter
is available on import lodkit
.
lodkit.lod_types
defines several useful typing.TypeAliases
and typing.Literals
for working with RDFLib-based Python functionalities.
uriclass
and make_uriclass
provide dataclass-inspired URI constructor functionality.
With uriclass
, class-level attributes are converted to URIs according to uri_constructor.
For class attributes with just type information, URIs are constructed using UUIDs,
for class attributes with string values, URIs are constructed using hashing based on that string.
from lodkit import uriclass
@uriclass(Namespace("https://test.org/test/"))
class uris:
x1: str
y1 = "hash value 1"
y2 = "hash value 1"
print(uris.x1) # Namespace("https://test.org/test/<UUID>")
print(uris.y1 == uris.y2) # True
make_uriclass
provides equalent functionality but is more apt for dynamic use.
from lodkit import make_uriclass
uris = make_uriclass(
cls_name="TestURIFun",
namespace="https://test.org/test/",
fields=("x", ("y1", "hash value 1"), ("y2", "hash value 1")),
)
print(uris.x1) # Namespace("https://test.org/test/<UUID>")
print(uris.y1 == uris.y2) # True
uritools.utils
defines base functionality for generating UUID-based and hashed URIs.
URIConstructorFactory
(alias of mkuri_factory
) constructs a callable for generating URIs.
The returned callable takes an optional str argument 'hash_value';
If a hash value is given, the segment is generated using a hash function, else the path is generated using a uuid.
from lodkit import URIConstructorFactory
mkuri = URIConstructorFactory("https://test.namespace/")
print(mkuri()) # URIRef("https://test.namespace/<UUID>")
print(mkuri("test") == mkuri("test")) # True
Triple tools (so far only) defines lodkit.ttl
, a triple constructor implementing a Turtle-like interface.
lodkit.ttl
aims to implement turtle predicate list notation by taking a triple subject and predicate-object pairs;
objects in a predicate-object pair can be
- objects of type
lodkit._TripleObject
(strings are also permissible and are interpreted asrdflib.Literal
), - tuples of
lodkit._TripleObject
(see turtle object lists), - lists of predicate-object pairs, emulating turtle blank node notation.
lodkit.ttl
objects.
from collections.abc import Iterator
from lodkit import _Triple, ttl
from rdflib import Graph, Literal, RDF, RDFS, URIRef
triples: Iterator[_Triple] = ttl(
URIRef("https://subject"),
(RDF.type, URIRef("https://some_type")),
(RDFS.label, (Literal("label 1"), "label 2")),
(RDFS.seeAlso, [(RDFS.label, "label 3")]),
(
RDFS.isDefinedBy,
ttl(URIRef("https://subject_2"), (RDF.type, URIRef("https://another_type"))),
),
)
graph: Graph = triples.to_graph()
The above graph serialized to turtle:
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
<https://subject> a <https://some_type> ;
rdfs:label "label 1",
"label 2" ;
rdfs:isDefinedBy <https://subject_2> ;
rdfs:seeAlso [ rdfs:label "label 3" ] .
<https://subject_2> a <https://another_type> .
lodkit.NamespaceGraph
is a simple rdflib.Graph subclass for easy and convenient namespace binding.
from lodkit import NamespaceGraph
from rdflib import Namespace
class CLSGraph(NamespaceGraph):
crm = Namespace("http://www.cidoc-crm.org/cidoc-crm/")
crmcls = Namespace("https://clscor.io/ontologies/CRMcls/")
clscore = Namespace("https://clscor.io/entity/")
graph = CLSGraph()
ns_check: bool = all(
ns in map(lambda x: x[0], graph.namespaces())
for ns in ("crm", "crmcls", "clscore")
)
print(ns_check) # True
lodkit.ClosedOntologyNamespace
and lodkit.DefinedOntologyNamespace
are rdflib.ClosedNamespace
and rdflib.DefinedNameSpace
subclasses
that are able to load namespace members based on an ontology.
crm = ClosedOntologyNamespace(ontology="./CIDOC_CRM_v7.1.3.ttl")
crm.E39_Actor # URIRef('http://www.cidoc-crm.org/cidoc-crm/E39_Actor')
crm.E39_Author # AttributeError
class crm(DefinedOntologyNamespace):
ontology = "./CIDOC_CRM_v7.1.3.ttl"
crm.E39_Actor # URIRef('http://www.cidoc-crm.org/cidoc-crm/E39_Actor')
crm.E39_Author # URIRef('http://www.cidoc-crm.org/cidoc-crm/E39_Author') + UserWarning
Note that rdflib.ClosedNamespaces
are meant to be instantiated and rdflib.DefinedNameSpaces
are meant to be extended,
which is reflected in lodkit.ClosedOntologyNamespace
and lodkit.DefinedOntologyNamespace
.
lodkit.testing_tools
aims to provide general definitions (e.g Graph format options) and Hypothesis strategies useful for testing RDFLib-based Python and code.
E.g. the TripleStrategies.triples
strategy generates random triples utilizing all permissible subject, predicate and object types including lang-tagged and xsd-typed literals.
The following uses the triples strategies together with a Hypothesis strategy to create random graphs:
from hypothesis import given, strategies as st
from lodkit import tst
from rdflib import Graph
@given(triples=st.lists(tst.triples, min_size=1, max_size=10))
def test_some_function(triples):
graph = Graph()
for triple in triples:
graph.add(triple)
assert len(graph) == len(triples)
The strategy generates up to 100 (by default, see settings) lists of 1-10 tuple[_TripleSubject, URIRef, _TripleObject]
and passes them to the test function.
Warning: The API of lodkit.tesing_tools is very likely to change soon! Strategies should be module-level callables and not properties of a Singleton.