Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Arches model QuerySets #11595 #11596

Draft
wants to merge 85 commits into
base: dev/8.0.x
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
4236af2
Group ResourceInstance fields together
jacobtylerwalls Oct 8, 2024
3feeb47
Initial commit of PythonicModelQuerySet
jacobtylerwalls Oct 8, 2024
68f1f83
Handle cardinality N tiledata
jacobtylerwalls Oct 9, 2024
fd0c206
Stub out save/clean/refresh_from_db
jacobtylerwalls Oct 9, 2024
68c27fc
Initial commit of updating and deleting on pythonic models
jacobtylerwalls Oct 9, 2024
f58c9b4
Orient around nodegroups to help with jagged data, blank tiles
jacobtylerwalls Oct 10, 2024
2cb2b40
Fix tile sortorder calculation
jacobtylerwalls Oct 10, 2024
ccca92d
Fix refreshing
jacobtylerwalls Oct 10, 2024
75f3dc0
Add datatype validation
jacobtylerwalls Oct 10, 2024
06a89de
Stub out function triggers
jacobtylerwalls Oct 10, 2024
965c397
Move ORM lookup to datatype
jacobtylerwalls Oct 11, 2024
696ea0a
Unwrap resource instance datatypes to string id
jacobtylerwalls Oct 11, 2024
c2d139d
Straighten out cardinality N interaction with list datatypes
jacobtylerwalls Oct 11, 2024
9d9f97a
Check for invalid defer/only values
jacobtylerwalls Oct 12, 2024
5e0dd2f
Handle concept-list datatype
jacobtylerwalls Oct 12, 2024
24fc868
Handle JSON null in resource instance list dt transform
jacobtylerwalls Oct 12, 2024
2377b16
Factor out _get_orm_lookup_cardinality_n()
jacobtylerwalls Oct 12, 2024
b6d210e
Remove run_functions switch, fix set union, respect name mangling
jacobtylerwalls Oct 16, 2024
295478b
Fetch only relevant tiles
jacobtylerwalls Oct 16, 2024
dd628a9
Remove typos in ResourceInstanceDataType validation
jacobtylerwalls Oct 16, 2024
ea06023
Remove unwanted outer join
jacobtylerwalls Oct 16, 2024
e0095ee
Return model instances for RI datatypes
jacobtylerwalls Oct 17, 2024
edb0534
Make `request` optional in Tile.__preDelete
jacobtylerwalls Oct 17, 2024
c868549
Skip no-op tile updates
jacobtylerwalls Oct 18, 2024
9ecd550
Improve prefetching
jacobtylerwalls Oct 18, 2024
5864d83
Implement RI datatype values_match
jacobtylerwalls Oct 18, 2024
2a327bb
Implement datatype post save actions
jacobtylerwalls Oct 18, 2024
f6f7ca1
Fix fallback value for RI ontology properties
jacobtylerwalls Oct 21, 2024
d88f175
Make post_tile_save request kwarg optional
jacobtylerwalls Oct 21, 2024
0fb9b4a
Implement indexing
jacobtylerwalls Oct 21, 2024
e7fba7a
Implement edit log saves
jacobtylerwalls Oct 21, 2024
5c4360f
Improve None handling
jacobtylerwalls Oct 21, 2024
6ae7477
Add name to ResourceInstance.__repr__()
jacobtylerwalls Oct 21, 2024
2f6b48e
Harden ConceptListDataType.transform_value_for_tile against lists
jacobtylerwalls Oct 21, 2024
0d483b2
Remove orm_array_transform in favor of to_python()
jacobtylerwalls Oct 21, 2024
a6a47de
Move principal user fallback logic
jacobtylerwalls Oct 22, 2024
f1a6313
Improve cardinality n vs. n-squared stuff
jacobtylerwalls Oct 22, 2024
eaa288b
Remove as_resource()
jacobtylerwalls Oct 22, 2024
dde980e
Rename queryset
jacobtylerwalls Oct 22, 2024
75ed701
Simplify error reporting
jacobtylerwalls Oct 22, 2024
0da60b2
Add TileQuerySet
jacobtylerwalls Oct 23, 2024
ab2df3c
Remove eager materialization of RI instances for now
jacobtylerwalls Oct 23, 2024
b74b44c
Add as_nodegroup transform
jacobtylerwalls Oct 24, 2024
883aab4
Attach nodegroups to resources
jacobtylerwalls Oct 24, 2024
65f6ff8
Attach child tiles
jacobtylerwalls Oct 24, 2024
fa8f752
Improve child tile attachment
jacobtylerwalls Oct 24, 2024
4bb5b0c
Shave off some data, improve performance
jacobtylerwalls Oct 24, 2024
8d79942
Finish reorienting around nodegroups, add minimal docs
jacobtylerwalls Oct 25, 2024
5119dbc
Fix subquery bugs
jacobtylerwalls Oct 28, 2024
30ee0f2
Initial commit of ArchesModelSerializer
jacobtylerwalls Oct 28, 2024
5843b4b
Move as_nodegroup()
jacobtylerwalls Oct 29, 2024
c119b4c
Continue fleshing out tile/instance serializers
jacobtylerwalls Oct 29, 2024
33f128b
Move some helpers to utils
jacobtylerwalls Oct 29, 2024
bc0fbea
Make further fields blank
jacobtylerwalls Oct 29, 2024
ad67513
Fix additional subquery bugs
jacobtylerwalls Oct 29, 2024
2fe25c4
Harden concept{list} dt validation against UUIDs
jacobtylerwalls Oct 29, 2024
a3e4d8d
Update save machinery
jacobtylerwalls Oct 29, 2024
b172ba6
Move ersatz model fields to datatype classes
jacobtylerwalls Oct 30, 2024
f5185db
Implement single tile updates
jacobtylerwalls Oct 30, 2024
ab0b4bf
Make StringDataType a JSONField
jacobtylerwalls Oct 30, 2024
9107842
Fix example query
jacobtylerwalls Oct 30, 2024
1190fda
Fix tests
jacobtylerwalls Oct 31, 2024
e0e6689
Quiet output from tests that expect transactions to fail
jacobtylerwalls Oct 31, 2024
bc698d6
Avoid saving resource update edit log entries
jacobtylerwalls Oct 31, 2024
30451da
Fix parent/child tile attachment
jacobtylerwalls Oct 31, 2024
cc397df
Update changelog
jacobtylerwalls Oct 31, 2024
3a5e589
Add documentation, reduce queries
jacobtylerwalls Oct 31, 2024
4f54a86
Point to other docstring
jacobtylerwalls Oct 31, 2024
260e4a0
Remove N+1 queries in build_unknown_field()
jacobtylerwalls Oct 31, 2024
849d648
fixup!
jacobtylerwalls Oct 31, 2024
0e455c7
Improve ResourceInstance deserialization
jacobtylerwalls Oct 31, 2024
52d3306
fixup! Add documentation, reduce queries
jacobtylerwalls Oct 31, 2024
293d8a8
Improve names
jacobtylerwalls Nov 1, 2024
548b1c3
fixup! Improve names
jacobtylerwalls Nov 3, 2024
713ddd5
Support nodegroups = "__all__" in serializers
jacobtylerwalls Nov 5, 2024
10a4429
Allow creation with tile data
jacobtylerwalls Nov 5, 2024
5f871e2
Check correct private attribute
jacobtylerwalls Nov 5, 2024
68fe9e0
Harden RI transform_value_for_tile
jacobtylerwalls Nov 5, 2024
a3c43ec
Implement provisional edits
jacobtylerwalls Nov 6, 2024
475bb37
Temporarily workaround tile deserialization issues
jacobtylerwalls Nov 6, 2024
202e766
Override TileModel.refresh_from_db()
jacobtylerwalls Nov 6, 2024
e4cc140
fixup! Implement provisional edits
jacobtylerwalls Nov 6, 2024
e27e0f1
fixup! Override TileModel.refresh_from_db
jacobtylerwalls Nov 6, 2024
48b3e7a
Disallow empty strings for legacyid
jacobtylerwalls Nov 6, 2024
6e855bb
Proof of concept of db_default uuid4() #10958
jacobtylerwalls Nov 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 20 additions & 5 deletions arches/app/datatypes/base.py
Original file line number Diff line number Diff line change
@@ -1,15 +1,20 @@
import json, urllib
import json
import logging
import urllib

from django.urls import reverse
from django.utils.translation import gettext as _

from arches.app.models import models
from arches.app.models.system_settings import settings
from arches.app.search.elasticsearch_dsl_builder import Dsl, Bool, Terms, Exists, Nested
from django.utils.translation import gettext as _
import logging

logger = logging.getLogger(__name__)


class BaseDataType(object):
rest_framework_model_field = None
"""Django model field if the datatype were to be a real table column."""

def __init__(self, model=None):
self.datatype_model = model
self.datatype_name = model.datatype if model else None
Expand Down Expand Up @@ -336,7 +341,7 @@ def get_default_language_value_from_localized_node(self, tile, nodeid):
"""
return tile.data[str(nodeid)]

def post_tile_save(self, tile, nodeid, request):
def post_tile_save(self, tile, nodeid, request=None):
"""
Called after the tile is saved to the database

Expand Down Expand Up @@ -532,3 +537,13 @@ def validate_node(self, node):
a GraphValidationError
"""
pass

def get_base_orm_lookup(self, node):
"""This expression gets the tile data for a specific node. It can be
overridden to extract something more specific, especially where the
node value is JSON and only certain k/v pairs are useful to query.
"""
return f"data__{node.pk}"

def to_python(self, tile_val):
return tile_val
25 changes: 20 additions & 5 deletions arches/app/datatypes/concept_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,14 @@
import uuid
import csv
import logging

from django.contrib.postgres.fields import ArrayField
from django.core.exceptions import ObjectDoesNotExist
from django.db.models import fields
from django.utils.translation import gettext as _
from arches.app.models import models
from arches.app.models import concept
from django.core.cache import cache

from arches.app.models import models
from arches.app.models.system_settings import settings
from arches.app.datatypes.base import BaseDataType
from arches.app.datatypes.datatypes import DataTypeFactory, get_value_from_jsonld
Expand All @@ -32,7 +35,6 @@
from rdflib.namespace import RDF, RDFS, XSD, DC, DCTERMS, SKOS
from arches.app.models.concept import ConceptValue
from arches.app.models.concept import Concept
from io import StringIO

archesproject = Namespace(settings.ARCHES_NAMESPACE_FOR_DATA_EXPORT)
cidoc_nm = Namespace("http://www.cidoc-crm.org/cidoc-crm/")
Expand All @@ -41,6 +43,8 @@


class BaseConceptDataType(BaseDataType):
rest_framework_model_field = fields.UUIDField(null=True)

def __init__(self, model=None):
super(BaseConceptDataType, self).__init__(model=model)
self.value_lookup = {}
Expand Down Expand Up @@ -253,6 +257,8 @@ def validate(
return errors

def transform_value_for_tile(self, value, **kwargs):
if isinstance(value, uuid.UUID):
return str(value)
try:
stripped = value.strip()
uuid.UUID(stripped)
Expand Down Expand Up @@ -409,6 +415,8 @@ def ignore_keys(self):


class ConceptListDataType(BaseConceptDataType):
rest_framework_model_field = ArrayField(base_field=fields.UUIDField(), null=True)

def validate(
self,
value,
Expand All @@ -425,13 +433,20 @@ def validate(
if value is not None:
validate_concept = DataTypeFactory().get_instance("concept")
for v in value:
val = v.strip()
if isinstance(v, uuid.UUID):
val = str(v)
else:
val = v.strip()
errors += validate_concept.validate(val, row_number)
return errors

def transform_value_for_tile(self, value, **kwargs):
ret = []
for val in csv.reader([value], delimiter=",", quotechar='"'):
if not isinstance(value, list):
value = [value]
if all(isinstance(inner, uuid.UUID) for inner in value):
return [str(inner) for inner in value]
for val in csv.reader(value, delimiter=",", quotechar='"'):
lines = [line for line in val]
for v in lines:
try:
Expand Down
5 changes: 4 additions & 1 deletion arches/app/datatypes/core/non_localized_string.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
from django.conf import settings
from django.db.models import fields
from django.utils.translation import gettext as _
from rdflib import URIRef, Literal, ConjunctiveGraph as Graph
from rdflib.namespace import RDF

from arches.app.datatypes.base import BaseDataType
from arches.app.datatypes.core.util import get_value_from_jsonld
from django.conf import settings
from arches.app.search.elasticsearch_dsl_builder import (
Bool,
Exists,
Expand All @@ -18,6 +19,8 @@


class NonLocalizedStringDataType(BaseDataType):
rest_framework_model_field = fields.CharField(null=True)

def validate(
self,
value,
Expand Down
88 changes: 73 additions & 15 deletions arches/app/datatypes/datatypes.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import copy
import itertools
import uuid
import json
import decimal
Expand All @@ -15,9 +16,6 @@
from datetime import datetime
from mimetypes import MimeTypes

from django.core.files.images import get_image_dimensions
from django.db.models import fields

from arches.app.const import ExtensionType
from arches.app.datatypes.base import BaseDataType
from arches.app.models import models
Expand Down Expand Up @@ -47,18 +45,18 @@
from arches.app.search.search_engine_factory import SearchEngineInstance as se
from arches.app.search.search_term import SearchTerm
from arches.app.search.mappings import RESOURCES_INDEX

from django.contrib.postgres.fields import ArrayField
from django.core.cache import cache
from django.core.files import File
from django.core.files.base import ContentFile
from django.core.files.storage import FileSystemStorage, default_storage
from django.core.files.images import get_image_dimensions
from django.core.files.storage import default_storage
from django.core.exceptions import ObjectDoesNotExist
from django.core.exceptions import ValidationError
from django.db import connection, transaction
from django.db import connection
from django.db.models import fields
from django.db.models.fields.json import JSONField
from django.utils.translation import get_language, gettext as _

from elasticsearch import Elasticsearch
from elasticsearch.exceptions import NotFoundError

# One benefit of shifting to python3.x would be to use
# importlib.util.LazyLoader to load rdflib (and other lesser
# used but memory soaking libs)
Expand Down Expand Up @@ -118,6 +116,8 @@ def get_instance(self, datatype):


class StringDataType(BaseDataType):
rest_framework_model_field = JSONField(null=True)

def validate(
self,
value,
Expand Down Expand Up @@ -458,6 +458,8 @@ def pre_structure_tile_data(self, tile, nodeid, **kwargs):


class NumberDataType(BaseDataType):
rest_framework_model_field = fields.FloatField(null=True)

def validate(
self,
value,
Expand Down Expand Up @@ -574,6 +576,8 @@ def default_es_mapping(self):


class BooleanDataType(BaseDataType):
rest_framework_model_field = fields.BooleanField(null=True)

def validate(
self,
value,
Expand Down Expand Up @@ -675,6 +679,8 @@ def default_es_mapping(self):


class DateDataType(BaseDataType):
rest_framework_model_field = fields.DateField(null=True)

def validate(
self,
value,
Expand Down Expand Up @@ -886,13 +892,17 @@ def get_display_value(self, tile, node, **kwargs):


class EDTFDataType(BaseDataType):
rest_framework_model_field = fields.CharField(null=True)

def transform_value_for_tile(self, value, **kwargs):
transformed_value = ExtendedDateFormat(value)
if transformed_value.edtf is None:
return value
return str(transformed_value.edtf)

def pre_tile_save(self, tile, nodeid):
# TODO: This is likely to be duplicative once we clean this up:
# https://github.com/archesproject/arches/issues/10851#issuecomment-2427305853
tile.data[nodeid] = self.transform_value_for_tile(tile.data[nodeid])

def validate(
Expand Down Expand Up @@ -1057,6 +1067,8 @@ def default_es_mapping(self):


class FileListDataType(BaseDataType):
rest_framework_model_field = ArrayField(base_field=fields.CharField(), null=True)

def __init__(self, model=None):
super(FileListDataType, self).__init__(model=model)
self.node_lookup = {}
Expand Down Expand Up @@ -1258,7 +1270,7 @@ def to_json(self, tile, node):
if data:
return self.compile_json(tile, node, file_details=data[str(node.nodeid)])

def post_tile_save(self, tile, nodeid, request):
def post_tile_save(self, tile, nodeid, request=None):
if request is not None:
# this does not get called when saving data from the mobile app
previously_saved_tile = models.TileModel.objects.filter(pk=tile.tileid)
Expand Down Expand Up @@ -2013,6 +2025,8 @@ class ResourceInstanceDataType(BaseDataType):

"""

rest_framework_model_field = fields.UUIDField(null=True)

def validate(
self,
value,
Expand Down Expand Up @@ -2060,14 +2074,14 @@ def validate(
raise ObjectDoesNotExist()
except ObjectDoesNotExist:
message = _(
"The related resource with id '{0}' is not in the system.".format(
"The related resource with id '{0}' is not in the system".format(
resourceid
)
)
errors.append({"type": "ERROR", "message": message})
except (ValueError, TypeError):
message = _(
"The related resource with id '{0}' is not a valid uuid.".format(
"The related resource with id '{0}' is not a valid uuid".format(
str(value)
)
)
Expand All @@ -2090,7 +2104,7 @@ def pre_tile_save(self, tile, nodeid):
for relationship in relationships:
relationship["resourceXresourceId"] = str(uuid.uuid4())

def post_tile_save(self, tile, nodeid, request):
def post_tile_save(self, tile, nodeid, request=None):
ret = False
sql = """
SELECT * FROM __arches_create_resource_x_resource_relationships('%s') as t;
Expand Down Expand Up @@ -2217,6 +2231,14 @@ def get_search_terms(self, nodevalue, nodeid=None):
return terms

def transform_value_for_tile(self, value, **kwargs):
def from_id_string(uuid_string):
nonlocal kwargs
return {
"resourceId": uuid_string,
"inverseOntology": kwargs.get("inverseOntology", ""),
"inverseOntologyProperty": kwargs.get("inverseOntologyProperty", ""),
}

try:
return json.loads(value)
except ValueError:
Expand All @@ -2228,7 +2250,16 @@ def transform_value_for_tile(self, value, **kwargs):
except TypeError:
# data should come in as json but python list is accepted as well
if isinstance(value, list):
return value
if all(isinstance(inner, models.ResourceInstance) for inner in value):
return [from_id_string(str(instance.pk)) for instance in value]
elif all(isinstance(inner, uuid.UUID) for inner in value):
return [from_id_string(str(uid)) for uid in value]
elif all(isinstance(inner, str) for inner in value):
return [from_id_string(uid) for uid in value]
else:
return value
if isinstance(value, models.ResourceInstance):
return [from_id_string(str(value.pk))]

def transform_export_values(self, value, *args, **kwargs):
return json.dumps(value)
Expand Down Expand Up @@ -2354,8 +2385,23 @@ def default_es_mapping(self):
}
return mapping

def _get_base_orm_lookup(self, node):
"""Filter down to the resourceId."""
return f"data__{node.pk}__0__resourceId"

def values_match(self, value1, value2):
if not isinstance(value1, list) or not isinstance(value2, list):
return value1 == value2
copy1 = [{**inner_val} for inner_val in value1]
copy2 = [{**inner_val} for inner_val in value2]
for inner_val in itertools.chain(copy1, copy2):
inner_val.pop("resourceXresourceId", None)
return copy1 == copy2


class ResourceInstanceListDataType(ResourceInstanceDataType):
rest_framework_model_field = ArrayField(base_field=fields.UUIDField(), null=True)

def to_json(self, tile, node):
from arches.app.models.resource import (
Resource,
Expand Down Expand Up @@ -2383,6 +2429,18 @@ def to_json(self, tile, node):
def collects_multiple_values(self):
return True

def _get_base_orm_lookup(self, node):
"""Undo the override in ResourceInstanceDataType. TODO: write a better lookup.
Currently the unpacking into UUID[] is done in to_python(), but this isn't
useful for querying."""
return f"data__{node.pk}"

def to_python(self, tile_val):
if tile_val is None:
return tile_val
resource_ids = [inner["resourceId"] if inner else None for inner in tile_val]
return resource_ids


class NodeValueDataType(BaseDataType):
def validate(
Expand Down
3 changes: 3 additions & 0 deletions arches/app/datatypes/url.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
from rdflib import ConjunctiveGraph as Graph
from rdflib import URIRef, Literal, Namespace
from rdflib.namespace import RDF, RDFS, XSD, DC, DCTERMS
from django.db.models import fields
from django.utils.translation import gettext as _

archesproject = Namespace(settings.ARCHES_NAMESPACE_FOR_DATA_EXPORT)
Expand Down Expand Up @@ -70,6 +71,8 @@ class URLDataType(BaseDataType):
URL Datatype to store an optionally labelled hyperlink to a (typically) external resource
"""

rest_framework_model_field = fields.URLField(null=True)

URL_REGEX = re.compile(
r"https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)"
)
Expand Down
7 changes: 7 additions & 0 deletions arches/app/models/functions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from django.db import models


class UUID4(models.Func):
function = "uuid_generate_v4"
arity = 0
output_field = models.UUIDField()
Loading