Replies: 1 comment 2 replies
-
To force the use of the correct index when searching the inputs table. The cost of ordering is actually negligible thanks to relational database and binary-tree indexes work.
There should be one! Which might explain all your problems. From a quick look at the code, this seems not to be the case (anymore?) 🤦 ... But that index should actually exist.
I don't think this query scales well as "SELECT datum_hash FROM inputs" will get larger and larger over time. So performing a deletion within that list is expensive, especially since the query needs to look for non-inclusion.
Thanks, I know :). I've read this manual from top to bottom several times actually.
This wouldn't work well unfortunately with the incremental approach, as the indexes will need to be created over and over again; without that, it's also just expensive to create an index on the fly on such a large collection. The index ought to exist anyway, so that's a non-problem. |
Beta Was this translation helpful? Give feedback.
-
Are there yet any experiments / thoughts on how to prune binary data? From the log of kupo I infer that this is the query currently used for pruning:
I have several questions for this
This is the query plan
I would also propose this alternative query that requires only one additional index on inputs by datum hash:
See the meaning of limit for delete
This has the following plan (using the proposed index)
The index on inputs by data hash is created automatically anyways according to this stack overflow answer, so it might make sense to just persist it.
Beta Was this translation helpful? Give feedback.
All reactions