Skip to content

zazuko/triplestore-benchmark

Repository files navigation

Triplestore Benchmark

Open in Gitpod

Welcome to the Triplestore Benchmark repository! Here, you can explore SPARQL queries extracted from the Swiss government SPARQL endpoint, LINDAS, to benchmark and compare other triplestores.

Baseline Comparison

Currently, we are in the process of preparing a baseline comparison of the LINDAS endpoint for future reference. Stay tuned for updates on this section.

Requirements

Ensure you have the following prerequisites ready:

  • k6
  • A snapshot of the LINDAS dataset, which you can download here. The dataset is approximately 2.3 GB compressed and 60GB uncompressed.
  • A triplestore that you wish to test against.
  • The above referenced dataset uploaded into the triplestore.

We use k6 to benchmark the queries. We also provide a quick way to check if the triplestore is compliant with the queries that are run against LINDAS.

Quick Start

All SPARQL queries that are used in the differents tests are stored in the queries folder.

In case you add/remove/rename a query, you need to update the query-files.json file, by running the following command:

./scripts/update-query-list.sh

In case you want to know the query file from an ID shown in the results, you can run the following command:

jq -r '.[x,y,z]' query-files.json

by replacing x, y, and z by the IDs you want to know the query file. You can specify as many IDs as you want.

Conformity Test

Check that your triplestore is able to support some common queries against the LINDAS dataset:

k6 run \
  -e SPARQL_ENDPOINT=https://example.com/query \
  -e SPARQL_USERNAME=user \
  -e SPARQL_PASSWORD=secret \
  lindas-conformity.js

The query timeout is set to 5min. The script has a limit of 1 day to run.

This will run the queries against the triplestore and check if they are able to return a result. The results are stored in a file ./results/summary-conformity.json.

And to inspect results in a human-readable format, run the following command (jq is required):

./scripts/summary-conformity-simple.sh

Benchmark

Run the benchmark against your triplestore:

k6 run \
  -e SPARQL_ENDPOINT=https://example.com/query \
  -e SPARQL_USERNAME=user \
  -e SPARQL_PASSWORD=secret \
  lindas-benchmark.js

In case you want to run the benchmark on some specific queries (it can be useful in order to check that it can hit your endpoint as expected), you can add those parameters:

  • -e START=0: The index of the first query to run (default: 0)
  • -e END=801: The index of the last query to run (default: 801)

The index starts at 0.

The query timeout is set to 2min30s.

It will run 10 virtual users, that will run the maximum number of queries they can against the triplestore during 120s, and this for each query.

The results are stored in a file ./results/summary-benchmark-YYYY-MM-DDTHH-MM-SS.json.

To inspect the results in a human-readable format, run the following command (jq is required):

./scripts/summary-benchmark.sh ./results/summary-benchmark-YYYY-MM-DDTHH-MM-SS.json

by updating the path to the JSON file you want to inspect.

You can analyse the results in a more detailed way by running the Jupiter Notebook benchmark-analysis.ipynb. Just update the path of the results file you want to analyse at the beginning of the notebook and run it.