docdb_create()
anddocdb_update()
for SQLite and PostgreSQL (only if on localhost) now import directly and fast fromndjson
files, in analogy to DuckDB (needs RSQLite >= 2.3.7.9014)- Refactored
docdb_update()
forsrc_couchdb()
- Add message from
docdb_create()
if a data frame has column names with a dot(s) since dots innodbi
are used forJSON
dot paths
- uses new features of
duckdb
1.11.0 for refactoring ofdocdb_query()
, accelerating queries - accelerated creating and updating from file
- partial refactoring of
docdb_query()
, accelerating queries up to 20-fold for SQLite, DuckDB, and acceleratinglistfields = TRUE
several times for DuckDB
- address
docdb_query()
not working for cases when dot paths had no counts between fields - address wrong database size printing
- stop if query is invalid even though JSON is valid
- print information also for MongoDB connection object
- code cleaning, parameters checking
- document that
$regex
indocdb_query()
is case-sensitive
- re-adding field formatting for
docdb_query(src, key, query, listfields = TRUE, limit = <integer>)
- minor fixes to
limit
indocdb_query(src, key, query, listfields = TRUE, limit = <integer>)
and speed up
- added vignette
- added tests internal functions, verbose option
- added caching to GitHub action workflow
- added missing fields validity check for duckdb
- more robust parameter checks in
docdb_query
anddocdb_update
- ensure
NULL
also for all MongoDB returns
- docTyp'ed src.R
- minify
JSON
with Elasticsearch indocdb_update
- moved local variable out of UseMethod in
docdb_query
- make
docdb_get()
work again forsrc_sqlite()
by castingJSONB
back toJSON
- empty parameter
query
now triggers a warning as it should be a valid JSON string; changequery = ""
intoquery = "{}"
- adapted to use new, faster
JSONB
functions inSQLite
3.45.0 (RSQLite
>= 2.3.4.9005) - refactored parts of
docdb_create()
to speed up handling large data frames and lists - made Elasticsearch to immediately refresh index after
docdb_create()
and other functions docdb_update()
now reports which records failed to update and then continuesdocdb_delete()
now returns harmonised success logical value across backends
docdb_query()
reimplementation to have the same functionality across all databases (DuckDB, SQLite, PostgreSQL, MongoDB, Elasticsearch, CouchDB); even though the API and unit tests remained, user provisions may break e.g. to handle return values of databases that previously were incompletely implemented (in particular Elasticsearch and CouchDB). Details:
query
can now be complex (nested, various operators) and is digested with a Javascript helperfields
can now be nested fields (e.g.,friends.name
) to directly return values lifted from the nested fieldlistfields
parameter newly implemented to return dot paths for all fields of all or selected documents in collection- expanded use of
jq
viajqr
for mangling parameters, selecting documents, filtering fields and lifting nested field values - if no data are found, returns
NULL
(previously some backends returned an empty data frame) docdb_query(src, key, query = "{}", fields = "{}")
now delegates todocdb_get(src, key)
_id
is always returned, unless specified with"_id": 0
in parameterfields
- for
scr_postgres
, only fewer than 50 fields if any can be specified infields
- for
src_sqlite
, minimise the use of the time-costlyjson_tree
- workaround for path collisions of MongoDB
- some acceleration of
docdb_query()
- factored out common code
- expanded testing
- updated docs
- escaping newline character within a JSON value, in
docdb_*()
functions
- changed
docdb_update()
to directly use NDJSON from file for duckdb - cleaned up unnecessary code in
docdb_create()
- no more using transactions with
src_duckdb()
- regression error from not specifying top-level jq script
- corrected and improve field selection in
docdb_query()
- corrected test exceptions for mongodb, updated GitHub Actions, expanded tests
- corrected marginal case in
docdb_query.src_duckdb()
- corrected minimum R version
- replaced in tests
httpbin
withwebfakes
- removed explicit UTF-8 encoding reference
- speed up in
docdb_query()
- switched to v2 GitHub r-lib/action for R CMD check
- replaced a dependency, gained speed
- fix initialisation in
docdb_query()
withsrc_duckdb()
docdb_update()
now can do bulk updates when _id's are invalue
(for SQLite, DuckDB, PostgreSQL, MongoDB; not yet for CouchDB and Elastic)
- fix tests for value parameter to be a file or an url
src_duckdb()
handles when json_type returns NULL for non-existing pathsrc_sqlite()
handles when text includes double quotation marks
- added warning if DuckDB's JSON extension is not available; improve instructions; see also issue #45
- minor simplification of
docdb_exists()
forsrc_mongo()
, and ofdocdb_query()
for SQL databases
- corrected closing connections to SQL database backends upon session restart
- improved provisions for parallel write access and corresponding tests
- capture marginal case of no rows in
docdb_query()
- adding support for duckdb (R package version 0.6.0 or higher) as database backend
- suppressed warnings when checking if a string points to a file
- replaced
isa()
as not available with R version 3.x
- refactored
docdb_update.src_couchdb()
to usejqr
- adapted
docdb_create
to acceptjsonlite
,jsonify
,jqr
JSON - added details to README
- testing (unset LANG, relocate open code, better cleaning up)
- fixed
docdb_query()
to account for change in SQLite 3.38.3 adding quotation of labels (closes issue #44), test added - made
docdb_query()
work for PostgreSQL when a string used with the$in
operator has a comma(s), test added
docdb_create()
now supports file names and http urls as argumentvalue
for importing datadocdb_create()
(and thusdocdb_update()
) now supports quantifiers (e.g., '[a-z]{2,3}') in regular expressions
- for SQLite, return
FALSE
like other backends when usingdocdb_delete()
for a non-existing container (table, in the case of SQLite) - better handle special characters and encodings under Windows
- full support for PostgreSQL (using jsonb)
- for SQLite add closing file references also on exit
- for SQLite under Windows ensure handling of special characters (avoiding encoding conversions with file operations that stream out / in NDJSON)
- identical API for
docdb_*()
functions so thatquery
andfields
parameters can be used across database backends - identical return values across database backends
- re-factored recently added functions for RSQLite
- re-factored most functions to provide identical API
- performance (timing and memory use) profiled and optimised as far as possible
- testing now uses the same test file across databases
- currently, no more support for redis (no way was found to query and update specific documents in a container)
docdb_list()
added as function to list container in database
- Support for complex queries not yet implemented for Elasticsearch
- Only root fields (no subitems) returned by Elasticsearch and CouchDB
- made remaining
docdb_*()
functions return a logical indicating the success of the function (docdb_create
,docdb_delete
), or a data frame (docdb_get
,docdb_query
), or the number of documents affected by the function (docdb_update
) - change testing approach
docdb_get()
to not return '_id' field forsrc_{sqlite,mongo}
since already used for row names
docdb_query.src_sqlite()
now handles JSON objects, returning nested lists (#40)src_sqlite()
now uses transactions for relevant functions (#39)docdb_update.src_mongo()
now returns the number of upserted or matched documents, irrespective of whether they were updated or not
docdb_get()
to not return '_id' field forsrc_{sqlite,mongo}
since already used for row names
- change of maintainer agreed
- fix for
src_couchdb()
: we were not setting user and password correctly internally, was causing issues in CouchDB v3 (#35) thanks to @drtagkim for the pull request
- in
docdb_query()
anddocdb_get()
, for sqlite source, use a connection instead of a regular file path to avoid certain errors on Windows (#33) work by @rfhb - in
docdb_query()
anddocdb_create()
for sqlite source, fix to handle mixed values of different types (#34) work by @rfhb - some Sys.sleep's added to Elasticserch eg's to make sure data is available after creation, and before a data request
- new author Ralf Herold, with contribution of new functions for working with SQLite/json1. new functions:
src_sqlite
,print.src_sqlite
,docdb_create.src_sqlite
,docdb_delete.src_sqlite
,docdb_exists.src_sqlite
,docdb_get.src_sqlite
,docdb_query.src_sqlite
, anddocdb_update.src_sqlite
. includes new datasetcontacts
(#25) (#27) (#28) (#29) (#30) (#31) docdb_update
gains method for working with MongoDB, via (#27)
- added
.github
files in the source repository to facilitate contributions src_mongo
changes, improved behavior, via (#27)
etcd
(via theetseed
package) integration has been removed from this package as etcd doesn't really fit the main goal of the pkg. functions now defunct are:src_etcd
,docdb_create.src_etcd
,docdb_delete.src_etcd
,docdb_exists.src_etcd
,docdb_get.src_etcd
, andprint.src_etcd
(#26)
docdb_get()
gainslimit
parameter to do pagination, for CouchDB, Elasticsearch and MongoDB only (#17) (#23)- gains function
docdb_query()
to send queries to each backend (#18) (#22) - gains function
docdb_exists()
to check if a database or equivalent exists (#21) (#22)
- Updated package for new version of
elastic
, which has slightly different setup for connecting to the Elasticsearch instance (#20)
- released to CRAN