Installation

run:

pip3 install . --user

esmarc.py

esmarc is a python3 tool to read line-delimited MARC21 JSON from an elasticSearch index, perform a mapping and writes the output in a directory with a file for each mapping type.

dependencies: python3-elasticsearch efre-lod-elasticsearch-tools

run:

$ esmarc.py <OPTARG>
	-h, --help            show this help message and exit
	-host HOST            hostname or IP-Address of the ElasticSearch-node to use. If None we try to read ldj from stdin.
	-port PORT            Port of the ElasticSearch-node to use, default is 9200.
	-type TYPE            ElasticSearch Type to use
	-index INDEX          ElasticSearch Index to use
	-id ID                map single document, given by id
	-help                 print this help
	-prefix PREFIX        Prefix to use for output data
	-debug                Dump processed Records to stdout (mostly used for debug-purposes)
	-server SERVER        use http://host:port/index/type/id?pretty syntax. overwrites host/port/index/id/pretty.
	-pretty               output tabbed json
	-w W                  how many processes to use
	-idfile IDFILE        path to a file with IDs to process
	-query QUERY          prefilter the data based on an elasticsearch-query

entityfacts-bot.py

entityfacts-bot.py is a Python3 program that enrichs ("links") your data with more identifiers from entitiyfacts. Prerequisits is that you have a field containing your GND-Identifier.

It connects to an elasticsearch node and outputs the enriched data, which can be put back to the index using esbulk.

Usage

./entityfacts-bot.py
    -h, --help            show this help message and exit
    -host HOST            hostname or IP-Address of the ElasticSearch-node to use, default is localhost.
    -port PORT            Port of the ElasticSearch-node to use, default is 9200.
    -index INDEX          ElasticSearch Search Index to use
    -type TYPE            ElasticSearch Search Index Type to use
    -id ID                retrieve single document (optional)
    -searchserver SEARCHSERVER use http://host:port/index/type/id?pretty. overwrites host/port/index/id/pretty
    -stdin                get data from stdin
    -pipeline             output every record (even if not enriched) to put this script into a pipeline

Requirements

python3-elasticsearch

e.g. (ubuntu)

sudo apt-get install python3-elasticsearch

wikidata.py

wikidata.py is a Python3 program that enrichs ("links") your data with the wikidata-identifier from wikidata. Prerequisits is that you have a field containing your GND-Identifier. Other identifiers are planned to be used in future.

It connects to an elasticsearch node and outputs the enriched data, which can be put back to the index using esbulk.

Usage

./wikidata.py
    -h, --help      show this help message and exit
    -host HOST      hostname or IP-Address of the ElasticSearch-node to use, default is localhost.
    -port PORT      Port of the ElasticSearch-node to use, default is 9200.
    -index INDEX    ElasticSearch Search Index to use
    -type TYPE      ElasticSearch Search Index Type to use
    -id ID          retrieve single document (optional)
    -stdin          get data from stdin
    -pipeline       output every record (even if not enriched) to put this script into a pipeline
    -server SERVER  use http://host:port/index/type/id?pretty. overwrites host/port/index/id/pretty

Name		Name	Last commit message	Last commit date
Latest commit History 263 Commits
conf		conf
enrichment		enrichment
esmarc		esmarc
mapping		mapping
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

esmarc.py

entityfacts-bot.py

Usage

Requirements

wikidata.py

Usage

About

Releases

Packages

Contributors 3

Languages

License

slub/esmarc

Folders and files

Latest commit

History

Repository files navigation

Installation

esmarc.py

entityfacts-bot.py

Usage

Requirements

wikidata.py

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages