Skip to content

Latest commit

 

History

History
55 lines (37 loc) · 1.97 KB

README.md

File metadata and controls

55 lines (37 loc) · 1.97 KB

ZBW - digital archive data fetcher and converter

This program downloads data from a variety of CBS databases and performs a conversion to a proprietary CSV schema.

Getting started and prerequisites

  1. Install curl.
  2. Install cpanminus.
  3. Install Catmandu. The necessary dependencies are automatically resolved and installed.
  4. Install the Catmandu MARC importer:
$ sudo cpanm Catmandu::Importer::MARC
  1. Install the Catmandu CSV exporter:
$ sudo cpanm Catmandu::Exporter::CSV

Usage

First there has to be a list of PPN IDs (one per row) in a file. A PPN (PICA production number) is a unique record identifier for bibliographic records in a CBS database.

The program expects three arguments when invoked:

./da_fetch.sh [filename] [target] [ISIL].

If no target is specified, the data is extracted from owc-de-206.

The following targets are available:

  • owc-de-206 (Arbeitskatalog der ZBW)
  • k10plus (database 1.1)
  • ebooks (database 1.2)
  • nl-monographien (database 1.50)
  • nl-zeitschriften (database 1.55)

If no ISIL is specified, the identifier.packageid column remains empty.

The records are now being downloaded and converted using Catmandu's ETL engine.

The PPN file will be automatically moved to a distinct directory: archive/ppns

A CSV file records-[filename].csv will be created that stores the converted records. The column separator is tab (\t). The file will be moved to archive/records.

If the mapping has to be adjusted, it is as simple as editing the mapping file da_fetch_mapping.fix. Use Catmandu's fix language.

Authors

  • Felix Hemme - Initial work - ZBW
  • Luisa Kramer-Ibig - Extensions within the scope of the SAVE project - ZBW

License

This project is licensed under the MIT License - see the LICENSE file for details