Overview

SolrMarc is designed to read MARC records and to extract data from those records to build an Apache Solr index. It relies on the library Marc4j for reading MARC records and then uses a user-provided indexing specification to determine what fields are to be created for the Solr input document, and where that data should be extracted from, lastly it uses the SolrJ library for sending the Solr input documents to the Solr index.

As of version 3.0 the program has been completely re-written, based on code written by Oliver Obenland, (See https://github.com/oobenland/SolrMarc-Indexer-Tests)
The key design improvement Oliver created is to essentially compile the indexing specification once, and then apply that "compiled" version to each of the records that need indexing. I have taken his code and added handling of the basic field specification of SolrMarc (such as: title_display = 245abnp ) via a parser specification (CUP and JFlex) which makes defining and handling more complex specifications simpler.

The goal of the design is a program which operates much the same as the earlier versions of SolrMarc, including being able to process index specifications that worked with previous versions and produce substantially the same Solr records. But with the further goals of operating much faster and supporting a richer superset of features in the index specification language.

Included with this project is a Swing-based interactive interface that could eventually be used to develop, modify, extend and debug a set of indexing specifications, but for now it can be used to see how some of the new features will work.

A more in-depth description of the differences in this new version can be found in the Wiki, as well as information on how to install the program, how to create an index specification, how to run the program with that specification.

Additionally there is some information there about the code and design of the program for those that might be interested in contributing to the project.

Name		Name	Last commit message	Last commit date
Latest commit History 1,502 Commits
.github/workflows		.github/workflows
buildtools		buildtools
lib-solrj		lib-solrj
lib		lib
src/org/solrmarc		src/org/solrmarc
test		test
.classpath		.classpath
.gitignore		.gitignore
.project		.project
.travis.yml		.travis.yml
LICENSE		LICENSE
New Release of SolrMarc.pptx		New Release of SolrMarc.pptx
Readme.md		Readme.md
build.properties		build.properties
build.xml		build.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

About

Releases

Packages

Languages

License

wvstolzing/solrmarc

Folders and files

Latest commit

History

Repository files navigation

Overview

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages