API

OCRD XML API

This document describes an application programming interface to the input and output format used for processes within the OCR-D project. The format itself is based on METS as a container and for descriptive metadata and PAGE XML for the content.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Conventions
API
Glossary
- Processor

Input can be either a single METS XML file or a ZIP container with a single mets.xml plus referenced files

Conventions

fileGrp `USE` attribute

#9 #7

A METS file an have 1..n <fileGrp>. Their USE attribute MUST be one of

`@USE`	Type of use for OCR-D
`OCR-D-IMG`	The unmanipulated source images
`OCR-D-IMG-BIN`	Black-and-White images
`OCR-D-IMG-GRAY`	Gray images
`OCR-D-IMG-CROP`	Cropped images
`OCR-D-IMG-DESKEW`	Deskewed images
`OCR-D-IMG-DESPECK`	Despeckled images
`OCR-D-IMG-DEWARP`	Dewarped images
`OCR-D-SEG-PAGE`	Page segmentation
`OCR-D-SEG-BLOCK`	Block segmentation
`OCR-D-SEG-LINE`	Line segmentation
`OCR-D-OCR-TESS3`	Tesseract 3.04 OCR
`OCR-D-OCR-TESS4`	Tesseract 4.00 OCR
`OCR-D-OCR-ANY`	AnyOCR
`OCR-D-COR-CIS`	CIS post-correction
`OCR-D-COR-ASV`	ASV post-correction

Generated file `ID` attributes

The ID of the files produced SHOULD be <USE>_<INDEX>, where <USE> is the USE of surrounding <mets:fileGrp> and <INDEX> is the zero-padded four-digit index of the file within the group. This way, file ID are unique within the document.

Example:

<mets:fileGrp USE="OCR-D-SEG-LINE">
  <mets:file ID="OCR-D-SEG-LINE_0001>[...]</mets:file>
</mets:fileGrp>

One PAGE XML document per document page

A single PAGE XML file represents one page in the original document.

Every <pc:Page> element MUST have an attribute image which MUST always be the source image.

The PAGE XML root element <pc:PcGts> MUST have exactly one <pc:Page>.

Images and coordinates

Coordinates are always absolute, i.e. relative to extent defined in the imageWidth/imageHeight attribute of the nearest <pc:Page>.

When a processor wants to access the image of a layout element like a TextRegion or TextLine, the algorithm should be:

If the element in question has an attribute imageFilename, resolve this value
If the element has a <pc:Coords> subelement, resolve by passing the attribute imageFilename of the nearest <pc:Page> and the points attribute of the <pc:Coords> element

API

📦TODO📦 https://github.com/PRImA-Research-Lab/prima-core-libs and its apidocs.

`Resolver`

📦TODO📦 Describe

Data Repository
backend for the transparency in handling input and output
cutting out images
etc.

`new Ocrd.Resolver()`

Creates a resolver and sets e.g. the ZIP it should resolve file-URL in etc.

`OcrdPage resolvePage(String url)`

Resolve a URL to an OcrdPage.

`OcrdMets resolveMets(String url)`

Resolve a URL to an OcrdMets.

`OcrdImage resolveImage(String url)`

Resolve a URL to an OcrdImage.

`OcrdImage resolveImage(String url, OcrdCoords coords)`

Resolve a URL to an image, then crop it to the coordinates provided.

`OcrdMets`

Represents the METS file as used for input and output of the processors.

`List<OcrdPage> listInputPages()`

If fileGrp USE="INPUT" contains file mimetype="text/xml", parse them (OcrdPage) and list them.

Otherwise, if fileGrp USE="INPUT" contains file mimetype="image/*", generate empty PAGE XML from these by

Creating an pc:PcGts and therein
an empty pc:Page element with image="<URL>"

`listVariants`

📦TODO📦 Wrong here

Lists all variants, i.e. nested METS files used as INPUT. In the common case that there is no nesting, this will return just one variant with all the files listed in INPUT.

`OcrdPage getInputPage(i)`

`List<OcrdPage> listOutputs()`

`addOutput(OcrdPage page)`

`OcrdPage`

Should be generated by the resolver.

`Image getImage()`

`Image getAlternativeImage(type)`

`TextRegion`

`Image getImage()`

`TextLine`

`Image getAlternativeImage(type)`

Glossary

Processor

A processor is a tool that accepts METSPAGE input and produces METSPAGE output according to this spec.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API

OCRD XML API

Conventions

fileGrp `USE` attribute

Generated file `ID` attributes

One PAGE XML document per document page

Images and coordinates

API

`Resolver`

`new Ocrd.Resolver()`

`OcrdPage resolvePage(String url)`

`OcrdMets resolveMets(String url)`

`OcrdImage resolveImage(String url)`

`OcrdImage resolveImage(String url, OcrdCoords coords)`

`OcrdMets`

`List<OcrdPage> listInputPages()`

`listVariants`

`OcrdPage getInputPage(i)`

`List<OcrdPage> listOutputs()`

`addOutput(OcrdPage page)`

`OcrdPage`

`Image getImage()`

`Image getAlternativeImage(type)`

`TextRegion`

`Image getImage()`

`TextLine`

`Image getAlternativeImage(type)`

Glossary

Processor

Clone this wiki locally

API

OCRD XML API

Conventions

fileGrp USE attribute

Generated file ID attributes

One PAGE XML document per document page

Images and coordinates

API

Resolver

new Ocrd.Resolver()

OcrdPage resolvePage(String url)

OcrdMets resolveMets(String url)

OcrdImage resolveImage(String url)

OcrdImage resolveImage(String url, OcrdCoords coords)

OcrdMets

List<OcrdPage> listInputPages()

listVariants

OcrdPage getInputPage(i)

List<OcrdPage> listOutputs()

addOutput(OcrdPage page)

OcrdPage

Image getImage()

Image getAlternativeImage(type)

TextRegion

Image getImage()

TextLine

Image getAlternativeImage(type)

Glossary

Processor

Clone this wiki locally

fileGrp `USE` attribute

Generated file `ID` attributes

`Resolver`

`new Ocrd.Resolver()`

`OcrdPage resolvePage(String url)`

`OcrdMets resolveMets(String url)`

`OcrdImage resolveImage(String url)`

`OcrdImage resolveImage(String url, OcrdCoords coords)`

`OcrdMets`

`List<OcrdPage> listInputPages()`

`listVariants`

`OcrdPage getInputPage(i)`

`List<OcrdPage> listOutputs()`

`addOutput(OcrdPage page)`

`OcrdPage`

`Image getImage()`

`Image getAlternativeImage(type)`

`TextRegion`

`Image getImage()`

`TextLine`

`Image getAlternativeImage(type)`