Skip to content

spraakbanken/parallel-corpus-py

Repository files navigation

parallel-corpus

PyPI version PyPI - Python Version PyPI - Downloads

Maturity badge - level 2 Stage

Codecov CI(release) CI(check) CI(test) CI(scheduled)

Parallel corpus as a graph.

Ported from Graph in spraakbanken/swell-editor.

Install

To install parallel-corpus in the current environment:

pip install parallel-corpus

To add parallel-corpus to a PDM project:

pdm add parallel-corpus

To add parallel-corpus manually to pyproject.toml:

[project]
dependencies = ["parallel-corpus>=0.1.2"]

Usage

first = "Jonathan saknades ."

# Initialize graph with source and target equal.
g = graph.init(first)

second = "Jonat han saknades ."

# Update target with new text.
gm = graph.set_target(g, second)

# The graph will now contain a edge from 'Jonathan' and both 'Jonat' and 'han'.
print(f"{gm.edges=}")

Changelog

This project keeps a changelog.

Supported Python Versions

This library thrives to support the following versions:

  • v0.2: Python 3.9
  • v0.1: Python 3.8

Development

This project uses conventional commits.

Tools used:

  • uv for project management.
  • pre-commit for pre-commit checking
    • runs ruff linter
    • runs ruff formatter
    • checks that commit message is according conventional commits.
    • install hooks with pre-commit install.
  • git-cliff for changelog updates.
  • bump-my-version for version bumping.
  • syrupy for snapshot testing.