Skip to content

Commit

Permalink
Refactoring
Browse files Browse the repository at this point in the history
  • Loading branch information
gremid committed Sep 27, 2024
1 parent feac542 commit 9e0457e
Show file tree
Hide file tree
Showing 19 changed files with 678 additions and 479 deletions.
6 changes: 6 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[flake8]
max-line-length = 80
extend-select = B950
extend-ignore = E203,E501,E701
per-file-ignores =
quaxa/__init__.py:F401
1 change: 0 additions & 1 deletion .github/FUNDING.yml

This file was deleted.

33 changes: 0 additions & 33 deletions .github/workflows/syntax-and-unit-tests.yml

This file was deleted.

25 changes: 25 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Python application

on: [push]

jobs:
build:
strategy:
matrix:
platform: [windows-latest, macos-latest, ubuntu-latest]

runs-on: ${{ matrix.platform }}

steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.x"
- name: Install dependencies
run: |
pip install -U pip
pip install -r requirements-dev.txt
pip install .
- name: Run unit tests
run: |
python -m unittest
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -108,3 +108,4 @@ venv.bak/
.vscode
profile/data*
.theia
*.temp
27 changes: 27 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v3.2.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: https://github.com/psf/black
rev: 22.10.0
hooks:
- id: black
- repo: https://github.com/PyCQA/flake8
rev: 7.0.0
hooks:
- id: flake8
additional_dependencies: [flake8-bugbear]
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.8.0
hooks:
- id: mypy
2 changes: 1 addition & 1 deletion .zenodo.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@
"good example extractor",
"German"
]
}
}
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -198,4 +198,4 @@
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
limitations under the License.
3 changes: 0 additions & 3 deletions MANIFEST.in

This file was deleted.

8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Wenn 1 Knock-out Kriterium identifiziert wird, dann wird direkt der Score direkt
| `has_blacklist_words` | bool | Satzbeleg enthält Wörter, sodass in keinem Fall der Satzbeleg als Wörterbuchbeispiel in Betracht gezogen wird; ausgenommen das Blacklist-Wort ist selbt der Wörterbucheintrag. (dt. Blacklist ist voreingestellt) | [1] GDEX blacklist |

### Diskontierungsfakoren
Je Kriterium wird ein Faktor berechnet, und alle Faktoren miteinander multipliziert.
Je Kriterium wird ein Faktor berechnet, und alle Faktoren miteinander multipliziert.
Wenn bspw. ein Faktor eine Penality von 0.1 bekommt, dann ist der Faktor 0.9.
Für den Gesamtscore wird der Gesamtfaktor mit 0.5 multipliziert.

Expand Down Expand Up @@ -79,11 +79,11 @@ pip install -r requirements-dev.txt --no-cache-dir
Publish

```sh
python setup.py sdist
python setup.py sdist
twine upload -r pypi dist/*
```

### Clean up
### Clean up

```sh
find . -type f -name "*.pyc" | xargs rm
Expand All @@ -106,4 +106,4 @@ The "Evidence" project was funded by the Deutsche Forschungsgemeinschaft (DFG, G

### Maintenance
- till 31.Aug.2023 (v0.1.0) the code repository was maintained within the DFG project [433249742](https://gepris.dfg.de/gepris/projekt/433249742)
- since 01.Sep.2023 (v0.1.0) the code repository is maintained by Ulf Hamster.
- since 01.Sep.2023 (v0.1.0) the code repository is maintained by Ulf Hamster.
1 change: 1 addition & 0 deletions VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.1.1
22 changes: 11 additions & 11 deletions demo/demo_quaxa.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import conllu
import random

import conllu

import quaxa
import quaxa.reader

Expand All @@ -8,21 +10,19 @@

def demo():
# read conllu file
corpus = conllu.parse(open('demo.conllu', 'r').read())
corpus = conllu.parse(open("demo.conllu", "r").read())
# compute scores for example sentences
for annot in corpus:
lemmas_content = [
tok.get('lemma') for tok in annot
if tok.get('upos') in {'NOUN', 'VERB', 'ADJ'}
tok.get("lemma")
for tok in annot
if tok.get("upos") in {"NOUN", "VERB", "ADJ"}
]
sent = annot.metadata['text']
sent = annot.metadata["text"]
for headword in lemmas_content:
factor = quaxa.total_score(
headword=headword, txt=sent, annotation=annot)
print((
"total_score:"
f"{factor: 7.4f} | {headword} | {sent[:50]} ..."))
factor = quaxa.total_score(headword=headword, txt=sent, annotation=annot)
print(("total_score:" f"{factor: 7.4f} | {headword} | {sent[:50]} ..."))


if __name__ == '__main__':
if __name__ == "__main__":
demo()
34 changes: 34 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[project]
name = "quaxa"
description = "QUAlity of sentence eXAmples scoring"
authors = [{name = "Ulf Hamster", email = "[email protected]"}]
classifiers = [
"Development Status :: 1 - Planning",
"Intended Audience :: Developers",
"Intended Audience :: Science/Research",
"License :: OSI Approved :: Apache Software License",
"Topic :: Education",
"Topic :: Scientific/Engineering",
"Topic :: Text Processing :: Linguistic"
]
requires-python = ">=3.7"
dynamic = ["dependencies", "version", "readme"]

[project.urls]
Homepage = "https://github.com/ulf1/quaxa"

[tool.isort]
profile = "black"

[tool.setuptools.dynamic]
dependencies = {file = ["requirements.txt"]}
version = {file = ["VERSION"]}
readme = {file = ["README.md"], content-type = "text/markdown"}

[tool.setuptools.packages.find]
include = ["quaxa*"]
exclude = ["test*"]
35 changes: 22 additions & 13 deletions quaxa/__init__.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,28 @@
__version__ = '0.1.1'
from pathlib import Path

__version__ = (Path(__file__) / ".." / ".." / "VERSION").resolve().read_text().strip()

from .quaxa import (
total_score,
isa_knockout_criteria,
BLACKLIST_WORDS_DE,
DEFAULT_SPACE_DEIXIS_TERMS,
DEFAULT_TIME_DEIXIS_TERMS,
ORD_RARE_CHARS_DE,
ORDS_QWERTZ_DE,
QWERTZ_DE,
RARE_CHARS_DE,
deixis_person,
deixis_space,
deixis_time,
factor_gradual_criteria,
has_finite_verb_and_subject,
is_misparsed,
has_illegal_chars,
has_blacklist_words, BLACKLIST_WORDS_DE,
factor_rarechars, RARE_CHARS_DE, ORD_RARE_CHARS_DE,
factor_notkeyboardchar, QWERTZ_DE, ORDS_QWERTZ_DE,
factor_graylist_words,
factor_named_entity,
deixis_space, DEFAULT_SPACE_DEIXIS_TERMS,
deixis_time, DEFAULT_TIME_DEIXIS_TERMS,
deixis_person,
optimal_interval
factor_notkeyboardchar,
factor_rarechars,
has_blacklist_words,
has_finite_verb_and_subject,
has_illegal_chars,
is_misparsed,
isa_knockout_criteria,
optimal_interval,
total_score,
)
Loading

0 comments on commit 9e0457e

Please sign in to comment.