Skip to content

Commit

Permalink
Adapt to PYPI rendering
Browse files Browse the repository at this point in the history
  • Loading branch information
rexruan committed Nov 15, 2023
1 parent 9cac8e2 commit 696918f
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 40 deletions.
2 changes: 1 addition & 1 deletion LICENSE → LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
MIT License
# MIT License

Copyright (c) 2021 bmegyesi

Expand Down
70 changes: 33 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,66 +15,64 @@ export SWEGRAM_WORKSPACE=$(pwd)

Before installation, it is strongly recommended to use a virtual environment
```bash
# Create virtual environment (Highly recommended)
python3 -m venv venv
source venv/bin/activate
```

```bash
# Install swegram package
pip install swegram --upgrade

# Build dependencies
swegram-build

# Export pythonpath
export PYTHONPATH="$PYTHONPATH:$(pwd):$(pwd)/tools/efselab"
```

Check the usage of swegram cli

* swegram -h

```console
(venv) ➜ swegram -h
usage: SWGRAM 1.0 [-h] -l {en,sv} -i INPUT_PATH [-o OUTPUT_DIR] [--output-format {txt,xlsx,json,csv}] {annotate,statistic} ...

Swegram command line interface description

positional arguments:
{annotate,statistic} Swegram subparser
annotate Annotation parser help
statistic Statistic parser help
{annotate,statistic} Swegram subparser
annotate Annotation parser help
statistic Statistic parser help
```

```
optional arguments:
-h, --help show this help message and exit
-l {en,sv}, --language {en,sv}
choose the language for annotation
-i INPUT_PATH, --input-path INPUT_PATH
The input path to files/directory where working files are stored
-o OUTPUT_DIR, --output-dir OUTPUT_DIR
The output directory where working files are stored
--save-as {txt,xlsx,json}
The output format
-h, --help show this help message and exit
-l {en,sv}, --language {en,sv} choose the language for annotation
-i INPUT_PATH, --input-path INPUT_PATH The input path to files/directory where working files are stored
-o OUTPUT_DIR, --output-dir OUTPUT_DIR The output directory where working files are stored
--save-as {txt,xlsx,json} The output format
```

swegram annotate -h
--normalize Process spelling checker after tokenization and normalized tokens will be used for upcoming annotation actions.
--tokenize Process sentence segmentation and tokenization.
--tag Process part-of-speech tagging.
--parse Process syntactic dependency parsing.
--aggregate Aggregate all annotated texts into one file.

```
--normalize Process spelling checker after tokenization and normalized tokens will be used for upcoming annotation actions.
--tokenize Process sentence segmentation and tokenization.
--tag Process part-of-speech tagging.
--parse Process syntactic dependency parsing.
--aggregate Aggregate all annotated texts into one file.
```

swegram statistic -h
--include-metadata Include certain texts by selecting metadata. For instance, "--include-metadata key1 key2:value2" only selects the texts that contain key1 or key2:value2 in the metadata
-- exclude-metadata Exclude certain texts by deselecting metadata
-u --units Checking statistics of features given certain linguistic unit(s). The following units are valid to be chosen: corpus, text, paragraph, sentence
--aspects Checking statistics on the basis of selection of certain aspect(s). The following aspects are valid to be chosen: general, readability, morph, lexical, syntactic
--include-features Only certain features will be included
--exclude-features Certain features will be excluded
--print Flag to print the result on console
```console
--include-metadata Include certain texts by selecting metadata. For instance, "--include-metadata key1 key2:value2" only selects the texts that contain key1 or key2:value2 in the metadata
-- exclude-metadata Exclude certain texts by deselecting metadata
-u --units Checking statistics of features given certain linguistic unit(s). The following units are valid to be chosen: corpus, text, paragraph, sentence
--aspects Checking statistics on the basis of selection of certain aspect(s). The following aspects are valid to be chosen: general, readability, morph, lexical, syntactic
--include-features Only certain features will be included
--exclude-features Certain features will be excluded
--print Flag to print the result on console
```

## Run annotate and statistic actions with swegram

* For example, if you want to annotate one text file called "10-sv.txt" in the existing Resource folder named "resources/corpus/raw", the final conll file will be generated in a folder called output-folder, type the following command
```

```bash
swegram --language sv --input-path resources/corpus/raw/10-sv.txt --output-dir output-folder annotate
```

Expand All @@ -86,11 +84,10 @@ rm output/*.tok output/*.tag output/*.txt
```

Now, type the following command:
```
```bash
swegram --language sv --input-path output statistic
```


## Dependencies

* [udpipe](https://ufal.mff.cuni.cz/udpipe/1/install)
Expand All @@ -103,4 +100,3 @@ SWIG 3.0.8 or newer for language bindings other than C++

* [efselab](https://github.com/robertostling/efselab)
* [pandoc](https://pandoc.org)

2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ def get_requirements() -> List[str]:
description="CLI library for Swegram",
long_description=(Base / "README.md").read_text(encoding="utf-8"),
packages=find_packages(exclude=["tools*", "test*", "swegram/*", "swegram_django*"]),
license=(Base / "LICENSE").read_text(encoding="utf-8"),
# license=(Base / "LICENSE").read_text(encoding="utf-8"),
url="https://github.com/bmegyesi/swegram-v2",
install_requires=get_requirements(),
package_data={
Expand Down
2 changes: 1 addition & 1 deletion swegram_main/version.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""version module"""

VERSION = "1.0.3"
VERSION = "1.0.5"

0 comments on commit 696918f

Please sign in to comment.