Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New exported format "tabular data" on JSON, use case Wikimedia Commons #26

Open
fititnt opened this issue Mar 18, 2022 · 2 comments
Open
Labels
archiva-farmatis archīva fōrmātīs; /formats of files/@eng-Latn; About (new) data formats to package dictionaries praeparatio-ex-codex praeparātiō ex cōdex; (related to) preparation of book (of a collection of dictionaries)

Comments

@fititnt fititnt added the praeparatio-ex-codex praeparātiō ex cōdex; (related to) preparation of book (of a collection of dictionaries) label Mar 18, 2022
@fititnt
Copy link
Member Author

fititnt commented Mar 18, 2022

./999999999/0/1603_1.py --codex-de 1603_25_1 --codex-in-tabulam-json | jq

Captura de tela de 2022-03-18 04-41-32

Not yet with real data, but is something

1603_1.py vs hxltmcli

In theory, the ideal would be move this exporter to hxltm tools (such as other formats we document on https://hxltm.etica.ai/) but it would take a lot of time to move (aka generalize) the logistics of numerordinatio directly on hxltm. HXLTM is more general propuse, and numerordinatio is a very strict subset. Add to this that to generate the data packages to work on Mediawiki commons, it would need be aware of the 1603:1:51 (which have the language mappings) so... at least for now, this exporter is similar to the way we generate the book formats.

fititnt added a commit that referenced this issue Mar 18, 2022
@fititnt
Copy link
Member Author

fititnt commented Mar 18, 2022

Better output (but still fake data).

Captura de tela de 2022-03-18 05-52-23

Anyway, I think we can have some MVP of this format soon. But it seems that the default way to show data tables with localized field on MediaWiki gives not hint for the user that it is showing a fallback language (not the target language) and this can lead to confusion).

We may even need to simulate a language that would never be target language from others to force the behavior. But it would still be better to do it via the programming, not on the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
archiva-farmatis archīva fōrmātīs; /formats of files/@eng-Latn; About (new) data formats to package dictionaries praeparatio-ex-codex praeparātiō ex cōdex; (related to) preparation of book (of a collection of dictionaries)
Projects
None yet
Development

No branches or pull requests

1 participant