Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proofread: Mayrhofer's Kurzgefasstes etymologisches Wörterbuch des Altindischen #137

Open
suhasm opened this issue Mar 28, 2022 · 5 comments

Comments

@suhasm
Copy link
Member

suhasm commented Mar 28, 2022

http://samskrtam.ru/sanskrit-lexicon/KEWA/

Can a stardict file be made that shows the screenshots for each headword?

@vvasuki
Copy link
Member

vvasuki commented Mar 28, 2022

Note from closed issue:

http://samskrtam.ru/sanskrit-lexicon/KEWA/ by marcis @gasyoun merits being OCR-ed and stardictified (headwords are already cleaned - so it should be a simple scripting task). BVP discussion https://groups.google.com/d/msgid/bvparishat/d4a94a28-55e4-4c88-af12-fc19fe9b9304n%40googlegroups.com .

@vvasuki
Copy link
Member

vvasuki commented Mar 28, 2022

It contains only 9587 entries until आयुः though. Wonder if there's a way to get the rest. @gasyoun - do you have the rest?

@vvasuki
Copy link
Member

vvasuki commented Mar 30, 2022

सिद्धम्। OCR succeeded mostly - barring Greek and some accents, which require manual correction.

@vvasuki vvasuki changed the title Dict Request: Mayrhofer's Kurzgefasstes etymologisches Wörterbuch des Altindischen Fix OCR: Mayrhofer's Kurzgefasstes etymologisches Wörterbuch des Altindischen Mar 30, 2022
@vvasuki vvasuki changed the title Fix OCR: Mayrhofer's Kurzgefasstes etymologisches Wörterbuch des Altindischen Proofread: Mayrhofer's Kurzgefasstes etymologisches Wörterbuch des Altindischen Mar 30, 2022
@vvasuki
Copy link
Member

vvasuki commented Jun 14, 2022

@Andhrabharati is working on proofreading this.

@Andhrabharati
Copy link

Andhrabharati commented Jun 14, 2022

It contains only 9587 entries until आयुः though. Wonder if there's a way to get the rest. @gasyoun - do you have the rest?

सिद्धम्। OCR succeeded mostly - barring Greek and some accents, which require manual correction.

@Andhrabharati is working on proofreading this.

@vvasuki
I found that your OCR has the need for more corrections for accents etc. and I did a complete redo of all the 3 volumes including the front pages and the missing ending 180 pages of Vol. 3 in @gasyoun's pdf set (and hence in your OCR).
[There is a combined scan of the 4 volumes at archive.org]

The summary of entries (total : 12007) is-
base entries : 9112
new entries : 378
entry revisions : 2517

I had finished about 30% of the overall work (planned in phases), took a small break now and would be resuming the project soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants