utils/image: add polygon_for_parent w/ a test #577

kba · 2020-08-25T13:37:23Z

No description provided.

codecov-commenter · 2020-08-25T13:41:26Z

Codecov Report

Merging #577 into master will decrease coverage by 0.15%.
The diff coverage is 37.50%.

@@            Coverage Diff             @@
##           master     #577      +/-   ##
==========================================
- Coverage   84.14%   83.98%   -0.16%     
==========================================
  Files          49       49              
  Lines        2776     2792      +16     
  Branches      541      546       +5     
==========================================
+ Hits         2336     2345       +9     
- Misses        342      344       +2     
- Partials       98      103       +5

Impacted Files	Coverage Δ
ocrd_utils/ocrd_utils/__init__.py	`100.00% <ø> (ø)`
ocrd_utils/ocrd_utils/image.py	`54.97% <37.50%> (-1.60%)`	⬇️
ocrd/ocrd/resolver.py	`96.55% <0.00%> (+3.44%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e41ba75...d03e2cd. Read the comment docs.

bertsky

I don't see the point. I have already stalled #492 (which does the same thing), because like I said this should be addressed at the PAGE model implicitly, rather than by requiring extra calls in every processor. (I thought we agreed on that?)

kba · 2020-08-25T15:57:23Z

We have been discussing GT validation of coordinates again today, as well as possible applications for ocrd-sanitize here, that's why I wanted to understand how to use that function. I see this as a step towards the eventual goal of PAGE-level validation/cl;ipping, the adaptions to enable better error reporting in GdsCollector is in the same vein. I just momentarily wasn't aware of #492. Can't we not try to bring the coordinate logic from tesserocr into OCR-D/core first, while working towards PAGE-level methods?

bertsky · 2020-08-26T08:11:57Z

Can't we ~~not~~ try to bring the coordinate logic from tesserocr into OCR-D/core first, while working towards PAGE-level methods?

But the exception was merely an experiment. We don't want that in production code. So not only would we have to add the call to nearly every processor, it would also have to catch the exception each time. That's just too much effort IMO.

OCR-D/ocrd_tesserocr#145 already demonstrates that ocrd_tesserocr correctly would have to catch the exception, ignoring the candidate. (I just rewrote with return type None.)

Thinking about this from the PAGE model is a different picture, though. Functions like add_TextRegion would have to throw some CoordsNotInParentException which needs to be handled differently in the caller/processor.

I think you might draw 2 different conclusions from that:

Moving the check into the PAGE model implicitly is not worth it, because the processors still need to be updated (catching the exception), so one might as well keep it a separate function and opt-in (tolerating processors which don't use it and thus may produce inconsistent coords).
Moving the check into the PAGE model implicitly allows us to enter a (non-production) phase where users will see exceptions as long as the respective processors have not been fixed.

bertsky · 2020-08-27T13:21:37Z

Thinking about this from the PAGE model is a different picture, though. Functions like add_TextRegion would have to throw some CoordsNotInParentException which needs to be handled differently in the caller/processor.

I think you might draw 2 different conclusions from that:

Moving the check into the PAGE model implicitly is not worth it, because the processors still need to be updated (catching the exception), so one might as well keep it a separate function and opt-in (tolerating processors which don't use it and thus may produce inconsistent coords).

Moving the check into the PAGE model implicitly allows us to enter a (non-production) phase where users will see exceptions as long as the respective processors have not been fixed.

After today's tech call, the picture has become even larger. #576 offers a new alley for making implementors aware of these problems – by feeding PAGE validation failures (along with call stack context if necessary) into dedicated loggers. But most of these failures very likely would result in subsequent crashes of the follow-up processors anyway (because we cannot expect to make all of them robust against all sorts of input invalidities, even in the long term). Plus we already face the problem that processors can fail (due to exception or exit) on single pages (often only after computing a long series of pages correctly). This is not tolerable for a system aiming at productive use. So is option 2 above. Therefore, we already need a mechanism of catching a processor's exceptions and falling back to (say) the input page as output page. That very mechanism could then be used to let exceptions fall through when requested. This could be controlled on the command line with a new option (say --debug). The single place to handle this would be ocrd.Processor.process. (But for the page-wise fallback we would also need #322.) Processor implementors could then run (and make test) their code with --debug and start catching the exceptions from the PAGE model themselves accordingly.

utils/image: add polygon_for_parent w/ a test

d03e2cd

kba requested a review from bertsky August 25, 2020 13:37

kba mentioned this pull request Aug 25, 2020

use polygon_for_parent from ocrd_utils OCR-D/ocrd_tesserocr#146

Open

bertsky reviewed Aug 25, 2020

View reviewed changes

bertsky mentioned this pull request Aug 27, 2020

Processor.process: catch exceptions by default, CLI: add --debug to let them explode #579

Closed

bertsky mentioned this pull request Sep 14, 2020

Repair fix coords OCR-D/ocrd_segment#43

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

utils/image: add polygon_for_parent w/ a test #577

utils/image: add polygon_for_parent w/ a test #577

kba commented Aug 25, 2020

codecov-commenter commented Aug 25, 2020 •

edited

Loading

bertsky left a comment

kba commented Aug 25, 2020

bertsky commented Aug 26, 2020

bertsky commented Aug 27, 2020

utils/image: add polygon_for_parent w/ a test #577

Are you sure you want to change the base?

utils/image: add polygon_for_parent w/ a test #577

Conversation

kba commented Aug 25, 2020

codecov-commenter commented Aug 25, 2020 • edited Loading

Codecov Report

bertsky left a comment

Choose a reason for hiding this comment

kba commented Aug 25, 2020

bertsky commented Aug 26, 2020

bertsky commented Aug 27, 2020

codecov-commenter commented Aug 25, 2020 •

edited

Loading