Modernize encoding conversion for FPDF #260

kohlerdominik · 2024-09-03T10:34:24Z

Fixes #245

I'm not entirly shure what the method toUtf8 did, but surely not convert to UTF-8. It converted away from UTF-8; ~~my best guess it that it's to replace characters that are not allowed in the payment slip?~~

~~ISO-8859-1 should be correct here, as the specs require Latin Character Set. But maybe there was a specific reason that the previous characterset was chosen?~~

Also, iconv is not able to handle MultiByteCharacters (🦓🦓🦓), so maybe we can fix that by using mb_* instead. I explicitly added the Symfony mb_*-polyfill, as its an indirect dependency anyway, so it will not bloat the package.

~~This PR should be considered as breaking change.~~

sprain · 2024-09-04T06:09:42Z

Thanks for the PR 🙌

Some background:
The initial functionality stems from this PR. The approach seems to originate here.

The method name toUtf8() is definitely wrong – as you said, it does the opposite. However, I think substituteUnsupportedCharacters() is also not the best, as it could be confused with what we did here.

So maybe we could call it setEncoding()? It really is about encoding.

Regarding the encoding itself, I think your approach here should be fine. But I am no expert on this topic, so it would be great if @supercosh could give it a test run, as mentioned in #245 (comment).

This PR should be considered as breaking change.

I am not sure if it's really needed. Is there another reason besides unsupported horse emojis? 😁🦓🦓

…h mbstring-polyfill

kohlerdominik · 2024-09-04T07:30:42Z

Hi @sprain

Thanks for your information, that cleared up a lot.

Your link provides information, why this was introduced to FpdfOutput. So my guess was not accurate. The real problem is, that FPDF does only support ISO-8859-1 and its extension Windows-1252. This should be easy fixable by mb_convert_encoding: it supports Windows-1252 without relying on the environment (OS). Only problem is, that if the mbstring-extension is not available, the Symfony-Polyfill will fall back to iconv again.

For TcPdfOutput, the conversion seems completely useless (at least acording to the tests?). Maybe you just copied over from FpdfOutput in 65f5b6a?. So I suggest to remove it there.

So I just updated my PR. This should be 100% backwards compatible, while fixing #245:

If ext-mbstring is installed, it should work perfectly
If fallback symfony/polyfill-mbstring is used, it should not throw an error for @supercosh, as i changed the encoding from Windows-1252 to CP1252. In mbstring, they are aliases anyway.

It might change output in edge-cases though, so the release should be tagged accordingly.

supercosh · 2024-09-06T07:28:04Z

Regarding the encoding itself, I think your approach here should be fine. But I am no expert on this topic, so it would be great if @supercosh could give it a test run, as mentioned in #245 (comment).

My answer is a bit late, but I can confirm the fix is working. I see that the fix is already in the master and released. Thank you all for the help!

kohlerdominik force-pushed the pdf-unsupported-characters branch from fce387b to fec65e4 Compare September 3, 2024 11:24

kohlerdominik mentioned this pull request Sep 3, 2024

Character set windows-1252 should be CP1252 under Unix #245

Closed

kohlerdominik force-pushed the pdf-unsupported-characters branch from ab5bd7d to d85d48b Compare September 4, 2024 06:51

kohlerdominik added 3 commits September 4, 2024 09:06

explicit require mb-string polyfill (already implicit required)

cb5cfa4

Remove unnecessary encoding-conversion

572180b

fixed wrong method name; make conversion use mbstring module

65ea1c2

kohlerdominik force-pushed the pdf-unsupported-characters branch from d85d48b to 65ea1c2 Compare September 4, 2024 07:08

Use CP1252-alias instead of Windows-1252 for better compatibility wit…

1c3f909

…h mbstring-polyfill

kohlerdominik changed the title ~~Fixed bad support for substitute of unsupported characters in PDF output~~ Modernize encoding conversion for FPDF Sep 4, 2024

sprain merged commit 3805597 into sprain:master Sep 5, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modernize encoding conversion for FPDF #260

Modernize encoding conversion for FPDF #260

kohlerdominik commented Sep 3, 2024 •

edited

Loading

sprain commented Sep 4, 2024 •

edited

Loading

kohlerdominik commented Sep 4, 2024

supercosh commented Sep 6, 2024

Modernize encoding conversion for FPDF #260

Modernize encoding conversion for FPDF #260

Conversation

kohlerdominik commented Sep 3, 2024 • edited Loading

sprain commented Sep 4, 2024 • edited Loading

kohlerdominik commented Sep 4, 2024

supercosh commented Sep 6, 2024

kohlerdominik commented Sep 3, 2024 •

edited

Loading

sprain commented Sep 4, 2024 •

edited

Loading