Strengthen check for UTF-8 conformity in formatContent() #704
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Type of pull request
About
In some cases a binary string may pass as valid UTF-8 to the
mb_check_encoding(..., 'UTF-8')
function. Use a comprehensive regexp from the W3 group instead to be certain we aren't trying to parse binary content informatContent()
. In addition to(strings)
, also check for the beginning ofID
inline image content sections, which may also contain binary. Resolves #668.Reference: https://www.w3.org/International/questions/qa-forms-utf-8.en
Checklist for code / configuration changes
In case you changed the code/configuration, please read each of the following checkboxes as they contain valuable information:
fixes #1234
to outline that you are providing a fix for the issue#1234
.