Skip to content

Commit

Permalink
APItest/t/utf8_warn_base: Add tests
Browse files Browse the repository at this point in the history
One UTF-8 malformation is when the string has a start byte in it before
the expected end of the character.  This test file tested the case where
the unexpected byte came in the final position.  GH #22597 found bugs
where the unexpected byte came immediately after the first byte.

This commit adds tests for unexpected bytes in all possible positions.
If the fix for GH #22597 is reverted, this new revised file has 1400
failures.
  • Loading branch information
khwilliamson committed Oct 11, 2024
1 parent 7020845 commit f108033
Showing 1 changed file with 14 additions and 6 deletions.
20 changes: 14 additions & 6 deletions ext/XS-APItest/t/utf8_warn_base.pl
Original file line number Diff line number Diff line change
Expand Up @@ -1190,7 +1190,12 @@ ($)
# We try various combinations of malformations that can occur
foreach my $short (0, 1) {
next if $skip_most_tests && $short;
foreach my $unexpected_noncont (0, 1) {
# Insert an unexpected non-continuation in every possible position
my $unexpected_noncont;
for ($unexpected_noncont = $length - $short - 1;
$unexpected_noncont > 0;
$unexpected_noncont--)
{
next if $skip_most_tests && $unexpected_noncont;
foreach my $overlong (0, 1) {
next if $overlong && $skip_most_tests;
Expand Down Expand Up @@ -1318,11 +1323,14 @@ ($)

if ($unexpected_noncont) {

# To force this malformation, change the final continuation
# byte into a start byte.
my $pos = ($short) ? -2 : -1;
substr($this_bytes, $pos, 1) = $known_start_byte;
$this_expected_len--;
# The overlong tweaking above changes the first bytes to
# specified values; we better not override those.
next if $overlong;

# To force this malformation, change a continuation byte into a
# start byte.
substr($this_bytes, $unexpected_noncont, 1) = $known_start_byte;
$this_expected_len = $unexpected_noncont;
}

# The whole point of a test that is malformed from the beginning
Expand Down

0 comments on commit f108033

Please sign in to comment.