You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TTY::File::CompareFiles#call seems read a file by chunk of block size.
When there is a multibyte character (CJK character, emoji, etc) crosses between blocks, the character will be broken.
Steps to reproduce the problem
./diff-j.rb
diff 4096-a.txt and 4096-aj.txt
--- 4096-a.txt
+++ 4096-aj.txt
@@ -1 +1 @@
-aaa(repeats 4096 times )aaa�
@@ -1 +1 @@
-A
+��い
4096-a.txt
aaa(repeats 4096 times)aaaA
4096-aj.txt
aaa(repeats 4096 times)aaaあい
check
puts TTY::File.diff("4096-a.txt", "4096-aj.txt")
Actual behaviour
Multi byte character あ is divided by byte, and broken.
�
��い
Expected behaviour
./diff-j.rb
diff 4096-a.txt and 4096-aj.txt
--- 4096-a.txt
+++ 4096-aj.txt
@@ -1 +1 @@
-aaa(repeats 4096 times )aaa
@@ -1 +1 @@
-A
+あい
It looks hard to solve with current implementation using block reads.
Describe the problem
TTY::File::CompareFiles#call
seems read a file by chunk of block size.When there is a multibyte character (CJK character, emoji, etc) crosses between blocks, the character will be broken.
Steps to reproduce the problem
4096-a.txt
4096-aj.txt
check
Actual behaviour
Multi byte character
あ
is divided by byte, and broken.Expected behaviour
It looks hard to solve with current implementation using block reads.
Describe your environment
diff-j.zip
The text was updated successfully, but these errors were encountered: