Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToJson()无法对特殊字符进行正确的转译 #107

Open
mnonmcn opened this issue Aug 21, 2019 · 7 comments
Open

ToJson()无法对特殊字符进行正确的转译 #107

mnonmcn opened this issue Aug 21, 2019 · 7 comments
Labels

Comments

@mnonmcn
Copy link

mnonmcn commented Aug 21, 2019

            JsonData jd = new JsonData();
            jd["name"] = "\u0003";
            string json = jd.ToJson();
            File.WriteAllText("c:\\test.json", json);

ToJson () can't translate special characters correctly, which inverts the string of ToJson () data in browsers such as Chorme.
JSON. parse (json) cannot be parsed correctly
Expected results:{"name":"\u0003"}
Actual results:{"name": "�"}
image

Sorry, it's automatically translated with a translator.

ToJson()无法对特殊字符进行正确的转译,这倒置ToJson() 数据的字符串在Chorme等浏览器中 JSON.parse(json)不能正确解析
正确的结果:{"name":"\u0003"}
实测的结果:{"name": "�"}

@devlead devlead added bug up-for-grabs Help wanted labels Aug 21, 2019
@devlead
Copy link
Member

devlead commented Aug 21, 2019

Which version of LitJson har you using?

Did a quick .NET Fiddle ( https://dotnetfiddle.net/3cGvQN ) and it outputs {"name":"\u0003"}, which you write as exeected result? Isn't this the expected result?

image

Sending that json to json lint says it's valid json ( https://jsonlint.com/?json={%22name%22:%22\u0003%22} )
image

Created an jsfiddle ( https://jsfiddle.net/nmecws5x/ ) and it parses the outputed json fine
image

So unsure what's expected and unexpected.

@mnonmcn
Copy link
Author

mnonmcn commented Aug 23, 2019

            JsonData jd = new JsonData();
            jd["name"] = "\u0003";
            jd["key"] = "中文";
            string json = jd.ToJson();
            Console.WriteLine(json);

The results are as follows: {"name":"\u0003","key":"\u4E2D\u6587"}

Sorry, I used an old version. I upgraded it to the latest version, but because of the particularity of Chinese, it converted all Chinese into unicode, which caused a serious decline in readability. In fact, Chinese does not need to be re-coded.


结果为:{"name":"\u0003","key":"\u4E2D\u6587"}
很抱歉,我使用了一个很古老的版本,我将它升级到了最新版本后可以了,但是因为中文的特殊性,它将中文全部也转换成了 unicode ,这造成了可读性严重下降,实际上,中文是不需要重新编码的

@devlead
Copy link
Member

devlead commented Aug 23, 2019

@mnonmcn ok, but there's no breaking bug then? But a feature request to not encode certain characters?

@mnonmcn
Copy link
Author

mnonmcn commented Aug 23, 2019

                //对中文的支持
                //https://www.qqxiuzi.cn/zh/hanzi-unicode-bianma.php
                //暂时只对常用的基本汉字进行支持	20902字
                if ((int)str[i] >= 0X4E00 && (int)str[i] <= 0X9FA5)
                {
                    writer.Write(str[i]);
                    continue;
                }

I support the code in Chinese after line 269 of the JsonWriter.cs file. enhance readability,This solves another problem at the same time.https://github.com/LitJSON/litjson/issues/78

我对代码进行了对中文的支持,放在JsonWriter.cs文件的第 269 行后,增强可读性,这同时解决了另外一个问题https://github.com/LitJSON/litjson/issues/78

@devlead
Copy link
Member

devlead commented Aug 23, 2019

I see, would it be possible for you to send a pull request with this change?

@nchetan-zz
Copy link

Is this still open? I found this from https://up-for-grabs.net/#/filters?labels=88&tags=c%23 and was about to consider this as an option to contribute to (assuming it's a good first issue on this).

@salyu9
Copy link

salyu9 commented Mar 15, 2020

Currently the escaping strategy of LitJson that escapes all non-ascii characters is actually fine. The problem is that people may want those graphic characters to be kept, not only chinese characters, so imho excluding the range [0x4e00, 0x9fa5] (more precisely, CJK Unified Ideographs) is not the solution.
According to RFC 4627: All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F). Maybe a switch can be added for just escape these characters, like the StringEscapeHandling.Default/EscapeNonAscii in Newtonsoft.Json. And I thought this may better be a feature request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants