Add Support for UTF-8, detect other wide-char sequences, and cope with BOMs #30

duncanmac99 · 2018-09-17T11:10:23Z

As it stands, it seems that this program should almost handle UTF-8. The main task would be tinkering with one particular function, as well as (possibly) adding command-line args for handling certain peculiar (and often undesirable) situations.

logological · 2018-09-20T18:28:44Z

Further details on the proposed solution, or better yet, a pull request, would be most welcome.

duncanmac99 · 2018-09-26T01:45:57Z

However, the rest of the program expects regular (byte-size) characters, not wide characters. It would be possible to assemble it and not send back a wide character, but that would require more buffering in the function itself, which would be Messy.

As for BOMs (byte order marks), Windows now expects one at the beginning of every UTF-8 and UTF-16 file. For more on that (for UTF-8), see:

https://social.msdn.microsoft.com/Forums/windowsapps/en-US/dd352270-8790-4b48-8492-17a4a6875e99/why-the-utf8-with-bom-marker-requirement?forum=winappswithhtml5

Also (for UTF-16):

https://docs.microsoft.com/en-us/windows/desktop/intl/using-byte-order-marks

logological · 2019-06-11T09:50:59Z

I'm afraid I still don't understanding the problem. Can you post a minimal example of a UTF-8 or UTF-16 file that GPP doesn't handle correctly, along with the expected and observed output?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for UTF-8, detect other wide-char sequences, and cope with BOMs #30

Add Support for UTF-8, detect other wide-char sequences, and cope with BOMs #30

duncanmac99 commented Sep 17, 2018

logological commented Sep 20, 2018

duncanmac99 commented Sep 26, 2018

logological commented Jun 11, 2019

Add Support for UTF-8, detect other wide-char sequences, and cope with BOMs #30

Add Support for UTF-8, detect other wide-char sequences, and cope with BOMs #30

Comments

duncanmac99 commented Sep 17, 2018

logological commented Sep 20, 2018

duncanmac99 commented Sep 26, 2018

logological commented Jun 11, 2019