This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: More about charsets


Corinna Vinschen:
> while looking into the GB18030 issue once again, I found that we still
> may have two holes which might be important to support.
>
> - GB2312 aka EUC-CN
>
> ÂWe already support GBK, codepage 936. ÂGB2312/EUC-CN is a subset
> Âof GBK and apparently GBK is often used while still labeled as
> ÂGB2312. ÂSee the discussion here:
> Âhttp://www.mail-archive.com/unicode@unicode.org/msg03516.html
>
> ÂSo the question is, should we just allow GB2312 and EUC-CN as
> Âcodeset names, but use the GBK conversion functions for them?

Might as well. As you saw, mintty already does that. Thomas Wolff's
mined goes even further and handles both GB2312 and GBK with its
GB18030 codec, because GBK is a subset of GB18030.


> ÂOtherwise, there's also a codepage 51936, which is called EUC-CN
> Âin the list at
> Âhttp://msdn.microsoft.com/en-us/library/dd317756%28VS.85%29.aspx
> ÂI didn't test it, but it appears to be the real GB2312. ÂI don't
> Âknow if it really makes sense to make the difference, though.

Also, it isn't available on any Windows I've tried.


> - EUC-TW
>
> ÂThere's a codepage 51950 which appears to be something like EUC-TW.
> ÂI just found this, though:
> Âhttp://code.google.com/p/mintty/source/detail?r=738
>
> ÂAndy, is that a general rule? ÂOr did you test on XP and the codepage
> Âwas just not installed, by any chance?

It doesn't show up as an option on XP, and I've just tried it again on
Windows 7, where codepages are no longer optional. Doesn't work. I
think I'd read somewhere that 51950 is only available for .Net
programs, but unfortunately I can't find that again. I guess it's
possible that Chinese Windows versions do support it anyway, although
Wikipedia describes EUC-TW as "rarely used".


> We certainly have other holes as well, but for OS usage I don't see
> any other codeset which would be that important.

I agree.

Andy


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]