This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Fwd: [1.7] wcwidth failing configure tests]


On Jun  5 18:25, Thomas Wolff wrote:
> IWAMURO Motonori wrote:
> > 2009/5/21 Thomas Wolff <towo@towo.net>:
> > >> > Therefore, I propose to use *_cjk() when the language part of LC_CTYPE
> > >> > is 'ja', 'ko', 'vi' or 'zh'.
> > > The problem with this is
> > > 1. As you say, there is no standard.
> 
> > But,
> > - I think that my proposal doesn't violate any specification.
> I think it does. Part of the locale information is the "charmap" 
> (called "codepage" on DOS/Windows). It may be implicit like 
> with LC_CTYPE=zh_CN which defines "GB2312" as its charmap, but it 
> is typically explicit like in en_US.UTF-8 - the intention is 
> that the "codepage" information should be the same for all locales 
> having thbe "UTF-8" (or any other) charmap. So you cannot freely 
> change width information among locales with the same charmap.
> Also, if ja_JP.UTF-8 would mean "CJK width", how would you specify 
> a working locale setting for a terminal that does not run a CJK width 
> font but should yet use other Japanese settings? E.g. with rxvt which 
> does not support CJK width.
> 
> However, there is one resort within the locale mechanism that can be used;
> the locale syntax allows for an optional "modifier" which can be used to 
> specify deviations, e.g.
> 	de_DE           has charmap ISO-8859-1
> 	de_DE@euro      has charmap ISO-8859-15
> 	uz_UZ           has charmap ISO-8859-1
> 	uz_UZ@cyrillic  has charmap UTF-8
> 	aa_ER and aa_ER@saaho both have charmap UTF-8 (with some other difference).
> Thus you could define e.g.
> 	ja_JP.UTF-8@cjk
> or
> 	ja_JP.UTF-8@cjkwidth
> to indicate CJK width properties. I guess this is the most compliant way to go.

I like this approach.  It's also more flexible than using the language
specifier.

<nit-picking>
Thomas, couldn't you have discussed this in the two weeks I was on
vacation?  Why did you wait until I implemented the language-based
approach?
</nit-picking>

Now, we just have to agree on the modifier and somebody has to implement
this in newlib/libc/locale/locale.c.  So far the modifier is ignored
entirely (de_DE@euro will still use ISO-8859-1).

I vote for @cjkwide, regardless of Andy's objection.  People using CJK
will know the meaning and it has the additional advantage to be a rather
simple to memorize identifier.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]