This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line
- From: IWAMURO Motonori <deenheart at gmail dot com>
- To: cygwin at cygwin dot com
- Date: Thu, 4 Jun 2009 00:03:29 +0900
- Subject: Re: 1.7.0-48: [BUG] Passing characters above 128 from bash command line
- References: <3f0ad08d0905290813m39999f81q918e94e3c960eb3f@mail.gmail.com> <3f0ad08d0905290852xe41338alfda89c622f92f677@mail.gmail.com> <4A200BC0.9010704@sidefx.com> <e2480c70905291142o2bcc65ccw2287d175dbd09dd5@mail.gmail.com> <4A204149.2050009@sidefx.com> <e2480c70905291337g6c8bcca7xd0baba79c84629db@mail.gmail.com> <4A2051E5.6060600@sidefx.com> <20090602205440.GF23519@calimero.vinschen.de> <4A26782C.9040207@sidefx.com> <20090603142755.GM23519@calimero.vinschen.de>
Hi.
How about the addition of the setting of the locale environment
variable (like LANG) to the Cygwin installer?
2009/6/3 Corinna Vinschen <corinna-cygwin@cygwin.com>:
> On Jun ?3 09:18, Edward Lam wrote:
>> Corinna Vinschen wrote:
>>> The question is, what do you expect? ?[...]
>> [...]
>> Wikipedia has several suggestions on how to handle invalid UTF-8 byte
>> sequences (http://en.wikipedia.org/wiki/UTF-8). Personally, I favor the
>> rule that uses the replacement character.
>
> Chris implemented using the invalid code point solution. ?The discussion
> in http://www.mail-archive.com/linux-utf8@nl.linux.org/msg00080.html
> supports this solution. ?What's missing so far is the way back, from
> an invalid single second half of a surrogate pair in the 0xDCxx range
> back to the correct byte value. ?I'm just looking into that.
>
>> > How is anybody supposed to know that the file which consists
>> > of the single byte 0xa9 has *any* meaning at all? ?Why should it be
>> > the copyright sign, of all things?
>>
>> What I was attempting to do was to have NO conversion. In the
>> real case that I into this, the "bug.exe" was the one to properly
>> interpret what the byte 0xA9 meant from the command line. Yes, I know
>> there are several workarounds.
>
> The command line is always converted to UTF-16 when calling a native
> Win32 application. ?If we don't do it (because we call CreateProcessA),
> Windows would do it. ?As matters stand, we have to convert ourselves,
> because we must call CreateProcessW. ?Either way, the problem persists.
> We just don't know what the correct conversion is for the given input.
> We have to rely on a correct setting of $LC_ALL/$LANG/$LC_CTYPE.
>
>>> If we default to the ANSI codepage, you will have the same problem,
>>> just upside down. ?In both cases you will have even more problems if
>>> you start using characters not available in your default codepage.
>>
>> This is where I disagreed with Alexey. What we're really arguing here is
>> whether which default will run into the least problems for the most
>> common usage. This is subjective of course.
>
> Definitely. ?The "right" solution is always only right for a given value
> of right. ?What if the user has set LANG to, say, ja_JP.eucJP? ?That
> user of course expects that the stuff on the command line is converted
> to UTF-16 using the eucJP encoding. ?Everything else would just be very
> surprising.
>
> What's left as questionable is the LANG=C default case. ?Due to the
> discussion from the last month we now use UTF-8 as default encoding,
> because it's the only encoding which covers all (valid) characters.
> Sure, we could also convert the command line using the current ANSI
> codepage as Windows does it when calling CreateProcessA in this case.
>
> Maybe we should do that for testing? ?Anybody having a strong opinion
> here?
>
>
> Corinna
>
> --
> Corinna Vinschen ? ? ? ? ? ? ? ? ?Please, send mails regarding Cygwin to
> Cygwin Project Co-Leader ? ? ? ? ?cygwin AT cygwin DOT com
> Red Hat
>
> --
> Unsubscribe info: ? ? ?http://cygwin.com/ml/#unsubscribe-simple
> Problem reports: ? ? ? http://cygwin.com/problems.html
> Documentation: ? ? ? ? http://cygwin.com/docs.html
> FAQ: ? ? ? ? ? ? ? ? ? http://cygwin.com/faq/
>
>
--
IWAMURO Motnori <http://vmi.jp/>
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/