This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: [1.7][python] File operation API to multibyte filenames fails.
- From: Corinna Vinschen <corinna-cygwin at cygwin dot com>
- To: cygwin at cygwin dot com
- Date: Fri, 8 May 2009 15:09:01 +0200
- Subject: Re: [1.7][python] File operation API to multibyte filenames fails.
- References: <3f0ad08d0905080602s36a9eddg852eaa3ea3a2a69f@mail.gmail.com>
- Reply-to: cygwin at cygwin dot com
On May 8 22:02, IWAMURO Motonori wrote:
> Hi.
>
> File operation API to multibyte filenames fails on Python and Cygwin-1.7.
> Which Python or Cygwin-1.7 should be fixed?
>
> My environment: Windows XP SP3, Cygwin-1.7.0-46, and LANG=ja_JP.UTF-8
>
> The following code fails on the directory which has multibyte filenames:
>
> >>> import os
> >>> os.listdir(".")
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> OSError: [Errno 138] Invalid or incomplete multibyte or wide character: '.'
>
> The following code works correctly:
>
> >>> import os
> >>> import locale
> >>> locale.setlocale(locale.LC_CTYPE, '')
> 'ja_JP.UTF-8'
> >>> os.listdir(".")
> [(snip), '\xe3\x82\xb9\xe3\x82\xbf\xe3\x83\xbc\xe3\x83\x88
> \xe3\x83\xa1\xe3\x83\x8b\xe3\x83\xa5\xe3\x83\xbc',
> '\xe3\x83\x87\xe3\x82\xb9\xe3\x82\xaf\xe3\x83\x88\xe3\x83\x83\xe3\x83\x97']
>
> However, it is impossible to fix all the python scripts.
>
> There are two causes.
>
> - Python has intentionally evaded the execution of setlocale(LC_ALL,
> "") and/or setlocale(LC_CTYPE, "").
> - When locale is not appropriately set, Cygwin-1.7 converts non-ASCII
> character into a special sequence. (see "Convert chars invalid in the
> current codepage to a sequence ASCII SO" part of sys_cp_wcstombs in
> winsup/cygwin/strfuncs.cc)
>
> Which Python or Cygwin-1.7 should be fixed?
Your scripts. Python correctly doesn't use setlocale because it's
the responsibility of the application to set the local if it uses
non-ASCII chars. And Cygwin simply has no chance to convert UTF-8
to UTF-16 if the application doesn't ask for UTF-8.
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/