This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: UTF8 support in Cygwin


> >I am working on a patch which would add UTF8 support to Cygwin.
> >i.e. Unicode filenames would be encoded as UTF8 before being returned by,
> >e.g., readdir and then converted back to Unicode before being passed to
the
> >Windows API.
> >This would solve Ville Herva's problem where he/she wanted to back up a
> >filesystem containing Unicode filenames using Cygwin, but found that the
> >Unicode characters were converted to question marks. Also, with an
> >appropriate terminal, it is actually possible to view the Unicode
characters
> >(altough at the moment, it is not possible to input them correctly
AFAIK).
> >The code is currently guarded by a CYGWIN environment variable flag,
'utf8'.
>
> A long awaited feature!
>
> This would really help for "star" and "mkisofs".
>
> Star needs to archive UTF-8 coded names in the POSIX.1-2001 filenames
> and "mkisofs" needs to deal with UNICODE names in Joliet and UDF.
>
> How about using the LC_* locale setup to force UTF-8 coding?
The utf8 flag turns on conversion of Windows filenames from Unicode to UTF-8
and back again. This is completely unrelated to the LC_* stuff. You could
use utf8 filenames without the utf8 flag and they would just be stored as
utf8 on the Windows filesystem, but your application would see exactly the
same filename. The only reason to use this flag is if you wish to access
existing files which have Unicode filenames under Windows NT.

For the record, I've changed all the MultiByteToWideChar, etc.'s to
sys_utf8towcs now, to make it more consistent with the other conversion
functions.

Chris



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]