This is the mail archive of the cygwin-apps mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: setup for 1.7 fails with Japanese characters in download


On Oct  3 08:51, Andy Koppe wrote:
> wynfield:
> > My base os is Japanese OEM Windows XP.
> >
> > I've only used the standard c:\cygwin-packages directory, so have not had any problems. But, to confirm Gernot's report, I created a ?C:\japanese_dirname???????????? directory and tried to download a package into it. ?It fails.
> >
> > setup reports the following type of message:
> >
> > No such file or directory: C:\japanese_dirname????????????/http%3...../release-2/..bzip_filename
> 
> The problem is that setup.exe's GUI uses the default ANSI codepage
> (932 in this case), whereas MSVCRT functions such as fopen() use the
> "C" locale by default. [...]
> 
> Now theoretically it should be possible to address this with a
> 'setlocale(LC_ALL, "")' call. However, after changing my Win7 system's
> default codepage to Japanese, I found that GetACP() would indeed
> return 932, but that 'setlocale(LC_ALL, "")' still yielded
> "English_United Kingdom.1252".
> 
> Hence the slightly more circuitous route in the patch below. Seems to
> do the job.
> [...]

This leads to another question.  When unpacking distro tar archives,
all archives hopefully only contain filenames with ASCII chars in them.
However, there's no reason to keep it this way in the future, if we
make sure that everybody uses the same charset.

Therefore I'd like to propose that distro tar archives are in future
*always* generated in the "C" locale, so setup can be sure all tar
archives are UTF-8 encoded.  

For setup it should be sufficient for now to make sure the installation
directory gets converted to UTF-8, and to change the mbstowcs calls in
filemanip.cc, function mklongpath() to MultiByteToWideChar (65001, ...)
calls.

In the long run it would be better if setup,.exe uses widechar functions
as much as possible, even for reading strings from the GUI.  This removes
the dreaded codepage problem and just leaves one conversion, which is
the conversion from UTF-8 filenames in tar archives to UTF-16.


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]