This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: [Q] Use of UTF-8 in cygwin bash shell scripts
- From: Igor Pechtchanski <pechtcha at cs dot nyu dot edu>
- To: Arifi Koseoglu <arifi at tnn dot net>
- Cc: cygwin at cygwin dot com
- Date: Fri, 2 Apr 2004 16:54:35 -0500 (EST)
- Subject: Re: [Q] Use of UTF-8 in cygwin bash shell scripts
- References: <CKEEILAKADKCNPNMDCJPIEAECAAA.arifi@tnn.net>
- Reply-to: cygwin at cygwin dot com
On Sat, 3 Apr 2004, Arifi Koseoglu wrote:
> Hello everyone.
>
> I have a question regarding the use of UTF-8 in a cygwin-bash shell script
> under windows XP and 2000 (does the behavior differ between 2000 and XP ?).
>
> I have a bash script automatically generated with a Perl program, which is
> supposed to copy files from one disk to another and at the same time replace
> all international characters in the filename and path with english
> counterparts (for example c with cedilla becomes c).
>
> The lines in the shell script are all of the form:
>
> cp "source path with international chars in it" "target with no
> international chars"
>
> The shell script is generated/saved in UTF-8 encoding. (since it has to
> properly contain the international chars). By the way, with international I
> mean the additional characters in the Turkish alphabet - but the same
> question should apply to all non-english alphabets.
>
> Now, I cannot get the script to work. I can 'ls' the files using
>
> $ ls "source path with international chars in it"
>
> the listing displays the Turkish characters properly, however whenever I go
> ahead to execute the script, bash complains that "source path with
> international chars in it" cannot be found.
>
> What am I missing? Does bash not support scripts encoded in UTF-8? Should I
> use another Unicode encoding (and how?) Or shoud I trash this method and try
> something else (what?). There are thousands of files to be renamed.
>
> I will appreciate any pointers deeply. Many thanks in advance.
> Best,
> Arifi
Bash doesn't support UTF-8. You might get away with using the appropriate
8-bit encoding based on your codepage. Alternatively, just iterate over
the directory contents and rename each file from perl.
Igor
--
http://cs.nyu.edu/~pechtcha/
|\ _,,,---,,_ pechtcha@cs.nyu.edu
ZZZzz /,`.-'`' -. ;-;;,_ igor@watson.ibm.com
|,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D.
'---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow!
"I have since come to realize that being between your mentor and his route
to the bathroom is a major career booster." -- Patrick Naughton
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/