This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

filenames with characters that have the high bit set


I've read http://cygwin.com/faq/faq-nochunks.html#faq.using.unicode and
http://cygwin.com/cygwin-ug-net/setup-locale.html but I'm still stumped.

My cygwin.bat now contains:

@echo off

C:
chdir C:\utils\cygwin\bin
set LANG=en_US.UTF-8
bash --login -I

And my ~/.inputrc contains:

set meta-flag on
set convert-meta off
set input-meta on
set output-meta on

$ echo $LC_ALL
en_US

$ echo $LANG
en_US.UTF-8

For the rest of this post, assume <special_filename> is "foo" with U+00E9 (e
with acute accent) at the end.

$ test -f <special_filename>; echo $?

prints 1 when <special_filename> really does exist....depending on how I try
to represent U+00E9 on the command line

$ ls foo<tab>

adds the actual accented character to the command line (whether set
show-all-if-ambiguous on is in ~/.inputrc or not).  Then I press return and
ls prints the filename.  Then if I go through command history and change
"ls" to "test -f" and add the "; echo $?" I get the right answer from test.
So far so good.

But, if I I try to do what
http://cygwin.com/cygwin-ug-net/using-specialnames.html#pathnames-unusual
says, the test command always fails, and ls doesn't print the filename.  I'm
not really sure how to get hex code 0x18 through bash and to
ls/test/whatever properly.  This what I tried:

$ ls "foo\x18<tab>"
$ ls "foo\x18\xc3\xa9<tab>"
$ ls "foo\x18\xc3\xa9*"

Note that 0xC3A9 is the UTF-8 encoding of U+00E9.

But all get me nothing.  Replacing "ls" with "test -f" gives me the same
nothing.  Replacing \x with \X doesn't change anything either.

Perhaps interesting is that if I pipe the ls command built with tab
completion that actually prints the filename to "od -c" I see 
Then for kicks I tried:

$ touch "\x18"; echo $?
0

but I didn't see any new file created.

$ touch "\x18\xc3\xa9"; echo $?
touch: cannot touch `\\x18\\xc3\\xa9': Not a directory
1

Neither of these seems quite right.

Can someone give me a hand coming up with a command line where I can build
up filenames that contain characters that have the high bit set (as well as
any non-ascii character really)?

Thanks much.

-DB

Attachment: cygcheck.out
Description: Binary data

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]