This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [BUG REPORT]sed -e 's/[B-D]/_/g' replaces unexpected characters


> Your locale is zh_CN.UTF-8.  What you're expecting is only guaranteed
> in the C locale:

I'm not quite sure it applies here.  I'm using US English Windows 7.

LANG = 'en_US.UTF-8'

I get the same result:

$ echo abcdeABCDE | sed -e 's/[B-D]/_/g'
ab__eA___E

BUT:

$ echo abcdeABCDE | LANG=C sed 's/[B-D]/_/g'
abcdeA___E

This is very weird, indeed.

OTOH, in Linux I have the same LANG setup, yet it does work
correctly:

> echo $LANG
en_US.UTF-8
> echo abcdeABCDE | sed -e 's/[B-D]/_/g'
abcdeA___E

I believe that an en_US UTF-8 string representation for
"abcdeABCDE" is not any different from ASCII.

Anton Lavrentiev
Contractor NIH/NLM/NCBI


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]