This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Internal echo of shell beaves (sometimes) different to external echo


On 20 July 2012 11:46, Ralf wrote:
> My problem is not that the script is in ISO-8859-1, nor that the strings
> or ttt.txt are in ISO-8859.1. They have to be in ISO-8859-1 because all my
> scripts are in ISO-8859-1 and they are used together with Windows-Programs
> (in the DOS-Box) which read and write only ISO-8851-1.
>
> My Problem is to handle in Shell-Scripts strings which are coded in
> ISO-8851 (and line-endings which depend on relative/absolute filenames,
> mounting and so on) without rewriting all the stuff.
>
> So what't the best setting in cygwin to echo ISO-88591? I still don't
> unterstand why the internal echo behaves in a different way from the external
> echo.

It's because setting LC_ALL in a bash script is too late for the bash
process itself, which will be using the default C.UTF-8 locale unless
something else is set when bash is invoked.

When stuff is written to a console (but not a pty-based terminal), the
Cygwin DLL converts it from the process charset (UTF-8 in this case)
to UTF-16 to pass it to the relevant Windows API function. Your
ISO-8859-1 encoded 'Ã' is an invalid byte when interpreted as UTF-8,
hence the error character.

/usr/bin/echo on the other hand is invoked as a separate process, with
LC_ALL already set appropriately, hence they're you're getting the
expected output.

Andy

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]