This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: zsh 4.3.9-1: text-mode stdin problem (breaking base64)


2010/04/24 10:03 Peter A. Castro wrote:
Could you give me a simple test case that fails without
cygwin_premain0()? I set my filesystems as text-mode and tried to find
such cases, but I couldn't.

It's been a while since I've looked at this, but the problem was mostly with binary-mode mounts, not text-mode mounts. The problem was that, say, you had your root mounted as text-mode, but your /tmp mounted as binary-mode. Zsh (and other utilities) create temp files fairly often and feed those as input to itself or other programs. Or, reverse the case (root mounted binary and /tmp mounted text).

{f}open() in Cygwin is context sensitive to the filesystem mount mode.
This leads to such situations as calling fopen("/tmp/foo","r") and
expecting it to read "text" lines, but "/tmp" is mounted binary and file
"foo" contains CRLF's because it was created by a Windows program or
editor. So, when you read the lines you will get the CR as well as the
LF, when you really only want the LF. Where as if "/tmp" was mounted
text, the CR would be stripped off as part of text processing.
Thank you. Indeed, even if a person mounts root (or some filesystems) as text-mode, still he might mount /tmp as binary-mode. So, I see that we need to take measures to meet such cases.

I thought about two cases:
* If you don't use CRLF scripts at all and mount all your filesystems as
binary-mode, there should be no problem (without premain hack).

In a pure Cygwin eco-system that might work. However, many Cygwin users have to interact with non-Cygwin created data and files. If you ask the good users on this mailing list you might find that people have any combination of file systems mounted for their particluar needs.

* If you use CRLF scripts and mount all your filesystem as text-mode,
there should be no problem (without premain hack).

But, now, you won't get binary data from the files using a naked "open()" as so many typically coded apps do.

Is it right?

If you could keep things strictly black-and-white like that, yes, in theory these could work. Well, the first one would be preferable as opposed to the second one. But the problem is that most Cygwin users don't operate in such a strict environment.
I might have been shortsighted. Especially, I didn't consider so much about using text-mode and binary-mode simultaneously.

I don't know well about zsh code, but I think it will be hard to do the
hack without cygwin_premain0(), as you said. But, how about bash? bash
seems not to have such hacks, but it seems to work well. And I think
it's confusing that bash and zsh treat stdin as different mode.

Have a look at Bash code some time. I recall seeing some O_TEXT options being set in the various {f}open()'s that it does. Again, I looked at doing the same in Zsh code, but after some initial experiments it proved that there were too many dependencies and assumptions about the carriage-control of "text" files to make it work quickly.
I took a look at Bash code and found it sometimes opens filehandles in text-mode, although I didn't read in detail. Anyway, Bash is also apparently not perfect (for example, it can't read CRLF scripts on binary-mode filesystems), so I see that we can't say which is right.

Indeed, it's theoretically right that any programs which perform binary
I/O should set stdin/stdout as binary mode for portability. But
practically, it will be a heavy work to check that all programs on our
system follow the rule, and I think the check can't be perfect. I'd

Reguardless of how much work it might be, it's a matter of "due diligence". When you find something that doesn't behave appropriately, report it to the maintainers.

And, in that vein, yes, I acknowledge there are issues with Zsh in this
area. The premain is one "solution" that works for most cases. You
appear to have found one case that doesn't work as expected
(congratulations!). But, as I said, that particular case appears to be
more a matter of that the Stdin handle should be treated as and work
appropriately.

This problem is still under consideration. Having more than one type of
filesystem mode is part of the equasion and attempting to treat that
correctly is somewhat difficult in Zsh.
Yes, this seems not so simple problem as I thought. For the present, if I find a problem like Base64 again, I will report it.

rather keep all my scripts as LF than break my data by some programs
like base64, so I will continue to use the customized zsh.

If that works for you, great. That's why the source is available. I do hope to get back to this issue at some point. Thanks for pointing it out.
My pleasure. I understood that we need premain hack for now in the package for public. Thank you for considering about my post. And I appreciate you for maintaining Cygwin Zsh package.

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]