This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Bogus assumption prevents d2u/u2d/conv/etal working on mixed files.


> From: David Fritz
> Sent: Sunday, April 04, 2004 6:46 AM

> Charles Wilson wrote:
> [...]
> >   (2) it's an attempt to prevent users from permanently
> scrogging binary
> > files.  See: d2u, on a binary file, is an irreversible operation.  So,
> > if you do "d2u *" you'll probably kill something deep inside
> some binary
> > file, and you can't fix it -- unless some minimal safeguards
> are in place.
> >
> >   u2d MAY be reversible -- IF there were no pre-exising \r\n
> > combinations in the file to begin with -- so when (OMG-fixit-)d2u is
> > run, obviously the first '\n' is preceeded by a (newly-added)
> '\r\n', so
> > the prog merrily replaces ALL '\r\n' with '\n'...which MAY fix your
> > oops, but maybe not.
> >
> >
> > So, with the current code, if you snarf the first "line" -- all chars
> > until the first '\n' -- if it's a binary file the odds are pretty low
> > that the immediately-preceeding character is a '\r' -- so d2u as
> > currently coded will bail out, and no harm is done.
> >
> > It doesn't work so well in the other direction -- by the same logic
> > above, you'll almost never bail out early if you run 'u2d' on a binary
> > file -- but if you immediately do a 'd2u' you MIGHT be able to recover.)
> >
> [...]
>
> If detection of binary files is desirable, why not use an
> explicit test with a
> more robust methodology?  GNU grep detects binary files by
> looking for a '\0'
> byte.  Such a test could be used by both d2u and u2d; they could
> bail out with a
> message like "skipping binary file".
>
> Cheers

A more "foolproof" (? does such a thing exist) test would be to disallow
using d2u/u2d on anything in directories found in $PATH. But then that one
has its disadvantages too, but less so IMO.

 I find all this "safety" related stuff be a PITA at times. Any kind of test
is prone to fail at some instances; at other instances just a cause for
confusion most of the time -> a lot of bug-hunting - for so little gain.

 How about running d2u/u2d, say, on a regedit 5 file (ie; mostly ascii but
due to the coding every other character is a NUL)?
 Would that be considered "legal"? IMO it should, a fast and easy way to
strip the garbage - to create a file that can be used with normal tools.

 IMO; stay away from all of this safety thingies, at _LEAST_ allow them to
be bystepped; e.g. --force. I will be using that switch all the time.

 There are a lot of these foolhardy "traps" one can fall into; e.g:
$ cd /;rm -rf *
are you gonna find a "safety" hatch for that too?


 Noo... Please, remove all of these safety checks.
There must be some kind of user sanity presupposition. Or else the tools
soon will be crippled to a state where they are unusable for normal work.

 Make Backups, Not War!  -> MBNW!  ;-P


/Hannu E K Nevalainen, B.Sc. EE - 59+16.37'N, 17+12.60'E

** on a mailing list; please keep replies on that particular list **

-- printf("LocalTime: UTC+%02d\n",(DST)? 2:1); --
--END OF MESSAGE--


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]