This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Extra spaces in text files in cygwin


On 2008-06-10, gmarsha11 wrote:
> Ok,  have saved the file with Windows notepad as ANSI, Unicode, Unicode big
> endian, and UTF-8.
> 
> Both Unicode options give me the output with the extra spaces.  ANSI and
> UTF-8 allow me to see the files as I would expect to see them.
> 
> Does this mean it's necessary to change the encoding for any files I might
> need to cat, grep awk, etc.?

I'm no expert on any of this, but as far as I know, all traditional 
Unix tools that deal with strings consider a string to be a sequence 
of 8-bit characters.  So the simple answer is yes.  The more 
complete answer is that it depends on what you're using those files 
for and what other programs need to read and/or write those files.

FWIW, I used Notepad on my Windows XP system to create a file 
containing your string, "This is abc file".  When I went to save it, 
the Encoding was already set to ANSI.  In other words, you shouldn't 
have to do anything special to save your files in a format already 
compatible with grep, etc.

That being said, you really shouldn't use Notepad to edit any files 
you expect to use with Cygwin, because Cygwin tools expect lines to 
end with LF, not a CR-LF pair.  Many tools will consider that CR to 
be part of the line.  In particular, bash will give odd results if 
you ask it to execute a shell script written with Notepad.

I got different results than you did when I cat'd abc.txt.  When I 
saved it as Unicode, the output of cat was:

   ÿþThis is abc file

When I saved it as Unicode Big Endian, the output of cat was:

   þÿThis is abc file

The only difference between the two was the ordering of the bytes in 
the BOM (Byte Order Mark) at the beginning of each file.  In both 
cases, there were no extra spaces.  I was running bash in an rxvt 
window, if that matters.

Regards,
Gary


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]