This is the mail archive of the cygwin@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: 1.3.10 memcmp() bug


On Tuesday 23 April 2002 23:41, Sami Korhonen wrote:
> On Tue, 23 Apr 2002, Tim Prince wrote:
> > On Tuesday 23 April 2002 22:04, Sami Korhonen wrote:
> > >  I wasnt sure wheter I should post about this on gcc bug report list or
> > > here. Anyways, it seems that using -O2 flag with gcc causes huge
> > > slowdown in memcmp(). However i dont see performance drop under linux,
> > > so I suppose it is cygwin issue.
> > >
> > > $ gcc memtest.c -O2 -o memtest ; ./memtest.exe
> > > Amount of memory to scan (mbytes)? 100
> > > Memory block size (default 1024)? 1024
> > > Allocating memory
> > > Testing memory - read (1 byte at time)
> > > Complete: 889.73MB/sec
> > > Testing memory - read (4 bytes at time)
> > > Complete: 3313.07MB/sec
> > > Freeing memory
> > >
> > > $ gcc memtest.c -o memtest ; ./memtest.exe
> > > Amount of memory to scan (mbytes)? 100
> > > Memory block size (default 1024)? 1024
> > > Allocating memory
> > > Testing memory - read (1 byte at time)
> > > Complete: 2517.94MB/sec
> > > Testing memory - read (4 bytes at time)
> > > Complete: 2933.50MB/sec
> > > Freeing memory
> > >
> > >
> > > '1 byte at time' is using memcmp() to compare two blocks.
> >
> > You leave so many relevant considerations unspecified, that anything I
> > say must be a stab in the dark.  I assume you have a standard cygwin
> > installation, where binutils is built to honor only 4-byte alignments,
> > while recent linux configurations provide for 16-byte alignments.  The
> > significance of that is different on various CPU families, with code
> > alignment being quite important on certain CPU's, and data alignment on
> > others.  Do we assume that you are running on a 486, since you have not
> > told gcc otherwise?  You may have fallen accidentally into good alignment
> > in one case and bad in the other.  You might or might not be using
> > similar versions of gcc in cygwin and linux.  If you would provide a test
> > case, and mention some hardware parameters, some of the mystery could be
> > eliminated; for example, we could find out whether memcmp() is code
> > generated by gcc or from a library.  cygwin is not generally considered
> > an important target for performance optimization, as you can see from the
> > alignment considerations and the differences in the libraries.
> > --
> > Tim Prince
>
>  Sorry that I wasnt specific enough with my system configuration. I'm
> running standard installation of cygwin on x86 (P4) and WinXP. Both
> test were run under same setup, only difference was the use of -O2 flag. I
> find it odd, that performance differnece is that huge. Source is available
> at: http://kotisivu.raketti.net/darkone/memtest/memtest.c
AFAICT there's no reason this should behave differently on linux or cygwin.  
You're comparing the speed of memcmp() against the speed of comparing ints in 
a loop.  When you don't ask the compiler to in-line memcmp(), you get a 
library function which is written with enough smarts to compare 4 bytes at a 
time.   Various versions of gcc are interpreting the instruction to use 
"optimized" in-line code as a rep cmpsb, which is slower than the newlib 
memcmp() function, even on my P-III.  
P4's, particularly early versions, are notorious for various performance 
glitches when using rep cmpsb on long strings.  gcc isn't smart enough to 
look at the lengths of your strings and second guess your instruction to do 
that, nor does it have a crystal ball to second guess your instruction to 
generate 486 code, even if you were running a version with P4 optimizations.
In time critical applications, it can be quite important to learn the 
particular tricks of your compiler and when to choose a separately compiled 
string function, or when to ask for in-line, as well as to acquire a library 
of such functions built for the processor of your choice.   On the P4, you 
would have available 64-bit integer comparisons if you chose to use them to 
speed this up.
-- 
Tim Prince

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]