This is the mail archive of the cygwin-apps mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [patch/rebase] Add a rebase database to keep track of DLL addresses


On 7/23/2011 6:11 AM, Corinna Vinschen wrote:
> There's definitely a bug in the Mingw code, though.  I reviewed the
> patch, comments inline.

:-(

>> I don't know why, but when I added O_BINARY everything was copacetic.
> 
> LF->CRLF conversion?

Well, I guess -- but there aren't (or shouldn't be) any LFs *or* CRs in
the filenames of DLLs.  That's what has me stumped.

   read(fd, array[i].name, array[i].name_size)

returned success (or, at least, a non-negative value), but nothing was
stored in array[i].name.  That's one reason I switched to calloc for the
name strings; otherwise I was getting random junk (and possible segvs)
when printing the names.

>> When I say quick-n-dirty, I mean: lots of duplicated and only slightly
>> modified code from rebase.c.  There's room for code consolidation, so
>> this bit could be put off until later.
> 
> Right.  Especially all the db file information should go into a new
> header file.

Ack -- but that's for another patch. the Q is, add quick-n-dirty now,
then consolidate and reorganize later -- or reorganize first, and then
(or simultaneous with) adding the new developer's utility.  Either way,
doesn't matter to me.

>> +override CXX_LDFLAGS+=@EXTRA_CXX_LDFLAG_OVERRIDES@
> 
> Why is CXX_LDFLAGS necessary?  I see what you do but I can't imagine the
> msys compiler doesn't know -static-libstdc++.

That's precisely why. msys compiler is 3.4.x which supports only
-static-libgcc.

> I'll ignore the rebase-dump stuff for now.

Ack.

>> +  mingw|msys )
>> +    # Also exclude the commands we're using below, which is not ideal
>> +    /bin/ps -s | /bin/sed -e '1d' | /bin/awk '{print $NF}' |\
>> +      /bin/sed -e '/\/bin\/ps$/d'   -e '/\/bin\/ps\.exe$/d'  \
>> +               -e '/\/bin\/sed$/d'  -e '/\/bin\/sed\.exe$/d' \
>> +	       -e '/\/bin\/awk$/d'  -e '/\/bin\/awk\.exe$/d' \
>> +	       -e '/\/bin\/sort$/d' -e '/\/bin\/sort\.exe$/d' \
>> +	       -e '/\/bin\/uniq$/d' -e '/\/bin\/uniq\.exe$/d' \
>> +               -e '/\/bin\/dash$/d' -e '/\/bin\/dash\.exe$/d' |\
> 
> What's the reason to do that?  I don't see how that should be necessary.
> All the tools are used before peflags is called, so there's no problem
> to change them as well, just as on Cygwin.

At one point, rebaseall had similar code, on cygwin, but then was later
changed to
	
    grep -E -q -i -v '/d?ash(.exe)?$' /proc/[0-9]*/exename

This "new" procedure omitted all the awk/sort/uniq stuff was because it
wasn't USING those tools anymore.  Then, the "new" code was copied into
peflagsall.

Since, on mingw|msys, I needed the "old" code in rebaseall (msys doesn't
have proc/*/exename), I simply duplicated the "old" code in peflagsall, too.

However, I think this big sed expression *IS* needed, in both cases.
Here's why:

The point here is to cause the script (rebaseall OR peflagsall) to give
up if ANY cygwin/msys process is active other than [d]ash.  The reason
is, in each case, we're going to try to write to every DLL (or every
EXE) in the installation tree (except [d]ash, and
cygwin-1.dll/msys-1.0.dll, and a few others). Since we can't write to
in-use objects, we have to make sure that there AREN'T any in-use
objects -- e.g. no other cygwin/msys processes *except* those that we
"know" about.

Since the procedure, on msys, to detect processes requires at minimum
ps.exe, awk, and dash, we have to 'sed-out' those three applications --
and sed.exe itself -- to avoid 'false positives'.  (uniq and sort are
cosmetic, but if we use 'em, then we need to sed them out too).

>> Index: rebase.c
>> ===================================================================
>> RCS file: /cvs/cygwin-apps/rebase/rebase.c,v
>> retrieving revision 1.6
>> diff -u -p -r1.6 rebase.c
>> --- rebase.c	21 Jul 2011 19:10:04 -0000	1.6
>> +++ rebase.c	22 Jul 2011 23:59:09 -0000
>> @@ -50,6 +56,10 @@ FILE *file_list_fopen (const char *file_
>>  char *file_list_fgets (char *buf, int size, FILE *file);
>>  int file_list_fclose (FILE *file);
>>  void version ();
>> +#if defined(__MSYS__)
>> +/* MSYS has no strtoull */
>> +unsigned long long strtoull(const char *, char **, int);
>> +#endif
> 
> Does it have strtoll?  It should have since the function is available
> in Cygwin since October 2001, which means it was available in Cygwin
> 1.3.4 already.  Msys has been forked after that, afaics.
> 
> So, if we have strtoll, you could simply use that and cast the result to
> uint64_t, rather than to paste some external strtoull implementation.

No, it doesn't have strtoll:

objdump -p /c/MinGW/msys/1.0/bin/msys-1.0.dll | grep strtoll

Anyway, here's the fork date for msys:

Wed Sep 12 01:03:36 2001  Christopher Faylor <...>

The strto[u]ll stuff was exported from cygwin two weeks later:
Mon Oct 1 14:25:00 2001  Charles Wilson  <...>
having been added to newlib the same day:
2001-10-01  Charles Wilson  <...>

I'll look into adding that patch to msys-$next, but given how long it
takes for new msys to propagate, we'll still need a workaround for some
time.

>> -#ifdef __CYGWIN__
>> +#if defined(__MSYS__)
>> +  if (machine == IMAGE_FILE_MACHINE_I386)
>> +    {
>> +      GetImageInfos64 ("/bin/msys-1.0.dll", NULL,
>> +	               &cygwin_dll_image_base, &cygwin_dll_image_size);
>> +      /* See cygwin code, below */
>> +      cygwin_dll_image_base -= 3 * ALLOCATION_SLOT;
>> +      cygwin_dll_image_size += 3 * ALLOCATION_SLOT + 8 * ALLOCATION_SLOT;
> 
> This is not correct for msys.  Msys has been forked off from Cygwin 1.3.
> Back in 1.3 days, the shared memory areas were not allocated in front of
> the DLL.  Rather, the default address for the shared memory areas was
> 0xa000000, bottom up.  I also doubt that it's really necessary to add the
> slop factor to the end.

OK, that's easy enough to correct.

>> @@ -308,7 +331,8 @@ save_image_info ()
>>    hdr.offset = offset;
>>    hdr.down_flag = down_flag;
>>    hdr.count = img_info_size;
>> -  if (write (fd, &hdr, sizeof hdr) < 0)
>> +  errno = 0;
>> +  if (write (fd, &hdr, sizeof (hdr)) < 0)
> 
> Why are you setting errno to 0, if it's only printed if write fails,
> which sets errno anyway?

Leftover from debugging. I was getting a failure and was trying to
figure out why (there was other debugging code mixed in there, which got
deleted).  I'll take it out.

>> @@ -480,7 +505,7 @@ load_image_info ()
>>        for (i = 0; i < img_info_size; ++i)
>>  	{
>>  	  img_info_list[i].name = (char *)
>> -				  malloc (img_info_list[i].name_size);
>> +				  calloc (img_info_list[i].name_size, sizeof(char));
> 
> There's no reason to use calloc, it's overwritten by the subsequent
> read() anyway.  This is differnt from the calloc for img_info_list,
> which will only get partially filled by the read call.

No, that's not what happened on mingw.

I was getting a "successful" return (e.g. non-negative return value)
from read(), but *nothing* was written into the .name buffer. So, later
on, when I tried to print the value, I just got random garbage until a
'\0' happened to be found.  Sometimes this was just a few chars, but
other times it was beyond the .name_size.  If this happened at the last
entry in the array, then it was possible to get a segv.

> What about defining the name of the DLL like this:
> 
>   #if defined (__MSYS__)
>   #define CYGWIN_DLL "/usr/bin/msys-1.0.dll"
>   #elif defined (__CYGWIN__)
>   #define CYGWIN_DLL "/usr/bin/cygwin1.dll"
>   #endif
> 
> and just use it subsequently, rather than adding more #if's?

OK.

>> +#else
>> +  {
>> +    /* Borrow cygwin code for extracting module path, but use ANSI */
>> +    char exepath[PATH_MAX];
> 
> PATH_MAX?  How big is that in native Win32?  If it's equivalent to
> MAX_PATH, you don't have to worry about long path prefixes.

limits.h:
#define PATH_MAX        259

stdlib.h:
#define       MAX_PATH        (260)

>> +    char* p = NULL;
>> +    char* p2 = NULL;
>> +    size_t sz = 0;
>> +
>> +    if (!GetModuleFileNameA (NULL, exepath, PATH_MAX))
>> +      fprintf (stderr, "%s: can't determine rebase installation path\n",
>> +	       progname);
>> +    p = exepath;
>> +    if (strncmp (p, "\\\\?\\", 4))   /* No long path prefix. */
>> +      {
>> +        if (!strncasecmp (p, "\\\\", 2))     /* UNC */
>> +          {
>> +            p = strcpy (p, "\\??\\UN");
> 
> Hang on.  You're adding the native NT path prefix for DOS devices?
> What's that supposed to accomplish, given that the subsequent code
> uses msvcrt functions, which work with Win32 paths?
> 
>> +            GetModuleFileNameA (NULL, p, PATH_MAX - 6);
>> +            *p = 'C';
> 
> That...
> 
> 
>> +          }
>> +        else
>> +          {
>> +            p = strcpy (p, "\\??\\");
> 
> Same NT path prefix weirdness.
> 
>> +            GetModuleFileNameA (NULL, p, PATH_MAX - 4);
> 
> ...and that won't work.  You replaced calls to wcpcpy with calls to
> strcpy.  Either use stpcpy, or add strlen(p) to p.
> 
> Fortunately the strcpy's are wrong, so the path is just what
> GetModuleFileNameA returned, a plain Win32 path, which is what you need.
> So, in fact you should just use the path returned by GetModuleFileNameA
> and...
> 
>> +    /* strip off exename and trailing slash */
> 
> ..yes, exactly.

OK.  All that stuff with NT native paths is so much black magic to me,
which is why I copied it with minimal changes (only what I *thought*
were necessary to switch from wchar to char).  Shoulda looked up the
defn for wcpcpy...

>> +#if defined(__MSYS__)
>> +/* implementation adapted from google andriod bionic libc's
> 
> s/andriod/android
> 
> However, as mentioned above, I'd remove this code and just create a
> small wrapper around the strtoll function.

....except we don't yet have that on msys, either.

>> +ProcessResult=0
>> +case $Platform in
>> +  mingw|msys )
>> +    # Also exclude the commands we're using below, which is not ideal
>> +    /bin/ps -s | /bin/sed -e '1d' | /bin/awk '{print $NF}' |\
>> +      /bin/sed -e '/\/bin\/ps$/d'   -e '/\/bin\/ps\.exe$/d'  \
>> +               -e '/\/bin\/sed$/d'  -e '/\/bin\/sed\.exe$/d' \
>> +               -e '/\/bin\/awk$/d'  -e '/\/bin\/awk\.exe$/d' \
>> +               -e '/\/bin\/sort$/d' -e '/\/bin\/sort\.exe$/d' \
>> +               -e '/\/bin\/uniq$/d' -e '/\/bin\/uniq\.exe$/d' \
>> +               -e '/\/bin\/dash$/d' -e '/\/bin\/dash\.exe$/d' |\
>> +               sort | uniq | grep -E '.'
> 
> Same as in peflagsall.in, I don't see why this should be necessary.

As explained above, it /is/ necessary -- we want to bail if we detect
ANY active cygwin/msys processes, but avoid false positives due to the
commands used to DETECT those processes.

the 'grep /proc/[0-9]*/exename' approach avoids these complications
because it only uses ONE command (grep), and "grep" doesn't yet show up
in the /proc/ area when it is invoked.  (Is this a race condition we are
exploiting?)

$ grep -E -i -v '/d?ash(.exe)?$' /proc/[0-9]*/exename
/proc/1044/exename:/usr/bin/cygrunsrv
/proc/224/exename:/usr/bin/ssh-agent
/proc/2804/exename:/usr/bin/cygrunsrv
/proc/2840/exename:/usr/sbin/sshd
/proc/2848/exename:/usr/bin/cygrunsrv
/proc/2876/exename:/usr/sbin/syslogd
/proc/484/exename:/usr/sbin/cygserver
/proc/5212/exename:/usr/bin/mintty
/proc/5452/exename:/usr/bin/bash

See? no 'grep'

Therefore, we don't need to 'sed' out grep itself. So the (new-ish)
cygwin method is much simpler than the (old) cygwin/current msys method.
 But the ugliness /is/ necessary.

Thanks for the review.

--
Chuck


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]