This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: fun? with libsigsegv


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Corinna Vinschen on 7/18/2009 3:45 AM:
>> AFAICT, the libsigsegv handler is overriding the "myfault" stuff in
>> Cygwin and is causing a fault for something that should be ignored.  The
>> "fault" in this case is coming from our old friend
>> verifyable_object_isvalid.  We expect faults there and they are supposed
>> to be ignored.
>>
>> The regression from 1.5 is because this didn't cause a fault in 1.5 so
>> it looks like something changed in newlib.

I'm still hoping to find something and provide a newlib patch for that.

>>
>> Looking at libsigsegv's code, I don't think it is being smart enough
>> about Cygwin.

Indeed; at any rate, if I can find a way to patch libsigsegv, I'm sure it
will be accepted upstream (I've made other patches for this library for
bugs it had on other platforms).

>>
>> It isn't clear to me why sigaltstack is needed here since I don't see
>> anything special happening with the stack.  It looks like you could
>> modify libsigsegv to just avoid trying to do anything with the signal
>> stack and you could use Cygwin's signal handlers directly.

The libsigsegv library provides two distinct features - robust stack
overflow detection (allowing last-ditch recovery to give a nicer error
message to the user when they have caused infinite recursion) and
user-space memory mapping (catching SIGSEGV to decide when to map in pages
on demand, but if this is all you are doing, you can catch the SIGSEGV on
the primary stack).  Of the two features, the robust stack overflow is
more commonly used in clients of the library.

Per POSIX (and as implemented by Solaris), stack overflow can be reliably
detected by registering with sigaltstack and inspecting siginfo_t which
should describe the primary stack that just overflowed (since if you
overflowed, by definition you can't run any last-ditch cleanup using the
primary stack that died).  But since so few platforms out there provide
this (even Linux gets it wrong, because its siginfo_t contains information
about the alternate stack instead of the primary stack), the bulk of the
libsigsegv library is figuring out other means of deciphering what address
caused (or will cause) the SIGSEGV fault and what memory regions are
currently mapped to the process.  On Linux, the library still gets by with
a sigaltstack handler and then probing the process's current mapped pages,
to see which chunk of mapped memory the fault is closest to (if it is
close to the primary stack, then declare stack overflow; otherwise,
reraise the SIGSEGV to be handled normally).

But on cygwin, since there is no way to do user code in either an
alternate stack or when the guard page exception is first encountered,
libsigsegv chose instead to inject itself prior to the cygwin SEH handler,
before SIGSEGV would be raised.  Obviously both cygwin and libsigsegv's
SEH handler are able to do their work within the guard page, as opposed to
an alternate stack.  This isn't quite as robust as sigaltstack (it is
capped to a single page, and is not as exposed to user code).  And
libsigsegv is set up to act as a filter - if it does not recognize the
fault, it can pass it on to the cygwin handler.  The problem is that when
it is installed as a stack overflow handler, it is usually configured to
recognize all faults, so it is preemptively assuming that a
non-stack-overflow fault would have been raised as a SIGSEGV anyway, and
tries to skip a step.

So I see several possibilities:

We could teach libsigsegv how to recognize if cygwin is in an efault block
(so that libsigsegv does nothing with the address but forwards onto the
cygwin handler, as if libsigsegv had not been installed).  Any ideas on
how to recognize whether cygwin is in an efault block and is going to
ignore a fault?

We could improve cygwin's SEH handler to add a hook function (defaulting
to NULL, but with an API that libsigsegv can use to add a callback) then
teach libsegsegv to install itself via that hook rather than as a
full-blown SEH handler (thus cygwin, rather than libsigsegv, gets first
shot at the fault).

Or we could try improving the libsigsegv SEH handler to ONLY react to
stack overflow, and handle all other faults via SIGSEGV, rather than its
current approach of preemptively assuming all non-overflow faults will
become SIGSEGV.  After all, SIGSEGV can be handled without an alternate
stack for all situations except for stack overflow.

> As Eric mentioned already, the next gawk version comes with libsigsegv
> as part of the package.  IIUC, it's not behaving correctly, so I guess
> I should rather build the next gawk version with --disable-libsigsegv,
> right?  I just don't know what other effect this has on gawk.  I'll
> ask the maintainer.

The difference is reliable stack overflow detection.  Any program that
takes arbitrary user input and can cause stack recursion has to make a
choice - either artificially limit how much recursion the user can
perform, or install a stack overflow handler to let the user push their
system to the limit but warn them nicely when they exceed it.  In m4's
case, here's a sample difference:

on cygwin 1.5 (m4 1.4.10b), with no overflow detection:

$ echo 'define(a,a(a))a' | m4
m4:stdin:1: recursion limit of 1024 exceeded, use -L<N> to change it
$ echo 'define(a,a(a))a' | m4 -L 5000
$ echo $?

The recursion limit is artificially low (at one point, older versions of
libtool had macros that wanted to recurse as deep as 1500 times if you had
a couple hundred .o files to link, although this has since been fixed in
libtool to use a more efficient algorithm).  But requesting to remove the
limit led to a silent status of 0 - the worst possible case (the stack
overflow exceeded the OS guard page, and so there is NO indication that
the user's program failed to complete)!

But repeated on cygwin 1.7, with m4 1.4.13:

$ echo 'define(a,a(a))a' | m4
m4: stack overflow
$ echo $?
1
$ echo 'changequote([,])define(b,0)dnl
define(a,[define([b],incr(b))errprint(b
)a(a)])a' | m4 -L 1024 2>&1 | tail -n2
1022
m4: stack overflow
$ echo 'changequote([,])define(b,0)dnl
define(a,[define([b],incr(b))errprint(b
)a(a)])a' | m4 2>&1 | tail -n2
11750
m4: stack overflow

where you can be assured that m4 chugged on for as long it had the room.
(Note that m4 1.4.13 also added some improvements to use less stack in the
first place, getting past 11700 rather than dying before 5000 iterations).
 On Linux, where the stack dynamically grows rather than being fixed in
length at compile time, this example can chug on for quite some time if
you don't use ulimit first, and with older kernels, it can even cause the
OOM manager to kill the wrong arbitrary process, so be careful when
playing with it.

So, disabling libsigsegv in gawk is not a show-stopper (as most code is
well-written and doesn't recurse that deep); rather it is just a QoI
question at detecting bad input graciously without imposing an arbitrary
limit.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkphryMACgkQ84KuGfSFAYBHEgCguDADHpgEWqiyK1eukU/cTCwk
cCsAn019+Qpi7XvVd54/dEQ+Emaf09jf
=FHEn
-----END PGP SIGNATURE-----

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]