This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to make child of failed fork exit cleanly?


On 03/05/2011 11:46 AM, Ryan Johnson wrote:
2. When the child does exit, how to prevent finalizers from running for dlls which did not load properly?
Context for the second question: exiting the child tends to trigger access violations, often in a pthread_mutex destructor call (la-la land). Some of these can be avoided by disabling stack dumping from api_fatal (see separate email about alloca and stack walking), but the others continue to mystify.


Overal, AFAICT, the cygwin dll design assumes that all dlls have loaded properly, and a failed fork breaks that invariant. I worry that some "properly-loaded" dll accesses state of a "not-properly loaded" dependency
The plot thickens... single-stepping through dll finalization, the crash occurs because of a call to __gcc_deregister_frame, which is inserted automatically by gcc (to deal with C++ exception handling unwind info?). Single-stepping into the call is a descent into chaos, with the end result that the process exits from a kernel32.dll call with an error code that suggests an access violation occurred (0x000005a).

The cygwin dll in question is statically-linked, loaded at the desired address, and depends only on cygwin1.dll, cyggcc_s-1.dll, and cygstdc++-6.dll (all of which are still loaded, their finalizers did not run yet). It had just executed its own global destructors. No global initializers had run, because in_forkee was set.

Very strangely, when every child dies (including those automatically respawned by Windows), the parent also seg faults when calling gcc_deregister_frame on the same dll! If even one child survives (even if many had previously crashed), then no error arises. Even more strangely, if I break into a first child which has a good layout (no previous failures, current fork will succeed) and delay it long enough that the parent times out, the parent still suffers the seg fault! What shared state is there that could cause this to happen?

Disabling dll finalization completely when in_forkee==1 gets rid of the above problem, but occasionally I'll get a new error in the child:

CloseHandle(pinfo_shared_handle<0x610031BF>) failed void pinfo::release():1040, Win32 error 6
110356 [main] fork 10556 fork: child -1 - died waiting for longjmp before initialization, retry 0, exit code 0x100, errno 11


Sometimes, when the child dies as above, the parent will again seg fault while deregistering a dll (but not always).

At this point I'm thoroughly confused. Does anyone have some enlightenment to offer?

Gory details below...
Ryan

Single-instruction stepping yields the following stack trace (sort of -- it doesn't reflect any one stack trace reported by gdb, because the stack kept changing). Stack frames marked with '*' are those which I suspect are due to a jump into la-la land; those marked with '+' correspond to a longjmp call which unwound the stack back to _sigfe an unknown number of times (at least twice).

*0x75a81136 in KERNEL32!GetPrivateProfileStructA () from /cygdrive/c/Windows/syswow64/kernel32.dll
*0x6115e228 in WaitForSingleObject@8 () from /usr/bin/cygwin1.dll
*0x610d63e5 in muto::acquire (this=0x611700c0, ms=4294967295) at /home/Ryan/apps/cygwin-src/winsup/cygwin/sync.cc:91
*0x61077dbf in calloc (nmemb=1, size=44) at /home/Ryan/apps/cygwin-src/winsup/cygwin/malloc_wrapper.cc:106
*0x61003129 in operator new (s=44) at /home/Ryan/apps/cygwin-src/winsup/cygwin/cxx.cc:23
*0x610ecece in pthread_mutex::init (mutex=0x67f0900c, attr=0x0, initializer=0x14) at /home/Ryan/apps/cygwin-src/winsup/cygwin/thread.cc:2746
+0x610c68b5 in __sjfault () from /usr/bin/cygwin1.dll
+0x610eeb63 in pthread_mutex_lock (mutex=0x67f0900c) at /home/Ryan/apps/cygwin-src/winsup/cygwin/cygtls.h:279
*0x610c6675 in _sigfe () from /usr/bin/cygwin1.dll
*0x610eeb00 in pthread_spinlock::init () at /home/Ryan/apps/cygwin-src/winsup/cygwin/thread.cc:2869
*0x610c7dc7 in _sigfe_pthread_mutex_lock () from /usr/bin/cygwin1.dll
*0x67f08a40 in cyggcc_s-1!__gthread_mutex_unlock () from /usr/bin/cyggcc_s-1.dll
0x67f054ad in cyggcc_s-1!__deregister_frame_info_bases () from /usr/bin/cyggcc_s-1.dll
0x660010d9 in __gcc_deregister_frame () from /cygdrive/c/cygwin/home/Ryan/experiments/fork-tests/cygbar.dll
0x61021d1e in per_module::run_dtors (this=0x61251050) at /home/Ryan/apps/cygwin-src/winsup/cygwin/dll_init.cc:89
0x61161716 in dll::run_dtors (this=0x61251048) at /home/Ryan/apps/cygwin-src/winsup/cygwin/dll_init.h:68
0x61022b36 in dll_list::detach (this=0x611e3440, retaddr=0x6600124d) at /home/Ryan/apps/cygwin-src/winsup/cygwin/dll_init.cc:343
#3 0x61022bea in cygwin_detach_dll () at /home/Ryan/apps/cygwin-src/winsup/cygwin/dll_init.cc:954
#4 0x610c6665 in _sigfe () from /usr/bin/cygwin1.dll


Very oddly, the parent process segfaults as well, in the same location as the child, when it tries to exit. This only occurs when the child crashes enough that windows fails to restart it. If the child crashes once, but the next child succeeds, the parent does not fault:
#0 0x67f054bc in cyggcc_s-1!__deregister_frame_info_bases () from /usr/bin/cyggcc_s-1.dll
#1 0x660010d9 in __gcc_deregister_frame () from /cygdrive/c/cygwin/home/Ryan/experiments/fork-tests/cygbar.dll
#2 0x61021d1e in per_module::run_dtors (this=0x61251050) at /home/Ryan/apps/cygwin-src/winsup/cygwin/dll_init.cc:89
#3 0x61161766 in dll::run_dtors (this=0x61251048) at /home/Ryan/apps/cygwin-src/winsup/cygwin/dll_init.h:68
#4 0x61021d70 in dll_global_dtors () at /home/Ryan/apps/cygwin-src/winsup/cygwin/dll_init.cc:61
#5 0x611492b7 in __call_exitprocs (code=0, d=0x0) at ../../../.././newlib/libc/stdlib/__call_atexit.c:116
#6 0x6112152a in exit (code=0) at ../../../.././newlib/libc/stdlib/exit.c:61
#7 0x61005fcb in cygwin_exit (n=0) at /home/Ryan/apps/cygwin-src/winsup/cygwin/dcrt0.cc:1111
#8 0x610081c0 in _cygwin_exit_return () at /home/Ryan/apps/cygwin-src/winsup/cygwin/dcrt0.cc:928
#9 0x61005b36 in _cygtls::call2 (this=0x28ce64, func=0x61007a50 <dll_crt0_1(void*)>, arg=0x0, buf=0x28cda4)
at /home/Ryan/apps/cygwin-src/winsup/cygwin/cygtls.cc:69
#10 0x61005bdb in _cygtls::call (func=0x61007a50 <dll_crt0_1(void*)>, arg=0x0) at /home/Ryan/apps/cygwin-src/winsup/cygwin/cygtls.cc:62
#11 0x610079bf in _dll_crt0@0 () at /home/Ryan/apps/cygwin-src/winsup/cygwin/dcrt0.cc:948
#12 0x004013c2 in cygwin_crt0 ()
#13 0x00401015 in mainCRTStartup ()




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]