This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Intermittent failures retrieving process exit codes


I've witnessed intermittent failures in multiple build systems while working at multiple companies using Cygwin bash and make as part of the build system but using non-Cygwin compilers and other tools. The intermittent failures occur when a process appears to complete successfully, but the process retrieving its exit code receives an unexpected value. This has been seen on many different Cygwin versions across several years.

Several reports of similar sounding issues can be found online:
- http://cygwin.1069669.n5.nabble.com/Cygwin-1-7-x-on-Windows-7-Exit-statuses-of-Win32-executables-are-sometimes-wrong-td20186.html
- http://stackoverflow.com/questions/9769256/intermittent-failures-under-cygwin-possibly-related-to-candle-and-or-make


I recently was able to produce a very small test case that reproduces this issue reliably on some machines:

$ cat test.sh
#!/bin/sh

while [ 1 ]; do
  echo "test..."
  if cmd /c "false"; then
    echo "exiting..."
    exit 1
  fi
done

An invocation of test.sh should run indefinitely, but fails very quickly on one of my machines:

$ ./test.sh
test...
test...
exiting...

$ ./test.sh
test...
test...
test...
test...
exiting...

$ ./test.sh
test...
exiting...

There are several high-level possibilities for what is going wrong:

1) cmd.exe is failing to retrieve the correct exit code for the invocation of false.exe (A Cygwin process).

2) cmd.exe is failing to return the (correct) exit code it received for the invocation of false.exe.

3) bash.exe (A Cygwin process) is failing to retrieve the correct exit code for the invocation of cmd.exe.

It is possible that other software installed on the machines I've witnessed this on are contributing to the problem (ala http://cygwin.com/faq/faq.using.html#faq.using.bloda). If so, such software would be a contributing factor to one of the explanations above, but does not necessarily mean that there is not a defect in Cygwin (or CreateProcess, WaitForSingleObject, or GetExitCodeProcess). I have not yet seen a similar case that does not involve Cygwin, so at present I suspect a defect in Cygwin, but possibly one that produces no negative symptoms in isolation.

I've reproduced this issue with both the 32-bit and 64-bit versions of cmd.exe. I've also reproduced it by replacing cmd.exe with a C file that calls CreateProcess for Cygwin's false.exe on its own. The issue reproduces whether that C file is compiled with Cygwin gcc, MinGW gcc (32-bit and 64-bit), and with MSVC (32-bit and 64-bit). So, substitute what you like for 'cmd.exe' in the above.

Likewise, I've reproduced this issue by replacing false.exe in the test above with a custom false.exe (A C program that just returns 1). The issue reproduces whether myfalse.exe is compiled with Cygwin gcc, MinGW gcc (32-bit and 64-bit), and with MSVC (32-bit and 64-bit). So, substitute what you like for 'false.exe' in the above.

I am not able to reproduce the problem if I elide the invocation of false.exe. (ie, if the cmd.exe invocation is 'cmd /c "exit /B 1"' or if my replacement for cmd.exe just returns 1).

The problem feels like a race condition in retrieving process exit codes. Further, it seems that it may only occur when two related processes exit in quick succession.

I've been granted several weeks in the near future to work exclusively on this issue. Before I start working on it though, I'd like to hear from other community members who have experienced this and tried to debug it. What is and is not known about the issue. What workarounds have been tried (especially any that were found to be successful). Are there specific parts of the Cygwin (or bash) code that you recommend starting with?

The machine that I've been running the above script on is 64-bit Windows 7 Professional SP1 running under VMware Workstation 8 which is running on Kubuntu 12.04.

Relevant parts of 'cygcheck-s' are:

Windows 7 Professional N Ver 6.1 Build 7601 Service Pack 1

Running under WOW64 on AMD64

    Cygwin DLL version info:
        DLL version: 1.7.16
        DLL epoch: 19
        DLL old termios: 5
        DLL malloc env: 28
        Cygwin conv: 181
        API major: 0
        API minor: 262
        Shared data: 5
        DLL identifier: cygwin1
        Mount registry: 3
        Cygwin registry name: Cygwin
        Program options name: Program Options
        Installations name: Installations
        Cygdrive default prefix:
        Build date:
        Shared id: cygwin1S5


Potential app conflicts:


ByteMobile laptop optimization client.

No Cygwin services found.

Cygwin Package Information
Package                    Version              Status
bash                       4.1.10-4             OK
cygwin                     1.7.16-1             OK


Tom.



-- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]