This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Possible race in SYSV IPC (semaphores)


Hello again,

I can now positively confirm the race condition in cygserver w.r.t. the named
pipe used to serialize SYSV requests through the server.  The race is due to
that transport_layer_pipes::accept() (bool *const recoverable) (file: transport_pipes.cc) does actually _create_ the pipe when pipe_instance == 0
(ironically, transport_layer_pipes::listen() does not create any OS primitives
at all!).

This means that under heavy load, cygserver threads may all end up processing
their requests and closing all instances of the pipe (bringing pipe_instance == 0)
yet not being able to get to the point of accepting new request (that is, to
re-create the pipe).  For the client (user process), this looks like the pipe
does not exist (during that very tiny period of time), and the following message
gets printed:

Iteration 3016
      1 [main] a 4872 transport_layer_pipes::connect: lost connection to cygserver, error = 2

Note "error 2" for ERROR_FILE_NOT_FOUND.

The error is so subtle it is very difficult to reproduce with the default number of
threads in cygserver, 10 (although, not impossible, esp. on a moderately-to-heavily
loaded system).

The error is not due to deadlocking (reported earlier today) -- the server remains
afloat coming to CreateNamedPipe() just a jiffy too late, it's the client that suffers
from the pipe disappearance.  Deadlocking of the logging cygserver seems to remain a
yet another issue.

Anton Lavrentiev
Contractor NIH/NLM/NCBI

P.S. The code to reveal the race is the very same test program posted today
(stripped of the BUGx conditionals):

#include <errno.h>
#include <stdio.h>
#include <string.h>
#include <sys/ipc.h>
#include <sys/sem.h>


#define SEMKEY 1


union semun {
    int val;                    /* value for SETVAL */
    struct semid_ds *buf;       /* buffer for IPC_STAT, IPC_SET */
    unsigned short  *array;     /* array for GETALL, SETALL */
};


static void doCriticalWork(void)
{
    return;
}


int main(void)
{
    struct sembuf lock[2];
    int n, semid;

    if ((semid = semget(SEMKEY, 2, IPC_CREAT | 0666)) < 0) {
        perror("semget(IPC_CREATE)");
        return 1;
    }

    lock[0].sem_num = 0;
    lock[0].sem_op  = 0;
    lock[0].sem_flg = IPC_NOWAIT;
    lock[1].sem_num = 0;
    lock[1].sem_op  = 1;

    lock[1].sem_flg = SEM_UNDO;

    if (semop(semid, lock, 2) != 0) {
        perror("semop(LOCK[0])");
        return 1;
    }

    for (n = 0; ; ++n) {
        static const union semun arg = { 0 };
        int error;

        printf("Iteration %d\n", n);

        lock[0].sem_num = 1;
        lock[0].sem_op  = 0; /* precondition:  [1] == 0 */
        lock[0].sem_flg = 0;
        lock[1].sem_num = 1;
        lock[1].sem_op  = 1; /* postcondition: [1] == 1 */
        lock[1].sem_flg = SEM_UNDO;

        if (semop(semid, lock, 2) < 0) {
            error = errno;
            fprintf(stderr, "semop(LOCK[1]): %d, %s\n",
                    error, strerror(error));
            break;
        }

        doCriticalWork();

        lock[0].sem_num =  1;
        lock[0].sem_op  = -1;
        lock[0].sem_flg =  SEM_UNDO | IPC_NOWAIT;

        if (semop(semid, lock, 1) < 0) {
            error = errno;
            fprintf(stderr, "semop(UNLOCK[1]): %d, %s\n",
                    error, strerror(error));
            break;
        }
    }

    return 1;
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]