This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: VM and non-blocking writes


On Dec 15 12:29, Robert Pendell wrote:
> Corinna Vinschen wrote:
> > Obviously I searched wrong.  There a reports about this behaviour
> > since at least 1998 and it has never been fixed.  These two links
> > might be interesting:
> > 
> >   http://support.microsoft.com/kb/q201213/
> >   http://tinyurl.com/2brokp
> 
> Do you have the test case you used for the pure win32 mode?

Sure, but before we start with this, a note:

  I'm contemplating the idea to workaround this problem in Cygwin (not
  for 1.5.25, but in the main trunk) by caping the number of bytes in a
  single send call, according to the patch Lev sent in
  http://www.cygwin.com/ml/cygwin-patches/2006-q2/msg00031.html.

  Lev, are you interested in reworking your patch (minus the pipe stuff)
  to match current CVS?  Is there any gain in raising SO_SNDBUF/SO_RCVBUF
  to a value > 8K, especially in the light of my experiences commented
  on in net.cc, function fdsock()?

Back to the testcase.  Source attached.  I created it so that it can be
built as Cygwin or Linux executable

  $ gcc -g -o nbcheck nbcheck.c

as well as native Windows application using mingw:

  $ gcc -g -mno-cygwin -o nbcheck-nat nbcheck.c -lws2_32

It takes the size of the user data buffer as optional argument, defaulting
to 100,000,000 bytes.

> If you do
> then maybe I can try and push to get this fixed for the next service
> pack release for both XP and Vista as well as Server 2008.  This will
> especially be the case if it can be easily reproduced.

Reproducing the issue is as easy as Wayne described.  Just start a
client application which connects but never reads, for instance by using
the python sequence Wayne used in his mail:

  $ python
  import socket
  s = socket.socket()
  s.connect(("name-of-windows-box", 12345))

If you add a second arbitrary argument, the testcase tries to write
always in 10,000 bytes chunks.  This shows how select starts to block at
one point, in my case on XP SP2 after writing 190,000 bytes.

Result on Linux:

  $ ./nbcheck 500000000
  listening to port 12345 host linux-box (10.0.0.1)
  got connection from 10.0.0.3
  accepted socket is nonblocking now
  buffer size is 100000000 bytes
  trying to write 100000000 bytes
  65536 bytes written
  trying to write 99934464 bytes
  147456 bytes written
  [HANG in select]

  $ ./nbcheck 100000000
  listening to port 12345 host linux-box (10.0.0.1)
  got connection from 10.0.0.3
  accepted socket is nonblocking now
  buffer size is 100000000 bytes
  trying to write 100000000 bytes
  65536 bytes written
  trying to write 99934464 bytes
  147456 bytes written
  [HANG in select]

  $ ./nbcheck 100000000 x
  listening to port 12345 host linux-box (10.0.0.1)
  got connection from 10.0.0.3
  accepted socket is nonblocking now
  buffer size is 100000000 bytes
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  [HANG in select]

Result on Windows;

  $ ./nbcheck-nat 500000000
  listening to port 12345 host windows-box (10.0.0.2)
  got connection from 10.0.0.3
  accepted socket is nonblocking now
  buffer size is 500000000 bytes
  trying to write 500000000 bytes
  Err: 10055
  hit return to exit 

  $ ./nbcheck-nat 100000000
  listening to port 12345 host windows-box (10.0.0.2)
  got connection from 10.0.0.3
  accepted socket is nonblocking now
  buffer size is 100000000 bytes
  trying to write 100000000 bytes
  100000000 bytes written
  hit return to exit 

  $ ./nbcheck-nat 100000000 x
  listening to port 12345 host windows-box (10.0.0.2)
  got connection from 10.0.0.3
  accepted socket is nonblocking now
  buffer size is 100000000 bytes
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  trying to write 10000 bytes
  10000 bytes written
  [WAIT in select for 5 seconds]
  trying to write 10000 bytes
  10000 bytes written
  [WAIT in select for 14 seconds]
  trying to write 10000 bytes
  10000 bytes written
  [WAIT in select for about 60 seconds]
  trying to write 10000 bytes
  10000 bytes written
  [WAIT in select for about 60 seconds]
  [a couple of times, but not always the same]
  trying to write 10000 bytes
  10000 bytes written
  [HANG in select]

The hang occured in one testruns after 160,000 bytes, in another after
190,000 bytes.  I have no idea if there's some sort of rule behind that.

> A source and > binary version will be useful for this.

Creating a binary is most easy, see above.

> I am in the tech beta group for
> Vista SP1, XP SP3, and Server 2008 so I can at least remind them of this
> bug and show them a test case.  No guarantees that it will be fixed.

Actually, given that this behaviour is known since at least 10 years, I
doubt that it will even be accepted as a bug.  But you never should give
up hope, right? :)


Thanks for your offer,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat
#include <stdio.h>
#include <assert.h>

#ifdef _WIN32

#include <windows.h>
#include <winsock2.h>

WSADATA wsadata;

#define SOCKLEN_T int

#else	// Assume Unix-like system

#include <unistd.h>
#include <stdlib.h>
#include <netdb.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <assert.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

#define SOCKET int
#define WSADATA int
#define WSAStartup(a,b)
#define SOCKET_ERROR -1
#define SOCKLEN_T socklen_t
#define WSAGetLastError()	(errno)
#define SD_BOTH SHUT_RDWR
#define closesocket close
#define WSACleanup()

#endif

int
main(int argc, char **argv)
{
  int i;
  SOCKET fd, fd2;
  struct hostent *hp;
  struct protoent *pp;
  char hostname[64];
  struct sockaddr_in lAddr, rAddr;
  char* data;
  size_t datalen, datapos;
  
  WSAStartup (MAKEWORD(2,2), &wsadata);
  gethostname(hostname, 64);
  pp = getprotobyname("tcp");
  hp = gethostbyname(hostname);
  
  setbuf (stdout, NULL);
  assert(pp && hp);
  
  fd = socket(AF_INET, SOCK_STREAM, pp->p_proto);
  assert(fd != SOCKET_ERROR);
  
  lAddr.sin_family = hp->h_addrtype;
  memcpy(&lAddr.sin_addr.s_addr, hp->h_addr, sizeof(lAddr.sin_addr.s_addr));
  lAddr.sin_port = htons(12345);
  
  i = bind(fd, (struct sockaddr *)&lAddr, sizeof(lAddr));
  assert(i != SOCKET_ERROR);
  
  printf("listening to port %d host %s (%s)\n", ntohs(lAddr.sin_port),
	 hostname, inet_ntoa(lAddr.sin_addr));
  i = listen(fd, 5);
  assert(i != SOCKET_ERROR);
  
  i = sizeof(rAddr);
  memset(&rAddr, 0, sizeof(rAddr));
  fd2 = accept(fd, (struct sockaddr *)&rAddr, (SOCKLEN_T *) &i);
  assert(fd2 != SOCKET_ERROR);
  
  printf("got connection from %s\n", inet_ntoa(rAddr.sin_addr));
  
#ifdef _WIN32
  {
    u_long on = 1;
    i = ioctlsocket (fd2, FIONBIO, &on);
  }
#else
  i = fcntl(fd2, F_SETFL, O_NONBLOCK);
#endif
  assert(i != SOCKET_ERROR);

  printf("accepted socket is nonblocking now\n");
  
  datalen = argc > 1 ? strtol (argv[1], NULL, 0) : 100000000;
  data = (char *) malloc(datalen);
  assert(data);
  printf("buffer size is %lu bytes\n", (unsigned long) datalen);

  datapos = 0;

  while (datapos < datalen)
    {
      fd_set wfds;
      FD_ZERO(&wfds);
      FD_SET(fd2, &wfds);
      
      i = select(fd2 + 1, NULL, &wfds, NULL, NULL);
      assert(i == 1);
      
      printf("trying to write %d bytes\n",
	     (int) (argc > 2 ? 10000 : datalen - datapos));

#if 0 // Same effect as send() on Windows, not available on Unix
      {
	DWORD ret;
	WSABUF iov[1];
	iov[0].buf = data + datapos;
	iov[0].len = argc > 2 ? 10000 : datalen - datapos;
	i = WSASendTo (fd2, iov, 1, &ret, 0, NULL, 0, NULL, NULL);
	if (i != SOCKET_ERROR)
	  i = ret;
      }
#else
      i = send (fd2, data + datapos, argc > 2 ? 10000 : datalen - datapos, 0);
#endif

      if (i == SOCKET_ERROR)
	{
	  printf ("Err: %d\n", WSAGetLastError ());
	  break;
      	}
      else
	printf("%d bytes written\n", i);
      
      
      datapos += i;
      assert(datapos <= datalen);
    }
  shutdown (fd2, SD_BOTH);
  closesocket (fd2);
  printf("hit return to exit ");
  getchar();
  WSACleanup ();
  return 0;
}



--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]