This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Slow stat(2) performance on ClearCase MVFS


For quite a while now I've seen noticeably poor cygwin performance
on ClearCase MVFS drives with recursive commands like:

o grep -r
o find .
o rm -r

For example, executing 'find . -name "*.exe"' on a particular
MVFS directory tree here takes 8 mins (480 secs), but using
the strategy outlined in result 6 below reduces the time to 32 secs.


Some digging on 1.5.25-15 and narrowed down the issue to the performance of stat(2).



Some questions:

o Does it make sense to replace GetFileAttributes() with
  FindFirstFile() in all cases ?

o Is it possible for fhandler_base::fstat_fs() to always
  use fstat_by_name() only, and avoid using open_fs() and
  fstat_by_handle() ?





Here are timings using some simple benchmarking programs. Each
program has a simple 10000 iteration loop:


GetFileAttributes Perform GetFileAttributes(argv[1]) FindFirstFile Perform FindFirstFile(argv[1]), FindClose() stat Perform stat(argv[1])


The results are measured in elapsed seconds using cygwin time(1) on the following files:

NTFS     c:/WINDOWS/system32/drivers/etc/hosts
MVFS     v:/cerberus/daytona/lib/Makefile.mk


NTFS MVFS 1. GetFileAttributes 0.66 10.5 2. FindFirstFile 0.33 1.2 3. stat(MSVC) 0.37 1.2 4. stat(CYGWIN-1.5.25) 1.47 20.3 5. stat(no open) 2.4 11.5 6. stat(no attr, open) 2.0 2.3


Results 2 and 3 show that Win32 and MSVC functions perform well, but that we can expect that ClearCase MVFS is four times slower than a native NTFS.

Result 1 shows that GetFileAttributes is nearly ten times
slower than FindFirstFile for MVFS, and twice as slow for NTFS.

Result 4 gives a baseline performance for stat(2) on a vanilla
1.5.25-15 system.

Result 5 shows a doubling of MVFS performance over result 4 by forcing
fstat_by_name() instead of fstat_by_handle():

--- fhandler_disk_file.cc.orig 2009-04-18 10:26:34.937500000 -0700
+++ fhandler_disk_file.cc 2009-04-18 10:27:04.484375000 -0700
@@ -356,7 +356,7 @@
return fstat_by_name (buf);
query_open (query_stat_control);
}
- if (!(oret = open_fs (open_flags, 0)) && get_errno () == EACCES)
+ if ((oret = 0) && !(oret = open_fs (open_flags, 0)) && get_errno () == EACCES
)



Result 6 shows a ten times improvement in MVFS performance over result 4 by forcing fstat_by_name() and also forcing the use of GetFileAttributes():

--- path.cc.orig        2009-04-18 11:18:49.812500000 -0700
+++ path.cc     2009-04-18 11:19:01.625000000 -0700
@@ -4299,3 +4299,24 @@
     strcpy (bs, ".");
   return buf;
 }
+
+extern "C"
+DWORD GetFileAttributes (const TCHAR* path)
+{
+  for (const TCHAR* p = path; *p; ++p)
+    if (*p == '*' || *p == '?')
+       return INVALID_FILE_ATTRIBUTES;
+
+  WIN32_FIND_DATA findbuf;
+
+  HANDLE findhandle = FindFirstFile(path, &findbuf);
+
+  if (findhandle != INVALID_HANDLE_VALUE)
+    {
+      FindClose(findhandle);
+
+      return findbuf.dwFileAttributes;
+    }
+
+  return INVALID_FILE_ATTRIBUTES;
+}





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]