Replies: 9 comments
-
Apologies for the late response; I have been inactive for a couple months due to a family matter.
WinFuse is an effort to bring the low-level FUSE API to Windows. It works on top of WinFsp and requires a modified version of libfuse. I originally created WinFuse primarily as a method to add FUSE capability to WSL1. WinFuse exposes the FUSE protocol and makes it accessible from the native Windows API ( Unfortunately when Microsoft moved from WSL1 (which is based on Windows kernel emulation of the Linux API) to WSL2 (which is based on a virtualized Linux kernel) it removed my major motivation to continue with the WinFuse project.
Ideally such tests should be performed in an identical environment. But I understand that this may not always be possible if your primary environment is Linux and you are just trying to get a feel for Windows performance via virtualization. Let us discuss the broad differences between file system implementation on Windows and Linux, which will help us better understand what you are seeing and suggest potential improvements. File system operation classes and their performanceFirst let us consider file system operations in two broad classes: operations that act on the names of files (e.g. In general "I/O" operations are quite similar between Linux and Windows. For this reason you will often see very similar performance when using Windows vs Linux. This is also true with WinFsp file systems, if you enable the kernel cache manager (on FUSE OTOH "namespace" operations follow a different model between Linux and Windows. The first important difference is that on Linux paths are broken down into path components represented by A second important difference is that on Linux many namespace operations are treated as primitive operations, whereas on Windows they are not. As an example consider As a rule of thumb you should expect "namespace" operations on Linux to be faster than on Windows. OTOH, you should expect "I/O" operations on Linux to be on par with those on Windows. Listing directoriesAn important "namespace" operation is that of listing files in a directory. This operation is very different between Linux and Windows: On Linux a WinFsp and FUSEWinFsp provides a native API that matches the operations that a Windows kernel mode FSD would have to implement. There are two FUSE layers on top of this native API, the high-level FUSE API provided with the core WinFsp project and the low-level FUSE API provided by the WinFuse project. Both FUSE wrappers have to do considerable work to match the Windows file system model to the Linux one, that goes far beyond translating strings to wstrings. For some insight into this see the SSHFS Port Case Study. The low-level FUSE wrapper has to go into even greater lengths as it has to translate every Windows path it sees into How to improve performance
Finally if you have a read-only file system you do not have to worry about cache coherency and you can always use |
Beta Was this translation helpful? Give feedback.
-
Hi again and thank you so much for the detailed reply! So it looks like there isn't actually much left for me to do to improve performance, at least for as long as I want to stick to the FUSE API. I've already been using As another quick test, I made a simple file system containing only a single small file in a subdirectory. I'm running this with
|
Beta Was this translation helpful? Give feedback.
-
Does your file system support |
Beta Was this translation helpful? Give feedback.
-
Ohhh, it does indeed! And I presume from what I'm seeing that those calls are not cached? I'll check later if the behaviour changes when I disable symlinks. As I know beforehand if a file system contains symlinks or not, I can choose to disable |
Beta Was this translation helpful? Give feedback.
-
According to Windows file system rules when a file is opened via a path that contains symlinks, the symlinks must first be resolved and the real path must be returned to Windows (via the special This means that the user mode DLL contains a version of
These extra Over the lifetime of the WinFsp project the complexity of the user mode DLL has increased and even more so in the FUSE layer. It may be worthwhile to add some caching and especially a |
Beta Was this translation helpful? Give feedback.
-
So with the
|
Beta Was this translation helpful? Give feedback.
-
Recall that on Windows every namespace operation requires opening the file, performing the operation and then closing the file. For example, consider an imaginary Windows equivalent of A lot of applications are coded against the Win32 API, which follows the above pattern and does not reuse the same HANDLE across operations; so you get a lot of extra open and close calls. Furthermore Windows has a lot of components such as filter drivers, etc. that see file system traffic and like to open files for their own purposes (e.g. AntiVirus). Because in the WinFsp design access control happens in user mode (to allow flexibility and custom access control policies) all open requests have to be forwarded to user mode. So you get to see most opens in your user mode file system (some may still be optimized away by the FSD in some circumstances). If you want to see where your file system traffic originates from, I recommend the FileSpy tool (original website seems to be gone, but the tool here seems to be the right one: https://community.chocolatey.org/packages/filespy). |
Beta Was this translation helpful? Give feedback.
-
Correct link for FileSpy (don’t know why my search engine could not find it at first): http://www.zezula.net/en/tools/filespy.html |
Beta Was this translation helpful? Give feedback.
-
I've added some code that checks if there are any symlinks present in a file system image, and if not, it will simply not register the |
Beta Was this translation helpful? Give feedback.
-
I did a couple of performance measurements with the DwarFS FUSE driver that I'm just porting to Windows with the help of WinFsp.
On Linux, I've so far been using the low-level FUSE API exclusively. It seems there has been work on making the low-level API available on Windows as well as part of the
winfuse
project, but it doesn't look like there's much happening these days. Do you reckon there will be progress on supporting the FUSE low-level API in the future?As an example, I've got a DwarFS file system with ~80,000 files and overall 800 MB of data (the full Boost 1.82.0 sources). On a Linux machine, doing
I can scan and read the whole file system in just about 3 seconds (and that includes the time it takes to decompress the data; the compressed file system is only 100 MiB in size).
Doing the exact same thing on Windows on the same machine (albeit in a VirtualBox instance), using WinFsp with the high-level FUSE API, creating the
tar
archive takes around 30 seconds, so it's an order of magnitude slower.So I wondered where the bottleneck is and did some measurements. On Linux, I see:
These measurements look reasonable. Reading/decompressing all regular file data takes about one second.
lookup
/getattr
are called approximately once for each inode.open
is called once per regular file.Let's look at the Windows version. I've made sure that real-time protection in Windows Defender was off during the test:
First of all, despite the fact that it's doing slightly more
read
calls, the overall read latency is exactly the same, just under one second.However, it's spending 4.4 seconds just doing
getattr
calls. It's doing almost 30x as manygetattr
calls as the Linux version. That's a surprise.It's also doing a lot more
open
calls. Again, a bit surprising.Still, the overall time spent inside the FUSE driver is just over 6 seconds, so more than 23 seconds are spent elsewhere. I presume that a lot of time is spent mangling paths from
wstring
to UTF-8, but I can't believe that's the only culprit.I'm wondering what the best way would be to get the performance of the Windows driver closer to the Linux performance.
What's the source of all the
getattr
calls? Is there something I could do to reduce their number? DwarFS file systems are read-only, so the attributes for an inode never actually change.Would a low-level FUSE API, if it was available, improve the situation dramatically? At least, there would have to be much less path mangling, but I wonder if this would help e.g. reduce the number of
getattr
calls.Would switching to the "native" WinFsp API improve things? DwarFS has been designed to be usable without FUSE, so the FUSE driver itself is just a thin layer making calls into the DwarFS API.
As another example that the bottleneck is not DwarFS itself, this is
dwarfsextract
usinglibarchive
to build apax
archive from the file system, completely bypassing the FUSE driver:The whole operation takes about one second.
Beta Was this translation helpful? Give feedback.
All reactions