Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limiting number of active workers #129

Open
iwanbk opened this issue Mar 4, 2016 · 16 comments
Open

Limiting number of active workers #129

iwanbk opened this issue Mar 4, 2016 · 16 comments

Comments

@iwanbk
Copy link

iwanbk commented Mar 4, 2016

I have memory issue in our fuse implementation, in this case i am doing cp -a from fuse directory to native FS.
There are around 20K files, total size is 640MB.
I found that memory usage reach 90% in my VM with 1497 MB of RAM.

After profiling, i found that:

  • the memory was allocated by fuse.allocMessage which called by fuse.(*Conn).ReadRequest
  • there are no limit on the number of active workers

So, i changed fuse.Serve() to be something like this and so far it solve our problem.

 func (s *Server) Serve(fs FS) error {
    defer s.wg.Wait()                // Wait for worker goroutines to complete before return
    guardChan := make(chan bool, 10) // limit number of active workers to 10

    s.fs = fs
    if dyn, ok := fs.(FSInodeGenerator); ok {
        s.dynamicInode = dyn.GenerateInode
    }

    root, err := fs.Root()
    if err != nil {
        return fmt.Errorf("cannot obtain root node: %v", err)
    }
    // Recognize the root node if it's ever returned from Lookup,
    // passed to Invalidate, etc.
    s.nodeRef[root] = 1
    s.node = append(s.node, nil, &serveNode{
        inode:      1,
        generation: s.nodeGen,
        node:       root,
        refs:       1,
    })
    s.handle = append(s.handle, nil)

    for {
        guardChan <- true
        req, err := s.conn.ReadRequest()
        if err != nil {
            <-guardChan
            if err == io.EOF {
                break
            }
            return err
        }

        s.wg.Add(1)
        go func() {
            defer func() {
                <-guardChan
                s.wg.Done()
            }()
            s.serve(req)
        }()
    }
    return nil
}

It basically limit the number of active workers to 10.

What do you think about it? I can send PR if it is OK for you.

I think we should introduce another mountOption for this feature

@tv42
Copy link
Member

tv42 commented Mar 4, 2016

In general, I think limiting workers might be a very good idea.

However, I don't see how cp -a would be able to trigger unlimited handlers. Can you share more about that scenario?

@iwanbk
Copy link
Author

iwanbk commented Mar 5, 2016

@tv42
I found that when doing cp -a, the flow is not always in this order : open file 1 -> processing -> close file 1 -> open file 2 -> process -> close file 2...

But sometimes : open file 1 -> process -> open file 2 -> process .......-> close file 1 -> close file 2......
As you can see in our log below, "closing file descriptor" was called multiple times in order.

There might be an issue in our implementation, but in this case, i think it is still better for bazil/fuse to limit the number of active workers.

2016/03/05 01:30:18 rw D > Creating a file instance for: /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_raise.py 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_raise.py 
2016/03/05 01:30:18 rw D > Opening file '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_raise.py' (OpenReadOnly+0x20000) 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_raise.py 
2016/03/05 01:30:18 rw D > Looking up '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_remove_old__future__imports.py' 
2016/03/05 01:30:18 rw D > Creating a file instance for: /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_remove_old__future__imports.py 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_remove_old__future__imports.py 
2016/03/05 01:30:18 rw D > Opening file '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_remove_old__future__imports.py' (OpenReadOnly+0x20000) 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_remove_old__future__imports.py 
2016/03/05 01:30:18 rw D > Looking up '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_keep_u.py' 
2016/03/05 01:30:18 rw D > Creating a file instance for: /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_keep_u.py 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_keep_u.py 
2016/03/05 01:30:18 rw D > Opening file '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_keep_u.py' (OpenReadOnly+0x20000) 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_keep_u.py 
2016/03/05 01:30:18 rw D > Looking up '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_literals_import.py' 
2016/03/05 01:30:18 rw D > Creating a file instance for: /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_literals_import.py 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_literals_import.py 
2016/03/05 01:30:18 rw D > Opening file '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_literals_import.py' (OpenReadOnly+0x20000) 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_unicode_literals_import.py 
2016/03/05 01:30:18 rw D > Looking up '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_xrange_with_import.py' 
2016/03/05 01:30:18 rw D > Creating a file instance for: /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_xrange_with_import.py 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_xrange_with_import.py 
2016/03/05 01:30:18 rw D > Opening file '/root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_xrange_with_import.py' (OpenReadOnly+0x20000) 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/fixes/fix_xrange_with_import.py 
2016/03/05 01:30:18 rw D > Looking up '/root/aysfs_main/jumpscale8/lib/libfuturize/main.py' 
2016/03/05 01:30:18 rw D > Creating a file instance for: /root/aysfs_main/jumpscale8/lib/libfuturize/main.py 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/main.py 
2016/03/05 01:30:18 rw D > Opening file '/root/aysfs_main/jumpscale8/lib/libfuturize/main.py' (OpenReadOnly+0x20000) 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libfuturize/main.py 
2016/03/05 01:30:18 rw D > Looking up '/root/aysfs_main/jumpscale8/lib/libpasteurize' 
2016/03/05 01:30:18 rw D > Creating a dir instance for: /root/aysfs_main/jumpscale8/lib/libpasteurize 
2016/03/05 01:30:18 rw D > Attr /root/aysfs_main/jumpscale8/lib/libpasteurize 
2016/03/05 01:30:18 rw D > Listing dir: '/root/aysfs_main/jumpscale8/lib/libpasteurize' 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Closing file descriptor 
2016/03/05 01:30:18 rw D > Looking up '/root/aysfs_main/jumpscale8/lib/libpasteurize/__init__.py' 

@jmmv
Copy link

jmmv commented Mar 17, 2016

Limiting the number of workers as shown in the original report is insufficient to obtain good performance.

The real problem is that the code spawns a single goroutine for each FUSE operation. When a file system is under load (think thousands of requests per seconds), the overhead to start and terminate goroutines is very significant. Several of my profiling attempts have shown contention in the underlying system mutex/semaphore operations used to coordinate the goroutines, and also in the Go scheduler.

What we should do is spawn a fixed (and small) set of goroutines and dispatch the incoming FUSE requests to them. I have a form of this in jmmv@1337908 though I haven't had the time to prepare it for a PR.

@EtiennePerot
Copy link

jmmv, any progress on this? I'm experiencing the same symptom in my filesystem whenever I run a program like find that walks over the file tree (it's a very deep tree).

@tv42
Copy link
Member

tv42 commented Aug 18, 2016

I have a largish rewrite in progress that does this among many other things. Currently figuring out last bits of the new API.

@EtiennePerot
Copy link

Ping! It's been 4 months. Any updates?

@omani
Copy link

omani commented Jan 18, 2017

any news here?

@yonderblue
Copy link

Would be interested too :)

@EtiennePerot
Copy link

Ping! It's been 8 months. Is the project dead?

@jmmv
Copy link

jmmv commented Apr 24, 2017

To clear confusion, I'm not currently working on this. (I abandoned the project that required this and the solution to this was not fully clear.)

@jmmv
Copy link

jmmv commented May 15, 2018

Getting back to this issue: I've recently reimplemented a file system of mine in Rust to compare performance between the Go and Rust implementations. I was surprised to see a huge difference, but not really: all previous CPU traces I took from my Go implementation pointed at the Go runtime, and the tight loop here was suspicious.

Upon further investigation, I noticed that the Rust implementation of the FUSE bindings is purely sequential: it doesn't try to handle requests in parallel at all. So I applied the same change to the Go version: I removed the wait group in the Serve method shown above and just made the s.serve(req) calls happen inline, not in a goroutine. This improved the performance of the Go implementation significantly and brought it on par to the Rust one. Any attempts at having concurrent requests (even limiting the number of outstanding goroutines to 2) caused a big performance degradation.

So maybe just do the simple thing for now and remove the goroutines from the serve loop? That's what other FUSE bindings do. But I fear this project is dead...

@omani
Copy link

omani commented May 15, 2018

@jmmv thank you for your input. I've read this comment just as another normal comment first but then I had to reply:

what makes you think this project is dead? the latest commit was 24 days ago. @tv42 and all contributors are doing a great job. this project is far from being dead. just wanted to mention.

@tv42
Copy link
Member

tv42 commented May 17, 2018

The current version has pretty miserable memory usage, it'll lose benchmarks pretty easily. There's a bigger refactor under way, but I haven't gotten back to it in a long while.

@jmmv Please do not ever try to declare someone else's project dead. That is really rude and entitled behavior.

@jmmv
Copy link

jmmv commented May 17, 2018

Apologies for the claim that the project was dead. I was just echoing the sentiment from the comment from April 2017 above, which got no answer and that claimed the same thing. I failed to check for recent activity before saying anything.

Even if a big refactoring is coming, is there a trivial fix that could be applied here to make things better for everyone? I can run some large builds on a file system that uses this library with and without goroutines in the serve loop and post the results. These builds tend to be very aggressive on the file system and can provide significant data, but I'm not sure what you'd need to make the decision.

@steeve
Copy link

steeve commented Nov 16, 2018

Ping too!

@tv42
Copy link
Member

tv42 commented Dec 9, 2022

Serving requests "inline" would eliminate a huge class of useful filesystems, you couldn't do any long-running processing to construct the response. My primary use cases fetches data over the network, serving that inline would make the while filesystem hang any time a single outgoing request was slow. (Where the current system just causes overhead, but functions well.) Multithreading is the obvious answer here, for Go.

I'm wary of naively limiting the number of workers, because you might end up blocking fast operations behind a smallish amount of slow requests. E.g. you might block statfs/readdir/stat because you have 100 slow HTTP fetches running. I'm still searching for an API shape that would let the application control this better.

The current design largely imitates net/http. That can't be a horribly wrong answer, or the whole premise of Go is wrong!

It's highly likely high memory usage is more about how we're kinda wasteful about allocating things, and not really about the goroutine usage. The code as-is would cause GC churn even if it was single-threaded.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants