Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

removeItem performance #2933

Closed
pepicrft opened this issue Oct 17, 2024 · 5 comments · Fixed by #3008
Closed

removeItem performance #2933

pepicrft opened this issue Oct 17, 2024 · 5 comments · Fixed by #3008
Labels
area/performance Improvements to performance. kind/enhancement Improvements to existing feature. size/S Small task. (A couple of hours of work.)

Comments

@pepicrft
Copy link

👋🏼 Hi,

First of all, thanks for this amazing work.

We are using the file-system IO components at Tuist, and I'd like to bring up something that we notice regarding the performance of the NIOFileSystem.FileSystem.removeItem function.

When we run it against a large directory with many nested directories and and files, we noticed the performance is much worse than using FileManager or just rm -rf in a Linux environment. We benchmarked the performance with hyperfine and the results are below.

Looking at the implementation, it seems that we are not parallelising at all, so I was wondering if this is something intentional, or if it's ok to refactor that code to run as much as possible in parallel.

Bechmarks

With NIOFileSystem

Benchmark

  • Time (mean ± σ): 67.939 s ± 4.450 s [User: 61.906 s, System: 3.205 s]
  • Range (min … max): 63.684 s … 77.757 s 10 runs

With Foundation's FileManager

Benchmark:

  • Time (mean ± σ): 5.857 s ± 0.251 s [User: 1.430 s, System: 2.067 s]
  • Range (min … max): 5.419 s … 6.190 s 10 runs

With rm -rf

Benchmark:
Time (mean ± σ): 3.150 s ± 0.386 s [User: 0.022 s, System: 1.666 s]
Range (min … max): 2.225 s … 3.585 s 10 runs

@FranzBusch
Copy link
Member

Thanks for filling this! This is definitely something that we would take a patch for and parallelise the removal similar to how we did for copying in #2806.

@FranzBusch FranzBusch added kind/enhancement Improvements to existing feature. area/performance Improvements to performance. size/S Small task. (A couple of hours of work.) labels Oct 17, 2024
mimischi added a commit to mimischi/swift-nio that referenced this issue Nov 29, 2024
This PR refactors the logic to remove items. Instead of sequentially
discovering and deleting items one by one, we concurrently traverse all
directories that we find and then sequentially issue file deletions.

This new deletion logic is currently unbounded, and will spawn a bunch
of tasks no matter the underlying system.

Benchmark using `--configration=release` for both the `main` and
`parallel-removal` branches:
```
$ rm -rf .build && git switch main && swift build --configuration=release && cp .build/release/NIOTestCLI NIOTestCLI_main
[...]
Building for production...
[134/134] Linking NIOPerformanceTester
Build complete! (32.65s)
$ rm -rf .build && git switch parallel-removal && swift build --configuration=release
[...]
Building for production...
[134/134] Linking NIOPerformanceTester
Build complete! (32.19s)
$ hyperfine --max-runs 10 --prepare "rm -rf /tmp/swift-nio && cp -r $(pwd) /tmp/swift-nio" "./NIOTestCLI_main /tmp/swift-nio" ".build/release/NIOTestCLI /tmp/swift-nio" "rm -rf /tmp/swift-nio"
Benchmark 1: ./NIOTestCLI_main /tmp/swift-nio
  Time (mean ± σ):      3.019 s ±  0.032 s    [User: 0.601 s, System: 11.123 s]
  Range (min … max):    2.985 s …  3.072 s    10 runs

Benchmark 2: .build/release/NIOTestCLI /tmp/swift-nio
  Time (mean ± σ):      2.295 s ±  0.065 s    [User: 0.673 s, System: 7.677 s]
  Range (min … max):    2.212 s …  2.370 s    10 runs

Benchmark 3: rm -rf /tmp/swift-nio
  Time (mean ± σ):      1.517 s ±  0.037 s    [User: 0.015 s, System: 1.442 s]
  Range (min … max):    1.480 s …  1.590 s    10 runs

Summary
  rm -rf /tmp/swift-nio ran
    1.51 ± 0.06 times faster than .build/release/NIOTestCLI /tmp/swift-nio
    1.99 ± 0.05 times faster than ./NIOTestCLI_main /tmp/swift-nio
```

Resolves apple#2933
@mimischi
Copy link
Member

@pepicrft I've raised a PR (#3008) to address the issue of handling removal of items sequentially. I seem to be unable to reproduce your results though. The existing logic in main appears to be performing ok, when compared to rm -rf (albeit slower).

If you still have your test setup, would you mind sharing that, so I can try and reproduce that?

@pepicrft
Copy link
Author

pepicrft commented Dec 4, 2024

@pepicrft I've raised a PR (#3008) to address the issue of handling removal of items sequentially. I seem to be unable to reproduce your results though. The existing logic in main appears to be performing ok, when compared to rm -rf (albeit slower).

If you still have your test setup, would you mind sharing that, so I can try and reproduce that?

Thanks a lot, @mimischi 🙏🏼. I unfortunately don't have the setup anymore, but I remember it was a directory containing all the dependencies of a project that had been resolved by the SPM. Perhaps you can try with a project that has many dependencies

mimischi added a commit to mimischi/swift-nio that referenced this issue Dec 4, 2024
This PR refactors the logic to remove items. Instead of sequentially
discovering and deleting items one by one, we concurrently traverse all
directories that we find and then sequentially issue file deletions.

This new deletion logic is currently unbounded, and will spawn a bunch
of tasks no matter the underlying system.

Benchmark using `--configration=release` for both the `main` and
`parallel-removal` branches:
```
$ rm -rf .build && git switch main && swift build --configuration=release && cp .build/release/NIOTestCLI NIOTestCLI_main
[...]
Building for production...
[134/134] Linking NIOPerformanceTester
Build complete! (32.65s)
$ rm -rf .build && git switch parallel-removal && swift build --configuration=release
[...]
Building for production...
[134/134] Linking NIOPerformanceTester
Build complete! (32.19s)
$ hyperfine --max-runs 10 --prepare "rm -rf /tmp/swift-nio && cp -r $(pwd) /tmp/swift-nio" "./NIOTestCLI_main /tmp/swift-nio" ".build/release/NIOTestCLI /tmp/swift-nio" "rm -rf /tmp/swift-nio"
Benchmark 1: ./NIOTestCLI_main /tmp/swift-nio
  Time (mean ± σ):      3.019 s ±  0.032 s    [User: 0.601 s, System: 11.123 s]
  Range (min … max):    2.985 s …  3.072 s    10 runs

Benchmark 2: .build/release/NIOTestCLI /tmp/swift-nio
  Time (mean ± σ):      2.295 s ±  0.065 s    [User: 0.673 s, System: 7.677 s]
  Range (min … max):    2.212 s …  2.370 s    10 runs

Benchmark 3: rm -rf /tmp/swift-nio
  Time (mean ± σ):      1.517 s ±  0.037 s    [User: 0.015 s, System: 1.442 s]
  Range (min … max):    1.480 s …  1.590 s    10 runs

Summary
  rm -rf /tmp/swift-nio ran
    1.51 ± 0.06 times faster than .build/release/NIOTestCLI /tmp/swift-nio
    1.99 ± 0.05 times faster than ./NIOTestCLI_main /tmp/swift-nio
```

Resolves apple#2933
@mimischi
Copy link
Member

I've never followed-up here. I've done a few more tests on my PR, but never increased the number of files/directories above five-digits. The benchmarks showed that the new swift-nio is faster than both rm -rf and the FileManager implementation.

@pepicrft
Copy link
Author

I've never followed-up here. I've done a few more tests on my PR, but never increased the number of files/directories above five-digits. The benchmarks showed that the new swift-nio is faster than both rm -rf and the FileManager implementation.

This is fantastic news. Thanks a lot @mimischi 🙏🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance Improvements to performance. kind/enhancement Improvements to existing feature. size/S Small task. (A couple of hours of work.)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants