OSError(16, 'Device or resource busy') in Parameters of a DIRAC job #6993
-
Hello, I observe occasional job failures on one particular computing resource that characterized by the following features:
Does anybody know what may be a reason for that behavior? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 7 replies
-
And sometimes there is no error in pilot output. Log ends with the following lines:
|
Beta Was this translation helpful? Give feedback.
-
It looks like the issue is with psutil library. During change of scaling_max_frequency the file could be not accessible during small period of time. If psutil tries to read it in that moment, the exception will be raised. I am talking with psutil developers and also ask my colleagues who responsible for servers to make scaling_max_frequency static. |
Beta Was this translation helpful? Give feedback.
-
@fstagni , How do you think, is it possible to add "try except" at that part of the code? |
Beta Was this translation helpful? Give feedback.
Well, I created an issue to refer to in the PR: #7109. But during work on that issue it appeared that this line of code was removed complitely in DIRAC 8. So I closed the issue.
The solution for me was to change Pilot version to newer one. I used 8.0.23. It work with server 7.3.29. And the plan is to move to DIRAC 8 server.