-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Binding non-existent file #281
Comments
Thank you for your detailed report (including attempts to mitigate this on your side, appreciated!), Kyle! Yes, SLURM environment tends to be problematic for PGAP. We will assess this situation ASAP. |
Question: Does
work like setting envar SLURM_CPUS_PER_TASK? |
Note that the file about which singularity complains, does exist, according to your listing. Is it possible that you have some kind of local singularity settings that disfavor your directory as a source of mount? Also, I just found out, in our FAQ, that:
|
Hi Azat, As it turns out, the file exists temporarily. I cleaned up the directory and re-ran the batch job. That listing was grabbed while PGAP was trying to run. Once the run fails, a secondary listing shows that it the file is no longer there. PGAP must be deleting the file after the run fails. With that revelation, I am not sure what is the actual problem. From my understanding, SLURM_CPUS_PER_TASK gets set and is useable while the job is running (for instance, to pass to a program so that it knows the actual core count it has to deal with) and the sbatch flag --cpus-per-task actually controls how many are requested during scheduling. I understand about not being able to offer support. On that FAQ I do see that --no-internet may help, so I will try that as well. For what it is worth, this was working with PGAP version 2023-05-17.build6771. I wish I had known about the --no-self-update flag, as my woes started when PGAP updated itself. |
Kyle, you can still run the May version, by using use-version parameter and, as you discovered yourself, |
I opened an internal investigation (code PGAPX-1229) for this, Kyle. |
Describe the bug
PGAP fails to start a singularity container because it is attempting to bind a file that does not yet exist.
To Reproduce
Using PGAP version 2023-10-03.build7061. Followed steps from quick start.
Starting directory structure:
Submitting SLURM job with the following sbatch script named
pgap.slurm
:Results in the following directory structure:
slurm-18657.out:
cwltool.log:
I attempted to run the docker command directly and got the following error:
So, it appears to be failing because the file
pgap_input_1zzxtcbo.yaml
does not exist.Expected behavior
PGAP should run successfully.
Software versions (please complete the following information):
Log Files
Ran with --debug but the debug and debug/log directories were empty.
Additional context
I had read some troubleshooting from other reported issues and tried this for the sbatch script, same results:
The text was updated successfully, but these errors were encountered: