You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am working on a REST service that will execute some simple(!) data / file conversion tools, so not any read workflows. In my first prototype, I manually assembled a command line that performs the conversion, in my second prototype, i had three runners (local, localdocker and kubernetes). For K8S I (also) need a ReadWriteMany shared volume between REST server (placing input files into said volume) and the K8S jobs.
So after the first prototypes (and before allowing more functionality) we'd like to get the architecture right and improve maintainability :-) Hence we are going to 1) use CWL to describe the conversion tools and 2) consider cwl-runner and calrissian as job runners.
calrissian takes the hints:DockerRequirement:dockerPull: whateverimage:latest and 1) puts that into the pod definition and 2) removes that from the cwl-runner inside that pod to avoid confusing cwl-runner
it maintains a simple JobResourceQueue
There is some convenient usage reporting.
I was wondering:
How do I know that my input job is finished ? Do I need to keep the K8S job id of my CalrissianJob-revsort and poll its status ? Or did I miss an easier way ?
Why not use K8S jobs instead of the JobResourceQueue and building another scheduler/queue into calrissian ? I found https://de.slideshare.net/DanLeehr/cwl-on-kubernetes-183727221
=> what is missing, and is that still missing today ? Is it the maximum memory and max CPU ? Are jobs still tenacious ?
How do I access the usage reports ?
Thanks in advance, Yours, Steffen
The text was updated successfully, but these errors were encountered:
Hi @sneumann. See below for my thoughts on your questions.
How do I know that my input job is finished ? Do I need to keep the K8S job id of my CalrissianJob-revsort and poll its status ? Or did I miss an easier way ?
We attached a label and watched for K8S events for the jobs with the attached label. Here is the code we used that watched for job status changes: wait_for_job_events.
Why not use K8S jobs instead of the JobResourceQueue and building another scheduler/queue into calrissian ? I found https://de.slideshare.net/DanLeehr/cwl-on-kubernetes-183727221
=> what is missing, and is that still missing today ? Is it the maximum memory and max CPU ? Are jobs still tenacious ?
We found that the K8S jobs would retry jobs that failed after running for quite some time wasting resources. For example if there is a problem with a job's data and the job fails after 3 hours. A K8S job will retry this some number of times. We did need to retry if the problem was temporary(which we found rather common in K8S).
How do I access the usage reports ?
I assume you are referring to the --usage-report command line option. This should write a JSON file in the location you specify once the calrissian process completes.
Hi team calrissian,
I am working on a REST service that will execute some simple(!) data / file conversion tools, so not any read workflows. In my first prototype, I manually assembled a command line that performs the conversion, in my second prototype, i had three runners (local, localdocker and kubernetes). For K8S I (also) need a ReadWriteMany shared volume between REST server (placing input files into said volume) and the K8S jobs.
So after the first prototypes (and before allowing more functionality) we'd like to get the architecture right and improve maintainability :-) Hence we are going to 1) use CWL to describe the conversion tools and 2) consider cwl-runner and calrissian as job runners.
Currently a calrissian CWL job is submitted as K8S job by crafting the K8S job definition
https://github.com/Duke-GCB/calrissian/blob/master/examples/CalrissianJob-revsort.yaml#L3
using the dukegcb/calrissian:latest image as master pod and passing arguments to the calrissian python stuff, which in turn builds a pod to execute the actual cwl-runner.
The main benefits I get are
I was wondering:
JobResourceQueue
and building another scheduler/queue into calrissian ? I found https://de.slideshare.net/DanLeehr/cwl-on-kubernetes-183727221=> what is missing, and is that still missing today ? Is it the maximum memory and max CPU ? Are jobs still tenacious ?
Thanks in advance, Yours, Steffen
The text was updated successfully, but these errors were encountered: