You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The scenario:
1 - A logical bug in the job code was introduced to a new job version. This version was deployed to the Flink cluster, causing some of the TaskManagers to crash with an exception after a few seconds of runtime. A restart loop started happening where the job would try to re-run and crash after a few seconds, repeatedly.
2 - The bug was identified, fixed and we want to update the running job with a new fixed-job JAR.
expected: only one job should run without errors. actual: two jobs are up.
After that, when trying to cancel the unexpected job, the flink cluster is canceled as well.
Thanks,
Gil
The text was updated successfully, but these errors were encountered:
If you deploy your FlinkCluster as Job Cluster / Application Cluster, Cancelling the job from the FlinkConsole will cancel the cluster too. (That is the intended behaviour)
Try the Session Cluster (sample) if you want to reuse your FlinkCluster for different jobs. (You'd have to submit your jobs to the JobManager using Flink CLI yourself)
Hey,
We encounter an issue in which a job is executed multiple times unintentionally although it's mentioned in the following remark that this is unexpected behavior. (https://github.com/spotify/flink-on-k8s-operator/blob/v0.4.0-beta.7/controllers/flinkcluster/flinkcluster_reconciler.go#:~:text=//%20This%20is%20an%20exceptional%20situation.)
The scenario:
1 - A logical bug in the job code was introduced to a new job version. This version was deployed to the Flink cluster, causing some of the TaskManagers to crash with an exception after a few seconds of runtime. A restart loop started happening where the job would try to re-run and crash after a few seconds, repeatedly.
2 - The bug was identified, fixed and we want to update the running job with a new fixed-job JAR.
expected: only one job should run without errors.
actual: two jobs are up.
After that, when trying to cancel the unexpected job, the flink cluster is canceled as well.
Thanks,
Gil
The text was updated successfully, but these errors were encountered: