SlidingInvisibilityTimeout behavior #209

AntonZiminSaritasa · 2021-09-23T21:55:46Z

The SQL Server package supports SlidingInvisibilityTimeout property:
https://www.hangfire.io/blog/2017/06/16/hangfire-1.6.14.html

Can I simulate a similar config using Hangfire.PostgreSql?

My config:

InvisibilityTimeout = TimeSpan.FromDays(1)

Some jobs take more than 30 minutes to complete, that's why I need the large InvisibilityTimeout value. One worker crashes and the job stays in the queue for many hours.

Ideally, some watchdog should update the jobs with "Processing" state if the server is dead (ServerId does not exist in hangfire.server table anymore).

https://github.com/frankhommers/Hangfire.PostgreSql/blob/master/src/Hangfire.PostgreSql/PostgreSqlConnection.cs#L401
https://github.com/frankhommers/Hangfire.PostgreSql/blob/master/src/Hangfire.PostgreSql/PostgreSqlConnection.cs#L427

The text was updated successfully, but these errors were encountered:

frankhommers · 2022-02-11T23:12:19Z

@azygis did you make this happen in the last PRs or am I wrong?

azygis · 2022-02-12T05:18:51Z

Nope, none of the timeouts in there. Though I'd rather use transactions and obsolete these altogether.

Frazi1 · 2022-02-13T22:20:09Z

I'm willing to work on this, but don't have much time. I'll see if I can make something next weekend.

However, I'm still not sure how to better approach this.

the simplest option is to have a watchdog monitor dead servers and requeue jobs (that's what @AntonZiminSaritasa suggested above). On my previous job, we had this solution working for a while in production without any noticable trouble.
But Hangfire.SqlServer doesn't use a watchdog. Instead, they have a dedicated timer per job which periodically updates invisibility timeout on the job to keep it "alive".

This approach might be a little more difficult to implement, but it's more granural and might be a little more performant because it won't require scanning through the database to find stuck jobs. I don't have any numbers though.

My personal preference is to follow SqlServer's approach and not reinvent the wheel. Do you guys have any thoughts on this?

Nope, none of the timeouts in there. Though I'd rather use transactions and obsolete these altogether.

@azygis
How do transactions help with long running jobs?

AntonZiminSaritasa · 2022-02-14T05:50:44Z

Personally, I migrated my Hangfire DB to SQL Server, it works fine. I prefer Option 2.

azygis · 2022-02-14T06:03:42Z

@Frazi1
Sorry, I did not see yet how the SqlServer provider does this. Had a quick google search when I saw this issue, their 1.5 release notes mention that they're now using transactions. Since the version is that old, they might have changed the implementation substantially.

All in all, my previous message can be translated as "I'd prefer to do it the way Hangfire.SqlServer does". As you say, not reinvent the wheel.

timpikelmg · 2024-04-13T11:37:03Z

Hi all, I had a look at this and for the use cases we want to use it for (which will be a mixture of short and long running jobs) we would really like to have a better solution that the current invisibility timeouts.

To that end, I have ported the sliding invisibility timeouts from SQL Server across as the first step in this PR #353.

Sliding invisibility is useful especially for very long running jobs and also good to have as an option since apparently the transaction approach, which holds a transaction open for long periods of time, has been seen to cause issues with backups on SQL Server (see #864) so makes sense to have both options here as well.

I'm still getting up to speed on the internals of Hangfire and storage libraries so hopefully the code is suitable. It's turned off by default.

I'm hoping I can find some time soon to see about porting the transaction approach across as well.

azygis · 2024-04-14T08:30:06Z

I might have some time to see this today or tomorrow.

IIRC SQL Server has sp_applock or something similar equivalent of which is not in PostgreSQL. Maybe advisory locks, but these were discussed in this repo somewhere a while back as being somewhat negative for this package. Might be worth reinvestigating.

timpikelmg · 2024-04-14T12:53:52Z

I might have some time to see this today or tomorrow.

Appreciate any time you can spend having a look. We are still building out our distributed job management service internally and not in production yet but I had some time free so decided to give this a look (and this was a very useful learning exercise digging into the libraries).

IIRC SQL Server has sp_applock or something similar equivalent of which is not in PostgreSQL. Maybe advisory locks, but these were discussed in this repo somewhere a while back as being somewhat negative for this package. Might be worth reinvestigating.

For the transaction dequeuing it basically just holds the transaction open (and the row lock) until the job is finished. This works the same way in Postgres since the rowlock is per transaction. I don't know of any way to do a row lock at a session level, not at least out of the box.
I saw some discussion around advisory locks for distributed locks, but I believe that is separate functionality, this doesn't use distributed locking at all (unless I missed something else in the code).

This draft PR is it ported from SQL Server storage to Postgres. It still needs some tests written/ported for dequeuing and more testing but shows how it would work - timpikelmg#1

robjakedorsett · 2024-06-13T19:47:20Z

Hey,
Hangfire SQL Server now doesn't use InvisibilityTimeout, are there plans to follow suit?

azygis · 2024-06-13T20:06:25Z

@timpikelmg sorry, life's happening quite a lot therefore there's really little personal time for reviewing stuff. Not going to make any specific promises (yeah, "today or tomorrow" hadn't quite happened yet).

@robjakedorsett not at this point in time, no, unless someone is willing to create a PR for it.

timpikelmg · 2024-06-16T23:08:35Z

Hey, Hangfire SQL Server now doesn't use InvisibilityTimeout, are there plans to follow suit?

The end goal of these PRs (in my view) is to enable turning off InvisibilityTimeout eventually since this will replace it. I wouldn't expect that for a while, not until sliding invisibility timeout has been the default for a release or two.

…ibility-timeouts #209 - Add Support for Sliding Invisibility Timeouts

azygis · 2024-06-27T07:16:05Z

Implemented with #353.

Frazi1 mentioned this issue Feb 14, 2022

Are there plans to support transactions in the future? #241

Closed

timpikelmg mentioned this issue Apr 13, 2024

#209 - Add Support for Sliding Invisibility Timeouts #353

Merged

timpikelmg mentioned this issue Apr 14, 2024

#209 - Use Transactional row locks for job timeouts / Instant re-queue for background jobs even on ungraceful shutdown timpikelmg/Hangfire.PostgreSql#1

Draft

KorsG mentioned this issue May 14, 2024

Add Support for Sliding Invisibility Timeouts raisedapp/Hangfire.Storage.SQLite#75

Open

azygis added a commit that referenced this issue Jun 27, 2024

Merge pull request #353 from timpikelmg/feature/209-add-sliding-invis…

942fccb

…ibility-timeouts #209 - Add Support for Sliding Invisibility Timeouts

azygis closed this as completed Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SlidingInvisibilityTimeout behavior #209

SlidingInvisibilityTimeout behavior #209

AntonZiminSaritasa commented Sep 23, 2021 •

edited

Loading

frankhommers commented Feb 11, 2022

azygis commented Feb 12, 2022

Frazi1 commented Feb 13, 2022

AntonZiminSaritasa commented Feb 14, 2022 •

edited

Loading

azygis commented Feb 14, 2022

timpikelmg commented Apr 13, 2024

azygis commented Apr 14, 2024

timpikelmg commented Apr 14, 2024

robjakedorsett commented Jun 13, 2024

azygis commented Jun 13, 2024

timpikelmg commented Jun 16, 2024

azygis commented Jun 27, 2024

SlidingInvisibilityTimeout behavior #209

SlidingInvisibilityTimeout behavior #209

Comments

AntonZiminSaritasa commented Sep 23, 2021 • edited Loading

frankhommers commented Feb 11, 2022

azygis commented Feb 12, 2022

Frazi1 commented Feb 13, 2022

AntonZiminSaritasa commented Feb 14, 2022 • edited Loading

azygis commented Feb 14, 2022

timpikelmg commented Apr 13, 2024

azygis commented Apr 14, 2024

timpikelmg commented Apr 14, 2024

robjakedorsett commented Jun 13, 2024

azygis commented Jun 13, 2024

timpikelmg commented Jun 16, 2024

azygis commented Jun 27, 2024

AntonZiminSaritasa commented Sep 23, 2021 •

edited

Loading

AntonZiminSaritasa commented Feb 14, 2022 •

edited

Loading