Asynchronous store
protocol
#3672
Labels
API change 📄
Content of patch changes API!
database 🗄️
Issues related to the database schema.
enhancement 🌟
refactoring 😡 ➡️ 🙂
Refactoring code.
RFC ✒️
Request For Comments
server 🖥️
usability 👍
Usability-related features
Milestone
store
, more specifically the API call tomassStoreRun
waits and blocks until the result of the store is returned to the client. As processing a store action takes a non-trivial amount of time on the serverside (and this operation is also executed only on one thread!), this means that returning frommassStoreRun
itself takes a non-trivial amount of time. The problem surfaces if the connection between the client and the server coughs, chokes, misbehaves, because it is only the networking stack in the kernel that is keeping the door open for the reply to arrive. While a disappearing client is no problem from the server's side and data won't be lost, CI jobs can hang indefinitely, or scripts that expect data to be available forcmd
query after a return fromstore
will break apart.The proposal is to switch the blocking from relying on externalia like "the TCP stack" into a softer, but more local, blocking mechanism, while also turning the API itself asynchronous. This proposal is backwards compatible.
Database changes
We already have information in
RunLock
as to what runs are undergoing a store. However, this is not enough, we need to store some semi-temporary information about store "attempts" or "sessions". This could go into its own table, per product, as this needs to be kept for a time even if the run lock is released. This table would contain the run name, a unique session token/identifier, and some status flag. The identifier might be auto-incremented, or a hash of the time when the lock was initialised, it is not a "secret" resource.These identifiers should be garbage collected in the usual process.
CLI changes
There are no changes needed on the CLI. Optionally, the
store
command might be extended with a--no-block
argument which makes it immediately exit and return to shell once the server started processing the data, in case the user does not care about when the operation finished.API changes
A new endpoint, hereby referred to as
massStoreRun
Async
shall be created. This function should return the aforementioned "store session token", or throw. The semantics of this function should be that once the server can confirm that processing of results can reasonably continue (cheap early checks like permission, the fact that the data is validly encoded before unpacking it, etc. should be performed) it returns.To query whether the store operation has succeeded or not, a new function should be added, which returns status information (from the database) about the store. The information needed here is malleable, but at least a boolean: "Is the operation still in progress?". (Consuming a successful result might want to remove the related information from the database, to ease garbage collection times at startup.)
Implementation changes
The
store
command should, once received the token from the server, close the connection and use the token to every once in a while poll the server for the status of the operation. Deciding a good interval here could be tough, but trivial choices like "every 10 sec" or "every 30 sec" should be fine as a prototype. As far as I gathered, we already perform a counting of reports during store (which is weird!) but if this information is available, the initial wait time, and the requery interval could be assumed using it.Inbetween queries, the
store
binary should sleep using OS primitives for sleeping a process, but without having to rely on the network stack. Every query is its own connection, likecmd ...
.Obsoletes #4039.
The text was updated successfully, but these errors were encountered: