Working out the best way to do request/reply for distributing tasks
So this turned out to be more than just nats and python by introducing Rethinkdb.
- Requests for new workloads go into the database
- A worker continuously polls the distributor for new tasks
- For every poll the distributor takes a new task in the
ready
state - The distributor marks the task as
active
and forwards it to the worker - The worker processes the task
- if the task fails it gets marked as
ready
and attempts increases by 1 and the worker request a new job - if the task succeeds it gets marked as
completed
and the worker requests a new job - When a task has been attempted X amount of times it gets marked as
failed
This setup works as microservice setup where every component can be scaled horizontally, meaning Rethinkdb, NATS, the Distributor and the Worker can all have multiple instances running simultaneously.
The Distributor listens for polls on a loadbalanced queue, a feature of NATS that works without configuration.
- python 3.8+
- docker-compose
pip install poetry
poetry install
This starts NATS and Rethinkdb 2.4.4
docker compose build
docker compose up -d rethinkdb nats
Put some fake tasks in the database
make generate
Run the worker and distributor
make distributor
make worker
You can open the rethinkdb webinterface with http://localhost:8080
In the data explorer try the query
r.db("work").table("tasks").changes()
and press the run
button to watch work happening in realtime