Investigate alternatives for task queue #127
Labels
No labels
Blocked
Bounty
$100
Bounty
$1000
Bounty
$10000
Bounty
$20
Bounty
$2000
Bounty
$250
Bounty
$50
Bounty
$500
Bounty
$5000
Bounty
$750
MoSCoW
Could have
MoSCoW
Must have
MoSCoW
Should have
Needs feedback
Points
1
Points
13
Points
2
Points
21
Points
3
Points
34
Points
5
Points
55
Points
8
Points
88
Priority
Backlog
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Signed-off: Owner
Signed-off: Scrum Master
Signed-off: Tech Lead
Spike
State
Completed
State
Duplicate
State
In Progress
State
In Review
State
Paused
State
Unverified
State
Verified
State
Wont Do
Type
Bug
Type
Discussion
Type
Documentation
Type
Epic
Type
Feature
Type
Legendary
Type
Support
Type
Task
Type
Testing
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Blocks
You do not have permission to read 1 dependency
Reference
clevermicro/amq-adapter-python#127
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Currently the python adapter provides a way to send messages over RabbitMQ, however, this is not enough for a task queue, which has some extra insurances and requirements for distributing tasks (something that needs processing), instead of a general message.
The current options are:
Celery
This comment contains all the findings related to celery, including a list of TODOs to make current cleverswarm using it.
Architecture overview
Celery requires the following components:
Frontend
For cleverswarm, the frontend is relatively simple.
Assuming the backend module has a method being annotated by
@app.task, for example:then you could import this method and call the
delaymethod on it:Or, if this is not ideal, you can also use something like this in the frontend:
Backend
Celery is pretty invasive for the backend side.
First, one must use
celery -A module_name worker -l infoto bring up the workers. This will bring up celery worker, a set of processes to connect to broker, listen to incoming messages, and call the registered method (annotated by@app.task).There are several problems:
Component initialization
To initialize the worker process, for example, initialize the python library for the backpressure report, we need to define a method like this:
Backpressure syncing
This is the biggest problem with celery. Celery worker relies on the process, aka fork, not thread. This means the backpressure service inside each task cannot be shared across processes. For cleverswarm backend, it means we will have multiple processes reporting backpressure for the same pod, confusing the management service.
There is a way to create a shared counter. By using Bootsteps, we can register a
Value(frommultiprocessingmodule) to the worker. When it is forked into multiple processes, they still share this value and update it via lock to ensure thread-safe.For reporting, we can have multiple process reporting for the same pod as long as they agree on their value.
However, even with this
multiprocessing.Valuesolution, it still doesn't fit into the existing backpressure management code in python library. So we will likely need to write a dedicated library to work with celery specifically.Hatchet
Hatchet is an execution engine. Compared to celery, hatchet is more like a client-side library (also you need a hatchet server). However, hatchet allows you to bring it up with your own pace and control:
It looks promising and provides stages, so the progress can be tracked. But still requires a lot of changes in cleverswarm.
Hatchet uses asyncio, which is much much much better than the fork processes used in celery.
Hatchet also supports concurrent limit, which limited how many tasks a worker can pull: