undefined | Better HN

0 pointsNextgrid4y ago0 comments

Any chance the entire thing can be offloaded to a task queue (Celery/etc)? This would decouple the HTTP request processing from the actual ML task.

The memory errors you're seeing could suggest that you may not actually be able to run multiple instances of the model, and even if you could it may not actually give you more performance than processing sequentially.

Seems like ultimately your current design can't gracefully handle too many concurrent requests, legitimate or malicious - this is a problem I recommend you address regardless of whether you manage to ban the malicious users.

0 comments

pjgalbraith4y ago

Yeah this is the way.

@headlessvictim2 search for "Asynchronous Request-Reply pattern" if you want more information about this kind of architecture. You will remove any bottleneck from the API server and can easily scale out from the task queue.

headlessvictim24y ago

Thanks for the suggestion.

How would this work with GPU-bound machine learning models?

The model processing takes > 30 seconds and would still represent the bottleneck?

pjgalbraith4y ago

You would still have the same bottleneck but the API request would return straight away with some sort of correllation ID. Then the workers that handle the GPU bound tasks would pull jobs when they are ready. If you get a lot of jobs all that will happen is the queue will fill up and the clients will wait longer and hit the status endpoint a few more times.

Here is an example of what it could look like: https://docs.microsoft.com/en-us/azure/architecture/patterns...

1 more reply

j / k navigate · click thread line to collapse

0 comments

pjgalbraith4y ago

Yeah this is the way.

headlessvictim24y ago

Thanks for the suggestion.

How would this work with GPU-bound machine learning models?

The model processing takes > 30 seconds and would still represent the bottleneck?

pjgalbraith4y ago

Here is an example of what it could look like: https://docs.microsoft.com/en-us/azure/architecture/patterns...

1 more reply

j / k navigate · click thread line to collapse