Skip to main content

Command Palette

Search for a command to run...

Understanding Celery's Prefetch Mechanism

Published
1 min read

When it comes to building a distributed system, to be honest, Python doesn't offer many good options or libraries compared to Go or Node.js. Today, we will look at some problems with Celery, a popular and default choice for task queues in Python.

You really need to understand distributed systems; otherwise, you will face some serious problems when using Celery because Celery's default settings are not very good, in my opinion. Why am I talking about default settings? Simple question: most of the time, we will use default settings for projects, right?

  1. Celery prefetching jobs can cause local congestion. A 1-second task might need to wait for a 1-hour task to finish before it gets a chance to run, even though the two tasks are unrelated and there are still many Celery workers available.

So, if you see some tasks taking a very long time to be picked up by the workers while the system load is still very low, prefetching might be the root cause.

How to fix: Disable prefetches setting!

References:

More from this blog

hai nguyen's blog

7 posts

Experienced software engineer with a strong problem-solving ability and a passion for building software solutions. Specializing in web development and data engineering.