For some time now, we have been working on a custom app that calls scheduled daily and monthly tasks, runs for long period and processes numerous transactions. The development of this app has constantly been faced with several issues like Mariadb crashing unexpectedly, “Request Timed Out” errors and even outright server failures. Some of these issues we try and solve but they keep resurfacing ultimately due to the fact that the server runs out of memory quite quickly.
I did some digging and found out that memory keeps building up in redis whether or not tasks are run, which is quite unusual, something similar to this.
and I noticed that the frappe.utils.background_jobs.enqueue() function which calls redis-queue’s enqueue_call() function doesn’t pass in the result_ttl argument which defaults to 500secs. This means that results for all jobs executed with the frappe.enqueue() function are retained in redis(memory) for about 8mins.
I added it and changed it to 5secs, and my tasks now run with very little dent on the memory and completes as expected and the system has been fine for about two days now.
So, I guess my question goes
Will adding and altering this argument to frappe.enqueue() function have any drastic effect on the system (maybe something I am not aware of)?
If “No” to question 1, can it be provided to the enqueue function call for development purpose and defaulted to a reasonable minimum value?
I understand, but I don’t think this is safe… I can afford to put a maxmemory option on the cache, which simply means “don’t cache if it’s full”, but not on the queue (all my requests should be queued or else they’ll be thrown away)… I think that’s the reason the frappe team left it out in the first place.
If you’ve already made the change and haven’t found any issues, maybe send a PR. There’s a better chance the people who can answer the two questions you posed will respond. On the surface, there shouldn’t be an issue.
However, it seems strange that the default 500s is too high. You must be running a lot of jobs! Or you’re running a server without much memory.
With no data to back this up whatsoever, i think 120s could be a good baseline.