ERP schedule slow in kubernetes and fast in docker

Pushpendra · January 14, 2025, 2:50pm

I have an ERPNext 15.2 setup running on Docker with a custom image using Supervisor, and another one running on Kubernetes with the official image frappe/erpnext:15.3.8. Both setups have similar VM configurations, and each is using 8 workers. When I try to run schedulers on a long queue in Docker, they execute quickly, but in Kubernetes, they run slowly, even with similar data.

revant_one · January 15, 2025, 3:21am

I’m assuming you’ve same MariaDB service serving both the docker and kubernetes environment. Both MariaDB need to be same to skip looking into database as bottleneck.

Kubernetes uses NFS volumes for shared sites. Any things that needs to read or write files from sites (NFS) mount will be slower than volume located on same VM.

Use fastest NAS storage possible.
Design the app such that there is no disk usage. Use redis, db or pod/container memory processing.

Are you processing files in the background jobs?

Pushpendra · January 16, 2025, 12:58pm

Thanks @revant_one for response.

No do not process any file in background jobs, we just use select query to fetch data from database.

revant_one · January 16, 2025, 4:31pm

How is db setup for both environments?

Pushpendra · January 18, 2025, 12:45pm

Db is running on a VM which has 24 core cpu and 128 of ram.

External ssd is also attached for storage.

Pushpendra · January 18, 2025, 12:48pm

One strange thing I’ve noticed is that when we are running approximately 100 schedulers in a queue, a few batches (2-3) finish very fast initially, taking around 30 minutes each. However, the completion time gradually increases, first to 35 minutes and then continues to rise with subsequent batches. (8 workers per batch).

This behavior occurs in both Kubernetes and Docker.
Is there a specific reason for this?

revant_one · January 19, 2025, 2:04am

Do not scale scheduler FAQ | ERPNext Helm Chart

Do not scale socketio 400 Errors with Multiple socketio replicas - affinity? · Issue #229 · frappe/helm · GitHub

Configure sentry, open telemetry or any traces monitoring and try to understand queries by application. Opentelemetry with Frappe framework? - #5 by revant_one