How to deploy ERPNext as a Multi-Module on Kubernetes

Khalid · June 13, 2020, 12:32pm

We are running software using ERPNext which is E-Learning system.
The solution developed completely with ERPNext (CMS-website, ERP, E-learning system, Vendors portal, Registrations and video streaming)
Heavy requests (like video streaming, student exam, browsing the cms or event applying to the system )are using more resources and affecting other operations and eventually the whole system. So scaling is needed.

The Question is:
How to deploy ERPNext as a Multi-Module on Kubernetes and would it help? and what is the recommended hardware specification for 4000 user with high availability.

vrms · June 13, 2020, 2:36pm

ever tried the Forum Search? I guess this may bring quite some some interesting Topics https://discuss.frappe.io/search?q=kubernetes

revant_one · June 13, 2020, 2:48pm

You need multiple custom apps?

Build layered image of all your frappe framework apps. Refer the custom apps section to build your own image.
That image will have all your modules/apps and it can be used instead of vanilla image when deploying over kubernetes.

Scaling ERPNext for Frappe Framework apps:

Whatever communicates as HTTP request/response is relayed to gunicorn web server. Mostly things happening here are synchronous unless code uses frappe.utils.background_jobs.enqueue(*args, **kwargs)
If all your code is synchronous then you need more number of gunicorn workers. You can add as many as you wish.
If code is async, appropriate background worker will handle it. In that case you need more background workers. e.g. if Data import is heavy, then you need default-worker scaled.

Scaling ERPNext for communicating with non frappe apps:

We’ve faced problems with microservices after scaling gunicorn workers.
- If async pattern like event loop or coroutines is used and microservice is programmed to retry ReST call after failure, it will cause unpredictable behavior.
- Microservice makes request > gunicorn accepts validates and then timeouts > app retries > another gunicorn serves this retry. (happens in milliseconds)
- It passes validations twice this way. It delivered same Serial No twice! on non-customized ERPNext.
To solve this we handled it with async Data Import + ReST API instead of sync CRUD + ReST API.

Screenshot of double delivered Serial No:

For scale, things have to be designed case by case in both approaches.

Shameless plug: Get in touch with castlecraft.in for consulting, training and magic!