We are running software using ERPNext which is E-Learning system.
The solution developed completely with ERPNext (CMS-website, ERP, E-learning system, Vendors portal, Registrations and video streaming)
Heavy requests (like video streaming, student exam, browsing the cms or event applying to the system )are using more resources and affecting other operations and eventually the whole system. So scaling is needed.
The Question is:
How to deploy ERPNext as a Multi-Module on Kubernetes and would it help? and what is the recommended hardware specification for 4000 user with high availability.
Build layered image of all your frappe framework apps. Refer the custom apps section to build your own image.
That image will have all your modules/apps and it can be used instead of vanilla image when deploying over kubernetes.
Scaling ERPNext for Frappe Framework apps:
Whatever communicates as HTTP request/response is relayed to gunicorn web server. Mostly things happening here are synchronous unless code uses frappe.utils.background_jobs.enqueue(*args, **kwargs)
If all your code is synchronous then you need more number of gunicorn workers. You can add as many as you wish.
If code is async, appropriate background worker will handle it. In that case you need more background workers. e.g. if Data import is heavy, then you need default-worker scaled.
Scaling ERPNext for communicating with non frappe apps:
We’ve faced problems with microservices after scaling gunicorn workers.
If async pattern like event loop or coroutines is used and microservice is programmed to retry ReST call after failure, it will cause unpredictable behavior.
Microservice makes request > gunicorn accepts validates and then timeouts > app retries > another gunicorn serves this retry. (happens in milliseconds)
It passes validations twice this way. It delivered same Serial No twice! on non-customized ERPNext.
To solve this we handled it with async Data Import + ReST API instead of sync CRUD + ReST API.