ERPNext High Availability Reference Architecture

I didn’t see any reference documentation that talks about ErpNext high availability. Is there any documentation available to separate components(redis, mariadb, web app, etc) in different nodes.

Please let me know if any of you are working on multi nodes. Thanks.

1 Like

This reply might be too simplistic but there options are implementing primary/ secondary setup for the db, creating a load balanced VM group, or docker orchestration.

Personally prefer the load balanced VM option ( since I can’t get docker orchestration to work with my preferred cloud provider.) Will switch the day @revant word a manual for kubernetes on azure.

1 Like

I am experimenting with docker locally but I still feel it is flaky. I will try Kubernetes. I think MariaDb is only challenge with respect to container implementation. Maybe taking it out will make it more manageable.

1 Like

It’s actually not with most providers, with azure it is, with azure you need to change the erpnext code base a bit, which I wasn’t able to figure out.

1 Like

Docker itself works wonderfully well on localhost, but when you go for orchestration, it’s a bit tricky and the documentation seems to be outdated. One of the key requirements (as per docs) has been deprecated and like I said ,I hope @revant_one can find some time to update the same.

Kudos to @revant_one because this is the first time I tried docker and it works so well for me.

1 Like

You need to understand how Frappe/ERPNext containers work in case debugging is required.

Container can take any db host that is mentioned in common_site_config.json or site_config.json.

The frappe_docker/docker-compose.yml for docker-compose up -d adds all the containers in single file for easy of use. It is not mandatory to use that docker-compose.yml for advance cases.

For AWS Aurora (MySQL) its the same case, check this gist, I execute sed command to replace erpnext code as a part of job that does the site creation.
After site creation, all seems to work with existing code on AWS Aurora.

I’ll update docs soon. Currently it mentions the nfs-server helm chart which is deprecated.

Instead you need to use

Manually deploy it, Helm Chart will redirect you back to the deprecated chart.

check these resources from tests, helm/tests at main · frappe/helm · GitHub.

It uses all the non deprecated services. The tests are running using k3d.

2 Likes

updated docs

https://github.com/frappe/helm/pull/80

2 Likes

Awesome. Checking out the new docs

But with the Azure managed db, there is another issue that pops up.

Azure requires the username to be in the format username@hostname in the connection string. While frappe only uses the username part to create default databases, and for further communications. I think the corrections need to be done in database.py and Azure DB for mariadb can be used seamlessly.

did you check the create-site.yaml from the gist?

bench new-site $SITE_NAME --no-mariadb-socket --db-name $DB_NAME --db-password $DB_PASSWORD --mariadb-root-username $DB_ROOT_USER --mariadb-root-password $MYSQL_ROOT_PASSWORD --admin-password $ADMIN_PASSWORD --install-app erpnext;

try substituting the environment variables with custom username, root user, etc.

Note that --db-name $DB_NAME is also username.

I haven’t tried this in AWS but did try that a simple Azure DB. I also hope i have understood your code corerectly.

It throws an when frappe or erpnext tries to create a db with the same name that a db with name xyz@abc cannot be created.

The auzre db hostname is usually like dumbhostnameyougave.microsoft.azure.something.
The user id required is username@dumbhostnameyougave.

So the db connections succeeds but frappe proceeds to create a new database which is the same as the username and this creates an error because of the @. Tried escaping it but, no go.

I’m interested in HA as well, including multiple front end servers.

Getting DB’s to work when not on the local machine does take some manual configuration, you have to be careful with “GRANTS”. We have that all worked out, but had a server crash through a mistake recently, would like to have a clustered front end (“multiple runtimes”) so that doesn’t happen.

== John ==

for general update on topic.

the updated helm chart has all the processes decoupled into separate pods. even the nginx and gunicorn are 2 separate pods now.

with the updated helm chart you can schedule different types of deployment/pods on different types of node pools using affinity in values.yaml

1 Like