Frappe Docker Single Server or Docker Swarm or Kubernetes? Need Expert guidance

We are using Frappe Docker Single Server so far with multiple projects depending on the custom apps.
We create regional servers if customer asks for the same(GDPR compliance) and deploy the sites.

So far we don’t see much problem with this kind of setup.
However we are skeptical about future as our customer base increases we might hit a roadblock.

I have explored other two options

Now I need to make a decision on which path we should choose for our future infrastructure.

Clarifications required from experts for both Docker swarm and K8s.

  1. Can we use the existing custom containers as different project use different custom container currently
  2. can our existing regional single servers can be clustered
  3. Can we use our existing volumes of single server
  4. Can we use Traefik instead of nginx in the case of K8s
  5. What is the migration path…

I have few more queries which I can post based on the guidance i get


1 Like

Docker swarm? Read In brief, simple to get started and understand.

Kubernetes? It is the go to orchestrator for cloud providers and large enterprises.

Yes you can run the same images that are used on single-server setup on any orchestrator. Read this section frappe_docker/ at main · frappe/frappe_docker · GitHub

You can use existing servers for docker swarm cluster or self hosted kubernetes cluster provided servers are stopped and restarted with new setup.

You cannot use existing servers in case of managed kubernetes offering, generally you’ll have to provision nodes from provider’s api to attach them to managed cluster. There may be providers who only provide Kubernetes control-plane and allow you to attach servers from anywhere.

Yes you can, it is easy to replace ingress-nginx and use any ingress controller. It works with istio virtualservice as well.

Backup and restore is the best path for data migration

  • Take snapshots keep them in shared volume/pvc or push them to s3 to restore later
  • pull snapshots from s3 or shared volumes and restore
  • Use maria-backup for fast db backup and restores. If it doesn’t work for you use standard bench backup which internally uses mysqldump.
  • Use restic for files snapshot / restore
  • Run containers / pods with mounted shared volumes and access to s3 to execute the restore commands.

Infra progression

  • Start with Single Server (1 VM)
  • Move to docker swarm (1 VM), gives you nice portainer ui and web hooks for gitops/ci/automated pipelines.
  • Upgrade docker swarm to multi vm. Data moved to managed Shared File System and database moved to managed DB (MariaDB conf for Frappe · frappe/bench Wiki · GitHub, DBaaS)
  • Kubernetes (multi VM), Managed FS and DB. Loadbalancer for ingress.

If you are just using ERPNext, no customization, small setup, use single server.

If you are using development pipelines with staging, uat servers being auto-deployed use docker swarm + portainer.

If you have built SaaS / PaaS app, or you need to scale any of the workers horizontally, use any Kubernetes.

If you have enterprise/compliance/checklists, use managed Kubernetes.


Thanks for the detailed clarification @revant_one . Much needed for me at this time. Most importantly for me to know where to invest my time as we don’t want to redo everything again.

Now i understand what we should focus on

  • Move to docker swarm (1 VM), gives you nice portainer ui and web hooks for gitops/ci/automated pipelines.

Once we are familiar with the concepts we will move on docker swarm multi VM

We wanted to move to Kubernetes at the earliest.

Thanks for your expert advice.

1 Like

@revant_one you are right. Docker Swarm is simple and easy to get started.

I should have done this before. I was assuming it to be hard to implement. But your docs helped me a lot.

for all who are using single server without portainer and swarm, i would sincerely advice to move to this setup immediately. Otherwise you will regret for all your ops effort getting spent on docker compose commands.

We have decided to use portainer and swarm.
Couple more questions:

Portainer also help in gitops GitOps with Portainer - Deploy to Docker using simple compose file - YouTube

For ease and declarative setup I’m adding stacks yaml. You add simple single containers “Tasks”, The interface in portainer needs to be used instead of yaml making it less declarative. Another way to exec into the running container from portainer ui and execute any bench commands without any tracking in yaml or portainer task. In any case make sure you only use the bench commands that don’t change the application code.

Check Interoperability with other storage providers  |  Cloud Storage  |  Google Cloud, you’ll need to generate HMAC keys and use it as s3 endpoint. HMAC 密钥  |  Cloud Storage  |  Google Cloud, After making it S3 compatible it should work as s3, I had used it with py/boto3 before. Restic docs for google cloud Preparing a new repository — restic 0.15.1 documentation, check other types of repos if they can be used as alternative.

There is no yaml. It depends on case to case.

  • if you backup using mariabackup snapshots then restore snapshots
  • if you backup using mysqldump then restore the sql file backups
  • if you backup using bench command you can restore using bench command
  • to restore files from restic use restic restore latest --target . (check restic docs for more)

Thanks @revant_one for as usual for the swift response and clear direction.

I will explore on your suggestions and revert in case i am clueless.

Will use this thread with further progress so that other who are with similar need can be benefitted

If you are exploring the docker swarm alternative, then you can create many other posts with specific questions. Link this post there for reference if you wish.

1 Like

After reading all your links and with my understanding
I created a bucket my-bucket in GCP
Created a Service account
Created a HMAC
in the backup.yaml i changed the environment values like below

      - AWS_ACCESS_KEY_ID=HMAC Access key
      - RESTIC_PASSWORD=somePassword

The job failed citing site_config.json error and unable to reach s3: Should i enable public access in GCP?
Where am i going wrong?

if s3 api doesn’t work i think restic also has Google cloud specific config.

If you are looking for HA cluster? Use Kubernetes.

Choose managed Kubernetes, managed FS, managed DB, managed load balancer. You’ll achieve scale and HA with the help of cloud provider. Target users are rich, large, MNC. Business needs to be proven to fund the cloud resources.

Resources at cost are cheaper if you know how to build things from raw material. To go with self managed Kubernetes be prepared to manage much more infrastructure. Managing following infrastructure is out of scope of Frappe Framework and ERPNext.

  • or or any such project for storage. Needs 4GB+ RAM per node. Turns out to be expensive (management overhead and redundancy resources). Not as expensive as managed google’s storage.
  • Install MariaDB Galera, on labeled nodes (dedicated part of cluster is galera cluster).
  • For Ingress setup MetalLB or configure cloud lb if cloud vm are used.
  • You may also need control-plane LB for multi-server (multi-master) setup.


1 Like

Sure. Kubernetes is way forward for us. If migrating from docker swarm to K8s is going to be easy, we would prefer adapt K8s at a later period.

Resources at cost are cheaper if you know how to build things from raw material

Rightly said. I am sure there is a lot of learning required.

How about microk8s?

If it is not managed or if you don’t have OEM support, you can go for any distribution. I’ve used k3s in containers for testing the official helm charts. Check the official helm chart tests. I’ve used kind, k3d (k3s in docker), I tried microk8s as well. All are good.