Frappe Docker Single Server or Docker Swarm or Kubernetes? Need Expert guidance

Thanks @revant_one for as usual for the swift response and clear direction.

I will explore on your suggestions and revert in case i am clueless.

Will use this thread with further progress so that other who are with similar need can be benefitted

If you are exploring the docker swarm alternative, then you can create many other posts with specific questions. Link this post there for reference if you wish.

1 Like

After reading all your links and with my understanding
I created a bucket my-bucket in GCP
Created a Service account
Created a HMAC
in the backup.yaml i changed the environment values like below

environment:
      - RESTIC_REPOSITORY=s3:https://storage.googleapis.com/my-bucket
      - AWS_ACCESS_KEY_ID=HMAC Access key
      - AWS_SECRET_ACCESS_KEY=HMAC Secret
      - RESTIC_PASSWORD=somePassword

The job failed citing site_config.json error and unable to reach s3:https://storage.googleapis.com/my-bucket. Should i enable public access in GCP?
Where am i going wrong?

if s3 api doesn’t work i think restic also has Google cloud specific config.

If you are looking for HA cluster? Use Kubernetes.

Choose managed Kubernetes, managed FS, managed DB, managed load balancer. You’ll achieve scale and HA with the help of cloud provider. Target users are rich, large, MNC. Business needs to be proven to fund the cloud resources.

Resources at cost are cheaper if you know how to build things from raw material. To go with self managed Kubernetes be prepared to manage much more infrastructure. Managing following infrastructure is out of scope of Frappe Framework and ERPNext.

  • rook.io or openebs.io or any such project for storage. Needs 4GB+ RAM per node. Turns out to be expensive (management overhead and redundancy resources). Not as expensive as managed google’s storage.
  • Install MariaDB Galera, on labeled nodes (dedicated part of cluster is galera cluster).
  • For Ingress setup MetalLB or configure cloud lb if cloud vm are used.
  • You may also need control-plane LB for multi-server (multi-master) setup.

Check k3s.rocks

1 Like

Sure. Kubernetes is way forward for us. If migrating from docker swarm to K8s is going to be easy, we would prefer adapt K8s at a later period.

Resources at cost are cheaper if you know how to build things from raw material

Rightly said. I am sure there is a lot of learning required.

How about microk8s?

If it is not managed or if you don’t have OEM support, you can go for any distribution. I’ve used k3s in containers for testing the official helm charts. Check the official helm chart tests. I’ve used kind, k3d (k3s in docker), I tried microk8s as well. All are good.

In short. Portainer not listing container from nodes.

I have been testing docker swarm with portainer ui. As per my initial exploration and setup from https://github.com/castlecraft/custom_containers/blob/main/docs/docker-swarm.md, I found docker swarm and portainer as good choice.

for the past few days, i upgraded the swarm with few more nodes. Deployed mariadb stack to a specific node with placement constraints. The deployment was successful as expected.
The problem here is portainer is not listing containers from other nodes.
I can access, stacks networks and services. But not containers.

Is this a limitation of portainer, or i am missing something?

Check firewall for docker swarm related ports

The following ports must be available. On some systems, these ports are open by default.

 - Port 2377 TCP for communication with and between manager nodes
 - Port 7946 TCP/UDP for overlay network node discovery
 - Port 4789 UDP (configurable) for overlay network traffic

Thanks @revant_one, We use GCP. Initially i allowed these ports using firewall rules. Even then we have same issue. Now for testing purpose I allowed all the ports. No improvement in the portainer. Is there something i need to do inside VM

Manager node ip should not change. The one used to join swarm. The one specified in command docker swarm init --advertise-addr.

From vm check ufw/firewall. It should not be a problem it is disabled by default.

What error do you face? any logs in portainer containers?

Manager node ip hasn’t changed. Its a static ip

Below is the message from portainer agent running in the manager node

2023/07/14 11:32:53 http: TLS handshake error from 10.0.2.3:40724: EOF

The portainer agent from the worker node has no specific logs

Are you able to manage swarm through cli from manager node?

Can you list nodes? docker node ls

Confirm if it’s a problem with swarm or portainer.

From CLI i am able to list nodes services and tasks for the nodes

I am not sure how to list the containers for a node from manager and exec bash into it. As docker nodes ps <node_id> only lists the tasks.

From Portainer, all nodes, services and its tasks are listed. Please note the console option is disabled for tasks running from swarm2 node(worker node)

One can observer from below Containers of mariadb-amr_db is not listed in the below image as it belong to the worker node.

I faced similar situation when,

  1. I connected to wrong IP, While docker swarm init --advertise-addr I used public static ip and used private static ip when I did docker swarm join, or other way private <-> public.
  2. firewall or restriction on port access between nodes

You are right, It works when we internal IP, but partially. The containers are getting identified. But portainer is finding difficulty in getting the volumes, due to which, logs and console are not accessible.

Further portainer becomes very slow as soon as we add a swarm node. With single manager node it works blazingly fast.

There is some serious configuration issue with respect to networks and firewall or even the way I had setup the volumes.

Anyone was successful setting up a cluster for the VMs in GCP can help here with steps or articles.

You will not find docker volumes, Each node has its own docker volumes under /var/lib/docker/volumes, it’ll be separate for nodes. To have them common you need to mount them via nfs or managed nfs. Best way to deploy new versions of custom app in self hosted docker setup - #13 by revant_one

Your database also need to be labeled and scheduled on same node everytime to access same volumes. Basically anything that has volume either need to be handled via nfs or lock it to same node.

Haven’t used multi-managers. Kubernetes can be re-thought if control-plane HA is needed.

I have achieved 1 manager, 2 worker,
NFS or Managed NFS for sites volume,
DBaaS or node locked stack or separate VM for DB.
Beyond that I’ll keep adding workers.
If I need anything more complex I’ll move to K8s.

Thanks @revant_one ,

Looks like I need to do couple more steps before I can achieve a stable swarm.

I wanted to achieve regional databases like we discussed earlier.

The whole idea is to have containerised databases with the regional node.
In such case how the configuration of NFS and DB should look like.

Have db in labelled nodes?
NFS server as per region and configured in stack?

Let me give a try to set this up over the weekend and share my experience