Review Custom Compose YAML Before Redeployment

fakebizprez · December 31, 2024, 9:12pm

Earlier this year, I had a lot of trouble getting ERPNext+HRMS and other modules of the Frappe framework deployed into a development environment before moving to our servers. After many modifications to our infrastructure, it has been successfully deployed for three months. However, it appears that the MariaDB environment has become corrupted. I have spent several hours debugging, but the consensus is to back up and redeploy.

Before I redeploy, could some developers and contributors review the YAML file I used and identify any syntax issues that might cause problems in the future?

I appreciate it. Thank you.

version: "3"

services:
  backend:
    image: 89neuron/erpnext_hrms_payments:v15
    deploy:
      restart_policy:
        condition: on-failure
    volumes:
      - sites:/home/frappe/frappe-bench/sites
      - logs:/home/frappe/frappe-bench/logs

  configurator:
    image: 89neuron/erpnext_hrms_payments:v15
    deploy:
      restart_policy:
        condition: none
    entrypoint:
      - bash
      - -c
    # add redis_socketio for backward compatibility
    command:
      - >
        ls -1 apps > sites/apps.txt;
        bench set-config -g db_host $$DB_HOST;
        bench set-config -gp db_port $$DB_PORT;
        bench set-config -g redis_cache "redis://$$REDIS_CACHE";
        bench set-config -g redis_queue "redis://$$REDIS_QUEUE";
        bench set-config -g redis_socketio "redis://$$REDIS_QUEUE";
        bench set-config -gp socketio_port $$SOCKETIO_PORT;
    environment:
      DB_HOST: db
      DB_PORT: "3306"
      REDIS_CACHE: redis-cache:6379
      REDIS_QUEUE: redis-queue:6379
      SOCKETIO_PORT: "9000"
    volumes:
      - sites:/home/frappe/frappe-bench/sites
      - logs:/home/frappe/frappe-bench/logs

  create-site:
    image: 89neuron/erpnext_hrms_payments:v15
    deploy:
      restart_policy:
        condition: none
    volumes:
      - sites:/home/frappe/frappe-bench/sites
      - logs:/home/frappe/frappe-bench/logs
    entrypoint:
      - bash
      - -c
    command:
      - >
        wait-for-it -t 120 db:3306;
        wait-for-it -t 120 redis-cache:6379;
        wait-for-it -t 120 redis-queue:6379;
        export start=`date +%s`;
        until [[ -n `grep -hs ^ sites/common_site_config.json | jq -r ".db_host // empty"` ]] && \
          [[ -n `grep -hs ^ sites/common_site_config.json | jq -r ".redis_cache // empty"` ]] && \
          [[ -n `grep -hs ^ sites/common_site_config.json | jq -r ".redis_queue // empty"` ]];
        do
          echo "Waiting for sites/common_site_config.json to be created";
          sleep 5;
          if (( `date +%s`-start > 120 )); then
            echo "could not find sites/common_site_config.json with required keys";
            exit 1
          fi
        done;
        echo "sites/common_site_config.json found";
        bench new-site --no-mariadb-socket --admin-password=admin --db-root-password=admin --install-app erpnext --set-default frontend;
        bench --site frontend install-app hrms;
        bench --site frontend install-app payments;

  db:
    image: mariadb:11.2
    healthcheck:
      test: mysqladmin ping -h localhost --password=admin
      interval: 1s
      retries: 15
    deploy:
      restart_policy:
        condition: on-failure
    command:
      - --character-set-server=utf8mb4
      - --collation-server=utf8mb4_unicode_ci
      - --skip-character-set-client-handshake
      - --skip-innodb-read-only-compressed # Temporary fix for MariaDB 10.6
    environment:
      MYSQL_ROOT_PASSWORD: admin
    volumes:
      - db-data:/var/lib/mysql

  frontend:
    image: 89neuron/erpnext_hrms_payments:v15
    deploy:
      restart_policy:
        condition: on-failure
    command:
      - nginx-entrypoint.sh

    environment:
      BACKEND: backend:8000
      FRAPPE_SITE_NAME_HEADER: frontend
      SOCKETIO: websocket:9000
      UPSTREAM_REAL_IP_ADDRESS: 127.0.0.1
      UPSTREAM_REAL_IP_HEADER: X-Forwarded-For
      UPSTREAM_REAL_IP_RECURSIVE: "off"
      PROXY_READ_TIMEOUT: 120
      CLIENT_MAX_BODY_SIZE: 50m
    volumes:
      - sites:/home/frappe/frappe-bench/sites
      - logs:/home/frappe/frappe-bench/logs
    ports:
      - "8080:8080"

  queue-long:
    image: 89neuron/erpnext_hrms_payments:v15
    deploy:
      restart_policy:
        condition: on-failure
    command:
      - bench
      - worker
      - --queue
      - long,default,short
    volumes:
      - sites:/home/frappe/frappe-bench/sites
      - logs:/home/frappe/frappe-bench/logs

  queue-short:
    image: 89neuron/erpnext_hrms_payments:v15
    deploy:
      restart_policy:
        condition: on-failure
    depends_on: ["backend"]
    command:
      - bench
      - worker
      - --queue
      - short,default
    volumes:
      - sites:/home/frappe/frappe-bench/sites
      - logs:/home/frappe/frappe-bench/logs

  redis-queue:
    image: redis:6.2-alpine
    deploy:
      restart_policy:
        condition: on-failure
    volumes:
      - redis-queue-data:/data

  redis-cache:
    image: redis:6.2-alpine
    deploy:
      restart_policy:
        condition: on-failure
    volumes:
      - redis-cache-data:/data

  scheduler:
    image: 89neuron/erpnext_hrms_payments:v15
    deploy:
      restart_policy:
        condition: on-failure
    depends_on: ["backend"]
    command:
      - bench
      - schedule
    volumes:
      - sites:/home/frappe/frappe-bench/sites
      - logs:/home/frappe/frappe-bench/logs

  websocket:
    image: 89neuron/erpnext_hrms_payments:v15
    deploy:
      restart_policy:
        condition: on-failure
    command:
      - node
      - /home/frappe/frappe-bench/apps/frappe/socketio.js
    volumes:
      - sites:/home/frappe/frappe-bench/sites
      - logs:/home/frappe/frappe-bench/logs

volumes:
  db-data:
  redis-queue-data:
  redis-cache-data:
  sites:
  logs:

revant_one · January 1, 2025, 6:52am

fakebizprez:

        bench new-site --no-mariadb-socket --admin-password=admin --db-root-password=admin --install-app erpnext --set-default frontend;
        bench --site frontend install-app hrms;
        bench --site frontend install-app payments;

You don’t need to install apps here, you just need a new site on which you’ll restore the previous sql dump. Installing apps will only take more time. Restoring db will restore apps.

Easiest way to restore db on new site:

create new site, configures db, user, wildcard host and db permissions.
drop the db created in previous step: drop database <db_name>; create database <db_name>. This ensures the db is empty.
restore dump on empty db from previous step.

fakebizprez · January 11, 2025, 11:25pm

Thanks for the response.

I will remove these lines from the compose file, and follow along with your instruction.

fakebizprez · January 12, 2025, 4:44pm

Wait, maybe I misunderstood you. Are you recommending that instead of tearing ita ll down and redeploying I just dump MariaDB and redeploy the DB? @revant_one

Peer · January 12, 2025, 5:02pm

That’s astonishing indeed.

Does pulling the backup into the (emptied) DB also restore all the dependencies?
Or is this meant to be done in an existing image which still has all the apps?

revant_one · January 13, 2025, 7:42am

Drop/create database to empty it and then restore the dump on blank db. It happens through mariadb / mysql shell.

Not to be confused with pruning mariadb containers and volumes and redeploying them.

It is SQL data, the interlinked tables and data will be available. The application code is not part of Database. List of apps that created the data need to be part of the bench on same or greater version.

Example:

Data has rows in tabBuilder Page then frappe/builder must have created it and it needs to be part of the image/bench.
Builder version 1.0.0 created the data then builder 1.0.0 or 1+ is expected in the image. In case of 1+ you can do bench --site all migrate and it’ll patch the changes.
Builder version 1.0.0 created the data then builder 0.0.9 in the image will NOT work.

Peer · January 13, 2025, 7:58am

Thank you for the clarification.

Knowing about the engineering that happens in dev mode between db-stored and filesystem-stored doctype meta data, where the separation of data and app is kind of “blurred” (“on-demand”), I wondered if there is indeed something similar that would have eased backup restoring even on the apps level.

That could probably be built, especially for dev setups. Building immutable images would need another layer, but impossible it’s not I’d say.

A “one-click” (one-command) restore certainly is desirable. The more needs for fiddling it takes away from the (possibly panicking) user, the more reliable it is, at least if the mechanics work as they should.

I’m getting more and more convinced that in any project, getting backups and restore right should be the very first thing to build/learn, test and use, as these can represent lots and lots of work hours (even years) lost … or preserved once you slide into using a software for real work.

revant_one · January 13, 2025, 9:34am

Take backup using bench --site all backup
Push the sites directory as restic snapshot
Restore the snapshot anywhere

Ideally outsource the backup to cloud providers and make them snapshot volumes for FS as well as DB. eg. AWS EFS and AWS RDS backups.

for self-hosters who do not like cloud providers, read more: How to backup with restic to S3 compatible storage

Peer · January 13, 2025, 10:55am

Thank you, Revant!

This prompts me (due to your frappe_docker/docs/backup-and-push-cronjob.md at main · frappe/frappe_docker · GitHub page about backups) to ask another general question about the system:

Does it conform to “The 12-Factor-App” methodology (see https://12factor.net/)?

It looks quite like it (although I didn’t check detailedly in-depth), at least as far as frappe_docker is concerned (due to: see above), but I wonder if this 12-factor-appiness is a stated (explicit) or an implicit goal of the engineering efforts. (Or maybe the similarity is just the naturally converging logic of building apps for cloud purposes?)

revant_one · January 14, 2025, 3:29am

Frappe framework started in 2005. 12 factor came later.

frappe_docker makes things as close to 12 factor as possible.

What I feel can be improved is, reduce the usage of File System volume for sites. i.e. use S3 for uploads and push backups to s3 so none are kept in sites/<site-name>/private/backups. No need to read site_config.json for passwords if config is overridden by custom app and picked up from environment variable. Configuration