404 Errors on Frappe v15.38 in Docker Swarm with Traefik: No Logs, Service Appears Running

Hello everyone!

I’ve deployed the Frappe Framework on Docker Swarm, version v15.38. When I start up the containers, everything works correctly; however, after some time—usually from one day to the next—the access starts returning a 404 error.

The strangest part is that I don’t receive any error message in any container. The docker logs for the frontend don’t show anything; it’s as if the service hasn’t started. But Portainer shows the container as running normally.

Traefik doesn’t return any errors.

Sometimes, I’ve run bench migrate, and it temporarily restored access. The system is in production mode. I tried various things like building, etc., out of desperation :rofl:… but without logs, I’m a bit lost.

Any help or suggestions would be greatly appreciated!

Here is an example of my frontend service setup:

frappe-frontend:
  image: frappe/erpnext:v15.38.0
  environment: *default-environment
  networks:
    - tr_pub
  deploy:
    restart_policy:
      condition: on-failure
    mode: replicated
    replicas: 1
    placement:
      constraints:
        - node.labels.worker == apps_frontend
    labels:
      - traefik.enable=true
      - traefik.http.routers.frappe-frontend.rule=Host(`v23.web.com`)
      - traefik.http.routers.frappe-frontend.tls.certresolver=cloudflare
      - traefik.http.routers.frappe-frontend.entrypoints=websecure
      - traefik.http.routers.frappe-frontend.tls=true
      - traefik.http.routers.frappe-frontend.service=frappe-frontend
      - traefik.http.services.frappe-frontend.loadbalancer.server.port=8080
      - traefik.http.middlewares.prod-redirect.redirectscheme.scheme=https
  command:
    - nginx-entrypoint.sh
  volumes:
    - frappe-apps:/home/frappe/frappe-bench/apps
    - frappe-sites:/home/frappe/frappe-bench/sites
    - frappe-logs:/home/frappe/frappe-bench/logs

The bench doctor command shows the following:

~/frappe-bench$ bench doctor
-----Checking scheduler status-----
Workers online: 2
-----frontend Jobs----

If I manually run ./nginx-entrypoint.sh, it says the port is already in use:

/frappe-bench$ /usr/local/bin/nginx-entrypoint.sh
2024/10/29 13:58:06 [emerg] 401#401: bind() to 0.0.0.0:8080 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:8080 failed (98: Address already in use)
...
nginx: [emerg] still could not bind()

When I run bench --site frontend migrate, I get the following output:

Migrating frontend
Updating DocTypes for frappe        : [========================================] 100%
Updating DocTypes for erpnext       : [========================================] 100%
Updating Dashboard for frappe
Updating Dashboard for erpnext
Updating customizations for Address
Updating customizations for Contact
Queued rebuilding of search index for frontend

After this, the frontend goes back online.

Can’t help you much. You’ve mounted apps which are not supposed to be part of volumes. They’re part of image.

You’re on your own. Try to figure out hacks that work for you and share if they work.

Recommend to build custom image for custom apps.

Since it creates a virtual volume, and I’m almost certain that the problem started before I even created the container with this volume, I didn’t connect one thing to the other, but I’m going to remove this volume and observe the behavior. If it continues, I’ll create a custom image. Thank you very much for the tip!

I created this volume to export some fixtures of the doctypes and scripts that some users created. It wasn’t with the intention of putting a custom app. It was to take the customizations that were made. In this case, would the ideal also be a custom app?