Hello,
I am trying to learn and selfhost the ERPNEXT with Gemini help. I managed to make a good progress, upgraded to v16 and even changed from dev to production. I am using a local DNS with https and selfmade certificates - except browser warning, it is working. But now, when I started creating custom reports, i spotted smth is not going right. Apparently issue was with socket .io connection:
socketio_client.js:49 WebSocket is closed before the connection is established.
socketio_client.js:69 Error connecting to socket_io: Unauthorized: TypeError: fetch failed
After two days of config experiments and endless “smart advises”, Gemini concluded it is because we upgraded host from dev to prod and screwed up something.
So I backed up all data and created a new VM, this time with docker “because it is a bulletproof easy solution (c) Gemini”…
I created a new site(avoiding copying crazy network config changes from the old install) and then restored the config from backup. Guess what? I get again the socket errors! Now, after another day of trial and error, I feel completely lost. We managed to change error to
socketio_client.js:69 Error connecting to socket_io: Invalid origin
and even to make the error to disappear, but the frappe.realtime.socket.connected is always false and Gemini “solutions” started to repeat…
I am sure that the config after at least 100 edits is far from being usable, so i probably need to start from scratch again at some point, but I doubt that I will be able to AI-solve it. Is there a clear troubleshooting technique for these errors? A kind of checklist or explanation of how that is intended to work?
I am usually quite capable to follow, but the “AI troubleshooting of ERPNEXT sockets” is impossible to rationally understand and this situation is really driving me mad.
I would appreciate any help/advice, thanks in advance!
Igor.
Hi, thanks for reply! I am with docker setup, so it is a bit different. but here is the output of
docker ps | grep websocket
386410c1fe39 frappe/erpnext:v16.0.0 “node /home/frappe/f…” Up 1 h frappe_docker-websocket-1
And I woke up my old, original VM where I have supervisor, but it also looks normal:
x@erp:~$ sudo supervisorctl status all
/usr/lib/python3/dist-packages/supervisor/options.py:13: UserWarning: pkg_resources is deprecated
import pkg_resources
frappe-bench-redis:frappe-bench-redis-cache RUNNING pid 864, uptime 0:05:47
frappe-bench-redis:frappe-bench-redis-queue RUNNING pid 865, uptime 0:05:47
frappe-bench-web:frappe-bench-frappe-web RUNNING pid 866, uptime 0:05:47
frappe-bench-web:frappe-bench-node-socketio RUNNING pid 867, uptime 0:05:47
frappe-bench-workers:frappe-bench-frappe-long-worker-0 RUNNING pid 870, uptime 0:05:47
frappe-bench-workers:frappe-bench-frappe-schedule RUNNING pid 868, uptime 0:05:47
frappe-bench-workers:frappe-bench-frappe-short-worker-0 RUNNING pid 869, uptime 0:05:47
x@erp:~$
That would be too easy, I ran all these setups/reboots a few dozen times
I am learnign and self-hosting a production version-16 docker/podman (podman actually, docker is used only for docker-compose) build using the easy-install.py script, fully local with with no https/ssl. I had socket.io issues too, which I checked from system health report. I managed to solve it by adding the extra-hosts options in websocket service in compose file. Also, I had to expose port 9000 and set bench config to deal with a CORS issue - this allowed me to pass the real-time messages check in Frappe Raven App and Frappe Builder CORS issue.
websocket/backend service in compose.yaml:
Apart from these, I run a vanilla setup. The only other thing I have done apart from the above 3 changes is change the docker.sock to rootless podman.sock in the cron-scheduler. I do not modify the generated compose file directly, I change the compose files in frappe-docker local repo so that script works without interruption
Hi Sridou, nice to meet you. Glad that you solved it, all these steps are well noted.
I have a strong feeling that my problems were somehow coming from encryption. Yes, I am on the local LAN now, but I would like to be able to access it from the outside in the close future, actually as soon as all these showstoppers are solved. So i used “erp.mydomain.com” in all config and just tricked my PDC DNS to resolve it to the local IP. That way, when I am ready to push it out, i will only need to change certs and the DNS record.
Gemini and ChatGPT were not able to fix it, but now I finally managed to give direct SSH access to my old server to the Copilot (Gemini 3 inside) and it actually managed to fix it! By adding “NODE_TLS_REJECT_UNAUTHORIZED=“0” to supervisor.conf
The problem is that this supervisor.conf is auto-generated, and it and all other .conf files are not in the best shape after that random AI “fixing”. Not sure if it is worth to continue using this VM, I am thinking of re-creating a new VM from scratch, but now fully by Copilot without “man in the middle”. I will request it a detailed action list, so if that works - i will have a working “v16 local https install guide”.