I have a fresh install of frappe/erpnext v15 running in a kubernetes environment. The pages are not auto-refreshing after adding items. I’ve ensured that Auto-Refresh is not disabled in List Settings. I’m guessing that websockets are being interfered with by the upstream (kubernetes) nginx ingress.
Thanks! That makes sense. I’ve added the following annotations to my ingress. Unfortunately, it hasn’t had an impact on auto-refresh. This is actually a pretty simple deployment, it’s a single-node cluster with a ClusterIP erpnext service. Ingress controller is registry.k8s.io/ingress-nginx/controller:v1.10
nginx.ingress.kubernetes.io/configuration-snippet: |
set $forwarded_client_ip "";
if ($http_x_forwarded_for ~ "^([^,]+)") {
set $forwarded_client_ip $1;
}
set $client_ip $remote_addr;
if ($forwarded_client_ip != "") {
set $client_ip $forwarded_client_ip;
}
nginx.ingress.kubernetes.io/upstream-hash-by: "$client_ip"
Not yet. Unfortunately, I’m still wrestling with it.
Actually, I’m wondering if it might not be a websockets problem. I set up an ssh tunnel with a port forward to the cluster server and configured an entry in my local hosts file so that I could access the erpnext instance directly. This bypasses the cluster’s ingress controller. The behavior was exactly the same. Deleted items disappear as expected but adding a new Item (or any new object in a list view) fails to update without a manual refresh.
Edit: Nope, definitely a websocket problem. In Chrome developer console, I can wee the websocket sessions. They all report a status of 101 but no traffic goes across them (Size remains zero).
Over a year later and I’ve run into the exact same issue. Completely different environment, different cloud, different frappe versions, and multiple environments exhibiting the same issue. I seem to be plagued by this. Here are my nginx ingress annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-http-version: "1.1"
nginx.ingress.kubernetes.io/proxy-set-header-Upgrade: "$http_upgrade"
nginx.ingress.kubernetes.io/proxy-set-header-Connection: "upgrade"
nginx.ingress.kubernetes.io/proxy-set-header-X-Forwarded-Proto: "https"
nginx.ingress.kubernetes.io/configuration-snippet: |
set $forwarded_client_ip "";
if ($http_x_forwarded_for ~ "^([^,]+)") {
set $forwarded_client_ip $1;
}
set $client_ip $remote_addr;
if ($forwarded_client_ip != "") {
set $client_ip $forwarded_client_ip;
}
nginx.ingress.kubernetes.io/upstream-hash-by: "$client_ip"
Looking at the error, the wss:// url that’s being connected to is the site’s public URL. Since this communication is happening within the cluster, shouldn’t it be using the cluster url rather than trying to go back through the front door?
What I see:
wss://{Public URL}/socket.io/?EIO=4&transport=websocket&sid=AJ-lSou_f3w9ZP9gAAAN ← This is throwing a 400 (Bad Request)
What I would expect to see:
ws://{cluster/internal socketio service name}/socket.io/?EIO=4&transport=websocket&sid=AJ-lSou_f3w9ZP9gAAAN
I’m using the official helm chart with a custom frappe-docker image. I’ve done nothing special outside of modifying the apps.json file to customize the frappe apps that are loaded.
Edit: Looking closer, in Frappe’s nginx config, I see this:
proxy_pass http://socketio-server;
From within the nginx pod, socketio-server does not resolve.
@revant_one Yes, I have an upstream socketio-server defined at the top of the frappe nginx config. It’s pointing to port 9000 of the correct local socketio service. So that, effectively rules out the proxy_pass entry as an issue.
I can’t figure out if it’s an issue with the kubernetes ingress or the frappe nginx instance. I keep bouncing back and forth. It seems like the internal frappe socketio calls never make it to the socketio pod. Now that I’ve confirmed that the nginx config is pointing to the right service and port for socketio, I’m back to being stumped