I previously created this post. This is a continuation, and possibly/hopefully an improvement - on that one.
I use…
ERPNext: v10.1.81 (898977b) (v10.x.x)
Frappe Framework: v10.1.71-8 (92dc1e0) (v10.x.x)
on a Debian stretch 9.x manually installed setup - i7 with 16GB RAM on an SSD
The DB parameters are “optimised” as much as possible within my knowledge for performance
The hostnames are in the /etc/hosts file to ensure it isn’t a DNS issue
The “user” PC is a debian LXDE i5 with 16GB RAM using chromium browser (behaviour is same in FF)
Problem:
- When I submit certain (weirdly specific) STE requests, they would timeout. No error…just a timeout, and the doctype doesn’t get submitted.
Method:
- I changed the timeouts, increased all sorts of parameters and retried lots of times.
Result:
- A workable workaround which (for my scenario) is adequate.
- Not ideal, but OK.
Observations:
- Although I now have 8x gunicorn threads, only ONE seems to be doing any work.
- The CPU is an i7 - only 1 of its cores appears to be working (~90% in top, but in the GUI it shows ~10-13%)
- As I submit more STE-#'s, it does appear to get marginally quicker on each subsequent submit
- possibly it still has cached copies of some of the code or something??
- While the mem usage is relatively low, the response is also low.
- eg. when the in use RAM is ~3GB for the overall system, there are a lot more timeouts.
- When the overall system is >5GB, there are less timeouts, and the submits are (quite a lot) quicker?
- If I set up several tabs on the browser, each “ready” to save/submit, and then submit the next one the moment the done “ting” sounds, then there are less timeouts too
- this also leads me to believe that there is a cache effect somewhere
- The timeout message appears after 2mins - no matter what I use for the -t parameter??
Adjusted parameters via
nano ~/frappe-bench/config/supervisor.conf
- Increased the number of worker threads to 8 (from default of 2)
- Increased timeout to 360s (from default of 60s. I used 360s initially, and this worked on some STE’s but there is one particular BOM type that still fails, so I upped it to 480[fail], 600[fail] and eventually what worked was 900 for those)
- Changed listener to 0.0.0.0 (from default of 127.0.0.1) to ensure that I access whatever port feels like listening, just in case of DNS type annoyances as outlined in the one section of the old post…
and updated/reloaded it with
sudo supervisorctl reread
sudo systemctl restart supervisor
I now have
ps -ef | grep guni
/home/frappe/frappe-bench/env/bin/python /home/frappe/frappe-bench/env/bin/gunicorn -b 0.0.0.0:8000 -w 8 -t 360 frappe.app:application --preload --limit-request-line 0
This allows me to (albeit slowly) submit them correctly. So, to summarise, what is working is as follows:
- Save/Submit the STE
- When it times out…just LEAVE IT untouched…it continues in the BG
- If the timeouts are adequate, then it will finish its submit process - as long as you don’t mess with it.
- The CPU utilisation will remain at the ~10%
- depends what you have in there - (i7 ~10%, i5 ~35% etc) until the process finishes or properly times out or bombs
- If the CPU drops back to very low (ie <5%) and the screen doesn’t yet show “Submitted”, then the timeout was NOT adequate, and the process is incomplete
- you’ll need to increase the timeouts further, and restart the supervisor processes like this
- nano ~/frappe-bench/config/supervisor.conf and increase the -t 60 to a higher number, like say 900 for the horrendously slow stuff
- sudo systemctl restart supervisor
- I found that using this
⇒ supervisorctl reread
⇒ supervisorctl stop all
⇒ supervisorctl start all- did not load the new/adjusted parameters, while restarting the entire process did
- I found that using this
- you’ll need to increase the timeouts further, and restart the supervisor processes like this
- The “workable” submit - which eventually finished after ~13mins of nothingness
- The 3rd submit attempt took “only” 6mins to complete, and the 4th one took only 5mins, so definitely got much faster
I do NOT know why this happens, nor how to actually fix it.
- hopefully someone who knows what’s what will see these notes/observations, have a eureka moment and change something in the mainstream code which can address/fix it properly
- I have looked at Submit Stock Entry taking too long time which describes a similar problem, in that the transactions are backdated, but I am not using a script like in that post, and don’t know how to do such things (yet).
- There is an observation made in that same post about the fact that STE is uniquely slow
- The post is from 2016, so I’m guessing that things in the code have changed - but apparently not quite enough just yet
- I have looked at Submit Stock Entry taking too long time which describes a similar problem, in that the transactions are backdated, but I am not using a script like in that post, and don’t know how to do such things (yet).
Here are some pretty pictures to show the observations
- CPU on server during a typical submit
- Timeout on submitting PC screen - if this happens, just leave it alone and it’ll hopefully finish the BG submit
- IF the CPU on server during the submit+timeout drops before you get the “Submitted” message with a “ting”, then your timeout is inadequate.
- Submit (eventually/successfully) despite timeout if you left it running anyway
Hopefully this will be helpful. Comments/clues/help welcomed if there is a better way.