For a custom use case, we are trying to change the COA of a company by adding and modifying multiple ledgers (order of 1000) and also import 10000s of vouchers into the system. We are facing the following issues,
i) COA changes are CPU heavy and are not letting any other operations go through. Even on a quad core machine the CPU is pegged at 200% (2 full cores). Though 2 more cores are available Mariadb is not using them. All other operations like statement generation are timing out.
ii) The voucher import is taking a very long time in the order of 8-10 hours for 15K vouchers. Again, the CPU pegs to 200% when insertion of vouchers are happening.
iii) Our use case calls for a multi-tenant system, i.e., voucher import and COA modification could happen for multiple companies. Also, there could be users on the system from different companies who are doing other routine things. The above 2 points are causing one tenant to block other tenant’s operations. Is there a separation of queues possible so that operations of one tenant can be in a separate queue and not hinder the other tenant?
For iii) to be effective, i) and ii) have to be addressed satisfactorily. Please share your experiences and strategies for the above problems.
Seems like you are onto an interesting deployment. Did you stumble into this page and have you taken care of these things?
Hope this helps.
trying to change the COA of a company by adding and modifying multiple ledgers (order of 1000) and also import 10000s of vouchers into the system
Can you explain the process you are following to do the above? Is it with a script or using data import tool ?
We are trying to do bulk insert of account followed by vouchers in similar way in a background job. This process takes a very long time i.e. 8-10 hours for 15k voucher as mentioned by @karanth and also when we try other operations during this background jobs, jobs ends with error saying “pymysql.err.OperationalError: (1213, ‘Deadlock found when trying to get lock; try restarting transaction’)” .Your views to overcome this Deadlock exception and also to speed up the inserts will be very helpful.
This deadlock issue is most likely due to naming series. The voucher creation process gains an exclusive lock on the naming series row and when multiple processes try to access the same deadlock arises.
The approach we follow is a bit unconventional, we create the raw CSV files which are pre-filled with the voucher numbers, all the validations are done in scripts during the CSV creation. These CSV files are directly loaded to database using mysql
LOAD CSV which is instant. After this process, the naming series is updated as well to reflect the last count.
Another issue in the above process is other vouchers being created by users which can result in duplicate voucher names. To solve this, we have a dedicated background job queue to which are all voucher imports are queued, this makes sure the voucher creation is always sequential. This approach also lets use stop the queue workers to do bulk inserts via CSV and resume the queue after the bulk import is done.
Thanks a lot for sharing the details @vkcr
While doing the
CSV import, how do you make use of the Frappe triggers like
on_submit that gets called on voucher submission.
I understand that
validate triggers can be made by scripts as you suggested.
But, how do we update ledger enteries? Do you have a post import script or something that does this? Please suggest your take on this too.
Thanks for your inputs.