[Performance] 1 Sales Invoice Save makes ~700+ DB calls. ~1000+ DB calls on Submit

~4 times lighter and faster ErpNext and more than 4 times happier Community

This article summarise the code changes done as part of the optimisation sprint we had in Frappe office 2 days back…

To start with, following is short ‘Before’ and ‘After’ summary -

Transaction - Submitting Sales Invoice containing 51 Items.

Bench Console

In [3]: si=frappe.get_last_doc('Sales Invoice')

In [4]: len(si.items)
Out[4]: 51
In [2]: %prun -s cumulative si.submit()

Results

Before optimisations
Time Taken : 4.56s
Number of SQL calls : 3625

After optimisations
Time Taken : 1.511s [First Time] and 0.981s [Subsequent Calls]
Number of SQL calls : 1070

It was evident that in a single transaction, same document was getting read from database again and again. get_doc and get_value method was getting called multiple times for the same document and same field.

General approach was to cache repetitively accessed documents in memory to reduce number of database calls. Following is the list of key changes done.

get_cached_doc instead of get_doc

First optimisation done was to cache documents in the local in-memory request level cache. A new request level cache is added in `frappe/__init__py`
def init(site, sites_path=None, new_site=False):
	"""Initialize frappe for the current site. Reset thread locals `frappe.local`"""
       ....
       ....
       ....
       local.document_cache = {}

NOTE -
frappe.local is a request level cache and gets cleared before each request.
frappe.flags is also request level cache

A new method frappe.get_cached_doc is introduced as follows -

def get_cached_doc(*args, **kwargs):
	if args and len(args) > 1 and isinstance(args[1], text_type):
		key = get_document_cache_key(args[0], args[1])
		# local cache
		doc = local.document_cache.get(key)
		if doc:
			return doc

		# redis cache
		doc = cache().hget('document_cache', key)
		if doc:
			doc = get_doc(doc)
			local.document_cache[key] = doc
			return doc

	# database
	doc = get_doc(*args, **kwargs)

	return doc

The above method implements progressive checks in local request cache then in global redis cache and then falls back to database to fetch a document.

the get_doc method is also modified to put the documents retrieved from database into cache as follows -

def get_doc(*args, **kwargs):
	"""Return a `frappe.model.document.Document` object of the given type and name.
         ...
         ...
         ...
	# set in cache
	if args and len(args) > 1:
		key = get_document_cache_key(args[0], args[1])
		local.document_cache[key] = doc                    <-- set in request level cache
		cache().hset('document_cache', key, doc.as_dict()) <-- set in global redis cache

and then there were selective find and replace based on the %prun output analysis. e.g. One sample %prun record showing lot of time spent in get_item_details method.

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
50    0.002    0.000    **1.809**    0.036 get_item_details.py:16(get_item_details)

All such high time taking methods were identified and get_doc calls were replaced with get_cached_doc in the same.

Following is %prun output after this change

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
50    0.002    0.000    **0.529**    0.011 get_item_details.py:19(get_item_details)

get_cached_value instead of get_value

Similar to ```get_doc```, ```get_value``` was another method which was getting called many time for same doctype and field. This was another candidate to be cache enabled.

Following method is introduced which internally calls get_cahed_doc and avoids a database call altogether.

def get_cached_value(doctype, name, fieldname, as_dict=False):
	doc = get_cached_doc(doctype, name)        <-- Gets cached doc. 1 db call saved per call
	if isinstance(fieldname, text_type):
		if as_dict:
			throw('Cannot make dict for single fieldname')
		return doc.get(fieldname)

	values = [doc.get(f) for f in fieldname]
	if as_dict:
		return _dict(zip(fieldname, values))
	return values

Then another selective find and replaced was done after analysing %prun output. Following number of occurrences of get_value were replaced

➜  apps grep -R 'get_cached_value' . | grep -v pyc | wc -l
     186

Fixed Meta caching

Rushabh found out that doctype meta-data was not actually getting saved in redis cache. It was failing to serialise doctype metadata and silently eating up the exception.

A custom serialisation method was introduced to serialize meta data as follows -

	def as_dict(self, no_nulls = False):
		def serialize(doc):
			out = {}
			for key in doc.__dict__:
				value = doc.__dict__.get(key)

				if isinstance(value, (list, tuple)):
					if len(value) > 0 and hasattr(value[0], '__dict__'):
						value = [serialize(d) for d in value]
					else:
						# non standard list object, skip
						continue

				if (isinstance(value, (frappe.text_type, int, float, datetime, list, tuple))
					or (not no_nulls and value is None)):
					out[key] = value

			return out

		return serialize(self)

After the custom serialization, redis started caching it. This was a big improvement reducing number of database calls significantly as it was getting used very frequently.

Although, these changes would improve performance of most of the transactions because of common codebase, we should %prun analyse every transactions to identify most time taking methods and selectively replace get_doc() and get_value() with get_cached_doc() and get_cached_value().

36 Likes