Frappe - Performance - Naming Series

Frappe allows automatically naming documents. One of the autoname options is ‘Naming Series’ as follows -

When naming_series option is used, a numerical sequence number is maintained in tabSeries table against each series key. e.g SO-, SINV-, PO- etc.

Every time a new document is created, getseries function in naming.py is called to get the next sequence number to be used for the document. The function is as follows -

def getseries(key, digits, doctype=''):
	# series created ?
	current = frappe.db.sql("select `current` from `tabSeries` where name=%s for update", (key,))
	if current and current[0][0] is not None:
		current = current[0][0]
		# yes, update it
		frappe.db.sql("update tabSeries set current = current+1 where name=%s", (key,))
		current = cint(current) + 1
	else:
		# no, create it
		frappe.db.sql("insert into tabSeries (name, current) values (%s, 1)", (key,))
		current = 1
	return ('%0'+str(digits)+'d') % current

The above function locks a row in tabSeries table for update and increments its value by 1. The row remains locked until the entire operation of saving the document is completed. This essentially means we cannot create multiple documents of same type concurrently.

What If, we do following -

At the server start, reserve lets say 100 sequence numbers and allocate the same from memory to new documents getting created. Reserve 100 more when we run out of sequences. We dont have to lock a row in tabSeries for each document and can essentially create multiple documents concurrently.

e.g. consider current naming series values for Sales Order doctype

tabSeries
name            current
SO-               123

At server startup we reserve 100 sequence numebrs and update the value as follows -

tabSeries
name            current
SO-               223

and in the get_series method, keep on returning the value from memory keeping track of last issued value.

We can handle error scenarios like server restart etc by resetting the value at startup based on the last document created of any type.

Any views?

6 Likes

it’s a good suggestion but error handling will be very critical.
there will be different use cases where chances are high to skip any number from sequence.

and it’s right time to address these performance issues with high priority.

1 Like

tabSeries is horrible for mass inserts. We pretty much avoid it on anything that needs to be scaled.

This solution though is interesting. However, resetting the value on startup will be an expensive operation.

What about moving this to Redis?

2 Likes

A cache of sequence will always generate holes in sequencing.
Mass insert should use a mass reservation when the number of insertion is known.

Naming series is turning out to be a bottleneck for all the operations and we need to nail the alternate approach down in order to better scale the product. Approach we plan to take -

  1. Introduce a new auto_name type - fast_naming_series
  2. Wrap the access to the global dict with Semaphore to address concurrency problems
  3. Lazy load current sequence number for each series

Psudeo logic below -

def db_insert(self):
      if naming is fast_naming_series:
             fast_series_sema.aquire()
             new_sequence = get_fast_series(key)
      
     frappe.db.sql (''' insert into ...''')
    
     if naming is fast_naming_series: 
           fast_series_sema.release()           

def get_fast_series (key):
       if not global_series_dict [key]:
          lazy load by finding last document with this key
      return global_series_dict ++[key];

The cached sequence number will be protected by a Semaphore which will lock this critical path only till the insert sql statement executes.

2 Likes

@Sachin_Mane Good Suggestion and Even I was thinking about Naming series bottleneck since quite some time.

Does your proposed algo considers :

  1. Phantom Read and
  2. Data Consistency Issues.

The Lock might have performance issues, but it surely ensures Consistency.

Regards,

Parth Joshi

As of now we started using hash as autoname for common tables like gl entry and stock ledger entry.

We are still working on a in memory /Redis backed naming series POC.

Meanwhile please go through [Performance] 1 Sales Invoice Save makes ~700+ DB calls. ~1000+ DB calls on Submit for other performance optimisation done

3 Likes