docs.frappe.io gives bad advice on item codification, no?

alp · January 17, 2025, 7:49am

The Doc

In this post I’ll be talking about the page on Item codification, from here on referred to as “The Doc”: Item Codification

Not a repost

I’ve searched for “item codification” and gotten only 21 results. The last post was from 2023 and not related to this topic and the rest were all at least several years old. Hence this post.

What I am against

The Doc says:

You should have a simple manual / cheat-sheet to codify your Items instead of just numbering them sequentially. Each letter should mean something.

(Granted, the page is for v14 but there is no v15 page so I assume this is still the official stance of ERPNext on this matter.)

My stance

The Doc gives bad advice, if you ask me. I am very new to ERPNext, and somewhat new to life, so nobody would ask me. Therefore I’ll have to cite other people. I could cite our business partners, but you wouldn’t know any of them. So I cite Martjin Dullaart’s book, Essential Guide to Part Reidentification, and a principle all software engineers know: Avoid unnecessary coupling.

Dullaart’s stance

Chapter 2, rules 1 & 6 basically say “just use numbers and they shouldn’t mean anything.” Pretty much the exact opposite of what docs.frappe.io says. You can take a look at the contents from this webpage and see for yourself. I recommend reading the other rules as well. Especially relevant are the rules 2 & 3 about length.

Later in the book, exceptions to these rules are mentioned. One exception is catalogue parts. But, The Doc talks about all items, not just catalogue items. If it were just catalogue items, I wouldn’t be writing this. But “the term Item is also applicable to raw materials or components of products yet to be produced” (from here).

One other exception is the situations in which you don’t have access to an information system, i.e. you cannot automatically convert a meaningless integer to meaningful attributes.

One other exception - and this is my addition - is ubiquitious stuff, such as:

Incoterms (FCA, EXW, …)
Currency (such as EUR, USD…) ISO 4217
UOMs (such as meter, kg…)

Such things usually have few (1-3) attributes anyway and people know most of the important ones by heart. For instance, a currency will have an abbreviation and a long name, such as EUR and euro. These are basically enums.

Are Items an exception to Dullaart’s rules? I don’t think so. If you have few enough, say, less than 50 items, then perhaps it’s okay. But in general, no. The Doc approach is unnecessary coupling. You’d be tightly coupling the attributes to IDs.

Introduce an extra layer

The solution to this coupling problem is introducing meaningless IDs, I believe: Separate the attributes into their own fields. When/if you need to have a codification for your catalogue, build it only then. So the codes will be implicitly stored, not explicitly. Sort of like tables (explicit) vs. views (implicit). I see this also analogous to dependency inversion.

Focusing back on ERPNext

As you can see, the issue of codification is not specific to ERPNext. It would apply to any system. So most of my post is not ERPNext specific either. But here’s an ERPNext-specific observation that will further support my point: What is the limit on number of Items in ERPNext? - #10 by adnan

Performance is better with integers. I’m not surprised.

Benefits mentioned on The Doc:

(1) Standard way of naming things.

“Sequential codes” is also a standard way. No difference here.

Since the codification process described in The Doc is more complex than just giving sequential integers, I could argue that it is more prone to errors and therefore less likely to follow a standard way.

(2) Less likely to have duplicates.

“Sequential codes” seems to be worse in this, indeed.

But… Duplicates are still a problem in both approaches. Even the page itself says “less likely”, not impossible. We have been following such a codification for about 10 years. Now we have 400.000 items and lots of duplicates.

I’d rather run nightly checks on the database to see if any two items are similar, based on their attributes, not IDs. I’d use “fuzzy matching”. If two items are close enough, report to am “Item authority”. What is “close enough”? One way to define this is string distance.

And then there’s the issue of non-duplicates. Maybe you started out with a codification scheme that uses four letters. You are happily differentiating between ABCD and IJKL and XYZW, until one day, you get a duplicate alarm: ABCD already exists. “But, no!”, you protest, “This ABCD is ABCDF. What we have is ABCDE. I wish we had used five letters.”

If you were to augment your codes, can you replace every affected document, every record in your database? I sure couldn’t. If you have an idea I am missing, I am all ears. (This is one of the reasons I’m posting this.)

(3) Explicit definition.

I don’t understand this. The attributes that the codification scheme uses will be explicitly present in other fields in either case. Difference is, in The Doc approach, the same information is saved twice: Once in the code, once in attributes.

(4) Helps to quickly find if a similar item exists.

I doubt searching the code is faster than searching in the attributes. It could be in ERPNext, but in general, I don’t see why this must be the case. Precomputation should be possible on whatever attribute you want.

Maybe if your employees have a certain way of working, maybe if they know regex… Then searching by codes could be beneficial. Not sure.

Speaking from our company’s experience: We cannot always find similar items quickly.

Besides… Will your employees even attempt to find if a similar item exists? There are enough people who don’t have the habit of searching. Plenty enough to make letmegooglethat.com possible.

(5) Item names get longer and longer as more types get introduced. Codes are shorter.

Sequential integers are even shorter!

… the item codes will help you quickly determine if you are using a similar raw material in another product.

(↑ This one is later on The Doc page.)

Can’t I filter by raw material? I suppose what is meant here is that I cannot filter by raw material in the Preview. But I sure can filter in the List View, and through the API, and that should be enough. This is no reason to over-engineer the identification codes.

Now the challenges mentioned on The Doc:

You have to remember the codes!

Harder for new team members to pick up.

Yes from me and yes from Martjin Dullaart. You should keep it stupidly simple (KISS). Having to remember the codes, to “pick up” the codes is the opposite of KISS.

You have to create new codes all the time.

I don’t see how this is a challenge. If your inventory is growing, sure, you need new codes. But the computer will create them in either case. No difference here.

Here’s another challenge, coming from my experience:

If your company has a simple manual / cheat-sheet to codify your items, someone will have to work to maintain that document as well. Unless that someone is above-average nitpicky, people will leverage the areas unspecified in the manuals to create their own schemes. Then after a few years those schemes will be added to the original scheme… While some of the original rules are becoming laxer and laxer… Urgh… Yet another column of impossible-to-analyze garbage.

Why I posted this

We are at the early stages of ERPNext implementation. I’d like to be clear on such details, because it will save me a lot of trouble later.

I am not asking for free consultation. We’ve already had ERPNext consultations, and also a “codification” consultation. We have a budget set aside for this and we will continue to consult. But the consultants are not infallible. Neither can The Doc claim to be. I acknowledge the experience behind ERPNext. I just want to understand.

I believe this discussion will be beneficial not only for me but for the community.