Testing ERPnext 700,000 rows of data

will do pdvyas
Thanks for the hint

Any update? :slight_smile:

Hi All,

Any updates regarding the change to frappe to improve large data sets? I have a project that requires importing 50mill records monthly and wondering if erpnext is a fit.

I would love to see that it is so how can I have to make it so? Funding, did we get enough, how much is required?


1 Like

@rmehta I am highly interested in this observation.I am about to use ERPnext for a client that the customers list will reach 3 million and heavy transactions will be on daily.
Let our amiable team let us now the position now.

Maybe ill share some of my experience.

I Currently have 2 Million+ row Item Master Database
and around 3 Million Transactions on different Modules

VM specs:
4 Cores
8GB (6GB dedicated to maria DB)
Users: 10 Simultaneous Users on Different Modules

When making a transaction it takes 2-5 seconds to load an item from the link field, before you can select it.
But from other masters with just 2000 rows like suppliers it takes only 1-2 secs.

[Generating Reports]
When Generating Reports (Script Reports) it takes 10 secs to load a report ranging from a month.
But When it comes to financial Analytics Reports it takes a min or sometimes it says “Something went wrong”

[Importing Data]
I imported my data via database directly.

I tried burrowing some RAM from other VM’s, noticed a great increase in performance

The whole system runs smooth, but when somebody generates a report for a year, it slows down

So for now i can say that.
Erpnext can handle heavy transactions. as long as you have a good hardware.

For the 50 million data ,assuming that its from a different source, i am more concern on how to convert that data to erpnext readable data plus you have to do it monthly. :scream:


@rmehta Do you have a rough estimate (monetary wise) for such a transition?

The architecture with the varchar primary keys is a good thing under the point of view of making imports to the system. We are working with Talend and create the primary keys according the naming capabilities of ERPnext. This is important, when putting dependent things together when designing imports.

Of course the aspect of speed on the database can be a concern on big databases. It is well known, that MySQL is slow on inserting, but fast on reading.
Also Integer indexes are always faster than varchar indexes.

But i think, with a proper fitted machine for the database, things can be handled. An SAP with Oracle is also very slow, when running Oracle on a 2GB / 1 core Box.
Opensource users are expecting the software to run on cheap boxes. This is not the case, when you need horsepower.
There must be a big iron. Much cores and RAM can only be replaced by MORE cores and RAM… :slight_smile:
Also fast discs and maybe usage of SSDs are a very important thing.

Because of this i would begin testing with running the MySQL database server on a dedicated machine / VM and the ERPnext server also on a separate machine.

I know, that the Frappe team has put it all into one machine because of the simplicity in deployment and hosting. For small companies or companies with a small or medium amount of transactions this is totally enough.

In short terms:

  1. As stated in the posts above the database seems to be the bottleneck. So i would focus on index optimization and / or partitioning of data. Especially transaction data can be partitioned, because old data are used only sometimes. MariaDB supports partitioning.
  2. Then i would place the MySQL on a separate big iron / VM and test again vs the monolithic system.
    Also important is a good connection between the ERPnext server and the MySQL server. GBit-network or better. This can be an issue when hosting in the cloud or datacenter, where you have no influence on the peering.

I hope this helps a little bit… :slight_smile:


We are using ERPnext for what we consider a small/medium scale manufacturing concern, with plans to extend to another much larger concern. Currently we have around half a million entries in 6 months, using a long item names of 20-40 characters. We see transaction speeds of 1-2 minutes per transaction, eg. when doing stock movements or processing work orders.

As this is a very old thread, I’m wondering if this issue was resolved in future editions? Or if there is any solution?

1 Like

It appears that having Interger or Varchar as primary key should not make any difference. See this link:

1 Like

Rather it is the WIDTH (length, or number of characters?) that makes a difference in performance:

VARCHAR vs. INT doesn’t tell much. What matter is the access pattern.

On absolute terms, a wider key will always be worse than a narrow key. The type carries absolutely no importance, is the width that matters. When compared with INT though, few types can beat INT in narrowness, so INT usually wins that argument just by the fact that is only 4 bytes wide.

But what really matters is the choice of clustered key. Often confused with the primary key, the two represent different notions and are not required to overlap. Here is a more detailed discussion Should I design a table with a primary key of varchar or int?

@claily I assume you have dedicated 6GB of RAM by change value of innodb_buffer_pool_size = 6G

which can be found under /etc/mysql/my.cnf ? Right ?

Yes. .

Strange, I have set it to 18G as I have installed 24GB of RAM. But the system is not utlizing allowed RAM, as it should to increase the performance by resulting in lower time execution.

you should try running this query it should show 18G if you have configured it correctly

SELECT @@innodb_buffer_pool_size/1024/1024/1024;

if it does not show 18 maybe your mysql is reading a different .cnf try locating something like frappe.cnf or somewhere in /etc/mysql/conf.d/

1 Like

Yes, you were right.

Settings must be PUT under [mysqld] to take effect.


SELECT @@innodb_buffer_pool_size/1024/1024/1024;

is showing 18 (what I have set) inside my.cnf file.