I need to upload a huge xml file (~200MB) to import data from another ERP Application. I do face problems such as browser crashing, python backend becoming unstable.
Javascript:
frappe.pages['my-data-import'].on_page_show = function(wrapper) {
var $form = $("form[id='frmFileUp']");
$("#btn_read").click(function() {
var input = $('div').find("[type='file']").get(0);
if(input.files.length) {
input.filedata = { "files_data" : [] }; //Initialize as json array.
window.file_reading = true;
$.each(input.files, function(key, value) {
setupReader(value, input);
});
window.file_reading = false;
}
});
$("#btn_import").click(function() {
var filedata = $('#select_files').prop('filedata');
if (filedata) {
frappe.call({
method: "~~path~to~method~~",
args: {
"filedata" : data,
"open_date": $('div').find('[name="exp_start_date"]').val()
},
callback: function(r) {
if (!r.exc) {
frappe.msgprint(__("Files uploaded"));
} else {
frappe.msgprint(__("Files not uploaded. <br /> " + r.exc));
}
}
});
}
});
};
function setupReader(file, input) {
var name = file.name;
var reader = new FileReader();
reader.onload = function(e) {
data = reader.result;
}
reader.readAsText(file);
}
Python Method:
@frappe.whitelist()
def read_uploaded_file(filedata=None,open_date=None):
if not filedata:
return
params = json.loads(frappe.form_dict.get("params") or '{}')
if params.get("open_date"):
open_date = params.get("open_date")
global opening_date
opening_date = open_date
try:
xmltodict.parse(filedata, item_depth=5, item_callback=process)
except ParsingInterrupted:
frappe.db.rollback()
return {"messages": ["There was a Problem Importing" + ": "], "error": True}
frappe.db.commit()
return {"messages": "Import Successful", "error": False}
Any way to achieve the above. I am not an expert on any of the above languages.
ERPNext restricts the amount of data you can upload in one file. Though the number may vary based on the type of data. It is usually safe to upload around 1000 rows of a table at one go. If the system will not accept, then you will see an error.
Why is this? Uploading a lot of data can cause your system to crash, specially if there are other users doing things in parallel. Hence ERPNext restricts the number of “writes” you can process in one request.
@ganas I know there are upload restrictions and I have already gone through Data Import Tool and I don’t want to use it, rather want to process the XML at the backend and insert records.
This data import is a one time job, at the time of setup. Therefore Breaking data into chunks and uploading seems interesting, any idea How I can go about it.
I gave a shot to breaking the data in JS, but this would call a lot of frappe.call requests. Even if it calls a lot of requests how do we call them in a sequence ie. Call the second after completing First.
Well, I have somehow managed to upload and process the data in Batches(Sequentially). I am still having some trouble at the Back-end (I Guess), and hope that someone can shed some light over it.
I am having ~1,00,000 records in the XML file consisting of Customers, Sales Invoices, Payment Receipts, Stock Items, etc. When I start the process of Importing the XML file in Batches, after processing about 3000 records (and inserting about 500 of them, as there are some checks that needs to be passed before inserting records), the System stops responding.
@Pawan I can’t understand why to split files …as far as i know that has to be done using erpnext import tool, but would be possible to write a custom app or custom python script in order to import big files …am I wrong?
Split this text so obtained into tokens/batches (of about 1000 records a batch)
Upload them one after the other (in a loop), to a custom python function using frappe.call for further processing.
The function processes each batch and calls a db.commit() after each batch is successfully processed. (and ofcourse then javascript would upload another batch. the process goes on till the EOF)
I think handling any data above 1 MB of text in JavaScript is a killer for a browser or simply anti-responsive approach.
I would upload it using regular Frappe API like (button inside ListView menu):
@frappe.whitelist()
def my_function(file_name): #file_name is ID in File doctype - not the name of the uploaded file
message = {}
if file_name:
file_doc = frappe.get_doc("File", file_name)
file_path = get_file_path(file_doc.file_name)
# here we can open file and parse it
else:
message['error'] = "No file name given"
return message
@Marcin_Walkow Thankyou for the inputs. yes it is not recommended and also a little difficult to upload huge files in frappe.
I used a rather “Direct” method to process the uploaded data.
Quote from one of my previous posts:
1.Read the file in javascript using readAsText()
2.Split this text so obtained into tokens/batches (of about 1000 records a batch)
3.Upload them one after the other (in a loop), to a custom python function using frappe.call for further processing.
4.The function processes each batch and calls a db.commit() after each batch is successfully processed. (and ofcourse then javascript would upload another batch. the process goes on till the EOF)
@mehta_akshay:
You didn’t get me - instead of splitting files in JavaScript (handling a 1 MB maybe a killer) I would try to upload 200 MB with a status bar. Of course you have to pay attention mostly to timeouts.