This is a question directed mostly at the Frappe team…
One of the important functionalities of an ERP is analytics. Right now, we have basic analytics (stock/financial/sales analytics etc) but most of it is document-driven rather than change-driven. What I mean by this is that we have reports that give us $ booked in a certain time period, but we don’t have anything that gives us, for example, the date that X lead had their status change from “open” to “interested”. I’m not even sure that the data for that is stored anywhere - the row is just updated in the DB.
If one were to go about building something like this - and fitting with the customizable/extensible nature of the framework - would you want it built in Frappe, or as a separate application with doc hooks? The logic for storing information is rather trivial, but the design of a feature like this involves a lot of compromise. Do you have the user select which fields should be tracked? Or do you assume more control and track certain fields? Storing a copy of each iteration of a document is definitely too much data, and storing changes is better, but deciding who decides (for lack of a better phrase) which fields need to be tracked is a rather big decision.
The two ways I am toying with is adding a checkbox in docs similar to “hidden” or “report hide” that is “analytics field”, and run an event on validate to track changes made to that field in a different table. The other is a separate app that hooks into a set of documents (imagine one big table with doctype, docfield) and on any event where that field changes, it writes to a db.
From there, you can generate extremely powerful analytics that really empower business both on an individual level as well as an executive/management level. Check out places like www.insightsquared.com for some examples.
More reports - but they need something deeper to happen (unless the framework does this and I have no idea about it).
We have a custom field that’s used to estimate the close date of an opportunity/quotation. If the date moves back, i.e. the salesperson changes the close date to a later date, the opportunity has been “pushed”. By keeping track of when the field changes, you can generate a report of % of opportunities pushed by salesperson, by deal value, and so on.
This can be extensible to many other fields as well. It unlocks a deeper level of analytics than just the current state of documents.
This would be a great feature. Basically the idea would be to add a new attribute to fields that indicates whether or not changes to them should be tracked. Then there would be a doctype (or two) for doctype changes. I think two is better. The first (DocChanges) could be to track documents that have changed and the 2nd (FieldChanges) would be to track fields in the document in DocChanges that changed.
Change No (PK)
Date of Change
Time of Change
Type of Change (creation, update or deletion?)
Change No (PK)
Doc Types in which the name is the field that changed would update Doc Name in DocChanges as well as the Field Name, name, in FieldChanges.
Updates to the tables would be fired from a hook on update, delete, create etc.
The changes can be tracked without a hook if we want to edit the save method of the document class. This is all-encompassing BUT will result in large amounts of data.
I have a working prototype of a field changes table…it holds doctype, docname, field name, old value, new value, and date of change. It’s just done in the save method of the document class. If we want to do hooks + an app, things can get more complex but also more customizable - lots of pros and cons to discuss here. I think that it should end up in the framework, though, because if you later decide you need analytics, you’ll need the data to support it - and without the data, a newly-installed analytics app is useless.
@Chude_Osiegbu your DocChanges table can be made unnecessary by tracking the ‘modified’ and ‘modified_by’ fields of any document, along with the field changes above - unless I have missed something!
That would work great. When look at the Version List on our production server, it only lists versions of Web Page / Blog Page. Is this intended? Or does more need to be done to track versions of an Opportunity? (Or is it something on my end?)
Storing entire doctype changes as a version in a json blob is not going to scale well for most general purpose analytics.
To actually query things, you will have to do a select * where you can at best filter by ref_doctype, then process in python to find the records you’re looking for. If you have a million rows where only 10% of the rows have information you need, you will still need to pull all million rows. (And a million will seem small once you implement versioning).
It will work, however, if you’re planning to load the blob, process it through ETL, and then load it into a data warehouse of some sort, and then you run your analytics off that data warehouse.Also, if you know you will only have a small number of rows, it will work well.
The logic to only store the changes is trivial - the problem more is whether or not you want to allow the user to select which documents have change history, and more complicated-ly (not a word…) if you want to allow the user to select which fields have changes.
But, if we decide that all fields should have history, except for possibly setup or whichever else, then creating an analytics module with 1:1 doctypes : field history table becomes a lot easier.
The main compromise I see is that, if we give the users no control and store all (or almost all) field histories, they will use more storage, but implementation will be easier and down the road they’ll have a history they can use if they want a certain set of analytics. If we give the users a lot of control, implementation becomes harder but they may see a marginal savings in disk space.
For something like this, I think a happy medium could be module-level or doctype-level control for the user, but not field-level.
This stores the fields that have changed when saving a document all into one big table. If the user wants to get analytics from those, they can install the analytics app, which sorts them all into their own table based on the doctype that has been edited. The after_install method will sort them on installation, and thereafter, the app hooks into changed_field entries, sorts them, and deletes them. (are hooks done as a background task? If not, it would be better to run the event as a background task…)