[RFQ] Frappe in the agentic world

Hello community.

Yes, I am late to the party, but here I am. I accept that agentic coding is the thing and we should adapt.

Broad philosophy

Agentic coding feels magical, but its a trick under the hood. When you tell an agent to get you a coffee, it is just going to order it for you. It is not going to grow beans and roast them and grind them. It feels that way. The LLM is just a clever binding to something underneath, it is not god level magic.

If you just ask an agent to manage your bills, it is most likely to sign you up somewhere or just use FOSS ERPNext under the hood. It is not going to code the entire ERP. The LLM is just ONE LAYER, it does not extrapolate as people imagine. If it was true, we would all have been toast by now. So while the LLM seems super-intelligent, the LLM + “harness” (all the apps underneath) is just clever. It is a trick. It does not tell you the entire truth.

So the apps will remain - ERPNext will remain, Framework will remain etc. None of that changes, only the user will have one lesser layer of interaction.

Which brings me to my first realisation. The agent is as good as the harness. The building blocks matter even more and the best building blocks will win. People have told me this before but now I realise why Frappe Framework is an ideal building block for agents (because it is an ideal block for people as well). The ability to just configure major actions that require code, makes it super efficient. Hence everytime I open linkedin, someone has vibed a new Frappe app.

What changes

The agentic coding is really cool, I am loving it like a new toy. It is very very clever and can reduce a lot of the tedious work that I had to do for coding. I think I can now attempt large projects that I never thought possible just because of the “effort” it would take. Now I can do them. Like this new bench I am trying to build.

I am assuming, within a year, we can fix all the major things that slow us down from development and also improve the quality and security of our apps if we focus on it.

The front-end will also change, but good apps are still good harness. I have some ideas on front-end, will share them later. The important thing in my view is that LLMs should not be embedded in app workflows. It will kill the power of the apps. You can call LLMs but don’t add LLM to workflows. LLM extensions should be external and optional.

Things that Agentic coding can help with

First, Framework needs to be a lot lighter. Check this issue: Make framework lighter · Issue #39459 · frappe/frappe · GitHub . Setting up framework, creating sites, updating apps should be “instant”. sqllite should be the default database, we should strip out all minor dependencies and only rely on few light weight blocks.

We need to refactor all our code and make it more managable. Please steal my taste file, it is a culmination of 30 years of programming.

After re-factor, we should burn down all quality and security problems.

I am assuming all of this will probably take a year. Maybe less if we all contribute.

How can you contribute

Since all of this is FOSS, and many of you are capable and interested in contributing, I think we can all do this together. Everything I do is usually open, but I have very little patience for lazy work. If you are someone who can do this, start firing up your favourite agent and start contributing.

Send PRs but carefully. Right now you can start helping me with building a new bench so that we can get more people activated in using Frappe.

REMEMBER: Everything must be super light weight. Don’t add dependencies, don’t add slop. And force the LLM to write as little code as possible.

Comments welcome.

37 Likes

Maybe we can also try what Rails 8 did (at least in development?):

By switching to database-based queues and caches, many applications will no longer require additional in-memory services like Redis.

Ref: Rails 8 Release | makandra.de

Then basically it is just a few SQLite files.

1 Like

Did you already explore making Framework and apps installable via PyPi? I think that kind of standardization / aligning with the Python ecosystem would be a huge win.

Please do! I really hope your team and contributors are all-in on this.

I’d say that depends a lot on what the app is trying to do. All tasks involving analysis or creation of text or images can benefit from a LLM-integration. E.g. Framework could offer LLM-written Custom Scripts, Helpdesk could turn solved tickets into knowledge-base articles, ERPNext could parse Purchase Invoices, etc.

Before asking for contributions, we should solve the review bottleneck. Since everybody can produce as much code as they like, nobody wants to review anymore. If I’m not mistaken, Framework already blocks first-time contributors or non-members/collaborators.

3 Likes

Sounds interesting. Yes we can use sqllite as well. Mostly, we don’t need persistent caches, just shareable caches.

I think the bottleneck is still mariadb and the 50+ python packages we load. Once things are stable, we can publish to PyPi. Just remove the un-necessary baggage.

All of these can be on top of standard apps. Also all apps should be complete without agentic workflows, otherwise the non agentic flows will become secondary and overtime the use of the app as “harness” will reduce.

Yes, this is a much bigger problem even without agentic coding. Not sure how to fix this, maybe a smaller group should start this and we should have a qualification criteria. At least for the projects I am working on (say the new bench), I will promise faster response on PRs.

For Framework, we should find a way for people to notify us. Lets explore.

I try my best to influence. I guess one of my things should be to re-factor the framework, file by file. Specially the files and functions that are too long, over engineered, or too clever. Then it will speed up the fixing part. @rohit_w has already started with stock_entry.py

1 Like

I know we may be specifically talking about Agentic coding in context of Frappe. But I think one of the killer applications is migrating legacy code bases to their modern counter parts. Couple of months back over a week I spent looking at Frappe Book’s code base. With Claude’s help I was able to upgrade all dependencies to modern version, specifically Electron (with all the tests passing).

Would it be a terrible idea to identify other repos/projects that might benefit from such undertaking?

Yes, refactoring the code and making it lose weight is the first thing on the cards. Unlike @revant_one, I don’t think a full rewrite is required. There is still a lot of value in the years of tiny polish that has gone into the current apps.

Nice, do send a PR (you can start a new thread on discuss for this) - will merge. We are also looking for a maintainer, but the criteria is “good taste” :slight_smile:

2 Likes

I am building Frappe Forge and FAgent. I will keep everyone on the progress. Its really cool.

3 Likes

Sharing my-work on AI Harness 100% auto SDLC

I am currently running 2 Frappe + Vue/React projects through a full auto-SDLC harness.
Now moving a larger project the same way. That is where I keep hitting this -

AI already knows Frappe and other UI code. That part is solved. Open any LLM (Claude / Codex) today and ask it to write a DocType - it will do it. What AI does not know is the context around the code:

  • Who the user is - an SME owner in tier-2 India, low tech literacy, will WhatsApp at 11pm asking why the invoice didn’t print

  • How fast we ship - 2 weeks to go-live, not 6 months. Polish happens in production

  • Domain quirks - GST changes mid-quarter, regional invoice formats, jugaad workflows the customer expects

  • What the customer actually pays for - a running system, not a perfect one

Without this context, even a perfectly-coded DocType is the wrong DocType. The agent ships fast but ships the wrong thing.

One agent alone cannot carry all this. You need AI at every step - requirements, dev, QA, test, deploy. And after deploy too - soak period, canary rollout, feedback loop back into the system. Each step teaches the next. Stack it like this and the output starts matching what the customer actually wanted, not just what the prompt said.

Current stack for this: Claude as the primary agent, BMAD methodology for PRD/story breakdown, Playwright for UI test automation, Langfuse for tracing every step, and a small tool called claude-fuse (or langfuse) for local observability and catching repeated mistakes. All open or self-hostable. The harness is the part I am still iterating on.

Fixing these missing pieces in the harness right now. This is where I think the Frappe community has an edge - because we already understand this user, this speed, this context. The framework is the easy half. Context is the hard half.

3 Likes

This really resonates with me. One of the biggest strengths of Frappe is that everything is already structured in a very standardized and configurable way, which makes it extremely agent-friendly.

From my experience working with Frappe/ERPNext over the last couple of years, it becomes much easier to guide and shape development using agents because the framework already provides strong patterns, metadata, and conventions. Many things that would normally take hours to build manually can now be achieved in minutes with the help of AI agents.

That said, review, standardization, and architecture are still extremely important. AI can accelerate development, but it should work within well-defined standards and boundaries.

Personally, I try to keep proper agent rules/skills in place so the generated code follows the expected architecture and does not go outside the standardization of the project. Edge cases, scalability, security, and maintainability still require strong architectural thinking and real development experience.

3 Likes

Our Company comes from a low-code (Microsoft Power Platform) background, and we are basically in the process of switching all our customers, who are willing, over to Frappe. We have found that Agentic Engineering Custom Apps is much faster and comes with higher quality than low-coding them on, say, the Power Platform.

What we really appreciate is that Frappe builds a solid and stable base for agents to work on custom apps. I believe Frappe being trad coded brings a lot of stability and certainty with it. Which is the perfect foundation for agents to build on top.

I understand that Agent Engineering is the future, and Frappe will have to adapt to it. However, before starting to rewrite large parts of the Framework, I believe it to be absolutely crucial to find a way to avoid making the codebase messy. We have seen time and time again that companies that embrace Agentic Engineering end up with less stable code (GitHub downtime, AWS outages, just look at the mess that is Claude Code, and so on).

I guess what I’m trying to say is that the stability of the Framework must be prioritised over moving quickly. Frappe is already an awesome ecosystem. There’s no need to rush into anything. I guess I’m preaching to the choir here, but in my mind, it is absolutely crucial that, before doing big re-writes with agents, the Quality Assurance processes have to be adjusted, so we don’t end up in slop land :sweat_smile:.

7 Likes

You are right. I can imagine when teams switch their coding styles to agentic it will be messy. I don’t think there is any option. GitHub downtime was mostly scaling issues due to a big surge in engineering activity.

What we will do is not touch v16 with agentic. All new clean up / refactor initiatives will go in develop/v17

I see agentic as a way to make framework / erpnext even better - leaner, meaner and more secure. That is what I am excited about.

3 Likes

Here is the PR: chore: update dependencies by sidharthshah · Pull Request #1 · sidharthshah/books · GitHub that I was talking about. It needs following cleanups:

  1. Ensure warning are not suppressed
  2. Double check Claude’s work to make sure it makes sense in context of existing code base’s style

Since this might be breaking changes here is how I’d approach merging it:

  1. Create a `develop` branch, this will ensure we get testing feedback from other community members. Create a PR from this branch and eventually sync with `master` branch
  2. I tested these set of changes on MacOS only. I need to double check they work fine in context of other OS including Linux(say Ubuntu) and Windows as-well
  3. Along the way I did create some technical notes which I believe can go into Wiki or a `DEVNOTES.md` file. It might be good idea to talk about “fyo” framework that possibly folks who read code discover
  4. While I was doing internal testing of Frappe Book I discovered there were few gaps in official documentation, which need to be updated as-well.
  5. We possibly need to check the “Sync to Cloud” feature with newer versions of ERPNext

Welcome to the AI world :robot:

LLM works better for Frappe or any open-source projects, as they have better/more training data. :slight_smile:

review bottleneck

We can take some AI help. Looping in @riddhesh237 who has automated a few things across our Frappe practice.

3 Likes

Thanks for the tag Rahul.

Internally we’ve been experimenting with agentic coding and automations on our Frappe work, and what we kept finding was that the generated code often doesn’t follow Frappe conventions and best practices, proper permission handling, N+1 queries and similar anti-patterns that look fine in isolation but compound at scale.

What helped was skills. A well-scoped, Frappe-aware skill turns a generic Python-capable model into something that behaves more like a developer who has actually shipped Frappe apps.

Proposal: a skills repo under the Frappe org, covering the framework itself + ecosystem apps. Once vetted and properly curated, these skills would help give us consistent code quality from agents and meaningfully reduce reviewer load.

A few community starters worth pointing at:

review bottleneck

On our side, we’ve been experimenting in this direction internally with a small in-house skill that flags common anti-patterns, permission issues, N+1 queries, unwanted frappe.db.commit(), and many more similar smells during review, the issues we’ve noticed even smart models still tend to introduce. It’s early-stage and tightly tied to our internal app conventions. But the experimentation has given us a sense of what’s worth automating, what tends to false-positive on Frappe-shaped code, and how to surface findings as review comments that drive fixes rather than noise. Once a skills repo lands under the Frappe org and the maintainers have shaped the direction for a code-review skill, we’d be glad to contribute back what we’ve observed and learned along the way, aligned with whatever scope and conventions the maintainers decide are right for the ecosystem.

5 Likes

i haven’t used skills.md yet, but my taste.md works well.

Yes, we can use LLMs to triage new pull requests as well. But GitHub really needs to bring this native at this scale of LLM driven development

May I share my humble thought around agentic.

There are millions of AI slop content in youtube which people are sick of it, and soon (in fact now) will be millions of copied open source project that are claimed to be better than the original.

Frappe Technology can see this as both threat and opportunity.

As in the future, there could be hundreds of Frappe clone, Odoo clone or even SAP clone. Many of them will be one time used and will die down if it can’t grow user base. The biggest challenges for all of us, the users, are to find the real project in the swamp.

I cam across the following VDO. I quite like the idea of Persona Dashboard, it is something ERPNext already has but can be better. I think this resonate well with the coming Frappe Studio?. I prefer the idea that the new personal page can be built within framework, secured, standard.

As big ERP vendor is evolving for Agentic UX, can you imagine how expensive for such the closed environment? This is an opportunity for us.

With the open nature of ERPNext, I predict it will experience even higher growth. When people are thinking about buiding Agentic ERP, they will need a good ERP blackbox to start. Nothing came close ERPNext at this moment.

Note: I still think all-in-one ERPNext is better than modular Odoo in this arena.

For me, the idea of leaner, meaner and more secure ERPNext idea of @rmehta is the good way forward. If so, i like to add “prompt compatible” too.

I imagine that in real life, AI (that consume costly token) will be used only as necessary, i.e., some provisional task, analyse existing bottleneck and suggest smart options for human to choose. There is no reason to run report with AI if there is a pre-built report that answers the same question.

Business is not about task distribution, it is about responsibility distribution so I think human will always be in the loop, and the optimized UX for human is needed.

Lastly, a question to @rmehta , has Frappe Cloud plan to provide a good LLM service, more light weight than frontiers. I think foundation LLM price will sky rocket very soon and this can be big opportunity of Frappe Cloud.

7 Likes

Here’s my take:

Personally nothing against agentic coding. But its important to know for what and when to use it. Agentic is great for new features. Agentic is decent for finding issues with your code. Agentic is a disaster for maintaining code.

My honest opinion: Frappe shouldn’t depend on agents, but it should use it automate tasks such as documentation and quality control.

But I also think Frappe should put a pause on all new features until the code is organized, slimmed down, and tidy. Take the example of V16, it looks great, i’m happy with it. But we did spend 5 months where UI elements that used to work stopped working and features that used to exist stopped existing. Granted they were eventually put back in, but it was such a big change (that proper documentation didn’t exist) that naturally it broke few things along the way. If we are serious about make the framework slim and well documented, I agree lets stop agentic but we must also put a pause on all new features.

@rmehta One area that feels important for the agentic development workflow is parallel validation environments.

In my experience, once agents start implementing features, they naturally want to self-validate: run tests, start the app, open the UI, iterate, fix, repeat. The next step is multiple agents working on multiple branches or worktrees in parallel. That is where the current bench model becomes a bottleneck.

Today, a bench can host multiple sites, but all sites resolve apps from the same apps/ checkouts. So if I have two worktrees of the same app on two branches, I cannot cleanly map site A to branch A and site B to branch B inside one bench. For human development this is inconvenient. For agentic development it limits throughput because only one branch is really testable at a time.

A big win would be a first-class workspace concept in bench, maybe something like:

bench workspace create feature-x
bench workspace start feature-x
bench workspace test feature-x
1 Like