Questionable use of SQL Parsing as a Security Feature for DB access

James_Houx · June 19, 2022, 9:10pm

On my very first attempt to use Frappe to write a SQL query, I ran into the CTEs not-supported except in very recent version problem. I was surprised to discover that Frappe parses SQL Command in an attempt to add some measure of “read-only” control. Trying to parse a language as a way of detecting and mitigating behavior is notoriously error-prone. I’ve seen reports already of people pointing out Postgres examples of this failing.

Can someone please link to previous discussions for why SQL parsing was considered and chosen as an accepted mechanism for security? If the argument is potent enough, I would like to be persuaded of its merit. For now, I am raising the concern that this mechanism may only provide a false sense of security with the downside that it creates additional code surface – thus the potential for more bugs or vulnerabilities of their own.

James_Houx · June 19, 2022, 9:24pm

Adding an additional note here…
Sane database engines provide mechanisms for Read-Only transactions, which is a much better way to assure the behavior is truly read-only with no need for a SQL parsing hack.

On MySQL/MariaDB: START TRANSACTION READ ONLY
On Postgres: BEGIN READ ONLY

Furthermore, if you execute a transaction ReadOnly mode, database engines can make performance optimizations that would not be available in Read/Write Mode:

Was there some specific reason that the Frappe team did not employ ReadOnly query support?

ankush · June 20, 2022, 4:21pm

Sane database engines provide mechanisms for Read-Only transactions, which is a much better way to assure the behavior is truly read-only with no need for a SQL parsing hack.

This has been attempted before: feat: Read Only transactions via @frappe.read_only by gavindsouza · Pull Request #14870 · frappe/frappe · GitHub

Re: why we do this?

Server script / restricted context is not really read only. You’re allowed to use ORM, even query builder for making changes to Database. All whitelisted mechanisms ensure that changes are done to your site (in multi-tenant system). So we can’t use read-only transactions here.
“read-only” validation is only on raw SQL where you can escape your site and start affecting data on other sites (again only applicable on multi-tenant systems)

If you’re able to break any of the validation you mentioned in meaningful way feel free to report security vulnerability.

I’ve seen reports already of people pointing out Postgres examples of this failing.

Are you referring to this? Allow read only CTEs in query reports · Issue #16329 · frappe/frappe · GitHub

“Ohno I am able to increase a sequence on a table in my own DB”

In general all these crazy parsing / whitelisting is only added to enable extending/customizing without writing apps.

One more example, if someone wants to attempt breaking it : fix: executing non-select qb code from whitelisted methods by ankush · Pull Request #15907 · frappe/frappe · GitHub

ankush · June 20, 2022, 4:27pm

Btw if there’s interest for it we can work on controlling this behaviour as per the user’s choice:

E.g. there are two extremes (we are right now in the middle)

I am paranoid, I want to reduce the surface area of anything potentially unsafe. Extra-hardened mode.
I don’t care, I run a single site on my server so let me do EVERYTHING. exec(...) #yolo

James_Houx · June 21, 2022, 1:07am

@Ankush

Why is it possible to affect alternate site databases on a multi-tenant system? Are the other sites not using isolated databases with their own credentials? If you’re saying its possible for a site to read data from adjacent tenants, that is already a security vulnerability on its own…

James_Houx · June 21, 2022, 1:37am

Also, maybe I’m missing something. But couldn’t someone workaround this on postgres by prefixing an INSERT or UPDATE with a “WITH” cte expression?

Joshua_Restivo · June 21, 2022, 9:30pm

Each frappe site has its own unique set of database credentials. Short of compromising those credentials, you can’t construct arbitrary sql queries on one site against another site’s database. Yes, you’re going through a common front-door (nginx) and, if it’s broken, then you’re going to have a bad day but that’s a layer of abstraction removed from sql communication. If I’m failing to consider something here, I’d like to know. Can you offer an example of a query that could potentially “jump” sites?