I’ve got erpnext deployed via helm. All pods are in running state. I’m trying to add a new site using the ‘new site’ job. mysql is configured per these instructions. However, the site deployment fails with the following…
Looking around, I see some mention of mariadb needing python3-dev or libmysqlclient-dev libraries but I’m not finding anything solid. Does anyone have any insight insight to offer?
Thanks in advance.
Edit: I’ve tried using bitnami mariadb chart (10.3 and 10.5), same issue.
Edit: “grant all privileges on db_name.* TO ‘db_name’@‘%’ identified by ‘db_password’;” works just fine (with proper values substituted). I’d like to add python3-dev to see if that makes a difference but I’m not sure where to do it? The worker image, used by site-creator seems like a reasonable candidate. However, I’d expect the native frappe images to be ironed out.
I think the error is related to db configuration. Check the configuration in the frappe/helm repo tests directory. It is used to run tests and it is successfully completing the site creation job.
“dev” in production images felt odd. I didn’t add.
if you need to add it for custom requirement base your image on official image and add it in your image.
Even if you wish to try it for official image try it like this. Create your image substitute in values.yaml and try, if it works we need to fix it in official.
I use k3d.io as a local kubernetes cluster to develop these things.
If you want to speed it up even faster check the k3d docs for using local container registries.
What I’m doing in case of containerized installations is, I maintain state (NFS Server/Redis/MariaDB) in standard VPS/VM and then deploy rest of the the stateless things in containers. I have setup MariaDB with Frappe specific config on a VM in private / firewalled network, Same VM has NFS and Redis. Gist of Data Server setup.
Thanks. I checked your config against mine. There are no substantial differences. Also, I tried starting over with mariadb-galera. After debugging all of the issues related to the db swap, I’m back to almost the same thing:
Now, the referenced table is different but the same generic error is thrown. This time, I found the following, too:
2021-06-01 18:58:34 12 [Warning] Aborted connection 12 to db: ‘unconnected’ user: ‘unauthenticated’ host: ‘172.31.16.163’ (This connection closed normally without authentication)
and
2021-06-01 18:59:35 122 [Warning] Access denied for user ‘_3e8ea0dcafb368d3’@‘172.31.21.183’ (using password: YES)
It looks like variables aren’t being populated. If I explicitly set the user / passwords (rootUser, db, replication) in the mariadb values, then the erpnext pod doesn’t deploy properly (gets ‘access denied’). If I leave them unset, everything deploys but the db user doesn’t seem to get set properly (when running the ‘add site’ job). I think I’m getting closer here but thought maybe the issue would seem obvious to someone. This is in AWS EKS, btw.
I’ve granted access for all hosts with that username but now the job errors out indicating that the site already exists no matter what SITE_NAME I pass it. I’ve deleted previous instances of the job. Is there some other key that’s used for site creation?
Figured it out. The problem here was tied to the fact that my database lived on an AWS EFS volume. EFS volumes only support 256 concurrent open files. The database would begin to populate and, since each table gets its own file, the limit would be reached part of the way through.
In the process of troubleshooting I switched over to mysql from mariadb. This isn’t necessary but if anyone else should find themselves in this situation, you need to add the following options to the [mysqld] sections for primary and secondary (in addition to the other character-set changes required by erpnext)…
innodb_file_per_table = OFF innodb_strict_mode = off sql_mode=‘’