Hello all,
I’m attempting to deploy frappe v16 on AKS using Azure managed redis and an external MariaDB cluster hosted on Azure as well. The site setup command bench new-site stalls for long time and when it continues, frappe reports that tabDefaultValue is missing, even though the table exists when queried manually from the same container.
Problem
-
The
bench new-sitecommand stalls for a very long time during runtime deployment on AKS. -
When it stops stalling, I get the following message:
Table 'tabDefaultValue' missing in the restored site.
This happens when the backup fails to restore. Please check that the file is valid
Do go through the above output to check the exact error message from MariaDB
Deployment setup
-
Application is deployed on Azure AKS
-
Redis is an Azure managed Redis service
-
MariaDB is an external MariaDB cluster
-
Build pipeline runs
bench initwhich give a docker image with all bench requirements installed -
during runtime the container runs
bench new-site -
To avoid race conditions while using
new-site, I tested with only one pod running
Command being used
i’m using --no-setup-db since the user and table already have been created on MariaDb cluster. and user have all the permissions required to to run bench new-site. before running bench new site i mount the app into the apps directory and i also set the global configs using bench set-config -g
bench new-site "$SITE_NAME" \
--db-type mariadb \
--db-name "$DB_NAME" \
--db-user "$DB_USER" \
--db-password "$DB_PASSWORD" \
--db-host "$DB_HOST" \
--db-port "$DB_PORT" \
--admin-password "$ADMIN_PASSWORD" \
--set-default \
--no-setup-db
Debugging steps I followed
-
Verified MariaDB connectivity from inside the same AKS container using the
mysqlCLI. The connection works, and I can query the database successfully. -
I also noticed that the database tables have been created.
-
Verified MariaDB connectivity using the underlying Python DB library. That also works correctly.
-
Queried
information_schema.tablesmanually from the same container and confirmed thattabDefaultValueexists.
SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'frappe'
AND table_name = 'tabDefaultValue';
This returns:
tabDefaultValue
-
Verified Redis connectivity using
redis-cli. The connection works. -
Verified Redis connectivity using Python Redis client. That also works.
-
Tested Frappe’s Redis client/cache wrapper separately from inside the container. Direct Redis works, but Frappe’s
ClientCache().get_value()stalls for around 20 seconds.
source /home/frappe/frappe-bench/env/bin/activate && \
cd /home/frappe/frappe-bench && \
python -c "
import os
import time
import frappe
from frappe.types import _dict
from frappe.utils.redis_wrapper import setup_cache, ClientCache
frappe.local.conf = _dict({
'redis_cache': os.environ['REDIS_CACHE'],
'redis_cache_sentinel_enabled': 0,
})
frappe.local.cache = {}
frappe.cache = setup_cache()
frappe.client_cache = ClientCache()
start = time.time()
print(frappe.cache.get('client_cache:db_tables'))
print('REDIS GET:', time.time() - start)
start = time.time()
print(frappe.client_cache.get_value('db_tables'))
print('FRAPPE CLIENT CACHE GET:', time.time() - start)"
Output:
None
REDIS GET: 0.03649544715881348
None
FRAPPE CLIENT CACHE GET: 19.74853205680847
-
I looked at the
bench new-sitecommand source code to see where i get this issue from to narrow down source of the issue:- This is where i the message appears, after it stalls for long time
-
inside get_tables functions:
I tested both methods: retrieving the tables from Redis, which returns none, and retrieving the tables from
information_schema, which returns the actual values.
- I attempted to run
bench doctorinside the container which is stalling as well
Questions
-
Could Frappe’s Redis client wrapper behave differently from direct
redis-clior plain Python Redis since i have patched the code to able to run it? -
Since
bench doctoralso stalls, could this indicate an issue during Frappe context initialization rather than only duringbench new-site? -
During
bootstrap_database, is there any reasonget_tables(cached=True)would not see a table that exists when queried manually from the same container?
Any guidance on whether this could be related to Frappe initialization, Redis caching, MariaDB, or something else I should check would be appreciated.
Thank you

