Hello Community! ![]()
I recently faced the challenge of fully localizing ERPNext (v15) for my region. The default translations were incomplete, and manual editing was too slow. I developed a workflow that uses AI (LLM) to translate thousands of strings in minutes and a Custom App to ensure these translations survive updates.
I also encountered a specific error in recent Bench versions (AttributeError: 'App' object has no attribute 'tag') when using local apps, and found a workaround.
Here is the complete guide to replicate this setup.
IMPORTANT NOTES:
Site Name: Replace
site.localwith your actual site name (e.g.,mysite.com).Language: You must edit the
SYSTEM_PROMPTinside the script to specify your target language (e.g., Spanish, German, Hindi) and your regionās specific ERP terminology.
Step 1: Create a Container App
Never modify core files in apps/erpnext. Create a dedicated app to hold your translations.
Bash
cd ~/frappe-bench
# Create a new app (we use --no-git for speed, but we will fix the repo later)
bench new-app custom_translations --no-git
# Install it on your site
bench --site site.local install-app custom_translations
Step 2: Extract Source Strings
To get a full list of translatable strings (including hidden system messages):
Bash
bench --site site.local build-message-files
Locate the generated source file (all_english_strings.txt or similar) and use it as your INPUT_FILE for the script below.
Step 3: AI Translation and CSV Formatting Script
This script performs the translation, saves progress to a database, and outputs a clean, Frappe-compatible CSV file by correctly handling quotes (quoting=csv.QUOTE_ALL)āeliminating the need for a separate cleaning script.
File: translate_erpnext.py
Python
import asyncio
import json
import os
import csv
from typing import Dict, List
# Ensure you install the library for your chosen LLM (e.g., `pip install google-generativeai tqdm`)
import google.generativeai as genai
from tqdm.asyncio import tqdm
# --- CONFIGURATION ---
API_KEY = "YOUR_API_KEY_HERE" # Replace with your actual API key
INPUT_FILE = 'source_strings.txt' # File containing English strings (one per line)
PROGRESS_DB_FILE = 'translation_db.json' # Stores progress to allow resuming
FINAL_OUTPUT_FILE = 'ru.csv' # CHANGE THIS: Use your language code (e.g., 'es.csv', 'de.csv')
BATCH_SIZE = 30
CONCURRENCY_LIMIT = 1 # Adjust based on your API plan's concurrency limit
RPM_DELAY = 4.5 # Delay to respect Rate Limits per minute
# --- MODEL SETUP ---
genai.configure(api_key=API_KEY)
model = genai.GenerativeModel('gemini-2.0-flash')
# --- SYSTEM PROMPT (CRITICAL: CUSTOMIZE THIS!) ---
SYSTEM_PROMPT = """
Role: Senior ERPNext Localization Expert.
Context: Accounting, Supply Chain, HR, and System Administration.
# !!! CHANGE THE TARGET LANGUAGE BELOW !!!
Target Language: Russian (RF/KZ context).
RULES:
1. Use professional terminology accepted in your region (e.g., "Submit" -> "ŠŃовеŃŃŠø").
2. NEVER translate code tags: {{ name }}, %s, {0}, <div>, etc.
3. If input is technical code or numbers only, return it as is.
4. Output strict JSON dictionary/object format where keys match the input indices.
"""
def load_progress() -> dict:
"""Loads previously translated strings to avoid duplicates."""
if not os.path.exists(PROGRESS_DB_FILE):
return {}
try:
with open(PROGRESS_DB_FILE, 'r', encoding='utf-8') as f:
return json.load(f)
except json.JSONDecodeError:
print("Warning: Progress DB corrupted. Starting fresh.")
return {}
def save_progress(new_data: dict):
"""Saves progress to JSON database."""
current_db = load_progress()
current_db.update(new_data)
with open(PROGRESS_DB_FILE, 'w', encoding='utf-8') as f:
json.dump(current_db, f, ensure_ascii=False, indent=2)
def export_to_csv(translation_db: dict):
"""
Exports the database to a Frappe-compatible CSV format.
Crucial: Uses csv.QUOTE_ALL to handle commas/quotes inside strings.
"""
print(f"\nExporting {len(translation_db)} strings to {FINAL_OUTPUT_FILE}...")
with open(FINAL_OUTPUT_FILE, 'w', encoding='utf-8', newline='') as f:
# csv.QUOTE_ALL ensures that all strings are quoted, protecting commas inside phrases.
writer = csv.writer(f, quoting=csv.QUOTE_ALL)
for source, target in translation_db.items():
writer.writerow([source, target])
print("Export complete.")
async def translate_batch(sem, batch_lines: list):
"""Handles API call for a single batch with retry logic."""
async with sem:
# Create a dictionary payload for the LLM to process
batch_dict = {str(i): line for i, line in enumerate(batch_lines)}
text_payload = json.dumps(batch_dict, ensure_ascii=False)
prompt = f"Translate the following dictionary values: {text_payload}"
# Retry logic for API stability
for attempt in range(3):
try:
# Use to_thread for synchronous API calls within an async loop
response = await asyncio.to_thread(
model.generate_content,
contents=[SYSTEM_PROMPT, prompt],
# Requesting a JSON response object
generation_config={"response_mime_type": "application/json"}
)
result_json = json.loads(response.text)
translated_batch = {}
for idx_str, original_text in batch_dict.items():
# Map original text to its translated version from the returned JSON
# Fallback to original text if the translation is missing in the result
translated_batch[original_text] = result_json.get(idx_str, original_text)
# Delay to respect RPM limits
await asyncio.sleep(RPM_DELAY)
return translated_batch
except Exception as e:
print(f"\n[Error on attempt {attempt+1}] {e}. Retrying...")
await asyncio.sleep(5 * (attempt + 1))
# Return original text for all lines if all attempts fail
return {line: line for line in batch_lines}
async def main():
if not os.path.exists(INPUT_FILE):
print(f"Error: Input file '{INPUT_FILE}' not found.")
return
with open(INPUT_FILE, 'r', encoding='utf-8') as f:
all_lines = [line.strip() for line in f if line.strip()]
progress_db = load_progress()
remaining_lines = [line for line in all_lines if line not in progress_db]
print(f"Total strings: {len(all_lines)} | Already translated: {len(all_lines) - len(remaining_lines)} | Remaining: {len(remaining_lines)}")
if remaining_lines:
batches = [remaining_lines[i:i + BATCH_SIZE] for i in range(0, len(remaining_lines), BATCH_SIZE)]
sem = asyncio.Semaphore(CONCURRENCY_LIMIT)
pbar = tqdm(total=len(batches), desc="Translating Batches")
for batch in batches:
result = await translate_batch(sem, batch)
# Save results immediately to disk after each batch
save_progress(result)
pbar.update(1)
pbar.close()
# Final export after all batches are processed
final_db = load_progress()
export_to_csv(final_db)
if __name__ == "__main__":
asyncio.run(main())
Step 4: The āBench Updateā Fix
If you created your app with --no-git, running bench update might crash with:
AttributeError: 'App' object has no attribute 'tag'
This happens because Bench expects a git repository. The Fix: Initialize a dummy local git repo inside your app.
Bash
cd ~/frappe-bench/apps/custom_translations
# Initialize git to satisfy Bench requirements
git init
git config user.email "local@admin.com"
git config user.name "Local Admin"
git add .
git commit -m "Initial commit to fix bench update error"
Step 5: Deployment
-
Upload the CSV: Copy your generated CSV (e.g.,
ru.csv) to your server:~/frappe-bench/apps/custom_translations/custom_translations/translations/ -
Apply Changes (Server-side): Translation files from Custom Apps have higher priority than core files, but you must trigger a migration to load them into the database.
Bash
cd ~/frappe-bench # 1. Force migration to parse the new CSV (Replace 'site.local') bench --site site.local migrate # 2. Clear Redis cache (Critical for translations) bench clear-cache # 3. Restart services sudo supervisorctl restart all
Result
Your ERPNext instance is now fully localized. Since the translations live in a Custom App, they will not be overwritten when you update ERPNext framework code.