Yet another wkhtmltopdf thread

I’m currently getting started with erpnext.
My recent problem is that I can’t get the generation of PDFs to work. It’s the same error as most other people are facing:

“[…] OSError: wkhtmltopdf reported an error:
Exit with code 1 due to network error: ConnectionRefusedError”

I tried every hint I could find here, but nothing helped. I would be glad if someone could point me in the right direction.

My compose.yml

Different host_name configs I tried (in the backend container)

My /etc/hosts file (in the backend container)

wkhtmltopdf --margin-top 15mm --margin-right 15mm --page-size A4 --encoding UTF-8 --background --print-media-type --images --margin-left 15mm --no-outline --margin-bottom 15mm https://en.wikipedia.org /tmp/f73bb79af6ca7dcd93919a11de5303be588ee061ec9d56bc5c0d22fd.pdf

gives this output:

Loading pages (1/6)
Counting pages (2/6)
Resolving links (4/6)
Loading headers and footers (5/6)
Printing pages (6/6)
Done

wkhtmltopdf --version:
wkhtmltopdf 0.12.6.1 (with patched qt)

wkhtmltopdf problems always seems to boils down to two things:

  1. Not using the correct patched version. It appears you are using the “with patched Qt”, so that’s not it.
  2. Add a "host_name" key to site_config.json, starting with “https://”. This also assumes you’ve set up TLS.
2 Likes

I did a complete reinstall and now I’m getting another error.
I would say it’s a success because it seems that it actually tries to generate a PDF.

The frontend now gives me:

Traceback (most recent call last):
File “apps/frappe/frappe/app.py”, line 110, in application
response = frappe.api.handle(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/api/init.py”, line 49, in handle
data = endpoint(**arguments)
^^^^^^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/api/v1.py”, line 36, in handle_rpc_call
return frappe.handler.handle()
^^^^^^^^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/handler.py”, line 49, in handle
data = execute_cmd(cmd)
^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/handler.py”, line 85, in execute_cmd
return frappe.call(method, **frappe.form_dict)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/init.py”, line 1718, in call
return fn(*args, **newargs)
^^^^^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/utils/typing_validations.py”, line 31, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/utils/print_format.py”, line 234, in download_pdf
pdf_file = frappe.get_print(
^^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/init.py”, line 2134, in get_print
return get_pdf(html, options=pdf_options, output=output) if as_pdf else html
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “apps/frappe/frappe/utils/pdf.py”, line 89, in get_pdf
filedata = pdfkit.from_string(html, options=options or {}, verbose=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “env/lib/python3.11/site-packages/pdfkit/api.py”, line 75, in from_string
return r.to_pdf(output_path)
^^^^^^^^^^^^^^^^^^^^^
File “env/lib/python3.11/site-packages/pdfkit/pdfkit.py”, line 201, in to_pdf
self.handle_error(exit_code, stderr)
File “env/lib/python3.11/site-packages/pdfkit/pdfkit.py”, line 155, in handle_error
raise IOError(‘wkhtmltopdf reported an error:\n’ + stderr)
OSError: wkhtmltopdf reported an error:
Exit with code 1 due to network error: TimeoutError

And the backend error log shows:

ERROR Property: Invalid value for “CSS Level 2.1” property: var(–gray-900) [14:3: color]
WARNING Property: Unknown Property name. [64:2: word-wrap]
WARNING Property: Unknown Property name. [83:2: columns]
WARNING Property: Unknown Property name. [90:2: background-size]
WARNING Property: Unknown Property name. [98:2: object-fit]
WARNING Property: Unknown Property name. [110:2: -webkit-print-color-adjust]
WARNING Property: Unknown Property name. [158:2: word-break]

Any ideas? I couldn’t find anything.

Update:
I just played a bit around and created a fresh print format. That worked!
The question is now: what’s the problem with the default ones?

Could you explain your solution in more detail? I am also running a fresh install on docker with nginx proxy manager for TLS termination. I also created a fresh print format but that did not help.

frappe.utils.get_url should resolve site url for internal (path relative to site) images in PDFs. e.g. images from /files needs {site_url}/files and images from /assets needs {site_url}/assets.

As long as it returns correct url, through site_config, request or site directory name and images get resolved, PDF should work.

get_url code frappe/frappe/utils/data.py at 504aab4f38dd0400ff07b8b22aa74911ccdf340a · frappe/frappe · GitHub

1 Like

It took a while to find it.
Please try:
Tools → Print → Print Format Builder (New)
That new builder has a preview built in that generates a PDF preview. When that preview works, it should work in reality too.

Would a problem with the site_url also produce errors like a working PDF when using the new builder and errors with every other solution?

Additionally could it be responsible for an error that looks more like an scss compile error than some kind of url resolution problem?

I can confirm that printing works when creating a new Print Format using the new builder. Does this consequently mean that all format will need to be recreated?

EDIT:
I just tried sending an invoice via PDF using the new format but receive this error now:

AttributeError: ‘str’ object has no attribute ‘get’

EDIT2:
What is confusing to me is that I used a PWD instance where printing works, running the PWD yaml file leads to the same errors as a supposed production installation.

EDIT3:
If I enter host_name as https://erp.domain.com then I get a timeout error.

It would be fantastic if we could resolve this.

@revant_one @Sch-Tim @tmatteson

UPDATE:

For anyone struggling with this, for now I got it working like this with the following setup:

  1. Nginx Proxy Manager for TLS termination
  2. ExposePort 8080 on nginx service in Docker Compose file
  3. set “host_name”: “http://10.10.0.111:8080” in site specific config file

I am running Proxmox VE and ERPNext on a VM with Nginx Proxy Manager routing traffic via internal IPs. 10.10.0.111 is the VM IP address. If we could somehow manage to make sure that ERPNext runs behind a separate Nginx Reverse Proxy without people stumbling accross these things I think a lot of people would benefit.

3 Likes

Glad you got it working through this workaround. Sounds to me like a DNS problem though. The wkhtmltopdf binary needs to be able to resolve the site. If it can’t, then you have to fetch the site by IP like you’ve done.

Based on what you shared so far, I’d guess that internal DNS resolution for your site isn’t working for your docker containers. You could determine if that’s the case using a nicolaka/netshoot container connected to your erpnext network. I’d also check to make sure docker is configured to use a DNS server that has an authoritative zone that can give a valid response to DNS A queries for your erpnext site.

Holy shit now it works here too!

One thought of mine:
Setting a static IP is probably not a good idea in some settings.
If you use docker compose or portainer with stacks the IPs might change after rebuild.

Expose port 8080 on the “frontend” service and set host_name to http://frontend:8080
That way you’re future proof to any IP change.

I should add that I always thought that host_name had to point to the backend service and not the frontend, so I had no chance. But exposing the port was needed here too.

Thanks for the nice collaborative problem solving.

For everyone struggling with this. Here are my compose.yml and env files:
compose.yml
env

Please change the env variables as needed.

1 Like

@danielslyman
I just realized that there is (at least) one caveat with that solution:
When I add things like images to an email, they aren’t shown properly because they are loaded from frontend:8080. Obviously that won’t work.
Probably any other links won’t work too, so that’s a bummer.
Did you find a solution for that?

1 Like

Hey there,

so for some reason adding https://erp.domain.com into the site specific config file works for me. I removed the DNS entry in /etc/hosts which was seemingly causing the issue. Have you had any progress?

I finally made it. I had to make a small change in the docker build files and a few in the compose file.

I forked the frappe_docker git and implemented my changes.
You can find it here: GitHub - Sch-Tim/frappe_docker: Docker images for production and development setups of the Frappe framework and ERPNext

Change:
I updated the nginx resources from the frappe-docker git:
resources/nginx-template.conf

line 10: listen ${NGINX_PORT};
line 11: server_name ${NGINX_SERVER_NAME};

resources/nginx-entrypoint.sh
add

if [[ -z “$NGINX_SERVER_NAME” ]]; then
echo ‘NGINX_SERVER_NAME defaulting to $host’
export NGINX_SERVER_NAME=‘$host’
fi

if [[ -z “$NGINX_PORT” ]]; then
echo ‘NGINX_PORT defaulting to 8080’
export NGINX_PORT=8080
fi

after the other if blocks
and

${TRAEFIK_DOMAIN}
${NGINX_PORT}

into the envsubst string

reason:
I had the problem that nginx produced gateway errors when I redirected erp.domain traffic directly into the frontend container. I tracked the error down to the server_name part in the config. It equals the site name. In my case, it was “frontend”. To uncouple the site name from the domain, I added a second variable.
Additionally, I had to change the listening port to 80 because the default http port is 80. Otherwise, I had to set host_name to erp.domain:8080.
By adding an NGINX_PORT variable, I kept it dynamic.

Change:
Add

hostname: ${TRAEFIK_DOMAIN}

to the frontend service in the compose file
Reason:
One often heard tip is that you should add the frontend container ip to the hosts file.
While this works, it causes two problems:
The first one is that you need root permissions to alter that file.
The second one is that the container IP may change after rebuild.
I realized that docker seems to resolve hostnames internally first. If there is a container with the hostname erp.domain, it goes that way instead of asking the DNS.

Change:
Add

bench set-config -g host_name “http://$TRAEFIK_DOMAIN”;
bench --site frontend set-config host_name “http://$TRAEFIK_DOMAIN”;

as additional config commands to the configurator service in the compose file.

Reason:
The reason should be obvious. You want the backend to explicitly call the internal frontend container. I don’t know what host_name is the default if you don’t set it. But it seems that you need to set it. I chose to set it globally and sitewise to be extra sure.

2 Likes

@revant_one would it make sense to integrate above changes into the official frappe_docker repo?

Sure, send a PR and make sure tests pass, it’s backward compatible with everything including helm charts, or specify migration guide.

We can completely override the nginx template file by mounting a custom one. No need to do any change to frappe_docker in that case.

Someone else did this trick and used $updated_host instead of $host in custom nginx conf template.

map $host $updated_host {
    default $host;
    "sitehost" erp.example.com;
}

Further you can even change /etc/hosts using extra_hosts for docker or equivalent configuration for other orchestrators.

1 Like

Changing frappe_docker directly by including those changes would make it possible to use everything out of the box. You only have to add the specific environment variables.

I know about the extra_hosts parameter but I had the problem that I don’t know what IP the container will get after rebuilding the stack. Do you know how to fetch it instead of hardcoding it?

Make sure images remain backward compatible for different setups. OR else people will upgrade and need some additional tweaks to get new image working.

May be an entrypoint script? Something like this?

Hi @danielslyman ,
If set “host_name”: “http://10.10.0.111:8080” in site specific config file then PDF print will working.
But we will facing another issue of Email Notitifcation,coz site_url in Email template will be get from host_name.
So I think set host_name is a IP will not solve a root cause

is this solution work for multi-sites ? How can we define TRAEFIK_DOMAIN align with each site ?