*SOLVED* Install.py now FAILS on Unbuntu 14.04, Debian 8, Centos 7. What happened? (also same on Ubuntu 16.04) *SOLVED*

@codingCoffee - did you happen to create an erpnext user like I do in the admin guide and then install as that user? All files including ./bench and any site bench directories should end up in /home/erpnext and not in /home/frappe, or /home/root. I will run another test later today when I get a chance. I’ll also look over the merged PR as well.

I am beginning testing again right now. It will probably take an hour or two to complete. I will update here when I am finished. Unfortunately @James_Robertson I will not be able to test your specific user name since I am working with GCP and cannot create additional users aside from my account username. I will however go through your procedure on a Debian 8 server after I complete my regular install tests. It will just be with my own username instead of erpnext.

BKM

@codingCoffee
@James_Robertson

I have used the Google Cloud Platform to spin up all of the servers this morning and here are the results:


Debian 8 -----------> Success (install.py works with no prerequisites)

Ubuntu 14.04 ----> Success (install.py works with no prerequisites)

Ubuntu 16.04 ----> FAILED (1st attempt - see traceback below)

CentOS 7 ---------> FAILED (see traceback below)


The Ubuntu v16.04 install attempt with the new install.py failed for the same reason it has failed for the past month. MariaBD still seems to have a configuration problem. Here is the trace:


TASK [Gathering Facts] **************************************************************************
ok: [localhost]

TASK [Check whether a site exists] **************************************************************
ok: [localhost]

TASK [Create new site] **************************************************************************
fatal: [localhost]: FAILED! => {“changed”: true, “cmd”: [“bench”, “new-site”, “site1.local”, “–admin-password”, “password”, “–mariadb-root-password”, “password”], “delta”: “0:00:01.951457”, “end”: “2018-01-19 14:34:16.990444”, “failed”: true, “rc”: 1, “start”: “2018-01-19 14:34:15.038987”, “stderr”: “ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run/mysqld/mysqld.sock’ (2)”, “stderr_lines”: [“ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run/mysqld/mysqld.sock’ (2)”], “stdout”: “Database not installed, this can due to lack of permission, or that the database name exists.\nCheck your mysql root password, or use --force to reinstall”, “stdout_lines”: [“Database not installed, this can due to lack of permission, or that the database name exists.”, “Check your mysql root password, or use --force to reinstall”]}
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry

PLAY RECAP **************************************************************************************
localhost : ok=73 changed=45 unreachable=0 failed=1

Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib/python2.7/subprocess.py”, line 541, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.yml’, ‘-e’, ‘@/tmp/extra_vars.json’, ‘–become’, ‘–become-user=erp_jmi’]’ returned non-zero exit status 2
erp_jmi@test-ubu1604:~$


It is possible to get past this error if you use the workaround created by @James_Robertson that involves creating a new .cnf file BEFORE running install.py. On my second attempt, using the workaround of creating a dummy .cnf file, I was able to get ERPNext to install on Ubuntu 16.04.
@codingCoffee New users will not have an easy time figuring this out and it should be fixed in the install script.


CENTOS v7

The install script FAILS to complete when run on a CentOS 7 host. The test was run by first…disabling selinux with the following command:

sudo setenforce 0

Then the install script was retrieved and run as normal.
Tje following is the trace dump from the failed attempt:


TASK [install erpnext to default site] **********************************************************
skipping: [localhost]
PLAY [localhost] ********************************************************************************
TASK [Gathering Facts] **************************************************************************
ok: [localhost]
TASK [insert/update inputrc for history] ********************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was:
OSError: [Errno 2] No such file or directory
fatal: [localhost]: FAILED! => {“changed”: false, “failed”: true, “msg”: “Could not replace file:
/tmp/tmpWGA_Qe to /home/root/.inputrc: [Errno 2] No such file or directory”}
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry
PLAY RECAP **************************************************************************************
localhost : ok=70 changed=48 unreachable=0 failed=1
Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib64/python2.7/subprocess.py”, line 542, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.y
ml’, ‘-e’, ‘@/tmp/extra_vars.json’, ‘–become’, ‘–become-user=erp_jmi’]’ returned non-zero exit
status 2
[erp_jmi@test-centos7 ~]$


.
.
At this point it appears we are at least back to being able to use the install.py script to habdle setting up ERPNext production servers on Debain v8 and Ubuntu v14.04.

This was the case a few days ago. However, we are still unable to run the install script on a CentOS 7 server. It fails every time including when I try some to the other work-around prerequisites from other posts. We are also unable to run the install script on Ubuntu 16.04 unless several prerequisites are performed to the server ahead of the install. Those prerequisites involve creating a dummy .cnf file in order to get MariaDB to complete the install.

@codingCoffee Your PRs were supposed to address both of these failure issues identified above. At this point they do not appear to be working. If you attempt to fix them, please perform regressive testing on the rest of the linux server types before releasing another version of the install script.

BKM

Did you use my instructions to create the /home/root directory first on CentOS?

Yes, I tried that, and it still failed but in a new location in the script. It didn’t seem to make sense to then try to find another workaround for the new failure point. The reality is that I took as much of your install guide as I could and applied it to the CentOS7 attempts. I created the .cnf file, set the selinux to permissive, and created the /home/root directory. Even with all of this, it still fails.

I tried this on several server iterations in different server farm locations around the country just to make sure I was not suffering sever issues. The results were the same. Since I can spin up multiple servers at multiple locations at the same time, I was using that ability to run the install test as many times as I could fit on my screens. Overall the CentOS7 server was used 9 different times. 4 of those were with your suggested prerequisites and they all failed.

I think now it is important for someone from the development team to put some real effort into making the “Easy Install” script live up to it’s name. It is no longer useful as an installer except to the 2 oldest and most stable linux candidates.

If you are interested in the new fail point, I can spin it up again and post the trace dump here. It doesn’t take a lot of effort to do that.

BKM

@James_Robertson
Well, I didn’t try exactly what you had written in your guide. But on a fresh server these are the exact commands I executed as the root user

Ubuntu 16.04

cd ~
apt-get update
apt-get -y upgrade
apt-get install -y python-minimal build-essential python-setuptools
wget https://raw.githubusercontent.com/frappe/bench/master/playbooks/install.py
sudo python install.py --develop --user frappe --mysql-root-password frappe --admin-password frappe

CentOS 7

cd ~
yum check-update
yum update -y
yum groupinstall -y development
curl "https://raw.githubusercontent.com/frappe/bench/master/playbooks/install.py" -o install.py
sudo python install.py --develop --user frappe --mysql-root-password frappe --admin-password frappe

The install script handles the user creation and creates a user named erpnext and installs frappe-bench in the /home/erpnext/ directory

I have tried this on CentOS7 and Ubuntu16.04 droplets on Digital Ocean and it seems to be working.

@bkm

You are right my PR was supposed to address these problem. I’ll take a look at the trace dumps you posted and try to figure out the solution.

Till then I would be grateful if you could execute the commands I mentioned above and post the trace dumps here. Use the --production flag instead of --develop

Per your request. This is the trace dump from Google Cloud Platform test on Ubuntu 16.04. I copied and pasted your exact instructions into the server to prepare it and run the install. I also made sure to use the --production switch instead of --develop. Here are the results:


TASK [Gathering Facts] ******************************************************************************
ok: [localhost]
TASK [Check whether a site exists] ******************************************************************
ok: [localhost]
TASK [Create new site] ******************************************************************************
fatal: [localhost]: FAILED! => {“changed”: true, “cmd”: [“bench”, “new-site”, “site1.local”, “–admin
-password”, “frappe”, “–mariadb-root-password”, “frappe”], “delta”: “0:00:02.073400”, “end”: “2018-0
1-19 17:46:50.195798”, “failed”: true, “rc”: 1, “start”: “2018-01-19 17:46:48.122398”, “stderr”: "ERR
OR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run/mysqld/mysqld.sock’ (2)
", “stderr_lines”: [“ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run
/mysqld/mysqld.sock’ (2)”], “stdout”: “Database not installed, this can due to lack of permission, or
that the database name exists.\nCheck your mysql root password, or use --force to reinstall”, “stdou
t_lines”: [“Database not installed, this can due to lack of permission, or that the database name exi
sts.”, “Check your mysql root password, or use --force to reinstall”]}
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry
PLAY RECAP ******************************************************************************************
localhost : ok=73 changed=45 unreachable=0 failed=1
Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib/python2.7/subprocess.py”, line 541, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.yml’,
‘-e’, ‘@/tmp/extra_vars.json’, ‘–become’, ‘–become-user=erp_jmi’]’ returned non-zero exit status 2


.
.
Let me know what you find different. It is possible that the Digital Ocean droplets are more provisioned than the basic server instances at other hosting locations.

BKM

Again… as per your request, I copied and pasted all of your commands into the CentOS 7 server to load all of your prerequisites and install erpnext. Again, it failed. Here is the trace dump:


TASK [create a new default site] *****************************************************************
skipping: [localhost]
TASK [install erpnext to default site] ***********************************************************
skipping: [localhost]
PLAY [localhost] *********************************************************************************
TASK [Gathering Facts] ***************************************************************************
ok: [localhost]
TASK [insert/update inputrc for history] *********************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: O
SError: [Errno 2] No such file or directory
fatal: [localhost]: FAILED! => {“changed”: false, “failed”: true, “msg”: “Could not replace file:
/tmp/tmpve_Njg to /home/root/.inputrc: [Errno 2] No such file or directory”}
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry
PLAY RECAP ***************************************************************************************
localhost : ok=70 changed=48 unreachable=0 failed=1
Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib64/python2.7/subprocess.py”, line 542, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.ym
l’, ‘-e’, ‘@/tmp/extra_vars.json’, ‘–become’, ‘–become-user=erp_jmi’]’ returned non-zero exit st
atus 2


.
.
Both times I used your special instructions for the Centos7 and Ubuntu 16.04 servers, they failed.

At this point the only thing I can think of as possibly different ‘might’ be your use of the --develop switch in your install tests. I ONLY use --production servers and never set up developer servers as I am not an developer.

BKM

I tried on Digital Ocean, I got the same Error… I added --site along with it. I going to test with exact script codingCoffee shared

TASK [Create new site] **********************************************************************************************************************************

fatal: [localhost]: FAILED! => {“changed”: true, “cmd”: [“bench”, “new-site”, “xxx”, “–admin-password”, “xxx”, “–mariadb-root-password”, “xxx”], “delta”: “0:00:02.467042”, “end”: “2018-01-19 21:56:32.215387”, “failed”: true, “rc”: 1, “start”: “2018-01-19 21:56:29.748345”, “stderr”: “ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run/mysqld/mysqld.sock’ (2)”, “stderr_lines”: [“ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run/mysqld/mysqld.sock’ (2)”], “stdout”: “Database not installed, this can due to lack of permission, or that the database name exists.\nCheck your mysql root password, or use --force to reinstall”, “stdout_lines”: [“Database not installed, this can due to lack of permission, or that the database name exists.”, “Check your mysql root password, or use --force to reinstall”]}
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry

PLAY RECAP **********************************************************************************************************************************************
localhost : ok=69 changed=41 unreachable=0 failed=1

Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib/python2.7/subprocess.py”, line 541, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.yml’, ‘-e’, ‘@/tmp/extra_vars.json’, ‘–become’, ‘–become-user=frappe’]’ returned non-zero exit status 2

This time I used ALL of your instructions in the install guide you published and this is the resulting (new) error:


TASK [restart mysql] *******************************************************************************************
skipping: [localhost]
TASK [create a new default site] *******************************************************************************
skipping: [localhost]
TASK [install erpnext to default site] *************************************************************************
skipping: [localhost]
PLAY [localhost] ***********************************************************************************************
TASK [Gathering Facts] *****************************************************************************************
ok: [localhost]
TASK [insert/update inputrc for history] ***********************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: OSError: [Errno
2] No such file or directory
fatal: [localhost]: FAILED! => {“changed”: false, “failed”: true, “msg”: “Could not replace file: /tmp/tmpAO67He
to /home/root/.inputrc: [Errno 2] No such file or directory”}
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry
PLAY RECAP *****************************************************************************************************
localhost : ok=70 changed=44 unreachable=0 failed=1
Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib64/python2.7/subprocess.py”, line 542, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.yml’, ‘-e’, ‘@/t
mp/extra_vars.json’, ‘–become’, ‘–become-user=erpnext’]’ returned non-zero exit status 2
[erpnext@test-centos7 ~]$


.
.
This error is different from those I had earlier using other work-around processes. So, even using your guide, the system will not install on CentOS 7. Hope this helps you work through your guide.

BKM

Same error with --production and exact same script

TASK [Create new site] **********************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["bench", "new-site", "site1.local", "--admin-password", "frappe", "--mariadb-root-password", "frappe"], "delta": "0:00:03.217272", "end": "2018-01-19 22:26:55.814059", "failed": true, "rc": 1, "start": "2018-01-19 22:26:52.596787", "stderr": "ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)", "stderr_lines": ["ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)"], "stdout": "Database not installed, this can due to lack of permission, or that the database name exists.\nCheck your mysql root password, or use --force to reinstall", "stdout_lines": ["Database not installed, this can due to lack of permission, or that the database name exists.", "Check your mysql root password, or use --force to reinstall"]}
    to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry

PLAY RECAP **********************************************************************************************************************************************
localhost                  : ok=69   changed=41   unreachable=0    failed=1   

Traceback (most recent call last):
 File "install.py", line 388, in <module>
install_bench(args)
 File "install.py", line 114, in install_bench
run_playbook('production/install.yml', sudo=True, extra_vars=extra_vars)
File "install.py", line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, 'playbooks'))
File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['ansible-playbook', '-c', 'local', 'production/install.yml', '-e', '@/tmp/extra_vars.json', '--become', '--become-user=frappe']' returned non-zero exit status 2

Yes, but it is still close. Still installing into /home/root. hmmmm.

Thanks for @bkm and @James_Robertson for all their efforts in getting this fixed. You are both a credit to the Community :sunglasses:

3 Likes

I just tried the

sudo python install.py --develop --user frappe --mysql-root-password frappe --admin-password frappe

And it works

when I change it to – production it breaks

sudo python install.py --production --user frappe --mysql-root-password frappe --admin-password frappe

TASK [Create new site] **********************************************************************************************************************************
fatal: [localhost]: FAILED! => {“changed”: true, “cmd”: [“bench”, “new-site”, “site1.local”, “–admin-password”, “frappe”, “–mariadb-root-password”, “frappe”], “delta”: “0:00:02.630226”, “end”: “2018-01-19 23:56:31.031747”, “failed”: true, “rc”: 1, “start”: “2018-01-19 23:56:28.401521”, “stderr”: “ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run/mysqld/mysqld.sock’ (2)”, “stderr_lines”: [“ERROR 2002 (HY000): Can’t connect to local MySQL server through socket ‘/var/run/mysqld/mysqld.sock’ (2)”], “stdout”: “Database not installed, this can due to lack of permission, or that the database name exists.\nCheck your mysql root password, or use --force to reinstall”, “stdout_lines”: [“Database not installed, this can due to lack of permission, or that the database name exists.”, “Check your mysql root password, or use --force to reinstall”]}
to retry, use: --limit @/tmp/.bench/playbooks/production/install.retry

PLAY RECAP **********************************************************************************************************************************************
localhost : ok=69 changed=41 unreachable=0 failed=1

Traceback (most recent call last):
File “install.py”, line 388, in
install_bench(args)
File “install.py”, line 114, in install_bench
run_playbook(‘production/install.yml’, sudo=True, extra_vars=extra_vars)
File “install.py”, line 326, in run_playbook
success = subprocess.check_call(args, cwd=os.path.join(cwd, ‘playbooks’))
File “/usr/lib/python2.7/subprocess.py”, line 541, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘ansible-playbook’, ‘-c’, ‘local’, ‘production/install.yml’, ‘-e’, ‘@/tmp/extra_vars.json’, ‘–become’, ‘–become-user=frappe’]’ returned non-zero exit status 2

1 Like

:rofl: As I laugh to myself, I suspected this to be the case. I was just too tired from testing almost 20 different mutations of installing PRODUCTION servers today that the thought of doing a developer version was more of an annoyance. Developers need to realize that the project they hold so dear is almost entirely dependent on PRODUCTION servers actually working. :rofl::upside_down_face::rofl::astonished:

BKM

2 Likes

@bkm
Thanks for the traceback. Will look into it.

@bibinqcs
Thanks for mentioning. I think I’ll need to edit some more playbooks. I never tried installing with the --production flag… @bkm you might be right. The production flag might be the issue

Will try to get the issue fixed as soon as possible!
@Julian_Robbins, true indeed. Thanks a ton to everyone in this thread for their help.

2 Likes

Thanks for hanging with us @codingCoffee and suffering our poking fun at developers along the way. We really appreciate your time to make this right. We feel making installs easy is the key to promoting the project overall. Your efforts on this part of that goal are really important to us.

Thanks,

BKM

2 Likes

I am happy to help :smile:

1 Like

This is huge. The --develop flag does barely anything in the grand scheme of things. The --production flag should actually be default and --develop removed IMO. You can turn a “production” site into “develop” mode with a quick and easy step after installation, which I provide link to in my admin guide instructions. Even “development/non-production” sites still need all the settings and nginx.conf stuff as development sites to properly run. Especially if you need/want the development site to be shared.

ALL testing should be done with --production flag.

1 Like

Update:
Have fixed the issue in the production script as well.

@bkm the issue of setting SELinux to permissive has also been fixed in the script. Referring to this

@James_Robertson @bkm @bibinqcs could you please test the easy install out and give feedback !

Will surely remember that henceforth :stuck_out_tongue:

1 Like