Start a Conversation

This post is more than 5 years old

Solved!

Go to Solution

3743

April 22nd, 2016 03:00

Multinode ECS Install fails

Hi all

I am trying to install and setup ECS so called free and frictionless :-)

I installed four minimal CentOS machines, updated and installed the packages as per the latest description.

The multinode install step1 fails at the end with the following message:

93930ab2480c: Pull complete

2771923bae9b: Pull complete

05ba09079941: Pull complete

1fb1f09d3a06: Pull complete

0682f71116e2: Pull complete

Digest: sha256:076ab6cf5cb3ee61be62fbdbcd145d209e474c36fe70f76cbb85e169f1acaa3d

Status: Downloaded newer image for docker.io/emccorp/ecs-software-2.2:latest

05efa00c1f0e9f25b70682cb5c4b9c4375826087716fdc138cc9a27f11adc512

Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.

[22/Apr/2016 13:03:40] INFO [root:469] Check the Docker processes.

CONTAINER ID    IMAGE                         COMMAND              CREATED              STATUS              PORTS           NAMES
05efa00c1f0e    emccorp/ecs-software-2.2:latest   "/opt/vipr/boot/boot."   Less than a second ago   Up Less than a second                   ecsmultinode

[22/Apr/2016 13:03:40] INFO [root:505] Backup common-object properties file

[22/Apr/2016 13:03:40] INFO [root:509] Copy common-object properties files to host

[22/Apr/2016 13:03:40] INFO [root:513] Modify Directory Table config for multi node

[22/Apr/2016 13:03:40] INFO [root:517] Copy modified files to container

[22/Apr/2016 13:03:40] INFO [root:525] Backup ssm properties file

[22/Apr/2016 13:03:40] INFO [root:529] Copy ssm properties files to host

[22/Apr/2016 13:03:40] INFO [root:533] Modify SSM config for multi node

[22/Apr/2016 13:03:40] INFO [root:537] Copy modified files to container

[22/Apr/2016 13:03:40] INFO [root:541] Adding python setuptools to container

[22/Apr/2016 13:03:40] ERROR [root:571] global name 'DockerCommandLineFlags' is not defined

Traceback (most recent call last):

  File "step1_ecs_multinode_install.py", line 542, in modify_container_conf_func

os.system("docker "+' '.join(DockerCommandLineFlags)+" exec -t  ecsmultinode wget https://bootstrap.pypa.io/ez_setup.py")

NameError: global name 'DockerCommandLineFlags' is not defined

[22/Apr/2016 13:03:40] CRITICAL [root:572] Aborting program! Please review log.

Unforntunately I am much more familiar with other installations. Any ideas what to do here?

The single node installed but after setting up a bucket I cannot add that to a data CAS user as the home bucket. This one fails with an API Server error and the jcenteraverify gives an authentication failed with the created .pea file.

Thanks a lot, Holger

April 22nd, 2016 09:00

Your first issue is a something that's already been patched; you can fix it either by downloading the latest release, executing "git pull" from your repository directory (if you obtained ECSCE via "git clone"), or by adding the line "DockerCommandLineFlags=[]" below the two lines beginning with "logging" and "logger" near the top of the script (if you just want a quick fix).

The authentication phase may take quite some time to come up, but if it's consistent, one of the first things to try (as listed here) is to edit your firewall settings (or remove your firewall entirely if you feel so inclined).

I'm not entirely sure about the CAS/.pea issue at the moment, but I get the feeling that it may similarly be related to your port settings - is 3218 open on both UDP and TCP? If not, you may need to add those (e.g. "firewall-cmd --permanent --add-port=3218/udp") to authenticate properly against CAS.

5 Practitioner

 • 

274.2K Posts

April 22nd, 2016 09:00

Hello Holger,

I faced the same issue and decided to try correcting the script code to solve it.

I did solve this issue with the 'DockerCommandLineFlags' variable not defined, and some other minor issues, but at the end there is still something missing in this script as I'm stuck in the authentication phase, where the script tries to connect to the ECS service inside the container but id cannot even after 30 minutes trying...

[22/Apr/2016 12:53:03] INFO [root:596] Problem reaching authentication server. Retrying shortly.

Executing getAuthToken: curl -i -k https://10.5.124.78:4443/login -u root:ChangeMe

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (7) Failed connect to 10.5.124.78:4443; Connection refused

[22/Apr/2016 12:53:33] INFO [root:596] Problem reaching authentication server. Retrying shortly.

[22/Apr/2016 12:53:33] CRITICAL [root:599] Authentication service not yet started.

[22/Apr/2016 12:53:33] INFO [root:704] Step 1 Completed.  Navigate to the administrator website that is available from any of the ECS data nodes.         The ECS administrative portal can be accessed from port 443. For example: https://ecs-node-external-ip-address.         The website may take a few minutes to become available.

In case you want to try, I attached my script with the corrections mentioned above.

ps: I'm not a developer or anything like it, just a user that wanted to experiment with ECS and faced the same issue..

Regards,


Caio

1 Attachment

5 Practitioner

 • 

274.2K Posts

April 22nd, 2016 15:00

Hello Aaron,

I tried the latest version on github today but I found a strange error that mentioned the ecsstandalone container name, instead of the ecsmultinode container name. I found the reference on line 553 of the step1 script:

os.system("docker "+' '.join(DockerCommandLineFlags)+" exec -t -i ecsstandalone bash -c \"cd /tmp/kennethreitz-requests-* && python setup.py install\"")

Even after correcting this command I still can't get access to ECS service even after 30 minutes of waiting.... Do you know if there is some additional work to be done in order to get this working?

btw: I have shutdown the firewalld service and it still does not work... the most strange thing is that if I decide to run the script for the single node ECS deployment on the same host, it works like a charm....

Thanks

April 25th, 2016 05:00

Hi Aaron

step1 script worked. step2 failed with:

[ecs@ecsnode1 ecs-multi-node]$ sudo python step2_object_provisioning.py --ECSNodes=192.168.1.22 192.168.1.23 192.168.1.24 192.168.1.25 --Namespace=mnns1 --ObjectVArray=mnova1 --ObjectVPool=mnovp1 --UserName=emccode --DataStoreName=mnds1 --VDCName=mnvdc1 --MethodName=

[sudo] password for ecs:

Executing getAuthToken: curl -i -k https://192.168.1.22:4443/login -u root:ChangeMe

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current

                                 Dload  Upload   Total   Spent    Left  Speed

100    93  100    93    0     0    169      0 --:--:-- --:--:-- --:--:--   170

Auth Token  BAAcZHFkRnRtaGlaNG1ocDlycHlxT2ZMdEZLbHo0PQMAjAQASHVybjpzdG9yYWdlb3M6VmlydHVhbERhdGFDZW50ZXJEYXRhOmJkM2NlMzNlLWM2NjYtNDM4Yi04ZjY3LTIzYzA4MGMxMmYzMQIADTE0NjE1ODg3NTQ0ODgDAC51cm46VG9rZW46OTQwYWQ3YzQtNWZhZS00ZTZlLWFjYmMtMDM0ODM4OWU0MDNlAgAC0A8=

ECSNodes: 192.168.1.22

Traceback (most recent call last):

  File "step2_object_provisioning.py", line 360, in

    main(sys.argv[1:])

  File "step2_object_provisioning.py", line 279, in main

    print("ObjectVArray: %s" %ObjectVArray)

UnboundLocalError: local variable 'ObjectVArray' referenced before assignment

[ecs@ecsnode1 ecs-multi-node]$

Thanks a lot, Holger

April 25th, 2016 07:00

Holger,

I believe it's a result of your formatting; you should use a comma-delimited list rather than a space-delimited one for ECSNodes, e.g.


[ecs@ecsnode1 ecs-multi-node]$ sudo python step2_object_provisioning.py --ECSNodes=192.168.1.22,192.168.1.23,192.168.1.24,192.168.1.25 --Namespace=mnns1 --ObjectVArray=mnova1 --ObjectVPool=mnovp1 --UserName=emccode --DataStoreName=mnds1 --VDCName=mnvdc1 --MethodName=

Honestly not sure what Calo is seeing if his environment is working fine with single-node but failing authentication on multi-node for the same host; there shouldn't be additional setup necessary by comparison. Other developers with possible insight would be welcomed here.

April 25th, 2016 08:00

Hi Aaron


Thanks for the reply and correction. Sorry for the trouble.

I will retry, even on my ssd esx it will take a short while. Then I'll update you.

As for the failing authentication, this took a long while on my machine to complete. But it finished.

I will keep you posted, Holger

No Events found!

Top