openshift / openshift-ansible-contrib Goto Github PK
View Code? Open in Web Editor NEWAdditional roles and playbooks for OpenShift installation and management
Home Page: https://www.openshift.com
License: Apache License 2.0
Additional roles and playbooks for OpenShift installation and management
Home Page: https://www.openshift.com
License: Apache License 2.0
Should be reusing the roles in the core repo so that the appropriate variables can be used.
Hello,
I intended to start up a cluster with small defaults but higher capacity app nodes, and used the app_instance_type parameter. But the result is I have all nodes the same size, t2.medium. There were no errors reported.
$ ./ose-on-aws.py --region=eu-west-1 --rhsm-user=jbrannst --public-hosted-zone=ocp.rocks --keypair=OpenShift-Key --rhsm-pool="Employee SKU" --app-instance-type=m4.xlarge --ami=ami-8b8c57f8 --s3-bucket-name=ocp-infra-registry-123456 --s3-username=ec2-user
RHSM password?:
Configured values:
stack_name: openshift-infra
ami: ami-8b8c57f8
region: eu-west-1
master_instance_type: m4.large
node_instance_type: t2.medium
app_instance_type: m4.xlarge
bastion_instance_type: t2.micro
keypair: OpenShift-Key
create_key: no
key_path: /dev/null
create_vpc: yes
vpc_id: None
private_subnet_id1: None
private_subnet_id2: None
private_subnet_id3: None
public_subnet_id1: None
public_subnet_id2: None
public_subnet_id3: None
byo_bastion: no
bastion_sg: /dev/null
console port: 443
deployment_type: openshift-enterprise
public_hosted_zone: ocp.rocks
app_dns_prefix: apps
apps_dns: apps.ocp.rocks
rhsm_user: jbrannst
rhsm_password: *******
rhsm_pool: Employee SKU
containerized: False
s3_bucket_name: ocp-infra-registry-123456
s3_username: ec2-user
It looks like a 'Subscription Name' update leads to the pool RegEx[1] matching nothing.
Subscription Name: Red Hat OpenShift Container Platform, Premium, 2-Core
Subscription Name: Red Hat OpenShift Container Platform, Standard, 2-Core
Subscription Name: Red Hat OpenShift Container Platform Broker/Master Infrastructure
ExecStart=/usr/bin/kubectl proxy -p 8080 --address=0.0.0.0 --accept-hosts=^*$ --config=/etc/origin/master/admin.kubeconfig
This exposes a world readable cluster admin port on the internet. We should not be doing this.
Encountered this during a normal deprovision run - inventory failed because someone was deleting volumes at the same time. This almost certainly will lead to flaking in the CI runs when we turn them on.
ventory()
File "/usr/share/ansible/openshift-ansible-gce/inventory/gce/hosts/gce.py", line 130, in __init__
print(self.json_format_dict(self.group_instances(zones),
File "/usr/share/ansible/openshift-ansible-gce/inventory/gce/hosts/gce.py", line 311, in group_instances
for node in self.driver.list_nodes():
File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 1606, in list_nodes
v.get('instances', [])]
File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 5300, in _to_node
extra['boot_disk'] = self.ex_get_volume(bd['name'], bd['zone'])
File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 4178, in ex_get_volume
response = self.connection.request(request, method='GET').object
File "/usr/lib/python2.7/site-packages/libcloud/compute/drivers/gce.py", line 120, in request
response = super(GCEConnection, self).request(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/libcloud/common/google.py", line 718, in request
*args, **kwargs)
File "/usr/lib/python2.7/site-packages/libcloud/common/base.py", line 797, in request
response = responseCls(**kwargs)
File "/usr/lib/python2.7/site-packages/libcloud/common/base.py", line 145, in __init__
self.object = self.parse_body()
File "/usr/lib/python2.7/site-packages/libcloud/common/google.py", line 271, in parse_body
raise ResourceNotFoundError(message, self.status, code)
libcloud.common.google.ResourceNotFoundError: {u'domain': u'global', u'message': u"The resource 'projects/openshift-gce-devel/zones/us-central1-a/disks/qe-chezhang-0106-master-1' was not found", u'reason': u'notFound'}
running gcloud after resolving yum errors on bastion manually results in the following ssh error:
To install or remove components at your current SDK version [138.0.0], run:
$ gcloud components install COMPONENT_ID
$ gcloud components remove COMPONENT_ID
To update your SDK installation to the latest version [138.0.0], run:
$ gcloud components update
Backing up [/home/cloud-user/.bashrc] to [/home/cloud-user/.bashrc.backup].
[/home/cloud-user/.bashrc] has been updated.
Start a new shell for the changes to take effect.
For more information on how to get started, please visit:
https://cloud.google.com/sdk/docs/quickstarts
Generating public/private rsa key pair.
Your identification has been saved in /home/cloud-user/.ssh/google_compute_engine.
Your public key has been saved in /home/cloud-user/.ssh/google_compute_engine.pub.
The key fingerprint is:
1d:ec:5d:80:ed:b3:cd:2b:89:91:32:72:ea:61:2a:db cloud-user
The key's randomart image is:
+--[ RSA 2048]----+
| o. |
| .. .. |
| o. . |
| o oo. |
| S o..= |
| . + o . o |
| o+ o o . . |
| .. o.. . o . |
| .oE.. . |
+-----------------+
Updating project ssh metadata...\Updated [https://www.googleapis.com/compute/v1/projects/machinelearning-nick].
Updating project ssh metadata...done.
Warning: Permanently added 'compute.3888108791522642653' (ECDSA) to the list of known hosts.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
(I'm tearing down this build so ssh key information will disappear thus no security risk)
Several commands in ./gcloud.sh
hung because they expected region (both create and describe).
++ gcloud --project openshift-gce-devel compute addresses describe origin-ci-master-network-lb-ip '--format=value(address)'
For the following addresses:
- [origin-ci-master-network-lb-ip]
choose a region:
[1] asia-east1
[2] europe-west1
[3] us-central1
[4] us-east1
[5] us-west1
Please enter your numeric choice: 3
I added --region flags to all the commands that didn't have it specified and that fixed the issue.
$ gcloud version
Google Cloud SDK 130.0.0
alpha 2016.01.12
beta 2016.01.12
bq 2.0.24
bq-nix 2.0.24
core 2016.10.07
core-nix 2016.03.28
gsutil 4.21
gsutil-nix 4.18
Hi,
vagrant up --provider=libvirt --debug results in an infinte loop.
==> node1: Updating /etc/hosts file on host machine (password may be required)...
INFO environment: Getting machine: node1 (libvirt)
INFO environment: Returning cached machine: node1 (libvirt)
INFO environment: Getting machine: admin1 (libvirt)
INFO environment: Returning cached machine: admin1 (libvirt)
INFO warden: Calling OUT action: #VagrantPlugins::HostManager::Action::UpdateAll:0x0000000352d6d8
INFO warden: Calling OUT action: #Vagrant::Action::Builtin::ConfigValidate:0x0000000352d728
INFO provision: Writing provisioning sentinel so we don't provision again
INFO warden: Calling OUT action: #Vagrant::Action::Builtin::Provision:0x00000003565808
INFO warden: Calling OUT action: #VagrantPlugins::ProviderLibvirt::Action::CreateDomain:0x000000035a4990
INFO warden: Calling OUT action: #VagrantPlugins::ProviderLibvirt::Action::CreateDomainVolume:0x0000000360b8c0
INFO warden: Calling OUT action: #VagrantPlugins::ProviderLibvirt::Action::HandleBoxImage:0x0000000364bdd0
INFO warden: Calling OUT action: #Vagrant::Action::Builtin::HandleBox:0x0000000368feb8
INFO warden: Calling OUT action: #VagrantPlugins::ProviderLibvirt::Action::HandleStoragePool:0x000000036e4878
INFO warden: Calling OUT action: #VagrantPlugins::ProviderLibvirt::Action::SetNameOfDomain:0x00000003728d70
INFO warden: Calling OUT action: #<Proc:0x000000037acd28@/opt/vagrant/embedded/gems/gems/vagrant-1.8.6/lib/vagrant/action/warden.rb:94 (lambda)>
INFO warden: Calling OUT action: #Vagrant::Action::Builtin::Call:0x000000036ec230
INFO warden: Calling OUT action: #Vagrant::Action::Builtin::ConfigValidate:0x000000036ec280
INFO interface: Machine: action ["up", "end", {:target=>:node1}]
INFO environment: Released process lock: machine-action-258d7116eec64fc595d6c05f5e723d3c
DEBUG environment: Attempting to acquire process-lock: dotlock
INFO environment: Acquired process lock: dotlock
INFO environment: Released process lock: dotlock
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
DEBUG ssh: Sending SSH keep-alive...
...
Best,
Peter
Hi again
When I provision a service, lets say jenkins, the url scheme is http://{service}-{project}.apps.domain.com
so http://jenkins-me.apps.domain.com
. Is there a setting to change the separator from - to . so I get this http://jenkins.me.apps.domain.com
?
I get various connection failures when browsing the OSE web console. I have no issues on the command line console. Shutting down the other masters 02 and 03 solves the problem, but I assume it should work in HA as well.
Maybe the asynchronous loading fails when the request is loadbalanced to a different master than the one the first page was served with. This is after accepting permanently all certificates. I am running Firefox 49.0.2 and I have seen this issue before. I noticed a mention in an internal doc about configuring reverse proxy, maybe there is some additional AWS config I must do manually that I may have missed?
command
./ose-on-aws.py --create-key=yes [email protected] --rhsm-password=XXXX --public-hosted-zone=ocp.alberttwong.com --key-path=/root/.ssh/id_rsa.pub --keypair=us-east-1 --rhsm-pool=Employee
error message
ERROR: "Forbidden", while: getting RDS instancesERROR! Inventory script (inventory/aws/hosts/ec2.py) had an execution error: ERROR: "Forbidden", while: getting RDS instances
modified my IAM policy to below with S3FullAccess
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1459269951000",
"Effect": "Allow",
"Action": [
"cloudformation:*",
"iam:*",
"route53:*",
"elasticloadbalancing:*",
"ec2:*",
"cloudwatch:*",
"autoscaling:*"
],
"Resource": [
"*"
]
}
]
}
The openshift-vm-facts role that is referenced in openshift-install.yaml is not present in the roles directory.
ERROR! the role 'openshift-vm-facts' was not found in /root/openshift/reference-architecture/vmware-ansible/roles:/root/openshift/reference-architecture/vmware-ansible:/opt/ansible/roles:/root/openshift/reference-architecture/vmware-ansible/roles
The error appears to have been in '/root/openshift/reference-architecture/vmware-ansible/openshift-install.yaml': line 9, column 9, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
- instance-groups
- openshift-vm-facts
^ here
./gcloud on fedora ends up on this error:
Warning: Permanently added 'compute.3888108791522642653' (ECDSA) to the list of known hosts.
ansible-config.yml 100% 783 13.8KB/s 00:00
Loaded plugins: search-disabled-repos
There are no enabled repos.
Run "yum repolist all" to see the repos you have.
To enable Red Hat Subscription Management repositories:
subscription-manager repos --enable
To enable custom repositories:
yum-config-manager --enable
Connection to 104.197.128.176 closed.
I don't see rotation of /var/log/messages happening in the GCE ref arch, and we have a very high debug level configured. First, do we have this already and I missed it? Second, if not, do we have it in OpenShift Ansible? Third, if not, we should probably add it to the ref-arch, since this is a very likely cause of node failure on GCE.
Hi,
I tried to deploy openshift origin using these ansible playbooks with vagrant and for some reason when trying to connect to the web console using admin/admin123 I always had "Invalid username or password". I then changed the htpasswd file with a re-hased value of "admin123" and it worked. What's the actual value of "$2y$11$jJioXC3WgyRq.FVy1vqtfuywDwEZp18d9Kkqb4MgFVzlgCGQNwy36" ?
I guess either the documentation or the Vagrantfile should be updated with something that matches.
Am I alone with this problem ?
When using ose-on-aws.py the password must be escaped if containing special characters. For example --rhsm-password=My$ecret would print as My$. A work around is --rhsm-password=My$ecret
Hi
Im having an issue I dont know how to solve. Basically you cannot run the s3 user tasks more than once and retrieve the required data structure.
If a user exists and this task runs, you do not get the s3user.user_meta data needed to run the subsequent tasks. ie
TASK [s3-registry-user : debug s3] *********************************************
task path: playbooks/roles/s3-registry-user/tasks/main.yaml:9
ok: [localhost] => {
"s3user": {
"changed": false,
"created_keys": [],
"groups": null,
"keys": {
"CHANGED": "Active"
},
"user_name": "apim"
}
}
Which results in
TASK [s3-registry-user : Set fact] *********************************************
task path: playbooks/roles/s3-registry-user/tasks/main.yaml:11
fatal: [localhost]: FAILED! => {
"failed": true,
"msg": "the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'user_meta'\n\nThe error appears to have been in 'playbooks/roles/s3-registry-user/tasks/main.yaml': line 11, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n debug: var=s3user\n- name: Set fact\n ^ here\n"
}
This means that any method to add an infra node will fail and any run further run of ose-on-aws.py
will not complete due to missing data structures. To get round this could mean pre-creating the s3 user and storing the aws secret and passing in as a param but that seems clunky.
TASK [instance-groups : Add app instances to host group] ***********************
[DEPRECATION WARNING]: Skipping task due to undefined Error, in the future this will be a fatal error.: 'dict object' has no attribute 'tag_provision_node'.
This feature will
be removed in a future release. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
skipping: [localhost]
Fail with file not found:
TASK [ssh-key : OSE ec2 key] ***************************************************
task path: /home/mangis/workspace/openshift-ansible-contrib/reference-architecture/aws-ansible/playbooks/roles/ssh-key/tasks/main.yaml:2
skipping: [localhost] => (item=) => {"changed": false, "item": "", "skip_reason": "Conditional check failed", "skipped": true}
TASK [cloudformation-infra : Create Greenfield Infrastructure] *****************
task path: /home/mangis/workspace/openshift-ansible-contrib/reference-architecture/aws-ansible/playbooks/roles/cloudformation-infra/tasks/main.yaml:2
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: mangis
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo $HOME/.ansible/tmp/ansible-tmp-1475326149.27-207773419727810 `" && echo ansible-tmp-1475326149.27-207773419727810="` echo $HOME/.ansible/tmp/ansible-tmp-1475326149.27-207773419727810 `" ) && sleep 0'
<127.0.0.1> PUT /tmp/tmpTaGfYL TO /home/mangis/.ansible/tmp/ansible-tmp-1475326149.27-207773419727810/cloudformation
<127.0.0.1> EXEC /bin/sh -c 'LANG=en_US.UTF-8 LC_ALL=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 /usr/bin/python2 /home/mangis/.ansible/tmp/ansible-tmp-1475326149.27-207773419727810/cloudformation; rm -rf "/home/mangis/.ansible/tmp/ansible-tmp-1475326149.27-207773419727810/" > /dev/null 2>&1 && sleep 0'
An exception occurred during task execution. The full traceback is:
Traceback (most recent call last):
File "/tmp/ansible_R4Ze5y/ansible_module_cloudformation.py", line 401, in <module>
main()
File "/tmp/ansible_R4Ze5y/ansible_module_cloudformation.py", line 269, in main
template_body = open(module.params['template'], 'r').read()
IOError: [Errno 2] No such file or directory: 'roles/cloudformation-infra/files/greenfield.json'
fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "invocation": {"module_name": "cloudformation"}, "module_stderr": "Traceback (most recent call last):\n File \"/tmp/ansible_R4Ze5y/ansible_module_cloudformation.py\", line 401, in <module>\n main()\n File \"/tmp/ansible_R4Ze5y/ansible_module_cloudformation.py\", line 269, in main\n template_body = open(module.params['template'], 'r').read()\nIOError: [Errno 2] No such file or directory: 'roles/cloudformation-infra/files/greenfield.json'\n", "module_stdout": "", "msg": "MODULE FAILURE", "parsed": false}
Testing this with slight modified version as I dont have subscription, but this should not impact this place. Looks like template is not being resolved.
^
Is anyone working on auto scaling groups?
If not I wouldn't mind starting on it
I'd like to be able to separate the image creation step from the provisioning of the instance group. I was thinking gcloud-image.sh
and gcloud.sh
, where gcloud.sh
would either skip the image creation or call into gcloud-image.sh
if and only if REGISTERED_IMAGE was unset.
I tried containerized setup with this tool in amazon ireland:
./ose-on-aws.py --keypair=$AWS_KEY --rhsm-user=$RHUSER --rhsm-password=$RHPWD --public-hosted-zone=konttikoulu.fi --rhsm-pool=$RHPOOL --region eu-west-1 --ami ami-02ace471 --master-instance-type t2.medium --node-instance-type t2.large --app-instance-type t2.medium --containerized true --app-dns-prefix=iken
which leads to failure in:
TASK [set_fact] ****************************************************************
fatal: [ose-master01.konttikoulu.fi]: FAILED! => {
"failed": true
}
MSG:
the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'openshift'
The error appears to have been in '/usr/share/ansible/openshift-ansible/playbooks/common/openshift-cluster/initialize_facts.yml': line 12, column 5, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
hostname: "{{ openshift_hostname | default(None) }}"
- set_fact:
^ here
fatal: [ose-infra-node02.konttikoulu.fi]: FAILED! => {
"failed": true
}
My clone of this tool is of version:
commit f5a7778
Merge: dd87087 df0f724
Author: Jason DeTiberus [email protected]
Date: Mon Nov 21 12:10:41 2016 -0500
and I'm running this from rhel 7 latest from rhel container on my laptop.
openshift-ansible-playbooks-3.3.54-1.git.0.61a1dee.el7.noarch
Running in my setup, using the qcow downloaded from server. Is this because i'm not in the right pool for my subscription?
+ gcloud --project openshift-gce-devel compute copy-files /Users/clayton/projects/origin/src/github.com/openshift/openshift-ansible-contrib/reference-architecture/gce-cli/ansible-config.yml cloud-user@bastion: --zone us-central1-a
Warning: Permanently added 'compute.5845721122299694843' (ECDSA) to the list of known hosts.
ansible-config.yml 100% 779 0.8KB/s 00:00
+ gcloud --project openshift-gce-devel compute ssh cloud-user@bastion --zone us-central1-a --ssh-flag=-t --command 'sudo sh -c '\''
yum install -y python-libcloud atomic-openshift-utils;
if ! grep -q "export GCE_PROJECT=openshift-gce-devel" /etc/profile.d/ocp.sh 2>/dev/null; then
echo "export GCE_PROJECT=openshift-gce-devel" >> /etc/profile.d/ocp.sh;
fi
if ! grep -q "export INVENTORY_IP_TYPE=internal" /etc/profile.d/ocp.sh 2>/dev/null; then
echo "export INVENTORY_IP_TYPE=internal" >> /etc/profile.d/ocp.sh;
fi
'\'''
Loaded plugins: search-disabled-repos
google-cloud-compute/signature | 454 B 00:00:00
google-cloud-compute/signature | 1.4 kB 00:00:00 !!!
google-cloud-compute/primary | 1.8 kB 00:00:00
google-cloud-compute 4/4
No package python-libcloud available.
No package atomic-openshift-utils available.
Error: Nothing to do
Connection to 104.155.172.153 closed.
Hi
My deployments are sometimes getting stuck quite a while on the RHN registration. After two hours being "stuck" it failed:
PLAY [localhost] ***************************************************************
TASK [instance-groups : Add bastion to group] **********************************
changed: [localhost]
TASK [instance-groups : Add masters to requisite groups] ***********************
changed: [localhost] => (item=ip-10-20-4-101.us-west-2.compute.internal)
changed: [localhost] => (item=ip-10-20-5-157.us-west-2.compute.internal)
changed: [localhost] => (item=ip-10-20-6-26.us-west-2.compute.internal)
TASK [instance-groups : Add a master to the primary masters group] *************
changed: [localhost] => (item=ip-10-20-4-101.us-west-2.compute.internal)
TASK [instance-groups : Add infra instances to host group] *********************
changed: [localhost] => (item=ip-10-20-5-61.us-west-2.compute.internal)
changed: [localhost] => (item=ip-10-20-4-22.us-west-2.compute.internal)
TASK [instance-groups : Add app instances to host group] ***********************
changed: [localhost] => (item=ip-10-20-5-170.us-west-2.compute.internal)
changed: [localhost] => (item=ip-10-20-4-177.us-west-2.compute.internal)
TASK [instance-groups : Add app instances to host group] ***********************
PLAY [localhost] ***************************************************************
TASK [host-up : check to see if host is available] *****************************
ok: [localhost]
PLAY [cluster_hosts] ***********************************************************
TASK [setup] *******************************************************************
ok: [ose-master01....com]
TASK [rhsm-subscription : Register host] ***************************************
changed: [ose-master01....com]
PLAY [cluster_hosts] ***********************************************************
TASK [setup] *******************************************************************
ok: [ose-master02....com]
TASK [rhsm-subscription : Register host] ***************************************
fatal: [ose-master02....com]: UNREACHABLE! => {
"changed": false,
"unreachable": true
}
MSG:
Failed to connect to the host via ssh: Shared connection to ose-master02....com closed.
PLAY RECAP *********************************************************************
localhost : ok=6 changed=5 unreachable=0 failed=0
ose-master01....com : ok=2 changed=1 unreachable=0 failed=0
ose-master02....com : ok=1 changed=0 unreachable=1 failed=0
I suspect it may be caused by some DNS propagation delay or other race condition. Connecting to master-02 did work when connection manually
ssh -v ose-master02.....com
OpenSSH_7.2p2, OpenSSL 1.0.2h-fips 3 May 2016
debug1: Reading configuration data /home/jhenner/.ssh/config
debug1: /home/jhenner/.ssh/config line 81: Applying options for *
debug1: /home/jhenner/.ssh/config line 198: Applying options for *.....com
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 58: Applying options for *
debug1: Executing proxy command: exec ssh ec2-user@bastion -W ose-master02.....com:22
debug1: permanently_drop_suid: 1000
debug1: identity file /home/jhenner/work/cm-infra/jenkins-rhel-slave/id_rsa type 1
debug1: key_load_public: No such file or directory
debug1: identity file /home/jhenner/work/cm-infra/jenkins-rhel-slave/id_rsa-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_7.2
Warning: Permanently added 'bastion.....com' (ECDSA) to the list of known hosts.
debug1: Remote protocol version 2.0, remote software version OpenSSH_6.6.1
debug1: match: OpenSSH_6.6.1 pat OpenSSH_6.6.1* compat 0x04000000
debug1: Authenticating to ose-master02.....com:22 as 'ec2-user'
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: algorithm: [email protected]
debug1: kex: host key algorithm: ecdsa-sha2-nistp256
debug1: kex: server->client cipher: [email protected] MAC: <implicit> compression: none
debug1: kex: client->server cipher: [email protected] MAC: <implicit> compression: none
debug1: kex: [email protected] need=64 dh_need=64
debug1: kex: [email protected] need=64 dh_need=64
debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
debug1: Server host key: ecdsa-sha2-nistp256 SHA256:Xxq5YiZ1fL5MiO4pF8pzvHfg/BvUwBX/qjjRIQhz4kU
Warning: Permanently added 'ose-master02.....com' (ECDSA) to the list of known hosts.
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: rekey after 134217728 blocks
debug1: SSH2_MSG_NEWKEYS received
debug1: Skipping ssh-dss key jhenner@jezevec - not in PubkeyAcceptedKeyTypes
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
debug1: Next authentication method: publickey
debug1: Offering RSA public key: jhenner@veverka
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
debug1: Offering RSA public key: /home/jhenner/work/cm-infra/jenkins-rhel-slave/id_rsa
debug1: Server accepts key: pkalg ssh-rsa blen 277
debug1: Authentication succeeded (publickey).
Authenticated to ose-master02.....com (via proxy).
debug1: channel 0: new [client-session]
debug1: Requesting [email protected]
debug1: Entering interactive session.
debug1: pledge: proc
debug1: Sending environment.
debug1: Sending env LC_PAPER = cs_CZ.utf8
debug1: Sending env LC_MONETARY = cs_CZ.utf8
debug1: Sending env LC_NUMERIC = cs_CZ.utf8
debug1: Sending env XMODIFIERS = @im=none
debug1: Sending env LANG = en_US.utf8
debug1: Sending env LC_MEASUREMENT = cs_CZ.utf8
debug1: Sending env LC_TIME = cs_CZ.utf8
Last login: Wed Oct 26 09:34:25 2016 from ip-10-20-1-39.us-west-2.compute.internal
./ose-on-aws.py --rhsm-user=* --rhsm-password=** --public-hosted-zone=....com --keypair=jenkins --ami=ami-775e4f16 --rhsm-pool=ES0113909 --no-confirm --region=us-west-2
When I run the brownfield deployment (run it again), It passes this phase no problems.
need to run
ansible-playbook -i inventory/aws/hosts -e 'region=us-west-2 s3_username=openshift-s3-docker-registry ci=true' playbooks/teardown.yaml
Change the region to your region.
What is the AMI to use if I want to install in another AWS region? It seems that I can only install in us-east-1. I don't have the ami for other regions.
TASK [instance-groups : Add masters to requisite groups] ***********************
task path: /root/AWS/openshift-ansible-contrib/reference-architecture/aws-ansible/playbooks/roles/instance-groups/tasks/main.yaml:9
fatal: [localhost]: FAILED! => {
"failed": true
}
MSG:
'dict object' has no attribute 'tag_openshift_role_master'
getting this error...
TASK [host-up : check to see if host is available] *****************************
fatal: [localhost]: FAILED! => {"changed": false, "elapsed": 300, "failed": true, "msg": "Timeout when waiting for bastion.ocp.alberttwong.com:22"}
tried to manually connect
[root@localhost aws-ansible]# ssh [email protected]
[ec2-user@ip-10-20-1-191 ~]$ ls
[ec2-user@ip-10-20-1-191 ~]$ exit
logout
Connection to ec2-35-161-6-30.us-west-2.compute.amazonaws.com closed.
Didn't ask me a username and let me in.
I found out the issue is that somehow the DNS entry in route53 aren't coming down to me (DNS propagation). Make sure you can get to your bastion (bastion.ocp.alberttwong.com) by ssh [email protected]. If this doesn't work, modify your dns.
Hi again :)
I tried moving openshift_master_identity_providers:
from playbooks/openshift-setup.yaml
to playbooks/vars/main.yaml
so I can use a different oauth provider without editing a core playbook.
I commented the provider out of openshift-setup.yaml
but always ended up with the default DenyAll provider when I put it in vars/main.yaml
I'd like the ability to change the oauth provider per env so using the --env-file=
feature in #39
Use case is to have this repo as a git submodule of a larger deployment system that manages multiple IaaS providers and environments. I dont want to have to modify an upstream repo
As a user I want to add a node to my cluster, I know the size and what it should do. I have my credentials ready. I'd preferably not need to dig into the detailed network/security setup of my cluster...
add_node.py
Things the user can easily specify
region: eu-west-1
node_instance_type: r3.xlarge
keypair: OpenShift-Key
public_hosted_zone: ocp.rocks
shortname: ose-app-node04
rhsm_user: jbrannst
rhsm_password: *******
rhsm_pool: Employee SKU
Items that are hard to specify that we should be able to provide defaults for, if needed by querying AWS:
ami: ami-ce66d8bd
(can be looked up, by hardcoded table, per region as suggested in another thread)
subnet_id: subnet-44763020
(as there are typically 3 relevant options they should be printed, one selected and asking the user to confirm or input another option)
node_sg: sg-f62ce390
(there is typically only one option)
iam_role: OpenShift-Infra-NodeInstanceProfile-8DDK8K32INSK
(also only one option)
I have no idea how hard this would be to do, but it would definitely be a boost to user experience
Hi,
tried to create a sample environment with the following command:
vagrant up --provider=libvirt --debug
Reported Errors:
An error occurred while executing multiple actions in parallel.
Any errors that occurred are shown below.
An error occurred while executing the action on the 'node1'
machine. Please handle this error then try again:
Vagrant can't use the requested machine because it is locked! This
means that another Vagrant process is currently reading or modifying
the machine. Please wait for that Vagrant process to end and try
again. Details about the machine are shown below:
Name: node1
Provider: libvirt
An error occurred while executing the action on the 'node2'
machine. Please handle this error then try again:
While attempting to connect with SSH, a "no route to host" (EHOSTUNREACH)
error was received. Please verify your network settings are correct
and try again.
An error occurred while executing the action on the 'admin1'
machine. Please handle this error then try again:
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
Debug Information:
DEBUG ssh: == Net-SSH connection debug-level log END ==
INFO retryable: Retryable exception raised: #<Errno::EHOSTUNREACH: No route to host - connect(2) for 192.168.121.64:22>
INFO ssh: Attempting to connect to SSH...
INFO ssh: - Host: 192.168.121.64
INFO ssh: - Port: 22
INFO ssh: - Username: vagrant
INFO ssh: - Password? false
INFO ssh: - Key Path: ["/home/labuser/openshift-ansible-contrib/vagrant/.vagrant/machines/node1/libvirt/private_key"]
DEBUG ssh: - connect_opts: {:auth_methods=>["none", "hostbased", "publickey"], :config=>false, :forward_agent=>false, :send_env=>false, :keys_only=>true, :paranoid=>false, :password=>nil, :port=>22, :timeout=>15, :user_known_hosts_file=>[], :verbose=>:debug, :logger=>#<Logger:0x007fc480bf8820 @progname=nil, @Level=0, @default_formatter=#<Logger::Formatter:0x007fc480bf87f8 @datetime_format=nil>, @Formatter=nil, @logdev=#<Logger::LogDevice:0x007fc480bf8730 @shift_size=nil, @shift_age=nil, @filename=nil, @dev=#StringIO:0x007fc480bf8870, @mutex=#<Logger::LogDevice::LogDeviceMutex:0x007fc480bf8708 @mon_owner=nil, @mon_count=0, @mon_mutex=#Mutex:0x007fc480bf8690>>>, :keys=>["/home/labuser/openshift-ansible-contrib/vagrant/.vagrant/machines/node1/libvirt/private_key"]}
DEBUG ssh: == Net-SSH connection debug-level log START ==
DEBUG ssh: D, [2016-10-25T07:01:55.922869 #4853] DEBUG -- net.ssh.transport.session[3fe2405c97b8]: establishing connection to 192.168.121.64:22
DEBUG ssh: == Net-SSH connection debug-level log END ==
ERROR warden: Error occurred: While attempting to connect with SSH, a "no route to host" (EHOSTUNREACH)
error was received. Please verify your network settings are correct
and try again.
INFO warden: Beginning recovery process...
INFO warden: Calling recover: #VagrantPlugins::ProviderLibvirt::Action::WaitTillUp:0x00000002621928
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
ERROR warden: Error occurred: While attempting to connect with SSH, a "no route to host" (EHOSTUNREACH)
error was received. Please verify your network settings are correct
and try again.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
ERROR warden: Error occurred: While attempting to connect with SSH, a "no route to host" (EHOSTUNREACH)
error was received. Please verify your network settings are correct
and try again.
INFO warden: Beginning recovery process...
INFO warden: Calling recover: #Vagrant::Action::Builtin::Call:0x000000024b6598
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO warden: Recovery complete.
INFO warden: Beginning recovery process...
INFO warden: Recovery complete.
INFO environment: Released process lock: machine-action-985e874cf515a9058c4a05af16e98c77
INFO interface: error: An error occurred. The error will be shown after all tasks complete.
INFO interface: error: ==> master1: An error occurred. The error will be shown after all tasks complete.
==> master1: An error occurred. The error will be shown after all tasks complete.
Host:
Ubuntu 16.04.01
Ansible 2.1.2.0
Vagrant 1.8.6
Best,
Peter
Using compiled ansible when add_node defined in instance groups causes the installation to fail whereas in the RH provided rpm we are prompted with
[DEPRECATION WARNING]: Skipping task due to undefined Error, in the future this will be a fatal error.: 'dict object' has no attribute 'tag_provision_node'.
This feature will be removed in a future release.
Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
skipping: [localhost]
Instead of leveraging kubeproxy, it would probably be better to stand up either apache or haproxy to forward only the healthz/ready endpoint rather than proxy the entire api behind 8080.
Hi, first of all congratulations for your good work!
I'm trying to install Openshift Origin on AWS on the us-east-1 region.
The ose-on-aws.py command is executed almost successfully. However, I get the following error (on the 3 masters):
TASK [openshift_cloud_provider : Create cloud config] **************************
fatal: [ose-master01.foxtek.net]: FAILED! => {
"changed": false,
"failed": true,
"rc": 257
}
MSG: Destination /etc/origin/cloudprovider/aws.conf does not exist !
So, the "play recap" shows the following summary:
PLAY RECAP *********************************************************************
localhost : ok=19 changed=13 unreachable=0 failed=0
ose-app-node01.foxtek.net : ok=52 changed=8 unreachable=0 failed=0
ose-app-node02.foxtek.net : ok=52 changed=8 unreachable=0 failed=0
ose-infra-node01.foxtek.net : ok=52 changed=8 unreachable=0 failed=0
ose-infra-node02.foxtek.net : ok=52 changed=8 unreachable=0 failed=0
ose-master01.foxtek.net : ok=218 changed=65 unreachable=0 failed=1
ose-master02.foxtek.net : ok=195 changed=64 unreachable=0 failed=1
ose-master03.foxtek.net : ok=195 changed=64 unreachable=0 failed=1
Do you have any ideas regarding why the /etc/origin/cloudprovider/aws.conf file was not created on any of the master nodes?
I have connected to the master nodes ("ssh [email protected]") and the directory /etc/origin/cloudprovider exists but is totally empty (no files inside). Any ideas?
Thank you very much in advance!
Best Regards
Hi,
https://github.com/openshift/openshift-ansible-contrib/blob/master/vagrant/README.md
Section: Installation
After cloning the github repo the intended path "vagrant-openshift-cluster/vagrant" is not available.
It should be:
cd openshift-ansible-contrib/vagrant/
Best,
Peter
Hi, Where is the cloudformation template located for this HA architecture?
I am looking to make some modifications in cloud formation template and then deploy the stack.
Any help?
Thanks in advance
Uninstall command is wrong. Should be:
ansible-playbook -i inventory/aws/hosts -e 'region=us-east-1 stack_name=openshift-infra ci=false' playbooks/teardown.yaml
Currently has "playook"
./ose-on-aws.py --rhsm-user=$USERNAME --rhsm-password=$PASS --public-hosted-zone=jhenner.mooo.com --keypair=jenkins --ami=ami-775e4f16 --rhsm-pool=ES0113909 --no-confirm --region=us-west-2
Configured values:
ami: ami-775e4f16
region: us-west-2
master_instance_type: m4.large
node_instance_type: t2.medium
bastion_instance_type: t2.micro
keypair: jenkins
create_key: no
key_path: /dev/null
create_vpc: yes
vpc_id: None
private_subnet_id1: None
private_subnet_id2: None
private_subnet_id3: None
public_subnet_id1: None
public_subnet_id2: None
public_subnet_id3: None
byo_bastion: no
bastion_sg: /dev/null
console port: 443
deployment_type: openshift-enterprise
public_hosted_zone: jhenner.mooo.com
app_dns_prefix: apps
apps_dns: apps.jhenner.mooo.com
rhsm_user: ******
rhsm_password: *******
rhsm_pool: ES0113909
containerized: False
TASK [s3-registry-user : Create S3 OpenShift registry user] ********************
ok: [localhost]
TASK [s3-registry-user : Set fact] *********************************************
fatal: [localhost]: FAILED! => {
"failed": true
}
MSG:
the field 'args' has an invalid value, which appears to include a variable that is undefined. The error was: 'dict object' has no attribute 'user_meta'
The error appears to have been in '/home/jenkins/workspace/ose-on-aws-deploy/openshift-ansible-contrib/reference-architecture/aws-ansible/playbooks/roles/s3-registry-user/tasks/main.yaml': line 9, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
register: s3user
- name: Set fact
^ here
This is with the commit 8c52fb6
Below is a snip from the end of my vagrant provisioning. I believe this shows that admin1 (the one running the ansible playbook in question) CAN in fact reach all the other nodes, but for some reason this one task consistently fails with this same error. Not sure what's going on.
TASK [openshift_node_certificates : Ensure CA certificate exists on openshift_ca_host] ***
ok: [master1 -> 192.168.50.20] => {"changed": false, "stat": {"atime": 1478024342.644244, "checksum": "6f7070de78ad472fe8fee4232319111aaf3873d5", "ctime": 1478024342.638244, "dev": 64769, "exists": true, "gid": 0, "gr_name": "root", "inode": 1573519, "isblk": false, "ischr": false, "isdir": false, "isfifo": false, "isgid": false, "islnk": false, "isreg": true, "issock": false, "isuid": false, "md5": "66d821c9bb1611bcbc784110e8c23ba6", "mode": "0644", "mtime": 1478024342.638244, "nlink": 1, "path": "/etc/origin/master/ca.crt", "pw_name": "root", "rgrp": true, "roth": true, "rusr": true, "size": 1070, "uid": 0, "wgrp": false, "woth": false, "wusr": true, "xgrp": false, "xoth": false, "xusr": false}}
TASK [openshift_node_certificates : fail] **************************************
skipping: [master1] => {"changed": false, "skip_reason": "Conditional check failed", "skipped": true}
TASK [openshift_node_certificates : Check status of node certificates] *********
ok: [node2] => (item=system:node:node2.example.com.crt) => {"changed": false, "item": "system:node:node2.example.com.crt", "stat": {"exists": false}}
ok: [master1] => (item=system:node:master1.example.com.crt) => {"changed": false, "item": "system:node:master1.example.com.crt", "stat": {"exists": false}}
ok: [node1] => (item=system:node:node1.example.com.crt) => {"changed": false, "item": "system:node:node1.example.com.crt", "stat": {"exists": false}}
ok: [node2] => (item=system:node:node2.example.com.key) => {"changed": false, "item": "system:node:node2.example.com.key", "stat": {"exists": false}}
ok: [master1] => (item=system:node:master1.example.com.key) => {"changed": false, "item": "system:node:master1.example.com.key", "stat": {"exists": false}}
ok: [node1] => (item=system:node:node1.example.com.key) => {"changed": false, "item": "system:node:node1.example.com.key", "stat": {"exists": false}}
ok: [master1] => (item=system:node:master1.example.com.kubeconfig) => {"changed": false, "item": "system:node:master1.example.com.kubeconfig", "stat": {"exists": false}}
ok: [node1] => (item=system:node:node1.example.com.kubeconfig) => {"changed": false, "item": "system:node:node1.example.com.kubeconfig", "stat": {"exists": false}}
ok: [node2] => (item=system:node:node2.example.com.kubeconfig) => {"changed": false, "item": "system:node:node2.example.com.kubeconfig", "stat": {"exists": false}}
ok: [master1] => (item=ca.crt) => {"changed": false, "item": "ca.crt", "stat": {"exists": false}}
ok: [node1] => (item=ca.crt) => {"changed": false, "item": "ca.crt", "stat": {"exists": false}}
ok: [node2] => (item=ca.crt) => {"changed": false, "item": "ca.crt", "stat": {"exists": false}}
ok: [master1] => (item=server.key) => {"changed": false, "item": "server.key", "stat": {"exists": false}}
ok: [node2] => (item=server.key) => {"changed": false, "item": "server.key", "stat": {"exists": false}}
ok: [node1] => (item=server.key) => {"changed": false, "item": "server.key", "stat": {"exists": false}}
ok: [master1] => (item=server.crt) => {"changed": false, "item": "server.crt", "stat": {"exists": false}}
ok: [node1] => (item=server.crt) => {"changed": false, "item": "server.crt", "stat": {"exists": false}}
ok: [node2] => (item=server.crt) => {"changed": false, "item": "server.crt", "stat": {"exists": false}}
TASK [openshift_node_certificates : set_fact] **********************************
ok: [master1] => {"ansible_facts": {"node_certs_missing": true}, "changed": false}
ok: [node1] => {"ansible_facts": {"node_certs_missing": true}, "changed": false}
ok: [node2] => {"ansible_facts": {"node_certs_missing": true}, "changed": false}
TASK [openshift_node_certificates : Create openshift_generated_configs_dir if it does not exist] ***
fatal: [node1]: UNREACHABLE! => {"changed": false, "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
fatal: [node2]: UNREACHABLE! => {"changed": false, "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
fatal: [master1]: UNREACHABLE! => {"changed": false, "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
[root@localhost ~]# ssh [email protected]
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the ECDSA key sent by the remote host is
06:35:3b:37:90:8e:50:32:4c:c0:67:d2:0e:54:79:0c.
Please contact your system administrator.
Add correct host key in /root/.ssh/known_hosts to get rid of this message.
Offending ECDSA key in /root/.ssh/known_hosts:2
Password authentication is disabled to avoid man-in-the-middle attacks.
Keyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.
Agent forwarding is disabled to avoid man-in-the-middle attacks.
Error: forwarding disabled due to host key check failure
ssh_exchange_identification: Connection closed by remote host
[root@localhost ~]# ssh [email protected]
Last login: Wed Oct 19 23:25:31 2016 from cpe-75-83-58-118.socal.res.rr.com
related to #58
When doing installation from vagrant image etc. time might not be synced correctly. This may cause python errors that are really hard to figure out.
I tried to deploy OCP on AWS using the python script :
./ose-on-aws.py --keypair=OSE-key --region eu-central-1 --ami ami-7def1712 --create-key=yes --key-path=/root/.ssh/id_rsa.pub --rhsm-user=dwojciec --rhsm-password=password \
--public-hosted-zone=mydomain.com --rhsm-pool="Red Hat OpenShift Container Platform, Standard, 2-Core"
--github-client-secret=6746b9659154d680a06ee9ba07b1d379582ab40f --github-organization=myorg-openshift
--github-client-id=f64169d9eaa3efccf87e -vvv
The error received is :
TASK [cloudformation-infra : Create Greenfield Infrastructure] *****************
task path: /root/AWS/openshift-ansible-contrib/reference-architecture/aws-ansible/playbooks/roles/cloudformation-infra/tasks/main.yaml:2
Using module file /usr/lib/python2.7/site-packages/ansible/modules/core/cloud/amazon/cloudformation.py
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "echo $HOME/.ansible/tmp/ansible-tmp-1483376807.65-246631564182145
" && echo ansible-tmp-1483376807.65-246631564182145="echo $HOME/.ansible/tmp/ansible-tmp-1483376807.65-246631564182145
" ) && sleep 0'
<127.0.0.1> PUT /tmp/tmph8KP_7 TO /root/.ansible/tmp/ansible-tmp-1483376807.65-246631564182145/cloudformation.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1483376807.65-246631564182145/ /root/.ansible/tmp/ansible-tmp-1483376807.65-246631564182145/cloudformation.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python2 /root/.ansible/tmp/ansible-tmp-1483376807.65-246631564182145/cloudformation.py; rm -rf "/root/.ansible/tmp/ansible-tmp-1483376807.65-246631564182145/" > /dev/null 2>&1 && sleep 0'
fatal: [localhost]: FAILED! => {
"changed": false,
"failed": true,
"invocation": {
"module_args": {
"aws_access_key": null,
"aws_secret_key": null,
"disable_rollback": false,
"ec2_url": null,
"notification_arns": null,
"profile": null,
"region": "eu-central-1",
"security_token": null,
"stack_name": "openshift-infra",
"stack_policy": null,
"state": "present",
"tags": null,
"template": "roles/cloudformation-infra/files/greenfield.json",
"template_format": null,
"template_parameters": {
"AmiId": "ami-7def1712",
"AppNodeInstanceType": "t2.medium",
"AppWildcardDomain": "*.apps.dwojciec.com",
"BastionInstanceType": "t2.micro",
"BastionRootVolType": "gp2",
"BastionUserData": "I2Nsb3VkLWNvbmZpZwp1c2VyczoKLSBkZWZhdWx0CgpzeXN0ZW1faW5mbzoKICBkZWZhdWx0X3VzZXI6CiAgICBuYW1lOiBlYzItdXNlcg==",
"InfraDockerVolSize": 25,
"InfraDockerVolType": "gp2",
"InfraInstanceType": "t2.medium",
"InfraRootVolSize": 15,
"InfraRootVolType": "gp2",
"KeyName": "OSE-key",
"MasterApiPort": "443",
"MasterClusterHostname": "internal-openshift-master.dwojciec.com",
"MasterClusterPublicHostname": "openshift-master.dwojciec.com",
"MasterDockerVolSize": 25,
"MasterDockerVolType": "gp2",
"MasterEmptyVolSize": 5,
"MasterEmptyVolType": "gp2",
"MasterEtcdVolSize": 25,
"MasterEtcdVolType": "gp2",
"MasterHealthTarget": "TCP:443",
"MasterInstanceType": "m4.large",
"MasterRootVolSize": 10,
"MasterRootVolType": "gp2",
"MasterUserData": "I2Nsb3VkLWNvbmZpZwpjbG91ZF9jb25maWdfbW9kdWxlczoKLSBkaXNrX3NldHVwCi0gbW91bnRzCgpmc19zZXR1cDoKLSBsYWJlbDogZXRjZF9zdG9yYWdlCiAgZmlsZXN5c3RlbTogeGZzCiAgZGV2aWNlOiAvZGV2L3h2ZGMKICBwYXJ0aXRpb246IGF1dG8KLSBsYWJlbDogZW1wdHlkaXIKICBmaWxlc3lzdGVtOiB4ZnMKICBkZXZpY2U6IC9kZXYveHZkZAogIHBhcnRpdGlvbjogYXV0bwoKcnVuY21kOgotIG1rZGlyIC1wIC92YXIvbGliL2V0Y2QKLSBta2RpciAtcCAvdmFyL2xpYi9vcmlnaW4vb3BlbnNoaWZ0LmxvY2FsLnZvbHVtZXMKCm1vdW50czoKLSBbIC9kZXYveHZkYywgL3Zhci9saWIvZXRjZCwgeGZzLCAiZGVmYXVsdHMiIF0KLSBbIC9kZXYveHZkZCwgL3Zhci9saWIvb3JpZ2luL29wZW5zaGlmdC5sb2NhbC52b2x1bWVzLCB4ZnMsICJkZWZhdWx0cyxncXVvdGEiIF0KCgp3cml0ZV9maWxlczoKLSBjb250ZW50OiB8CiAgICBERVZTPScvZGV2L3h2ZGInCiAgICBWRz1kb2NrZXJfdm9sCiAgICBEQVRBX1NJWkU9OTUlVkcKICAgIEVYVFJBX0RPQ0tFUl9TVE9SQUdFX09QVElPTlM9Ii0tc3RvcmFnZS1vcHQgZG0uYmFzZXNpemU9M0ciCiAgcGF0aDogL2V0Yy9zeXNjb25maWcvZG9ja2VyLXN0b3JhZ2Utc2V0dXAKICBvd25lcjogcm9vdDpyb290Cgp1c2VyczoKLSBkZWZhdWx0CgpzeXN0ZW1faW5mbzoKICBkZWZhdWx0X3VzZXI6CiAgICBuYW1lOiBlYzItdXNlcg==",
"NodeDockerVolSize": 25,
"NodeDockerVolType": "gp2",
"NodeEmptyVolSize": 50,
"NodeEmptyVolType": "gp2",
"NodeRootVolSize": 15,
"NodeRootVolType": "gp2",
"NodeUserData": "I2Nsb3VkLWNvbmZpZwpjbG91ZF9jb25maWdfbW9kdWxlczoKLSBkaXNrX3NldHVwCi0gbW91bnRzCgpmc19zZXR1cDoKLSBsYWJlbDogZW1wdHlkaXIKICBmaWxlc3lzdGVtOiB4ZnMKICBkZXZpY2U6IC9kZXYveHZkYwogIHBhcnRpdGlvbjogYXV0bwoKcnVuY21kOgotIG1rZGlyIC1wIC92YXIvbGliL29yaWdpbi9vcGVuc2hpZnQubG9jYWwudm9sdW1lcwoKbW91bnRzOgotIFsgL2Rldi94dmRjLCAvdmFyL2xpYi9vcmlnaW4vb3BlbnNoaWZ0LmxvY2FsLnZvbHVtZXMsIHhmcywgImRlZmF1bHRzLGdxdW90YSIgXQoKd3JpdGVfZmlsZXM6Ci0gY29udGVudDogfAogICAgREVWUz0nL2Rldi94dmRiJwogICAgVkc9ZG9ja2VyX3ZvbAogICAgREFUQV9TSVpFPTk1JVZHCiAgICBFWFRSQV9ET0NLRVJfU1RPUkFHRV9PUFRJT05TPSItLXN0b3JhZ2Utb3B0IGRtLmJhc2VzaXplPTNHIgogIHBhdGg6IC9ldGMvc3lzY29uZmlnL2RvY2tlci1zdG9yYWdlLXNldHVwCiAgb3duZXI6IHJvb3Q6cm9vdAoKdXNlcnM6Ci0gZGVmYXVsdAoKc3lzdGVtX2luZm86CiAgZGVmYXVsdF91c2VyOgogICAgbmFtZTogZWMyLXVzZXI=",
"PublicHostedZone": "dwojciec.com",
"Region": "eu-central-1",
"Route53HostedZone": "dwojciec.com.",
"S3BucketName": "openshift-infra-ocp-registry-dwojciec",
"S3User": "openshift-infra-s3-openshift-user",
"SubnetAvailabilityZones": "eu-central-1a,eu-central-1b",
"SubnetCidrBlocks": "10.20.1.0/24,10.20.2.0/24,10.20.3.0/24,10.20.4.0/24,10.20.5.0/24,10.20.6.0/24",
"VpcCidrBlock": "10.20.0.0/16",
"VpcName": "ose-multi-az-vpc-openshift-infra"
},
"template_url": null,
"validate_certs": true
},
"module_name": "cloudformation"
}
}
MSG:
Template error: Fn::Select cannot select nonexistent value at index 2
PLAY RECAP *********************************************************************
localhost : ok=3 changed=2 unreachable=0 failed=1
I tried using the default availability zone US-EAST-1 instead of European zone and I received this issue inside cloudformation console:
CREATE_FAILED AWS::EC2::Subnet PublicSubnet3 Value (us-east-1c) for parameter availabilityZone is invalid. Subnets can currently only be created in the following availability zones: us-east-1e, us-east-1a, us-east-1b, us-east-1d.
I only succeeded by using : --region us-east-2 --ami ami-0a33696f
Hi. Even though I have set the --public-hosted-zone
parameter, accessing the console (for example https://openshift-master.MYDOMAIN.TLD/console/
redirects me to
https://openshift-master.rcook-aws.sysdeseng.com/oauth2callback/github?error=redirect_uri_mismatch&error_description=The+redirect_uri+MUST+match+the+registered+callback+URL+for+this+application.&error_uri=https%3A%2F%2Fdeveloper.github.com%2Fv3%2Foauth%2F%23redirect-uri-mismatch&...
It's possible that this has already been solved in the base openshift-ansible and we should be reusing a pattern for there.
I'm trying to deploy the aws-ansible
reference architecture, and get this error when running ose-on-aws.py
, seemingly when it tries to apply playbooks/openshift-install.yaml
:
ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.
The error appears to have been in '/vagrant/openshift-ansible-contrib/reference-architecture/aws-ansible/playbooks/roles/prerequisite/tasks/main.yaml': line 2, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
---
- name: Gather facts
^ here
I've tried this on both Mac OSX and CentOS 7 in Vagrant (above). In both cases all dependencies are installed and openshift-ansible
is checked out - although on OSX the destination path has to be changed to /usr/local/share/...
due to OSX security policy on /usr/share
.
Me again :)
Im reading this doc on OCP/AWS integration but cant find the AWS_ vars in any of the specified files.
Am I missing something?
Is there a block to using boto3? I'd like to use the cloudformation_facts module and its only in boto3
Maybe more question than a issue.
After provisioning infrastructure I got all infrastructure deployed with security groups, EC2 instances, etc.
As per reference architecture bastion
is ONLY ssh entry point to the platform. But we are running ansible from outside AWS, so access to master hosts is not permitted by security group to build openshift itself.
Action failing:
- hosts: cluster_hosts
gather_facts: yes
become: yes
serial: 1
user: ec2-user
vars_files:
- vars/main.yaml
roles:
- rhsm-subscription
Error simple, no ssh possible.
PLAY [bastion] *****************************************************************
TASK [host-up : check to see if host is available] *****************************
ok: [bastion.bgol.lt] => {"changed": false, "elapsed": 20, "path": null, "port": 22, "search_regex": null, "state": "started"}
PLAY [cluster_hosts] ***********************************************************
TASK [setup] *******************************************************************
fatal: [ose-master01.bgol.lt]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh.", "unreachable": true}
PLAY RECAP *********************************************************************
bastion.bgol.lt : ok=1 changed=0 unreachable=0 failed=0
localhost : ok=5 changed=5 unreachable=0 failed=0
ose-master01.bgol.lt : ok=0 changed=0 unreachable=1 failed=0
Does this mean we cant build infrastructure using external ansible master? OR i'm missing something?
Error message " - The tar archive is not a valid image." when running ./gcloud script on mac.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.