openshift / assisted-installer-agent Goto Github PK

View Code? Open in Web Editor NEW

24.0 24.0 74.0 23.28 MB

License: Apache License 2.0

Makefile 0.56% Go 98.65% Shell 0.65% Python 0.14%

assisted-installer-agent's People

Contributors

$eifrach avatar$

Stargazers

Watchers

assisted-installer-agent's Issues

Future Release Branches Frozen For Merging | branch:release-4.17 branch:release-4.18

The following branches are being fast-forwarded from the current development branch (master) as placeholders for future releases. No merging is allowed into these release branches until they are unfrozen for production release.

release-4.17
release-4.18

For more information, see the branching documentation.

assisted installer service getting error - chronyc: error while loading shared libraries: libnettle.so.8

I'm using the self-hosted assisted installer service to install Single Node OKD.
The assisted installer service is running in podman containers, as documented here

This method of doing a single node install of OKD used to work. But, has started to fail recently (within the last 30 days or so).

The host registers with the installer service, but gets stuck on an NTP synchronization failure as seen in the attached screen-shot

Looking into the pod logs of the assisted installer service, I see this message;

level=error msg="Received step reply <ntp-synchronizer-392f0f02> from infra-env <ff4ce4b9-a3cd-4c50-b258-24cfbba8d1e3> host <68b15b04-5cb1-429f-9778-3c8727d0235d> exit-code <-1> stderr <chronyc exited with non-zero exit code 127: \nchronyc: error while loading shared libraries: libnettle.so.8: cannot open shared object file: No such file or directory\n> stdout <>" func=github.com/openshift/assisted-service/internal/bminventory.logReplyReceived file="/go/src/github.com/openshift/origin/internal/bminventory/inventory.go:2992" go-id=9762 host_id=68b15b04-5cb1-429f-9778-3c8727d0235d infra_env_id=ff4ce4b9-a3cd-4c50-b258-24cfbba8d1e3 pkg=Inventory request_id=6a4edac8-f290-4cb2-813e-f6a67ef9c50b

The relevant part of the message being - chronyc: error while loading shared libraries: libnettle.so.8: cannot open shared object file: No such file or directory

I believe the root cause for this is due to the changes introduced by this commit

The code change introduced by that commit mounts the chronyc command binary of the underlying OS (on which the assisted-installer-agent container runs on) into the /usr/bin directory inside the container. In my particular instance that host OS is Fedora CoreOS 35.20220327.3.0. The problem, in this case, is that the chronyc command is a dynamically linked ELF that depends on the libnettle.so.8 shared library... which isn't present in the container. The container does contain libnettle.so.6 tho.

Anyway, IMO this [bind-mounting the chronyc command from the underlying OS] is a containers anti-pattern.

Wouldn't it be a better approach to use the chronyc installed by the dnf install chrony in the docker file here, used to build the assisted installer agent container image.

@tsorya could you have a look at the change introduced in that commit. This introduces a significant pre-req of same shared library (that which the chronyc binary is dynamically linked) also be present on the assisted installer agent container image. Is there a different approach?

Future Release Branches Frozen For Merging | branch:release-ocm-2.9

release-ocm-2.9

Future Release Branches Frozen For Merging | branch:release-ocm-2.8

release-ocm-2.8

Future Release Branches Frozen For Merging | branch:release-ocm-2.10

release-ocm-2.10

Host shows with 0 GiB RAM once discovered

Using assisted installer with guest VM on virtualbox, RAM is not correctly detected. Cannot progress with install because minimum 16 GiB is not met, despite the guest having 16GiB. See below:

At the time of opening this I can't attach the agent logs because for some reason I can't ssh to the host, despite generating an ISO with my public key.

Future Release Branches Frozen For Merging | branch:release-ocm-2.11

release-ocm-2.11

Agent not find my vswitch interface

Hey folks,

I'm trying to use your utility with the new RedHat Openshift installation (4.6) on Heztner root servers.
The agent is working, but detecting only the physical interface (enp4s0) and not the vswitch interface. I created the vswitch device with the following commands:

sudo ip link add link enp4s0 name enp4s0.4000 type vlan id 4000
sudo ip link set enp4s0.4000 mtu 1400
sudo ip link set dev enp4s0.4000 up
sudo ip addr add 192.168.100.3/24 brd 192.168.100.255 dev enp4s0.4000

My nmcli output (censored public IP):

enp4s0: connected to Wired Connection
        "Intel I210"
        ethernet (igb), 10:7B:44:AF:F1:EC, hw, mtu 1500
        ip4 default
        inet4 **.4.**.115/26
        route4 0.0.0.0/0
        route4 **.4.**.64/26
        inet6 ****::127b:****:feaf:****/64
        route6 fe80::/64
        route6 ff00::/8

cni-podman0: connected to cni-podman0
        "cni-podman0"
        bridge, 12:FF:14:F2:7A:5F, sw, mtu 1500
        inet4 10.88.0.1/16
        route4 10.88.0.0/16
        inet6 fe80::10ff:14ff:fef2:7a5f/64
        route6 fe80::/64
        route6 ff00::/8

enp4s0.4000: connected to enp4s0.4000
        "enp4s0.4000"
        vlan, 10:7B:44:AF:F1:EC, sw, mtu 1400
        inet4 192.168.100.3/24
        route4 192.168.100.0/24
        inet6 fe80::127b:44ff:feaf:f1ec/64
        route6 ff00::/8
        route6 fe80::/64

lo: unmanaged
        "lo"
        loopback (unknown), 00:00:00:00:00:00, sw, mtu 65536

DNS configuration:
        servers: 213.133.98.98 213.133.99.99 213.133.100.100
        domains: your-server.de
        interface: enp4s0

agent --text=true json (censored not relevant information like public IP):

{
  "bmc_address": "0.0.0.0",
  "bmc_v6address": "::/0",
  "boot": {
    "current_boot_mode": "bios"
  },
  "cpu": {
    "architecture": "x86_64",
    "count": 12,
    "flags": [
      "fpu",
      "vme",
      "de",
      "pse",
      "tsc",
      "msr",
      "pae",
      "mce",
      "cx8",
      "apic",
      "sep",
      "mtrr",
      "pge",
      "mca",
      "cmov",
      "pat",
      "pse36",
      "clflush",
      "dts",
      "acpi",
      "mmx",
      "fxsr",
      "sse",
      "sse2",
      "ss",
      "ht",
      "tm",
      "pbe",
      "syscall",
      "nx",
      "pdpe1gb",
      "rdtscp",
      "lm",
      "constant_tsc",
      "arch_perfmon",
      "pebs",
      "bts",
      "rep_good",
      "nopl",
      "xtopology",
      "nonstop_tsc",
      "cpuid",
      "aperfmperf",
      "pni",
      "pclmulqdq",
      "dtes64",
      "monitor",
      "ds_cpl",
      "vmx",
      "smx",
      "est",
      "tm2",
      "ssse3",
      "sdbg",
      "fma",
      "cx16",
      "xtpr",
      "pdcm",
      "pcid",
      "dca",
      "sse4_1",
      "sse4_2",
      "x2apic",
      "movbe",
      "popcnt",
      "tsc_deadline_timer",
      "aes",
      "xsave",
      "avx",
      "f16c",
      "rdrand",
      "lahf_lm",
      "abm",
      "cpuid_fault",
      "epb",
      "invpcid_single",
      "pti",
      "intel_ppin",
      "ssbd",
      "ibrs",
      "ibpb",
      "stibp",
      "tpr_shadow",
      "vnmi",
      "flexpriority",
      "ept",
      "vpid",
      "fsgsbase",
      "tsc_adjust",
      "bmi1",
      "avx2",
      "smep",
      "bmi2",
      "erms",
      "invpcid",
      "cqm",
      "xsaveopt",
      "cqm_llc",
      "cqm_occup_llc",
      "dtherm",
      "ida",
      "arat",
      "pln",
      "pts",
      "md_clear",
      "flush_l1d"
    ],
    "frequency": 3800,
    "model_name": "Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz"
  },
  "disks": [
    {
      "by_path": "/dev/disk/by-path/pci-0000:00:1f.2-ata-1",
      "drive_type": "SSD",
      "hctl": "0:0:0:0",
      "model": "SAMSUNG_MZ7LM480",
      "name": "sda",
      "path": "/dev/sda",
      "serial": "S1YJNX0H603507",
      "size_bytes": 480103981056,
      "vendor": "ATA",
      "wwn": "0x5002538c403064d7"
    },
    {
      "by_path": "/dev/disk/by-path/pci-0000:00:1f.2-ata-3",
      "drive_type": "SSD",
      "hctl": "2:0:0:0",
      "model": "SAMSUNG_MZ7LM480",
      "name": "sdb",
      "path": "/dev/sdb",
      "serial": "S1YJNX0H603505",
      "size_bytes": 480103981056,
      "vendor": "ATA",
      "wwn": "0x5002538c403064d5"
    }
  ],
  "hostname": "static.***.251.***.138.clients.your-server.de",
  "interfaces": [
    {
      "biosdevname": "em1",
      "flags": [
        "up",
        "broadcast",
        "multicast"
      ],
      "has_carrier": true,
      "ipv4_addresses": [
        "***.201.***.139/26"
      ],
      "ipv6_addresses": [
        "****::3697:****:fe5a:***/64"
      ],
      "mac_address": "**:97:f6:**:0d:8f",
      "mtu": 1500,
      "name": "enp4s0",
      "product": "0x1533",
      "speed_mbps": 1000,
      "vendor": "0x8086"
    }
  ],
  "memory": {
    "physical_bytes": 274877906944,
    "usable_bytes": 270361206784
  },
  "system_vendor": {
    "manufacturer": "ASUSTeK COMPUTER INC.",
    "product_name": "Z10PA-U8 Series",
    "serial_number": "System Serial Number"
  }
}

No connectivity to the majority of hosts in the cluster - connectivity check fails between hosts

Hi,

We use Assisted Service v2.3.1 to deploy OCP 4.8.29.

We have 3 (control) + 2 (worker) AI based cluster deployment on Baremetal Hardware, hosts are not in Ready state to proceed with installation (as seen in the Network Page on the AI GUI). Status of host is Insufficient even after NTP sync is successful.

Output from the one of the cluster nodes

$ chronyc sources
210 Number of sources = 1
MS Name/IP address Stratum Poll Reach LastRx Last sample

^* 192.168.10.4 3 10 377 947 -292us[ -295us] +/- 92ms

$ timedatectl
Local time: Tue 2022-05-10 15:51:51 UTC
Universal time: Tue 2022-05-10 15:51:51 UTC
RTC time: Tue 2022-05-10 15:51:51
Time zone: UTC (UTC, +0000)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no

Cluster events :

5/10/2022, 7:21:04 PM Updated status of the cluster to insufficient
5/10/2022, 7:21:04 PM Cluster validation 'api-vip-defined' is now fixed
5/10/2022, 7:07:40 PM Cluster validation 'ntp-server-configured' is now fixed
5/10/2022, 7:07:38 PM Host sl12345.net: validation 'ntp-synced' is now fixed
5/10/2022, 7:07:14 PM Host sl12346.net: validation 'ntp-synced' is now fixed
5/10/2022, 7:06:58 PM Host sl12347.net: validation 'ntp-synced' is now fixed
5/10/2022, 7:06:34 PM Host sl12348.net: validation 'ntp-synced' is now fixed
5/10/2022, 7:05:38 PM Cluster validation 'sufficient-masters-count' is now fixed
5/10/2022, 7:05:38 PM
warning
Host sl12345.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server ; No connectivity to the majority of hosts in the cluster)
5/10/2022, 7:05:38 PM Host sl12349.net: validation 'ntp-synced' is now fixed
5/10/2022, 7:05:14 PM
warning
Host sl12346.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server ; No connectivity to the majority of hosts in the cluster)
5/10/2022, 7:04:58 PM
warning
Host sl12347.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server ; No connectivity to the majority of hosts in the cluster)
5/10/2022, 7:04:54 PM Host 67bb1e80-ccb6-2902-bebd-c722391c6b27: Successfully registered
5/10/2022, 7:04:39 PM
warning
Cluster validation 'ntp-server-configured' that used to succeed is now failing
5/10/2022, 7:04:34 PM
warning
Host sl12348.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server)
5/10/2022, 7:04:30 PM Host 90570f72-619d-2327-e948-7a1ac68387b6: Successfully registered
5/10/2022, 7:03:38 PM
warning
Cluster validation 'all-hosts-are-ready-to-install' that used to succeed is now failing
5/10/2022, 7:03:38 PM
warning
Host sl12349.net: updated status from discovering to insufficient (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server)

Validation info from the cluster :

"configuration": [
{
"id": "pull-secret-set",
"status": "success",
"message": "The pull secret is set."
}
],
"hosts-data": [
{
"id": "all-hosts-are-ready-to-install",
"status": "failure",
"message": "The cluster has hosts that are not ready to install."
},
{
"id": "sufficient-masters-count",
"status": "success",
"message": "The cluster has a sufficient number of master candidates."
}
],
"network": [
{
"id": "api-vip-defined",
"status": "success",
"message": "The API virtual IP is defined."
},
{
"id": "api-vip-valid",
"status": "success",
"message": "api vip 192.168.10.40 belongs to the Machine CIDR and is not in use."
},
{
"id": "cluster-cidr-defined",
"status": "success",
"message": "The Cluster Network CIDR is defined."
},
{
"id": "dns-domain-defined",
"status": "success",
"message": "The base domain is defined."
},
{
"id": "ingress-vip-defined",
"status": "success",
"message": "The Ingress virtual IP is defined."
},
{
"id": "ingress-vip-valid",
"status": "success",
"message": "ingress vip 192.168.10.41 belongs to the Machine CIDR and is not in use."
},
{
"id": "machine-cidr-defined",
"status": "success",
"message": "The Machine Network CIDR is defined."
},
{
"id": "machine-cidr-equals-to-calculated-cidr",
"status": "success",
"message": "The Cluster Machine CIDR is equivalent to the calculated CIDR."
},
{
"id": "network-prefix-valid",
"status": "success",
"message": "The Cluster Network prefix is valid."
},
{
"id": "network-type-valid",
"status": "success",
"message": "The cluster has a valid network type"
},
{
"id": "networks-same-address-families",
"status": "success",
"message": "Same address families for all networks."
},
{
"id": "no-cidrs-overlapping",
"status": "success",
"message": "No CIDRS are overlapping."
},
{
"id": "ntp-server-configured",
"status": "success",
"message": "No ntp problems found"
},
{
"id": "service-cidr-defined",
"status": "success",
"message": "The Service Network CIDR is defined."
}
],
"operators": [
{
"id": "cnv-requirements-satisfied",
"status": "success",
"message": "cnv is disabled"
},
{
"id": "lso-requirements-satisfied",
"status": "success",
"message": "lso is disabled"
},
{
"id": "odf-requirements-satisfied",
"status": "success",
"message": "odf is disabled"
}
]
}

Used following images :

quay.io/edge-infrastructure/postgresql-12-centos7:0.3.25
quay.io/edge-infrastructure/assisted-service:v2.3.1
quay.io/edge-infrastructure/assisted-installer-ui:v2.3.9
quay.io/edge-infrastructure/assisted-image-service:v2.3.1
quay.io/edge-infrastructure/assisted-installer-agent:v2.3.1
quay.io/edge-infrastructure/assisted-installer:v2.3.1
quay.io/edge-infrastructure/assisted-installer-controller:v2.3.1

We are unable to proceed with the installation with the hosts being in 'Insufficient' state.

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Edited/Blocked

These updates have been manually edited so Renovate will no longer make changes. To discard all commits and start over, click on a checkbox.

Update registry.access.redhat.com/ubi9/go-toolset Docker tag to v1.21.10-1.1719562237

Detected dependencies

dockerfile

Dockerfile.assisted_installer_agent

Dockerfile.assisted_installer_agent-build

registry.access.redhat.com/ubi9/go-toolset 1.20

Dockerfile.ocp

subsystem/Dockerfile.agent_test

uuid generation of systems with identical/collision Serial

My motherboards all do not have a valid Serial in DMI.
All return "To be filled by O.E.M." string.
Could this be a problem that the assisted-installer-agent podman container starts only on the first PC and does not on the others?

Future Release Branches Frozen For Merging | branch:release-4.8

release-4.8

Contact the Test Platform or Automated Release teams for more information.

Non distinct UUID causes hosts not to show up

If you use assisted-installer with a VM provider which does not randomize UUID as part of a VM clone activity, the hosts will all share the same UUID which the assisted installer interface does not expect or handle.

Instead of a list of hosts showing up on the interface, only one host entry will show and it will switch between all discovered hosts which share that UUID. If distinct UUID is a requirement for AI, then the interface should recognize when hosts all have the same UUID and advise user of the issue.

Although I have not tested yet, I am assuming for Parallels the --regenerate-src-uuid parameter should be used when cloning.

Agent doesn't report memory installed on the system on KVM based instances on EC2

We are using c5 instances to test OpenShift Cluster Deployment using the assisted installer. The WebUI reports the nodes as if they don't have any memory installed.

Below the relevant outputs:

$ podman run --rm --net=host --privileged  quay.io/ocpmetal/assisted-installer-agent inventory

{"bmc_address":"0.0.0.0","bmc_v6address":"::/0","boot":{"current_boot_mode":"bios"},"cpu":{"architecture":"x86_64","count":2,"flags":["fpu","vme","de","pse","tsc","msr","pae","mce","cx8","apic","sep","mtrr","pge","mca","cmov","pat","pse36","clflush","mmx","fxsr","sse","sse2","ss","ht","syscall","nx","pdpe1gb","rdtscp","lm","constant_tsc","rep_good","nopl","xtopology","nonstop_tsc","cpuid","aperfmperf","tsc_known_freq","pni","pclmulqdq","ssse3","fma","cx16","pcid","sse4_1","sse4_2","x2apic","movbe","popcnt","tsc_deadline_timer","aes","xsave","avx","f16c","rdrand","hypervisor","lahf_lm","abm","3dnowprefetch","invpcid_single","pti","fsgsbase","tsc_adjust","bmi1","hle","avx2","smep","bmi2","erms","invpcid","rtm","mpx","avx512f","avx512dq","rdseed","adx","smap","clflushopt","clwb","avx512cd","avx512bw","avx512vl","xsaveopt","xsavec","xgetbv1","xsaves","ida","arat","pku","ospke"],"frequency":3399.996,"model_name":"Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"},"disks":[{"drive_type":"SSD","name":"nvme0n1","path":"/dev/nvme0n1","size_bytes":21474836480}],"hostname":"ip-10-0-0-152.ec2.internal","interfaces":[{"flags":["up","broadcast","multicast"],"has_carrier":true,"ipv4_addresses":["10.0.0.152/24"],"ipv6_addresses":["fe80::c0c:72ff:fe1c:57d3/64"],"mac_address":"0e:0c:72:1c:57:d3","mtu":9001,"name":"eth0","product":"0xec20","vendor":"0x1d0f"}],"memory":{"usable_bytes":3710193664},"system_vendor":{"manufacturer":"Amazon EC2","product_name":"c5.large","serial_number":"ec294ec1-e09d-ff8c-21ac-81f963ae1a37"}}

$ podman run --rm --net=host --privileged  quay.io/ocpmetal/assisted-installer-agent dmidecode -t 17

# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.

Future Release Branches Frozen For Merging | branch:release-ocm-2.6

release-ocm-2.6

Future Release Branches Frozen For Merging | branch:release-ocm-2.12

release-ocm-2.12

Future Release Branches Frozen For Merging | branch:release-4.16 branch:release-4.17

release-4.16
release-4.17

For more information, see the branching documentation.

Support using the inventory when not running in a container

In the k4e-device-worker project, we rely on the inventory to gather HW info from the device, see:
https://github.com/jakub-dzon/k4e-device-worker/blob/main/internal/hardware/hardware.go#L13
However, since the device-worker runs as a process and not as a container, there are few failures to collect certain data, e.g.:

Oct 17 19:00:06 localhost.localdomain yggdrasild[1000]: [yggdrasild] 2021/10/17 19:00:06 [/usr/libexec/yggdrasil/device-worker] WARNING: Unable to read product_family: open /host/sys/class/dmi/id/product_family: no such file or directory

We'd like the option not to set the Chroot as /host when not running as a container.

ASUS motherboard unset serial

Hello,

I'm trying to install onto a set of machines with ASUS system boards. The boards unfortunately all report "To be filled by O.E.M." for their baseboard serial numbers. The service is not set up to handle this case, so only one of the nodes will be recognized by the installer at the same time as the generated host ID is identical for all the hosts.

assisted-installer-agent/src/scanners/machine_uuid_scanner.go

Line 134 in 569f4e4

motherboardSerialUUID := idReader.readMotherboardSerial()

assisted-installer-agent/src/scanners/machine_uuid_scanner.go

Lines 104 to 106 in 569f4e4

 if funk.Contains(unknownSerialCases, strings.ToLower(basedboard.SerialNumber)) { 

 return nil 

 }

assisted-installer-agent/src/scanners/machine_uuid_scanner.go

Lines 21 to 29 in 569f4e4

 const ( 

 SerialDefaultString = "default string" 

 SerialUnspecifiedBaseBoardString = "unspecified base board serial number" // BF cards 

 SerialUnspecifiedSystemString = "unspecified system serial number" // BF cards 

 SerialNotSpecified = "not specified" // Linode 

 SerialProliantGen11 = "PCA_number.ACC" // Proliant Gen 11 

 ZeroesUUID = "00000000-0000-0000-0000-000000000000" 

 KaloomUUID = "03000200-0400-0500-0006-000700080009" // All hosts of this type have the same UUID 

 )

Thank you

Future Release Branches Frozen For Merging | branch:release-ocm-2.7

release-ocm-2.7

agent-config.yaml: hosts parameter is not optional as stated in the docs

Hi,
According to the OpenShift documentation, the hosts parameters is optional in agent-config.yaml file : https://docs.openshift.com/container-platform/4.12/installing/installing_with_agent_based_installer/installing-with-agent-based-installer.html

Though with the following configuration I'm experiencing a crash when creating the ISO image.

The error does not occur if I add hosts.
If I add hosts, that works. However, my need is to create the boot ISO to create my VMs from vSphere with DHCP.

apiVersion: v1alpha1
metadata:
  name: cluster-yxdn
rendezvousIP: 10.0.155.74

Error message

DEBUG     Fetching Agent Manifests...
DEBUG       Fetching Agent PullSecret...
DEBUG         Fetching Install Config...
DEBUG         Reusing previously-fetched Install Config
DEBUG       Generating Agent PullSecret...
DEBUG       Fetching InfraEnv Config...
DEBUG         Fetching Install Config...
DEBUG         Reusing previously-fetched Install Config
DEBUG         Fetching Agent Config...
DEBUG         Reusing previously-fetched Agent Config
DEBUG       Generating InfraEnv Config...
DEBUG       Fetching NMState Config...
DEBUG         Fetching Agent Config...
DEBUG         Reusing previously-fetched Agent Config
DEBUG         Fetching Install Config...
DEBUG         Reusing previously-fetched Install Config
DEBUG       Generating NMState Config...
DEBUG       Fetching AgentClusterInstall Config...
DEBUG         Fetching Install Config...
DEBUG         Reusing previously-fetched Install Config
DEBUG       Generating AgentClusterInstall Config...
panic: runtime error: index out of range [0] with length 0

goroutine 1 [running]:
github.com/openshift/installer/pkg/asset/agent/manifests.(*AgentClusterInstall).Generate(0xc000ae6310, 0x5?)
        /go/src/github.com/openshift/installer/pkg/asset/agent/manifests/agentclusterinstall.go:185 +0xe85
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc0009c93b0, {0x1a81ca60, 0xc000ae6310}, {0xc000ac63aa, 0x6})
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:227 +0x5fa
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc0009c93b0, {0x1a81ca90, 0xc0002d4620}, {0xc000a4d00c, 0x4})
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:221 +0x75b
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc0009c93b0, {0x1a81ca30, 0xc0009ffb30}, {0x4bf26f2, 0x2})
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:221 +0x75b
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc0009c93b0, {0x7f616b57a918, 0x1dbb22a0}, {0x0, 0x0})
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:221 +0x75b
github.com/openshift/installer/pkg/asset/store.(*storeImpl).Fetch(0x7ffd48a143f2?, {0x7f616b57a918, 0x1dbb22a0}, {0x1db7f9c0, 0x3, 0x3})
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:77 +0x48
main.runTargetCmd.func1({0x7ffd48a143f2, 0x7f})
        /go/src/github.com/openshift/installer/cmd/openshift-install/create.go:261 +0x125
main.runTargetCmd.func2(0x1db8cce0?, {0xc0009c90b0?, 0x3?, 0x3?})
        /go/src/github.com/openshift/installer/cmd/openshift-install/create.go:291 +0xe7
github.com/spf13/cobra.(*Command).execute(0x1db8cce0, {0xc0009c9050, 0x3, 0x3})
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:876 +0x67b
github.com/spf13/cobra.(*Command).ExecuteC(0xc00042e280)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:990 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:918
main.installerMain()
        /go/src/github.com/openshift/installer/cmd/openshift-install/main.go:61 +0x2b0

Unable to pull okd binaries on nodes behind a proxy

Agent service doesn't start on cluster nodes (booting from agent.iso) because okd-binaries.sh can't pull an okd image : HTTPS_PROXY is not set and /etc/profile.d/proxy.sh doesn't exists

As a temp workaround, it's possible to connect to each node and add export HTTP(S)_PROXY=_proxy_url_ before podman pull line:

sudo vi /usr/local/bin/okd-binaries.sh ; sudo systemctl restart okd-overlay.service

After this, discovery and installation succeeds

podman images
REPOSITORY                                          TAG               IMAGE ID      CREATED     SIZE
localhost/podman-pause                              4.2.0-1667932451  0988768b0889  4 days ago  739 kB
quay.io/edge-infrastructure/assisted-image-service  latest            fe99790f4d21  5 days ago  400 MB
quay.io/edge-infrastructure/assisted-service        latest            73d6d4a24ee2  5 days ago  973 MB
quay.io/centos7/postgresql-12-centos7               latest            ed33c67aa405  7 days ago  383 MB
quay.io/edge-infrastructure/assisted-installer-ui   latest            1e6ad6c750ff  8 days ago  472 MB

Don't know if it's related, but if I don't add HTTP(S)_PROXY in okd-configmap.yml, cluster nodes joins discovery without any problem (but installation fails because assisted-installer can't download installation files)

	if funk.Contains(unknownSerialCases, strings.ToLower(basedboard.SerialNumber)) {
	return nil
	}

	const (
	SerialDefaultString = "default string"
	SerialUnspecifiedBaseBoardString = "unspecified base board serial number" // BF cards
	SerialUnspecifiedSystemString = "unspecified system serial number" // BF cards
	SerialNotSpecified = "not specified" // Linode
	SerialProliantGen11 = "PCA_number.ACC" // Proliant Gen 11
	ZeroesUUID = "00000000-0000-0000-0000-000000000000"
	KaloomUUID = "03000200-0400-0500-0006-000700080009" // All hosts of this type have the same UUID
	)

openshift / assisted-installer-agent Goto Github PK

assisted-installer-agent's People

Contributors

Stargazers

Watchers

Forkers

assisted-installer-agent's Issues

Output from the one of the cluster nodes

$ chronyc sources 210 Number of sources = 1 MS Name/IP address Stratum Poll Reach LastRx Last sample

Cluster events :

Validation info from the cluster :

Used following images :

Edited/Blocked

Detected dependencies

Recommend Projects

Recommend Topics

Recommend Org

$ chronyc sources
210 Number of sources = 1
MS Name/IP address Stratum Poll Reach LastRx Last sample