Comments (10)
I'd be surprised if this were a general problem since I don't remember seeing this ever in e2e. Can you provide any more details about what the cluster template looked like or if you were doing anything to it around the time it was being deleted? It would also be helpful to know what the capz-controller-manager was logging for the CAPZ resources while they were stuck and what the full YAML of the resources was then.
from cluster-api-provider-azure.
I think the errors were like this when it was in this state. These are the logs with the state of an AKS cluster not being able to be created.
ASO logs
I0214 23:48:07.738380 1 generic_reconciler.go:96] "msg"="Encountered error, re-queuing..." "logger"="controllers.ResourceGroupController" "name"="argocluster" "namespace"="default" "result"={"Requeue":false,"RequeueAfter":180000000000}
I0214 23:51:07.737209 1 common.go:58] "msg"="Reconcile invoked" "annotations"={"serviceoperator.azure.com/credential-from":"argocluster-aso-secret","serviceoperator.azure.com/operator-namespace":"capz-system","serviceoperator.azure.com/reconcile-policy":"skip","serviceoperator.azure.com/resource-id":"/subscriptions/#-4518-4d05-9d6d-a29b5cf33a8d/resourceGroups/argocluster"} "conditions"="[Condition [Ready], Status = \"False\", ObservedGeneration = 1, Severity = \"Warning\", Reason = \"Failed\", Message = \"error getting status for resource ID \\\"/subscriptions/#-4518-4d05-9d6d-a29b5cf33a8d/resourceGroups/argocluster\\\": getting resource with ID: \\\"/subscriptions/#-4518-4d05-9d6d-a29b5cf33a8d/resourceGroups/argocluster\\\": ClientSecretCredential: unable to resolve an endpoint: server response error:\\n Get \\\"https://login.microsoftonline.com/#-345f-445e-90b7-31ac0c5cf7ef/v2.0/.well-known/openid-configuration\\\": dial tcp: lookup login.microsoftonline.com on 10.96.0.10:53: server misbehaving\", LastTransitionTime = \"2024-02-14 23:41:18 +0000 UTC\"]" "creationTimestamp"="2024-02-14T23:40:55Z" "deletionTimestamp"=null "finalizers"=["serviceoperator.azure.com/finalizer"] "generation"=1 "kind"={"kind":"ResourceGroup","apiVersion":"resources.azure.com/v1api20200601storage"} "logger"="controllers.ResourceGroupController" "name"="argocluster" "namespace"="default" "owner"=null "ownerReferences"=[{"apiVersion":"infrastructure.cluster.x-k8s.io/v1beta1","kind":"AzureManagedControlPlane","name":"argocluster","uid":"acd970a7-adff-4fb8-af20-98014aee4f3d","controller":true,"blockOwnerDeletion":true}] "resourceVersion"="732785" "uid"="162c81d9-4718-49e8-a5f3-d3a7a305d12e"
I0214 23:51:07.737286 1 generic_reconciler.go:335] "msg"="Skipping creation/update of resource due to policy" "logger"="controllers.ResourceGroupController" "name"="argocluster" "namespace"="default" "serviceoperator.azure.com/reconcile-policy"="skip"
I0214 23:51:07.737464 1 azure_generic_arm_reconciler_instance.go:421] "msg"="Resource successfully created/updated" "azureName"="argocluster" "logger"="controllers.ResourceGroupController" "name"="argocluster" "namespace"="default" "resourceID"="/subscriptions/##-4518-4d05-9d6d-a29b5cf33a8d/resourceGroups/argocluster"
I0214 23:51:07.737700 1 recorder.go:104] "msg"="Using credential from \"default/argocluster-aso-secret\"" "logger"="events" "object"={"kind":"ResourceGroup","namespace":"default","name":"argocluster","uid":"162c81d9-4718-49e8-a5f3-d3a7a305d12e","apiVersion":"resources.azure.com/v1api20200601storage","resourceVersion":"732785"} "reason"="CredentialFrom" "type"="Normal"
E0214 23:51:30.238458 1 generic_reconciler.go:361] "msg"="Encountered error impacting Ready condition" "error"="Reason: Failed, Severity: Warning, RetryClassification: RetrySlow, Cause: error getting status for resource ID \"/subscriptions/#-4518-4d05-9d6d-a29b5cf33a8d/resourceGroups/argocluster\": getting resource with ID: \"/subscriptions/#-4518-4d05-9d6d-a29b5cf33a8d/resourceGroups/argocluster\": ClientSecretCredential: unable to resolve an endpoint: server response error:\n Get \"https://login.microsoftonline.com/#-345f-445e-90b7-31ac0c5cf7ef/v2.0/.well-known/openid-configuration\": dial tcp: lookup login.microsoftonline.com on 10.96.0.10:53: server misbehaving" "logger"="controllers.ResourceGroupController" "name"="argocluster" "namespace"="default"
I0214 23:51:30.280058 1 generic_reconciler.go:96] "msg"="Encountered error, re-queuing..." "logger"="controllers.ResourceGroupController" "name"="argocluster" "namespace"="default" "result"={"Requeue":false,"RequeueAfter":180000000000}
CAPZ logs
I0214 23:58:49.719804 1 azuremanagedcluster_controller.go:169] "Successfully reconciled" logger="controllers.AzureManagedClusterReconciler.Reconcile" controller="azuremanagedcluster" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AzureManagedCluster" AzureManagedCluster="default/argocluster" namespace="default" name="argocluster" reconcileID="4f7de029-f703-4b0b-9c80-1fd98c860b88" kind="AzureManagedCluster" namespace="default" name="argocluster" x-ms-correlation-request-id="821674f9-3b7e-4d53-ad30-05646b3e8a39" cluster="argocluster" controlPlane="argocluster"
I0214 23:58:49.720816 1 azuremanagedmachinepool_controller.go:194] "AzureManagedControlPlane is not initialized" logger="controllers.AzureManagedMachinePoolReconciler.Reconcile" controller="azuremanagedmachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AzureManagedMachinePool" AzureManagedMachinePool="default/argocluster-pool0" namespace="default" name="argocluster-pool0" reconcileID="ce0d7739-b492-45e9-8e32-e3225ac5b3e5" namespace="default" name="argocluster-pool0" kind="AzureManagedMachinePool" x-ms-correlation-request-id="07d964de-46a9-424a-8caa-a300e0848848" ownerCluster="argocluster"
I0214 23:58:57.004553 1 azuremanagedcontrolplane_controller.go:234] "Reconciling AzureManagedControlPlane" logger="controllers.AzureManagedControlPlaneReconciler.reconcileNormal" controller="azuremanagedcontrolplane" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="AzureManagedControlPlane" AzureManagedControlPlane="default/argocluster" namespace="default" name="argocluster" reconcileID="496335b7-c7c5-42d6-b1a7-4e99ef398eca" x-ms-correlation-request-id="cced4b6d-844f-46f0-8d73-a07c07b8d9ff"
E0214 23:58:57.004718 1 managedcontrolplane.go:427] "Unable to determine if ManagedControlPlaneScope VNET is managed by capz, assuming unmanaged" err="VirtualNetwork.network.azure.com \"argocluster\" not found" logger="scope.ManagedControlPlaneScope.IsVnetManaged" x-ms-correlation-request-id="28d3fddd-4e53-4226-ac43-d50c841734ff" AzureManagedCluster="argocluster"
from cluster-api-provider-azure.
Unfortunately, I can't reproduce the problem right now - but will circle back when/if I do.
from cluster-api-provider-azure.
It almost seems like there's something wonky in your Workload ID setup or your sub or something. I've never seen this kind of error in e2e or locally for me.
from cluster-api-provider-azure.
I have a live repo of this now. I'm pretty sure you can reproduce this by doing the following:
- Let CAPZ create a cluster
- Power off the CAPZ management cluster
- Manually delete the AKS cluster resource (but leave everything else in the RG)
- Bring back online the CAPZ management cluster
Here's my logs I have right now from ASO (using 1.14.0 release also).
I0314 21:49:12.626378 1 recorder.go:104] "msg"="Reason: ResourceNotFound, Severity: Warning, RetryClassification: RetrySlow, Cause: The Resource 'Microsoft.ContainerService/managedClusters/argocluster' under resource group 'argocluster' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix: PUT https://management.azure.com/subscriptions/####/resourceGroups/argocluster/providers/Microsoft.ContainerService/managedClusters/argocluster/agentPools/pool1\n--------------------------------------------------------------------------------\nRESPONSE 404: 404 Not Found\nERROR CODE: ResourceNotFound\n
from cluster-api-provider-azure.
#4609 May or may not be connected to this issue. But since both of these issues are around deletion, I am mentioning the other issue as well in here.
from cluster-api-provider-azure.
Related Issues (20)
- Docker-in-Docker failing in prow jobs HOT 3
- Deploying more than topology based on the default AKS ClusterClass only results in one successfully deployed cluster HOT 2
- Not possible to migrate existing AzureCluster with empty subscriptionID to CAPZ v1.11 or newer, which removes the fallback credential
- Add support to configure AKS clusters with a pre-existing privateDNSZone HOT 1
- Add FAQ to AKS doc
- Add option to enable monitoring on cluster HOT 11
- CAPZ 1.13 sees a lot of timeout
- CAPZ 1.13.1 fails to use existing virtualNetwork from a different rg
- clusterclass control-plane namingStrategy is not applied HOT 6
- Self-Hosted cluster deletion failing in scenarios when manual deletion is attempted first. HOT 9
- Allow no k8s version for ManagedClusters HOT 1
- CAPZ should document which versions of Azure APIs are actually in use HOT 2
- Cluster with custom VNET is stuck in Creating status HOT 1
- clusterctl init doesn't work when metadata is merged but release isn't published HOT 1
- CAPI v1.7.0-beta.0 has been released and is ready for testing HOT 1
- Support disabling bastion host HOT 1
- Cluster fails to reconcile with `etcdserver: request is too large` HOT 3
- Set default ENV variables for clusterctl templates
- Support setting routeTables to virtualnetworkssubnets for AKS managed clusters HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cluster-api-provider-azure.