You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Cilium Operator and the Daemonset pods are trying to contact the K8s API Server but can't, I don't believe the IP address they're trying is correct, it's a 10.x.x.x address.
I've not been able to recover from this error, but I haven't really tried much. Disabling the network using the k8s CLI tool doesn't have any effect.
What Should Happen Instead?
I'm not sure what the underlying issue is, if it is a issue with the API Server IP address being wrong then I guess that needs to be set correctly somehow.
Reproduction Steps
The most recent time this happened I set the containerd_custom_registries setting to a bad value, it included a semi-colon in the middle of the string:
I corrected the setting but the k8s cluster in Juju ended up in an errored state and the Cilium Operator and Pods ended up in the situation described above. I managed to recover the k8s units in Juju by downgrading the release then bumping it back up again, but I am unable to recover the Cilium installation back to a working state
The text was updated successfully, but these errors were encountered:
playworker
changed the title
Cilium sometimes ends up in a failed state
Cilium sometimes ends up in a failed state unable to contact K8s API Server
Aug 8, 2024
Hello @playworker ,
We are aware of this issue and are currently working on a fix. In the meantime, here are a couple of workarounds we've tested to temporarily fix the issue:
Summary
Sometimes when things go wrong with the k8s cluster Cilium ends up in a failed state and I don't know how to recover it.
The Cilium failure looks a lot like this: cilium/cilium#20679
The Cilium Operator and the Daemonset pods are trying to contact the K8s API Server but can't, I don't believe the IP address they're trying is correct, it's a 10.x.x.x address.
I've not been able to recover from this error, but I haven't really tried much. Disabling the network using the k8s CLI tool doesn't have any effect.
What Should Happen Instead?
I'm not sure what the underlying issue is, if it is a issue with the API Server IP address being wrong then I guess that needs to be set correctly somehow.
Reproduction Steps
The most recent time this happened I set the containerd_custom_registries setting to a bad value, it included a semi-colon in the middle of the string:
I corrected the setting but the k8s cluster in Juju ended up in an errored state and the Cilium Operator and Pods ended up in the situation described above. I managed to recover the k8s units in Juju by downgrading the release then bumping it back up again, but I am unable to recover the Cilium installation back to a working state
System information
inspection-report-20240808_102202.tar.gz
Can you suggest a fix?
No response
Are you interested in contributing with a fix?
No response
The text was updated successfully, but these errors were encountered: