Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

status is error when cloud-init status --long only shows warnings #5836

Open
hector-gao opened this issue Oct 21, 2024 · 1 comment
Open
Labels
bug Something isn't working correctly new An issue that still needs triage

Comments

@hector-gao
Copy link

Bug report

The instance boots and network is configured correctly, but cloud-init status is error. In the status report, there is no item in the field errors. IIUC, the status should be degraded done when there are only warnings.

Steps to reproduce the problem

gcloud compute instances create cos113 --zone=us-west2-b --image=cos-113-18244-151-100 --image-project=cos-cloud

Environment details

  • Cloud-init version: 23.4.4
  • Operating System Distribution: Container Optimized OS
  • Cloud provider, platform or installer type: Google Cloud Platform

cloud-init logs

......
2024-10-21 23:07:21,561 - DataSourceGCE.py[DEBUG]: Looking for the primary NIC in: ['eth0']
2024-10-21 23:07:21,563 - dhcp.py[DEBUG]: Skip dhclient configuration: No dhclient command found.
2024-10-21 23:07:21,563 - dhcp.py[WARNING]: DHCP client not found: dhclient
2024-10-21 23:07:21,564 - dhcp.py[WARNING]: DHCP client not found: dhcpcd
2024-10-21 23:07:21,564 - dhcp.py[DEBUG]: Skip udhcpc configuration: No udhcpc command found.
2024-10-21 23:07:21,564 - dhcp.py[WARNING]: DHCP client not found: udhcpc
2024-10-21 23:07:21,564 - DataSourceGCE.py[WARNING]: Did not find a fallback interface on gce.
2024-10-21 23:07:21,565 - handlers.py[DEBUG]: finish: init-local/search-GCELocal: FAIL: no local data found from DataSourceGCELocal
2024-10-21 23:07:21,565 - util.py[WARNING]: Getting data from <class 'cloudinit.sources.DataSourceGCE.DataSourceGCELocal'> failed
2024-10-21 23:07:21,566 - util.py[DEBUG]: Getting data from <class 'cloudinit.sources.DataSourceGCE.DataSourceGCELocal'> failed
Traceback (most recent call last):
File "/usr/lib/python3.8/site-packages/cloudinit/sources/init.py", line 1017, in find_source
if s.update_metadata_if_supported(
File "/usr/lib/python3.8/site-packages/cloudinit/sources/init.py", line 903, in update_metadata_if_supported
result = self.get_data()
File "/usr/lib/python3.8/site-packages/cloudinit/sources/init.py", line 438, in get_data
return_value = self._check_and_get_data()
File "/usr/lib/python3.8/site-packages/cloudinit/sources/init.py", line 370, in _check_and_get_data
return self._get_data()
File "/usr/lib/python3.8/site-packages/cloudinit/sources/DataSourceGCE.py", line 145, in _get_data
if not ret["success"]:
UnboundLocalError: local variable 'ret' referenced before assignment
......

output of cloud-init status --long:
status: error
extended_status: error
boot_status_code: enabled-by-generator
last_update: Mon, 21 Oct 2024 23:07:26 +0000
detail:
DataSourceGCE
errors: []
recoverable_errors:
WARNING:
- DHCP client not found: dhclient
- DHCP client not found: dhcpcd
- DHCP client not found: udhcpc
- Did not find a fallback interface on gce.
- Getting data from <class 'cloudinit.sources.DataSourceGCE.DataSourceGCELocal'> failed

@hector-gao hector-gao added bug Something isn't working correctly new An issue that still needs triage labels Oct 21, 2024
@hector-gao
Copy link
Author

The error seems to be caused by status "disabled" of cloud-init services though all the services ran successfully https://github.com/canonical/cloud-init/blob/ubuntu/23.4.4-0ubuntu0_23.10.1/cloudinit/cmd/status.py#L341-L342

Then there are two follow-up question:

  1. why is the error not printed in errors of cloud-init status --long
  2. why is "enabled" or "static" required for the services? In COS, we use gentoo eclass function https://github.com/gentoo/gentoo/blob/master/eclass/systemd.eclass#L267 to enable the services. It creates symlink in the .want directory of the target service. This makes sure cloud-init services are executed but they are not recognized as "enabled" by systemctl. Also, if a program or user disables cloud-init.service after boot, cloud-init status will print error. But it doesn't necessarily indicate actual errors in cloud-init process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly new An issue that still needs triage
Projects
None yet
Development

No branches or pull requests

1 participant