Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dhcp: add option to use NetworkManager for DHCP discovery #5563

Closed

Conversation

lkundrak
Copy link
Contributor

@lkundrak lkundrak commented Jul 29, 2024

Many distros nowadays, especially that the ISC DHCP client got abandoned usptream, ship NetworkManager as their only DHCP client. Allow using it for the init stage networking.

  • I have signed the CLA: https://ubuntu.com/legal/contributors
  • I have added my Github username to tools/.github-cla-signers
  • I have included a comprehensive commit message using the guide below
  • I have added unit tests to cover the new behavior under tests/unittests/
    • Test files should map to source files i.e. a source file cloudinit/example.py should be tested by tests/unittests/test_example.py
    • Run unit tests with tox -e py3
  • I have kept the change small, avoiding unnecessary whitespace or non-functional changes.
  • I have added a reference to issues that this PR relates to in the PR message (Refs integration: do not LXD bind mount /etc/cloud/cloud.cfg.d #1234, Fixes integration: do not LXD bind mount /etc/cloud/cloud.cfg.d #1234)
  • I have updated the documentation with the changed behavior.
    • If the change doesn't change the user interface and is trivial, this step may be skipped.
    • Cloud-config documentation is generated from the jsonschema.
    • Generate docs with tox -e docs.

@github-actions github-actions bot added the documentation This Pull Request changes documentation label Jul 29, 2024
Many distros nowadays, especially that the ISC DHCP client got abandoned
usptream, ship NetworkManager as their only DHCP client. Allow using it
for the init stage networking.

Signed-off-by: Lubomir Rintel <[email protected]>
@TheRealFalcon TheRealFalcon removed the documentation This Pull Request changes documentation label Jul 29, 2024
@holmanb holmanb self-assigned this Jul 29, 2024
Copy link

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

@github-actions github-actions bot added the stale-pr Pull request is stale; will be auto-closed soon label Aug 13, 2024
@holmanb holmanb removed the stale-pr Pull request is stale; will be auto-closed soon label Aug 13, 2024
Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this proposal @lkundrak! I haven't tested it yet, but I just left a couple of comments inline for you and other community members. Changing the whole Local stage to be after NetworkManager would break some expected behaviors, but perhaps those things don't need to stay as they are in the long term.

@@ -1011,4 +1012,228 @@ def parse_static_routes(routes: str) -> List[Tuple[str, str]]:
return []


ALL_DHCP_CLIENTS = [Dhcpcd, IscDhclient, Udhcpc]
class NetworkManagerDhcpClient(DhcpClient):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at this code thoroughly, but it does appear to have the same shortcoming as the udhcpc implementation: missing azure support for option 245.

cc @cjp256

On azure I see this:

$ nmcli --terse --fields DHCP4 device show eth0
DHCP4.OPTION[1]:dhcp_client_identifier = 01:00:0d:3a:91:ab:22
DHCP4.OPTION[2]:dhcp_lease_time = 4294967295
DHCP4.OPTION[3]:dhcp_server_identifier = 168.63.129.16
DHCP4.OPTION[4]:domain_name = vd15dwc2m3fevjchgt4zmpijze.gx.internal.cloudapp.net
DHCP4.OPTION[5]:domain_name_servers = 168.63.129.16
DHCP4.OPTION[6]:ip_address = 10.0.0.6
DHCP4.OPTION[7]:next_server = 168.63.129.16
DHCP4.OPTION[8]:private_245 = a8:3f:81:10
DHCP4.OPTION[9]:requested_broadcast_address = 1
DHCP4.OPTION[10]:requested_domain_name = 1
DHCP4.OPTION[11]:requested_domain_name_servers = 1
DHCP4.OPTION[12]:requested_domain_search = 1
DHCP4.OPTION[13]:requested_host_name = 1
DHCP4.OPTION[14]:requested_interface_mtu = 1
DHCP4.OPTION[15]:requested_ms_classless_static_routes = 1
DHCP4.OPTION[16]:requested_nis_domain = 1
DHCP4.OPTION[17]:requested_nis_servers = 1
DHCP4.OPTION[18]:requested_ntp_servers = 1
DHCP4.OPTION[19]:requested_rfc3442_classless_static_routes = 1
DHCP4.OPTION[20]:requested_root_path = 1
DHCP4.OPTION[21]:requested_routers = 1
DHCP4.OPTION[22]:requested_static_routes = 1
DHCP4.OPTION[23]:requested_subnet_mask = 1
DHCP4.OPTION[24]:requested_time_offset = 1
DHCP4.OPTION[25]:requested_wpad = 1
DHCP4.OPTION[26]:rfc3442_classless_static_routes = 0.0.0.0/0 10.0.0.1 168.63.129.16/32 10.0.0.1 169.254.169.254/32 10.0.0.1
DHCP4.OPTION[27]:routers = 10.0.0.1
DHCP4.OPTION[28]:subnet_mask = 255.255.255.0

So we have access to the wireserver IP address, but this PR doesn't seem deal with this special option correctly.

@@ -12,7 +12,7 @@ After=systemd-remount-fs.service
Requires=dbus.socket
After=dbus.socket
{% endif %}
Before=NetworkManager.service
After=NetworkManager.service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break some cloud-init expectations, so I don't think that we can accept this as it is. We could always just patch it for downstreams using NetworkManager, but this would leave cloud-init differently on different distributions, and it really misses the point of what the local stage is supposed to be doing.

It might be that cloud-init doesn't really need to get a network configuration before the daemon is up for datasources that require network - if that is true then we could always use the "activator" codepath - but that would change things like what hostname is advertised to the DHCP server (iirc there is a bug related to this). Thoughts @cjp256?

Either way if we go down that route, this code proposal would have to make some changes and that would be a big architectural change - so let me think about this a bit.

Copy link

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

@github-actions github-actions bot added the stale-pr Pull request is stale; will be auto-closed soon label Aug 29, 2024
@github-actions github-actions bot closed this Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale-pr Pull request is stale; will be auto-closed soon
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants