Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: single process optimization #5489

Merged
merged 22 commits into from
Aug 2, 2024

Conversation

holmanb
Copy link
Member

@holmanb holmanb commented Jul 4, 2024

Proposed Commit Message

feat: Single process optimization

Python interpreter initialization and module import time 
contributes a significant amount of wall clock time to
cloud-init's runtime (and therefore to total boot time).

Cloud-init has four stages. Each stage starts its own Python
interpreter and loads the same libraries. To eliminate the
redundant work of starting an interpreter and loading libraries,
this changes cloud-init to run as a single process. Systemd
service ordering is retained by using the existing cloud-init
services as shims which use a synchronization protocol to start
each cloud-init stage and to communicate that each stage is
complete to the init system. Since multiple cloud-init processes
sit in the critical chain of starting the system, this reduces
boot time (including time to ssh login and time to cloud-init
completion).

Currently only systemd is supported, but the synchronization
protocol should be capable of supporting other init systems
as well with minor changes.

Note: This enables many additional follow-on improvements that
eliminate redundant work. However, these potential improvements
are temporarily ignored. This commit has been structured to
minimize the changes required to capture the majority of primary
performance savings while preserving correctness and the ability
to preserve backwards compatibility.

Since this changes the semantics of the existing cloud-init unit
files, this change takes the opportunity to rename one of its
systemd units which causes frequent user confusion. The unit named
cloud-init.service is often mistaken by users for being the only
cloud-init service, when it is simply one of four stages. This
stage is documented as the "network" stage, so this service will
be renamed to "cloud-init-network.service". A new notify service
is added as part of this implementation which contains the
cloud-init process. This unit is named "cloud-init-main.service".
 
Synchronization protocol
========================

- create one Unix socket for each systemd service stage
- send sd_notify()
- For each of the four stages (local, network, config, final):
   - when init system sends "start" to the Unix socket, start the
     stage
   - when running stage is complete, send "done" to Unix socket

File changes
============

socket.py (new)
---------------

- define a systemd-notify helper function
- define a context manager which implements a multi-socket
  synchronization protocol

cloud-init.service -> cloud-init-network.service (renamed)
----------------------------------------------------------

- renamed to cloud-network.service

cloud-{init-local,init-network,config,final}.services
-------------------------------------------

- change ExecStart to use netcat to connect to Unix socket and:
  - send a start message
  - wait for completion response
- note: a pure Python equivalent is possible for any downstreams
  which do not package openbsd's netcat

cloud-init-main.service (new)
-----------------------------

 - use service type to 'notify'
 - invoke cloud-init in single process mode
 - adopt systemd ordering requirements from cloud-init-local.service
 - adopt KillMode from cloud-final.service

main.py
-------

 - Add command line flag to indicate "all stages" mode
 - In this mode run each stage followed by an IPC
   synchronization protocol step

cloud-final.services
--------------------

- drop KillMode

cloud-init-local.services
-------------------------

- drop dependencies made redundant by ordering after
  cloud-init-main.service

Performance Impact
==================

On Ubuntu 24.04, Python's wall clock start up time as measured with
`time python3 -c 'import cloudinit.cmd.main' on a few cloud types:

lxc container: 0.256s
QEMU machine:  0.300s
gce instance:  0.367s
ec2 instance:  0.491s

This change eliminates x1 this start up time from time to ssh.
This change eliminates x3 this start up time from cloud-init's total
completion. Total benefit varies based on the platform that the
instance is hosted by, but all platforms will measurably benefit from
this change.

BREAKING_CHANGE: Run all four cloud-init services as a single systemd service.

(if you made it this far, don't worry the implementation code is only 2x longer than this commit message!)

Design

Why use Unix sockets?

The implementation requires a fast, event-driven, lossless, bi-directional, cross platform IPC mechanism that is easily limited to root-only send/receive. Alternatives obviously exist, but Unix sockets seem like a natural fit.

Why not just share one socket?

A socket per stage keeps the implementation and makes failure modes simple.

Why openbsd netcat?

It supports Unix sockets and it's 7x faster than a bare bones pure Python implementation. A C / Rust / Go based implementation would be trivial to write, but netcat already exists and does the job sufficiently fast.

Why a new service?

  1. any external service that is ordered Before=cloud-init-local.service can now run in parallel to the interpreter loading

  2. by decoupling interpreter loading from running cloud-init logic, systemd can potentially start loading cloud-init earlier in boot (note the difference in ordering between cloud-init-single.service vs cloud-init-local.service)

  3. changing all four services to be shims that use an identical synchronization protocol to trigger and wait for the cloud-init stages keeps the design symmetrical - this makes it easier to debug and describe

Security implications

Dependencies

No additional dependencies required - this change implements our own sd_notify.

File permissions

The directory which contains the unix sockets is limited to root-only (see cloudinit/socket.py::67):

os.makedirs(f"{DEFAULT_RUN_DIR}/share", mode=0o700, exist_ok=True)

On Ubuntu DEFAULT_RUN_DIR resolves to /run/cloud-init/ which has the following access:

$ stat /run/cloud-init/
...
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)

Currently (before this PR), writes to this directory could modify cloud-init's configuration; cloud-init depends on POSIX permissions.

Todo (this PR)

  • unittest - this code gets exercised by every integration test, but a unittest
    would benefit development and could better stress test the synchronization code
    for race conditions
  • fix integration test tests/integration_tests/test_multi_part_user_data_handling.py::test_cloud_config_archive_boot_hook_logging
  • update performance results in the above commit message

Future work (followup PRs, required pre-24.3 release)

  • update man page
  • update CLI docs
  • update debugging docs per feat: single process optimization #5489 (comment)
  • fix logging
    • cloud-init pipes all output to tee by default (which is completely unnecessary for stream handling - we can probably ditch this dependency), which leaves processes hanging around after cloud-init signals completion. This leaves extraneous log messages in the output of systemctl status cloud-init-single.service (see Appendix B).
[RFC] Names (closed)

The cloud-init --single-process and cloud-init-single.service is the working names that I've used, but I think that we can do better.

  • cloud-init - this naming proposal is the spiritual successor to this spec we could rename cloud-init.service to cloud-init-network.service and repurpose the service name cloud-init.service to be the service that, you know, "runs cloud-init". This would probably be the best option in the long run, but would come at the cost of short-term pain for distributions. Since we won't backport this change to old releases this might be feasible. I'd be willing to write a blog post and release notes regarding the choice, and the changes that are visible to the user/distributor should be pretty straightforward: for any package that has an ordering dependency on cloud-init.service they could just sed -i s/cloud-init.service/cloud-init-network.service/g`.
  • runner / run - the names cloud-init --run and cloud-init-runner.service make it clear that a thing is running, but some might still see a cloud-init.service and incorrectly assume that this is the name of the service that runs cloud-init (many have made this incorrect assumption)

Rejected names

  • daemon - this isn't a long-lived process. While this lives longer than the previous four processes, it simply combines their four lifespans into one. This name would be misleading.
  • all - cloud-init --all and cloud-init-all.service, convey the right meaning, but it requires context to understand - the only reason to call this "all" is because it replaces multiple parts.

Appendix

A. systemctl status cloud-init-local.service

 cloud-init-local.service - Cloud-init: Local Stage (pre-network)
     Loaded: loaded (/usr/lib/systemd/system/cloud-init-local.service; enabled; preset: enabled)
     Active: active (exited) since Tue 2024-07-16 00:51:39 CEST; 16h ago
   Main PID: 254 (code=exited, status=0/SUCCESS)
        CPU: 1ms

Jul 16 00:51:39 cloudinit-0715-225129iftdf1yb systemd[1]: Starting cloud-init-local.service - Cloud-init: Local Stage (pre-network)...
Jul 16 00:51:39 cloudinit-0715-225129iftdf1yb nc.openbsd[254]: done
Jul 16 00:51:39 cloudinit-0715-225129iftdf1yb systemd[1]: Finished cloud-init-local.service - Cloud-init: Local Stage (pre-network).

B. systemctl status cloud-init-single.service

○ cloud-init-single.service - Cloud-init: Single Process
     Loaded: loaded (/usr/lib/systemd/system/cloud-init-single.service; enabled; preset: enabled)
     Active: inactive (dead) since Tue 2024-07-16 00:52:11 CEST; 16h ago
   Duration: 31.569s
    Process: 76 ExecStart=/usr/bin/cloud-init --single-process (code=exited, status=0/SUCCESS)
   Main PID: 76 (code=exited, status=0/SUCCESS)
     Status: "Complete"
        CPU: 10.096s



Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb cloud-init[257]: | ... . o =oO.    |
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb cloud-init[257]: +----[SHA256]-----+
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb systemd[1]: cloud-init-single.service: Deactivated successfully.
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb systemd[1]: cloud-init-single.service: Unit process 256 (sh) remains running after unit stopped.
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb systemd[1]: cloud-init-single.service: Unit process 257 (tee) remains running after unit stopped.
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb systemd[1]: cloud-init-single.service: Unit process 273 (sh) remains running after unit stopped.
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb systemd[1]: cloud-init-single.service: Unit process 274 (tee) remains running after unit stopped.
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb systemd[1]: cloud-init-single.service: Unit process 369 (sh) remains running after unit stopped.
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb systemd[1]: cloud-init-single.service: Unit process 1161 (sh) remains running after unit stopped.
Jul 16 00:52:11 cloudinit-0715-225129iftdf1yb systemd[1]: cloud-init-single.service: Consumed 10.095s CPU time, 228.9M memory peak, 0B memory swap peak.

C. Example log from instance booted with latest commit:

single-process.log

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

@holmanb holmanb force-pushed the holmanb/fastpath-single-proc branch 3 times, most recently from 5920138 to bdc76a4 Compare July 5, 2024 03:24
@a-dubs
Copy link
Collaborator

a-dubs commented Jul 5, 2024

(if you made it this far, don't worry the implementation code is only 2x longer than this commit message!)

🤣

@aciba90 aciba90 self-requested a review July 5, 2024 16:02
Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work here! I have some comments but they're all very minor. We should take some extra time to think about any additional testing that may need happen, but overall +1 to the approach.

[RFC] Names

--single-process is fine, but I'm -1 to cloud-init-single.service simply because we have the cloud-init single command that could potentially be confusing.

What about cloud-init boot and cloud-init-boot.service?

Long term, I like the idea of making this cloud-init.service, but I'm not sure it's worth the churn of doing it at the same time as this PR. Ideally, this PR won't introduce any backwards incompatibility, but given the scope of the change, I'm sure there will be some effects we need to deal with, so I think I'd rather keep intentional breakage separate.

Why openbsd netcat?

I don't know the history or difference between implementations, but is nc.openbsd installed everywhere? If not, is the difference meaningful such that we couldn't just use nc?

cloudinit/cmd/main.py Outdated Show resolved Hide resolved
cloudinit/socket.py Outdated Show resolved Hide resolved
cloudinit/socket.py Outdated Show resolved Hide resolved
systemd/cloud-init-single.service Outdated Show resolved Hide resolved
cloudinit/cmd/main.py Outdated Show resolved Hide resolved
cloudinit/socket.py Show resolved Hide resolved
cloudinit/socket.py Show resolved Hide resolved
cloudinit/socket.py Outdated Show resolved Hide resolved
systemd/cloud-config.service.tmpl Outdated Show resolved Hide resolved
cloudinit/socket.py Outdated Show resolved Hide resolved
@holmanb
Copy link
Member Author

holmanb commented Jul 15, 2024

Great work here! ... overall +1 to the approach.

Thanks!

We should take some extra time to think about any additional testing that may need happen

Agreed. I have an idea for an integration test that would support verifying the behavior more easily.

[RFC] Names

...

Long term, I like the idea of making this cloud-init.service, but I'm not sure it's worth the churn of doing it at the same time as this PR. Ideally, this PR won't introduce any backwards incompatibility, but given the scope of the change, I'm sure there will be some effects we need to deal with, so I think I'd rather keep intentional breakage separate.

This definitely breaks backwards compatibility and will require a patch and will require downstream maintainers that to update their vendored service files if they want to benefit from the performance improvements here. If we do this change separately from a rename, that means two breaking changes rather than one. Since we already require downstream changes as a result of this PR, I'd prefer to do the rename now - with the justification of a cleanup alongside a performance improvement is less painful than two breaking changes. This will require a community communication email / post to communicate changes required of downstreams, so this can be included in the "what this means for you" section for maintainers. Thoughts?

is nc.openbsd installed everywhere?

I can't promise that it is used everywhere, but I do know that it is pretty widely used, and is already a dependency of ubuntu-minimal:

$ apt rdepends netcat-openbsd                                        
netcat-openbsd
Reverse Depends:
  Recommends: libvirt-daemon
  Suggests: mariadb-server
  Recommends: libvirt-daemon
  Depends: xletters
  Depends: tomcat10-user
 |Depends: rtpengine-utils
  Suggests: mariadb-server
 |Depends: ifupdown-extra
  Depends: freedombox
  Depends: ubuntu-minimal

If not, is the difference meaningful such that we couldn't just use nc?

Like I mentioned before, openbsd's netcat supports Unix sockets. I don't see any other implementations that do.

If we'd prefer to avoid more dependencies, a C / Rust / Go implementation of the behavior that we need would be trivial to write using minimal dependencies, but that comes with the complexity of cloud-init packages becoming arch-dependent - which is not a huge burden, but it is another packaging burdon (though it's probably worth considering since this would likely improve performance further).

@holmanb holmanb force-pushed the holmanb/fastpath-single-proc branch 3 times, most recently from 5692c50 to c6e9e4b Compare July 15, 2024 23:11
Copy link
Contributor

@aciba90 aciba90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work here, thanks for the effort!

I don't know the history or difference between implementations, but is nc.openbsd installed everywhere? If not, is the difference meaningful such that we couldn't just use nc?

@TheRealFalcon: AFAIK the standard netcat does not implement unix sockets.

@@ -38,12 +37,13 @@ ExecStartPre=/bin/mkdir -p /run/cloud-init
ExecStartPre=/sbin/restorecon /run/cloud-init
ExecStartPre=/usr/bin/touch /run/cloud-init/enabled
{% endif %}
ExecStart=/usr/bin/cloud-init init --local
ExecStart=nc.openbsd -Uu -W1 /run/cloud-init/share/local.sock -s /run/cloud-init/share/local-return.sock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires an explicit downstream Ubuntu explicit runtime dependency on netcat-openbsd.

Copy link
Member Author

@holmanb holmanb Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires an explicit downstream Ubuntu explicit runtime dependency on netcat-openbsd.

Yes, it would be best to add this dependency to the packaging. However, since ubuntu-minimal already includes it it won't functionally change anything.

I plan to add that to the Ubuntu packaging in a followup PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet! Even if it's a noop, I think explicitly expressing the dependency is the way to go as, cloud-init may be installed in environments where ubuntu-minimal is not, or ubuntu-minimal might drop the dependency on netcat-openbsd in the future.

cloudinit/cmd/main.py Outdated Show resolved Hide resolved
systemd/cloud-init-single.service Outdated Show resolved Hide resolved
@holmanb
Copy link
Member Author

holmanb commented Jul 18, 2024

Nice work here, thanks for the effort!

@aciba90 thanks for the review! I'll ping you when I've addressed your points.

I don't know the history or difference between implementations, but is nc.openbsd installed everywhere? If not, is the difference meaningful such that we couldn't just use nc?

@TheRealFalcon: AFAIK the standard netcat does not implement unix sockets.

@TheRealFalcon also worth noting that on Ubuntu, I'm pretty sure that nc.openbsd is nc on Ubuntu. It is on Ubuntu Noble at least.

@blackboxsw blackboxsw added packaging Supplemental package review requested security labels Jul 18, 2024
@TheRealFalcon
Copy link
Member

This definitely breaks backwards compatibility and will require a patch and will require downstream maintainers that to update their vendored service files if they want to benefit from the performance improvements here. If we do this change separately from a rename, that means two breaking changes rather than one. Since we already require downstream changes as a result of this PR, I'd prefer to do the rename now - with the justification of a cleanup alongside a performance improvement is less painful than two breaking changes. This will require a community communication email / post to communicate changes required of downstreams, so this can be included in the "what this means for you" section for maintainers. Thoughts?

Makes sense. +1. We're just changing filenames though, right? Not also the CLI as you had initially speced out?

You and Alberto also worked through my nc concerns.

@holmanb
Copy link
Member Author

holmanb commented Jul 19, 2024

Makes sense. +1. We're just changing filenames though, right? Not also the CLI as you had initially speced out?

Correct, I'm not going to touch pre-existing cli names.

For now I'm going to rename it to cloud-init --all-stages (named for what it is doing) rather than cloud-init --single-process (named for how it is different from the past). If there are any objections or suggestions over this name please let me know.

@holmanb holmanb force-pushed the holmanb/fastpath-single-proc branch 3 times, most recently from ac9e590 to 49bebbf Compare July 23, 2024 21:36
@holmanb
Copy link
Member Author

holmanb commented Jul 23, 2024

Requesting re-review @TheRealFalcon. I added some another unit test. I think that the testing coverage does a pretty good job of exercising the ordering requirements and error conditions of the protocol in just the unit test suite, and it uses only standard library so I'm not sure what integration tests would add above what is already covered since every integration test in the test suite will already exercise the netcat / systemd service portion of this.

If dedicated integration tests are desired I'm happy to add them (with expectations laid out for what is desired). Note that the unittests added in this PR make use of the same operating system primitives (unix sockets) as any integration test would - with only the sd_notify mocked plus a couple of calls that were added which allow running cloud-init --all-stages under a debugger.

I'm currently working on a community notice to advertise the impacts to users and packagers.

Copy link

@setharnold setharnold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like an awesome performance improvement.

I'm concerned about reusing a systemd unit name for a completely different thing. I'd rather no names are re-used: someone somewhere either has fingers that know the names or documentation that refers to the names or (maybe not, for cloud-init?) service files and configurations on disk, or terraform plans, or ansible recipes, etc, that refer to the old units with the old meanings.

So: I suggest not using an old name for a new purpose.

One common problem with "multiple-stage" software is when something, or someone, causes the stages to execute out of order. This might not be quite as vulnerable as a TLS implementation in a webserver, but if the different phases have strict ordering requirements, we should have some simple checks for that ordering here.

Thanks

cloudinit/socket.py Outdated Show resolved Hide resolved
# notify systemd that this stage has completed
socket.sd_notify("READY=1")
# wait for cloud-init-local.service to start
with sync("local"):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there ordering requirements inherent to cloud-init's different stages? If local must be run before network, for example, this would be an ideal place to encode these requirements.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there ordering requirements inherent to cloud-init's different stages?

Yes, these ordering requirements are inherent requirements. This order is already encoded in the init system. Currently cloud-init-local.service is ordered before cloud-init.service which is before cloud-config.service which is before cloud-final.service.

local must be run before network, for example, this would be an ideal place to encode these requirements.

@setharnold Maybe you had something specific in mind that I'm missing?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these have to happen in lockstep order, maybe a variable to show what stage it's on:

stage=0
with sync("local"):
    if (stage++ < 0):
        error_and_exit("unexpected stage error")
    # stuff
with sync("network"):
    if (stage++ < 1):
        error_and_exit("network must follow local")
    # stuff
with sync("config"):
    if (stage++ < 2):
        error_and_exit("config must follow network")
    # stuff
with sync # over and over again
...

If the ordering is more vague, like "config" requires "local" and doesn't care about "network", then local_done and network_done and config_done and so on boolean variables to mark these completed in turn, then each one could check the status of the phases that it depends upon.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If these have to happen in lockstep order, maybe a variable to show what stage it's on:

stage=0
with sync("local"):
    if (stage++ < 0):
        error_and_exit("unexpected stage error")
    # stuff
with sync("network"):
    if (stage++ < 1):
        error_and_exit("network must follow local")
    # stuff
with sync("config"):
    if (stage++ < 2):
        error_and_exit("config must follow network")
    # stuff
with sync # over and over again
...

If the ordering is more vague, like "config" requires "local" and doesn't care about "network", then local_done and network_done and config_done and so on boolean variables to mark these completed in turn, then each one could check the status of the phases that it depends upon.

We could add code like this, but it wouldn't be possible for error_and_exit() to ever be called, regardless of the order that the messages are received on the unix sockets. This is single threaded code and it only has a single entry point. So in order for the network stage to run, the local stage has to have run.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've just got less faith than you in the systemd-enforced ordering, that's all. If you're sure I'm over-thinking it, feel free to leave it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've just got less faith than you in the systemd-enforced ordering, that's all.

Systemd is not the only thing enforcing ordering - it is already enforced in the Python code as well. If systemd triggers stages out of order, earlier stages will not get skipped - if "network" is triggered before "local", this Python code will wait until systemd triggers "local".

The reason is that the ordering of these stages isn't enforced in the definition of the context manager. It is enforced by the order that the context manager is invoked in - since with sync("local") is called prior to with sync("network") is called, the network stage cannot start until the local stage has completed. No parallalism / async / threading / coroutines / etc are in use; the code simply runs one stage after the next but not until systemd has signaled that it is time for the stage to run.

If you're sure I'm over-thinking it, feel free to leave it out.

I'm pretty sure that's what is happening, but I want to make sure you're comfortable with / understanding with how it works too. Hopefully the following annotated code helps.

with sync("local"):
   # "start" must have been received on the local socket to enter the context manager
   # (triggered by the one-liner in cloud-init-local.service)

# once the code gets to this point, the "local" stage has completed (and a response was
# sent in the context manager to the shim process in cloud-init-local.service)
with sync("network"):
   # "start" must have been received on the network socket to enter the context manager
   # (triggered by the one-liner in cloud-init-network.service) - in addition to
   # all of the code above being completed

# once the code gets to this point, the "network" stage has completed (and a response 
# was sent in the context manager to the shim process in cloud-init-network.service)
with sync("config"):
   # "start" must have been received on the config socket to enter the context manager
   # (triggered by the one-liner in cloud-config.service) - in addition to
   # the code above being completed

# once the code gets to this point, the "config" stage has completed (and a response 
# was sent in the context manager to the shim process in cloud-config.service)
with sync("final"):
   # "start" must have been received on the final socket to enter the context manager
   # (triggered by the one-liner in cloud-config.service) - in addition to
   # the code above being completed

So even if systemd triggers events out of order, cloud-init won't run them out of order. It will just wait until the next stage in the order has completed.

Please let me know if you have any questions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! now I can see why you're confident :) thanks!

cloudinit/socket.py Outdated Show resolved Hide resolved
cloudinit/socket.py Show resolved Hide resolved
cloudinit/socket.py Outdated Show resolved Hide resolved
@@ -10,7 +10,7 @@ ConditionEnvironment=!KERNEL_CMDLINE=cloud-init=disabled

[Service]
Type=oneshot
ExecStart=/usr/bin/cloud-init modules --mode=config
ExecStart=sh -c 'echo "start" | nc.openbsd -Uu -W1 /run/cloud-init/share/config.sock -s /run/cloud-init/share/config-return.sock | sh'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly complex invocation for an ExecStart with no surrounding commentary or documentation to aid in explaining the intent of the socket setup and pipe chain. It also makes typical triage entrypoints harder to read when folks may jump into systemctl status cloud-*.service
as it pushes visibility of exit codes off screen because our ExecStart is so long.

It may aid in readability/maintainability if cloud-init delivered a simple wrapper script in /usr/lib/cloudinit/that would take a single command line param for the stage name (local|network|config|final) and create these seperate named sockets. We can then document in that script the intent of opening the socket and the act of passing through socket payloads to sh.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly complex invocation for an ExecStart with no surrounding commentary or documentation to aid in explaining the intent of the socket setup and pipe chain.

Agreed, inline-documentation would help a lot. I'm happy to add that.

It also makes typical triage entrypoints harder to read when folks may jump into systemctl status cloud-*.service as it pushes visibility of exit codes off screen because our ExecStart is so long.

You're referring to systemd's pager truncating the message on the Process: line, right?

$ systemctl status cloud-init-local.service
● cloud-init-local.service - Cloud-init: Local Stage (pre-network)
     Loaded: loaded (/usr/lib/systemd/system/cloud-init-local.service; enabled; preset: enabled)
     Active: active (exited) since Wed 2024-07-24 19:53:46 UTC; 2min 25s ago
    Process: 254 ExecStart=sh -c echo "start" | nc.openbsd -Uu -W1 /run/cloud-init/share/local.sock -s /run/cloud-init/share/local-return.sock | sh (code=exited, status=0/S>
   Main PID: 254 (code=exited, status=0/SUCCESS)
        CPU: 5ms

I don't think that this is really a concern. This off-screen information is duplicated on the following line, and even if it wasn't, one can always use --no-pager or pipe to cat / less / etc.

It may aid in readability/maintainability if cloud-init delivered a simple wrapper script in /usr/lib/cloudinit/that would take a single command line param for the stage name (local|network|config|final) and create these seperate named sockets. We can then document in that script the intent of opening the socket and the act of passing through socket payloads to sh.

I'd really rather not hide the implementation in a script. The original implementation was dead simple - just a single nc invocation but at the request of getting error codes into the shim services I made this compromise to the previous simplicity. I'm very averse to making this any more complex than it already is. I don't want another ds-identify. Frankly I'd sooner drop the error codes in the shim services or re-write this one-liner in C/Rust/Go (but in preference of shipping code that we already have packaging for I've proposed it like this for now with the intent of potentially rewriting in a different language eventually).

Also, putting this in a file would add a file load and read to the critical chain per-stage.

Copy link
Collaborator

@blackboxsw blackboxsw Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, inline-documentation would help a lot. I'm happy to add that.

This meets the majority of my concern, I'm good with your concerns about adding a wrapper layer adding complexity or obscurity where it may not be needed.

If we have comments within this unit that describe the intent of this interaction, then I can back down on the "please make systemctl status cloud-init.service more readable/simple" point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the latest commit, which I think addresses the remaining concerns in this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes looking great.

cloudinit/cmd/main.py Show resolved Hide resolved
cloudinit/socket.py Show resolved Hide resolved
# the second value contains the path to a unix socket on which to
# reply, which is expected to be /path/to/{self.stage}-return.sock
sock = self.sockets[self.stage]
chunk, self.remote = sock.recvfrom(5)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically systemd units have been kicked out of systemd boot goals due to misconfiguraiton ordering cycles (unit dependency chain issues) with other systemd units. Do we want to implement some sort of long timeout exception handling here to not block indefinitely and hard error that "cloud-init didn't start expected X stage in X mins" to unblock the rest of boot?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historically systemd units have been kicked out of systemd boot goals due to misconfiguraiton ordering cycles (unit dependency chain issues) with other systemd units.
Do we want to implement some sort of long timeout exception handling here to not block indefinitely and hard error that "cloud-init didn't start expected X stage in X mins" to unblock the rest of boot?

This is a condition which should be discovered before the image ever gets shipped, not something that should bite end users.

I really don't think that we want that. What you are referring to is a symptom seen at runtime of a problem that in almost all cases is introduced by a combination of packages that have been installed on the system (build time). Services in systemd can take an unknowable amount of time to complete and a timeout could break a real user of cloud-init that has to deal with long boot times. Such a timeout isn't a correct solution for this problem. The correct solution is to fix the ordering cycle. Adding such a timeout could break real users that have long boot times just to introduce a broken workaround for packagers.

Yes cloud-init will behave differently from before if booted out due to an ordering cycle. However I don't think that this is something that an end user should ever see unless they themselves are messing with the boot order or their OS vendor made a mistake - and even in those cases it still doesn't seem like a problem for cloud-init to attempt to solve at the expense of real users with slow boot times.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a condition which should be discovered before the image ever gets shipped, not something that should bite end users.

This does happen sometimes to (image creator) end-users who install supplemental systemd services in their golden images which has resulted in bogus bugs/questions about cloud-init not running right (because it was ejected from the boot goals due to systemd unit dependency conflicts). I was wondering if there are ways we could make that condition a bit more conspicuous, but the symptoms below are probably "good enough" for those image creators to triage and determine what is causing the block in boot.

Here are the symptoms a user would encounter if we don't instrument a timeout of some sort our failure mode in images where cloud-network.service were to be ejected from boot goals due to cyclic dependency ordering the system will be in the following state:

  • systemctl status` will indefinitely set at "starting" State.
  • systemctl list-jobs will not show the missing/disabled cloud-network.service but instead show cloud-final.service, cloud-init.target in waiting state and cloud-config.service sitting indefinitely in "running" state with not logs
  • systemctl status cloud-init.service will show Waiting on external services to complete (network stage)
  • cloud-init status --format=yaml will show extended_status: running and no start times for init stage in the resulting YAML output.

These are probably reasonable enough breadcrumbs to look into on systems which exhibit this configuration problem and may not warrant further arbitrary timeouts which could adversely affect slow system startup.

Copy link
Member Author

@holmanb holmanb Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a condition which should be discovered before the image ever gets shipped, not something that should bite end users.

This does happen sometimes to (image creator) end-users who install supplemental systemd services in their golden images which has resulted in bogus bugs/questions about cloud-init not running right (because it was ejected from the boot goals due to systemd unit dependency conflicts). I was wondering if there are ways we could make that condition a bit more conspicuous, but the symptoms below are probably "good enough" for those image creators to triage and determine what is causing the block in boot.

This can easily be checked with systemd-analyze verify default.target and checking the output, or by checking the journal for messages like: "Breaking ordering cycle by deleting job cloud-init.service".

I can add that to the docs in the appropriate place for anyone that runs into this. I'll make it a separate PR. I've added it to the "future work" list in the PR description.

systemd/cloud-network.service.tmpl Outdated Show resolved Hide resolved
@holmanb
Copy link
Member Author

holmanb commented Jul 24, 2024

Thanks for the review @setharnold!

So: I suggest not using an old name for a new purpose.

You make good points. I was hopeful that this would be a good name for it, and I still think it is, however you're right that this may make the transition unnecessarily painful. I've changed the name of the primary service to cloud-init-main.service.

One common problem with "multiple-stage" software is when something, or someone, causes the stages to execute out of order. This might not be quite as vulnerable as a TLS implementation in a webserver, but if the different phases have strict ordering requirements, we should have some simple checks for that ordering here.

With these changes, cloud-init can only run these stages in the required order. Cloud-init uses systemd ordering to guarantee that a stage is not started before it is ready, and furthermore will only run each stage once the last one has ran and once it has received the "start" message for the next stage. This PR also includes tests that exercise this ordering functionality, so I think that we already have in place what is being suggested, unless I missed something.

@github-actions github-actions bot added the documentation This Pull Request changes documentation label Jul 24, 2024
@holmanb holmanb force-pushed the holmanb/fastpath-single-proc branch from 40992ea to ae7a4da Compare July 24, 2024 23:12
@holmanb
Copy link
Member Author

holmanb commented Jul 24, 2024

I just rebased on tip of main to fix a unit test that failed due to service rename.

@holmanb holmanb force-pushed the holmanb/fastpath-single-proc branch from 0de6016 to cb7bb25 Compare August 2, 2024 20:24
@holmanb
Copy link
Member Author

holmanb commented Aug 2, 2024

The cc_mounts PR that we just landed introduced a merge conflict with this PR, so I just pushed a change which resolves that.

@holmanb holmanb merged commit 143bc9e into canonical:main Aug 2, 2024
23 checks passed
holmanb added a commit to holmanb/cloud-init that referenced this pull request Aug 2, 2024
Python interpreter initialization and module import time 
contributes a significant amount of wall clock time to
cloud-init's runtime (and therefore to total boot time).

Cloud-init has four stages. Each stage starts its own Python
interpreter and loads the same libraries. To eliminate the
redundant work of starting an interpreter and loading libraries,
this changes cloud-init to run as a single process. Systemd
service ordering is retained by using the existing cloud-init
services as shims which use a synchronization protocol to start
each cloud-init stage and to communicate that each stage is
complete to the init system. Since multiple cloud-init processes
sit in the critical chain of starting the system, this reduces
boot time (including time to ssh login and time to cloud-init
completion).

Currently only systemd is supported, but the synchronization
protocol should be capable of supporting other init systems
as well with minor changes.

Note: This enables many additional follow-on improvements that
eliminate redundant work. However, these potential improvements
are temporarily ignored. This commit has been structured to
minimize the changes required to capture the majority of primary
performance savings while preserving correctness and the ability
to preserve backwards compatibility.

Since this changes the semantics of the existing cloud-init unit
files, this change takes the opportunity to rename one of its
systemd units which causes frequent user confusion. The unit named
cloud-init.service is often mistaken by users for being the only
cloud-init service, when it is simply one of four stages. This
stage is documented as the "network" stage, so this service will
be renamed to "cloud-init-network.service". A new notify service
is added as part of this implementation which contains the
cloud-init process. This unit is named "cloud-init-main.service".
 
Synchronization protocol
========================

- create one Unix socket for each systemd service stage
- send sd_notify()
- For each of the four stages (local, network, config, final):
   - when init system sends "start" to the Unix socket, start the
     stage
   - when running stage is complete, send "done" to Unix socket

File changes
============

socket.py (new)
---------------

- define a systemd-notify helper function
- define a context manager which implements a multi-socket
  synchronization protocol

cloud-init.service -> cloud-init-network.service (renamed)
----------------------------------------------------------

- renamed to cloud-network.service

cloud-{init-local,init-network,config,final}.services
-------------------------------------------

- change ExecStart to use netcat to connect to Unix socket and:
  - send a start message
  - wait for completion response
- note: a pure Python equivalent is possible for any downstreams
  which do not package openbsd's netcat

cloud-init-main.service (new)
-----------------------------

 - use service type to 'notify'
 - invoke cloud-init in single process mode
 - adopt systemd ordering requirements from cloud-init-local.service
 - adopt KillMode from cloud-final.service

main.py
-------

 - Add command line flag to indicate "all stages" mode
 - In this mode run each stage followed by an IPC
   synchronization protocol step

cloud-final.services
--------------------

- drop KillMode

cloud-init-local.services
-------------------------

- drop dependencies made redundant by ordering after
  cloud-init-main.service

Performance Impact
==================

On Ubuntu 24.04, Python's wall clock start up time as measured with
`time python3 -c 'import cloudinit.cmd.main' on a few cloud types:

lxc container: 0.256s
QEMU machine:  0.300s
gce instance:  0.367s
ec2 instance:  0.491s

This change eliminates x1 this start up time from time to ssh.
This change eliminates x3 this start up time from cloud-init's total
completion. Total benefit varies based on the platform that the
instance is hosted by, but all platforms will measurably benefit from
this change.

BREAKING_CHANGE: Run all four cloud-init services as a single systemd service.
holmanb added a commit that referenced this pull request Aug 6, 2024
Python interpreter initialization and module import time 
contributes a significant amount of wall clock time to
cloud-init's runtime (and therefore to total boot time).

Cloud-init has four stages. Each stage starts its own Python
interpreter and loads the same libraries. To eliminate the
redundant work of starting an interpreter and loading libraries,
this changes cloud-init to run as a single process. Systemd
service ordering is retained by using the existing cloud-init
services as shims which use a synchronization protocol to start
each cloud-init stage and to communicate that each stage is
complete to the init system. Since multiple cloud-init processes
sit in the critical chain of starting the system, this reduces
boot time (including time to ssh login and time to cloud-init
completion).

Currently only systemd is supported, but the synchronization
protocol should be capable of supporting other init systems
as well with minor changes.

Note: This enables many additional follow-on improvements that
eliminate redundant work. However, these potential improvements
are temporarily ignored. This commit has been structured to
minimize the changes required to capture the majority of primary
performance savings while preserving correctness and the ability
to preserve backwards compatibility.

Since this changes the semantics of the existing cloud-init unit
files, this change takes the opportunity to rename one of its
systemd units which causes frequent user confusion. The unit named
cloud-init.service is often mistaken by users for being the only
cloud-init service, when it is simply one of four stages. This
stage is documented as the "network" stage, so this service will
be renamed to "cloud-init-network.service". A new notify service
is added as part of this implementation which contains the
cloud-init process. This unit is named "cloud-init-main.service".
 
Synchronization protocol
========================

- create one Unix socket for each systemd service stage
- send sd_notify()
- For each of the four stages (local, network, config, final):
   - when init system sends "start" to the Unix socket, start the
     stage
   - when running stage is complete, send "done" to Unix socket

File changes
============

socket.py (new)
---------------

- define a systemd-notify helper function
- define a context manager which implements a multi-socket
  synchronization protocol

cloud-init.service -> cloud-init-network.service (renamed)
----------------------------------------------------------

- renamed to cloud-network.service

cloud-{init-local,init-network,config,final}.services
-------------------------------------------

- change ExecStart to use netcat to connect to Unix socket and:
  - send a start message
  - wait for completion response
- note: a pure Python equivalent is possible for any downstreams
  which do not package openbsd's netcat

cloud-init-main.service (new)
-----------------------------

 - use service type to 'notify'
 - invoke cloud-init in single process mode
 - adopt systemd ordering requirements from cloud-init-local.service
 - adopt KillMode from cloud-final.service

main.py
-------

 - Add command line flag to indicate "all stages" mode
 - In this mode run each stage followed by an IPC
   synchronization protocol step

cloud-final.services
--------------------

- drop KillMode

cloud-init-local.services
-------------------------

- drop dependencies made redundant by ordering after
  cloud-init-main.service

Performance Impact
==================

On Ubuntu 24.04, Python's wall clock start up time as measured with
`time python3 -c 'import cloudinit.cmd.main' on a few cloud types:

lxc container: 0.256s
QEMU machine:  0.300s
gce instance:  0.367s
ec2 instance:  0.491s

This change eliminates x1 this start up time from time to ssh.
This change eliminates x3 this start up time from cloud-init's total
completion. Total benefit varies based on the platform that the
instance is hosted by, but all platforms will measurably benefit from
this change.

BREAKING_CHANGE: Run all four cloud-init services as a single systemd service.
@holmanb holmanb mentioned this pull request Aug 8, 2024
2 tasks
@holmanb
Copy link
Member Author

holmanb commented Oct 11, 2024

Any that prefer to avoid the openbsd netcat dependency might consider the following patch.

0001-Demonstrate-how-to-avoid-netcat-dependency.txt

archlinux-github pushed a commit to archlinux/archiso that referenced this pull request Oct 27, 2024
Adapt enabled services in both baseline and releng profiles to cope with
changes in cloud-init 24.3.

Related-to: https://gitlab.archlinux.org/archlinux/packaging/packages/cloud-init/-/issues/3
Related-to: canonical/cloud-init#5489
Signed-off-by: David Runge <[email protected]>
@dvzrv
Copy link

dvzrv commented Oct 27, 2024

Any that prefer to avoid the openbsd netcat dependency might consider the following patch.

0001-Demonstrate-how-to-avoid-netcat-dependency.txt

@holmanb If this is possible, I don't understand why cloud-init would not just rely on Python fully in the first place and instead on something quite undermaintained as openbsd-netcat? 🤔
Downstreams would have to maintain your proposed patch indefinitely, which is brittle and a unnecessary overhead for them.
I think it would be much more future proof to drop the dependency on openbsd-netcat and instead just use Python here going forward. 🙏

Relatedly, this change has not been mentioned in the release announcements, or the changelog (it can only be found in the last footnote of a separate announcement on an Ubuntu discourse post).

I'm raising this, because I am packaging this project for Arch Linux and due to the intransparent handling these breaking changes were not initially clear to me (which broke first cloud-init generally, but also our integration with our installation medium afterwards and I am still fixing that).
Additionally, not all downstreams have an "alternatives" system, so gnu-netcat and openbsd-netcat can not be installed side-by-side, while offering different features (notably, gnu-netcat does not offer -U for dealing with unix sockets).

archlinux-github pushed a commit to archlinux/archiso that referenced this pull request Oct 27, 2024
archlinux-github pushed a commit to archlinux/archiso that referenced this pull request Oct 27, 2024
archlinux-github pushed a commit to archlinux/archiso that referenced this pull request Oct 27, 2024
@holmanb
Copy link
Member Author

holmanb commented Oct 27, 2024

Any that prefer to avoid the openbsd netcat dependency might consider the following patch.
0001-Demonstrate-how-to-avoid-netcat-dependency.txt

@holmanb If this is possible, I don't understand why cloud-init would not just rely on Python fully in the first place

Openbsd-netcat is significantly faster to start based on our testing. A compiled language would likely perform even better with less dependencies (outside of a compiler) - we may implement that in the future. Since cloud-init sits in the critical chain during boot, this would reduce the benefits received by this change - while keeping the other complexity of this change.

and instead on something quite undermaintained as openbsd-netcat?

How is it undermaintained?

Relatedly, this change has not been mentioned in the release announcements, or the changelog

(it can only be found in the last footnote of a separate announcement on an Ubuntu discourse post).

This was actually mentioned under the "Breaking Changes" section in the release announcements. We broadcasted it everywhere, but the release notes admittedly didn't describe the change in detail.

Breaking Changes:

Warning issued for systemd environments if cloud-init boot stages are
called by PIDs other than the root process. It is expected that boot
stages are only called by systemd units and not post-production scripts
or tools.
Introduce a performance optimization by sharing the python environment and setup costs across all four boot stages with a new cloud-init-main.service

🤔 Downstreams would have to maintain your proposed patch indefinitely, which is brittle and a unnecessary overhead for them. I think it would be much more future proof to drop the dependency on openbsd-netcat and instead just use Python here going forward. 🙏
...
I'm raising this, because I am packaging this project for Arch Linux and due to the intransparent handling these breaking changes were not initially clear to me (which broke first cloud-init generally, but also our integration with our installation medium afterwards and I am still fixing that).

I see the frustration, sorry :(

Additionally, not all downstreams have an "alternatives" system, so gnu-netcat and openbsd-netcat can not be installed side-by-side, while offering different features (notably, gnu-netcat does not offer -U for dealing with unix sockets).

If you were to replace gnu-netcat with openbsd-netcat, like some other distros have, this wouldn't be an issue. Would that be possible?

archlinux-github pushed a commit to archlinux/archiso that referenced this pull request Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation This Pull Request changes documentation packaging Supplemental package review requested security
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants