Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: early hostkey generation #5728

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

TheRealFalcon
Copy link
Member

@TheRealFalcon TheRealFalcon commented Sep 23, 2024

Proposed Commit Message

TODO

Additional Context

I'm looking for feedback on the approach. Code currently works for the happy path. Tests, docs, and some refactoring still needed.

Canonical internal only: https://docs.google.com/document/d/1-Cfn-UIpN26vef_LeE0CtQg5D15r3ZRo0fERJ7R0XFg/edit

root@me:~# cloud-init analyze blame
-- Boot Record 01 --
     00.05800s (modules-config/config-grub_dpkg)
     00.02800s (init-network/config-users_groups)
     00.02200s (modules-final/config-keys_to_console)
     00.01300s (modules-config/config-apt_configure)
     00.01000s (init-local/search-LXD)
     00.00400s (init-network/config-ssh)
     00.00300s (init-network/config-growpart)
     00.00200s (modules-final/config-final_message)
     00.00100s (modules-final/config-install_hotplug)
     00.00100s (modules-config/config-locale)
     00.00100s (modules-config/check-cache)
     00.00100s (init-network/consume-user-data)
     00.00100s (init-network/config-set_passwords)
     00.00100s (init-network/config-set_hostname)
     00.00100s (init-network/config-resizefs)
     00.00100s (init-network/activate-datasource)
     00.00000s (modules-final/config-ssh_authkey_fingerprints)
     00.00000s (modules-final/config-scripts_vendor)
     00.00000s (modules-final/config-scripts_user)
     00.00000s (modules-final/config-scripts_per_once)
     00.00000s (modules-final/config-scripts_per_instance)
     00.00000s (modules-final/config-scripts_per_boot)
     00.00000s (modules-final/config-reset_rmc)
     00.00000s (modules-final/check-cache)
     00.00000s (modules-config/config-ssh_import_id)
     00.00000s (modules-config/config-byobu)
     00.00000s (init-network/setup-datasource)
     00.00000s (init-network/consume-vendor-data2)
     00.00000s (init-network/consume-vendor-data)
     00.00000s (init-network/config-update_hostname)
     00.00000s (init-network/config-seed_random)
     00.00000s (init-network/config-mounts)
     00.00000s (init-network/check-cache)
     00.00000s (init-local/check-cache)

Test Steps

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

@TheRealFalcon TheRealFalcon marked this pull request as draft September 23, 2024 14:29
@holmanb holmanb self-assigned this Sep 23, 2024
try:
# Using subprocess.Popen instead of subp.subp to run
# multiple ssh-keygen commands in parallel.
p = subprocess.Popen(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why generate these in parallel? RSA keygen is time-consuming, but generating the other keys is not, so I'm not convinced that the extra complexity outweighs the benefit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really seem any more complex to me, but serial is fine too.

LOG.warning("Failed to retrieve early generated host keys")
return []

key_dir = get_early_host_key_dir(rundir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not just use early_key_fifo_path here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I forgot to update the name (shouldn't end with _dir), but the reason for the function is needing to pass the rundir and to make sure the same path is used in multiple places.

early_keys: List[ssh_util.KeyPair] = (
ssh_util.wait_for_early_generated_keys(rundir)
)
if not early_keys or cfg.get("seed_random"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem with this that needs to be addressed.

Cloud-init's Azure and Openstack code has automatic entropy seeding, which bypasses cloud-config. This change highlights the fact that cloud-init implements datasource-specific code in configuration modules and in datasource modules. Simply checking the merged configuration is insufficient, because for some reason when this was implemented it was decided that this shouldn't just be transformed into vendor-data to be merged (overwritten by user-data).

I strongly suspect that automatic entropy seeding exists on these platforms for legacy reasons only[1][2], and is no longer needed. However, until this tech debt has been resolved, cloud-init should still respect the entropy provided by these platforms.

[1] torvalds/linux@f2580a9
[2] https://bugs.launchpad.net/nova/+bug/1789868

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I realized this should be random_seed rather than seed_random. Having the key opposite order from the module name is confusing.

To your point, I was aware of datasources doing their own thing here. I saw random_seed in the datasources and thought it was using the same key. I didn't realize it's storing the seed under a separate metadata key. This if statement can just be updated to also check if random_seed is in cloud.datasource.metadata, correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if statement can just be updated to also check if random_seed is in cloud.datasource.metadata, correct?

Yes, I believe so

early_key_fifo_path = _get_early_key_fifo_path(rundir)
if not early_key_fifo_path.exists():
return []
if early_key_fifo_path.read_bytes() != b"done":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not complete, this will block. It would be good to know if this happens, such as by using performance.Timed or performance.timed()

Copy link

Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close.

If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging TheRealFalcon, and he will ensure that someone takes a look soon.

(If the pull request is closed and you would like to continue working on it, please do tag TheRealFalcon to reopen it.)

@github-actions github-actions bot added the stale-pr Pull request is stale; will be auto-closed soon label Oct 19, 2024
@TheRealFalcon TheRealFalcon removed the stale-pr Pull request is stale; will be auto-closed soon label Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants