Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support URI sources in write_files module #5505

Merged
merged 24 commits into from
Jul 22, 2024

Conversation

LRitzdorf
Copy link
Contributor

@LRitzdorf LRitzdorf commented Jul 10, 2024

Proposed Commit Message

feat(write_files): support URI content sources

This change adds an optional `source` key to the `write_files` module,
allowing users to specify a URI from which to load file contents. This
facilitates more flexible multi-part configurations, as file contents
can be managed via external sources such as independent Git
repositories.

Fixes GH-5500

Additional Context

Resolves #5500

Test Steps

Unit tests, included, cover:

  • file:// and HTTP URI functionality
  • Fallback behavior (i.e. use inline content if the provided URI fails to return usable data)
  • Updated schema validation

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

@LRitzdorf
Copy link
Contributor Author

As I continue to work on these changes, I have a few questions for anyone with real cloud-init experience:

  • Do we want to support templating of the source URI?
    The phone_home module seems to do this, but... only with the instance ID?
  • Which, if any, of the other kwargs supported by readurl should be exposed as module parameters?
    Retry behavior and headers, in particular, stick out to me as settings that the user might want to control. On the other hand, how many parameters is too many?

This isn't really an appropriate example, since the `write_files` module
now supports URL sources. (Also, this example wrote to `/tmp`, which
conflicts with other advice on the examples page.)
...rather than passing a Paths object, which is just pain. Also, this
way lets us pass None as well, if desired.
@LRitzdorf
Copy link
Contributor Author

LRitzdorf commented Jul 11, 2024

I have one (now two) new tests currently, which access content from a file:// URI. There should probably also be one for a "real" HTTP(S) URI as well, but I'm not sure about setting that up — would we also need to launch a lightweight HTTP server? Is there already some kind of support for this in the testing framework?

We should also test reading from a "real" (i.e. HTTP[S]) URI, though
setting this up might be... involved. Would we need a local server?
@LRitzdorf LRitzdorf marked this pull request as ready for review July 12, 2024 15:02
@TheRealFalcon TheRealFalcon added the CLA signed The submitter of the PR has signed the CLA label Jul 12, 2024
Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this contribution! Overall, things look good here. I left a few inline comments as well as answers to your questions here.

Do we want to support templating of the source URI?

cloud-config supports jinja templating, so module specific templating shouldn't be necessary (if I understood the question correctly). The cc_phone_home snippet you found is very old and likely predates the ability to use jinja.

Which, if any, of the other kwargs supported by readurl should be exposed as module parameters?

That's a great question with no great answer. I prefer to be opinionated as to remove the number of options that need to be specified. The retry defaults you provided seem reasonable, though I'd throw a comment in specifying that they're completely arbitrary and just seemed like good defaults. It makes sense to include headers for any kind of token or basic auth, but I don't think we need more options than that. To accomodate this, I think it'd be better to make source a dictionary. It can then have subkeys like uri which would be required and headers that would be optional. If in the future we decide we need more than that, we can then add them to the dictionary.

I have one (now two) new tests currently, which access content from a file:// URI. There should probably also be one for a "real" HTTP(S) URI as well, but I'm not sure about setting that up — would we also need to launch a lightweight HTTP server? Is there already some kind of support for this in the testing framework?

For unit tests, we use responses to mock the server responses. For this effort I think that should be sufficient. Can we also get a test for the fallback behavior?

Let me know if you have any extra questions!

cloudinit/config/cc_write_files.py Outdated Show resolved Hide resolved
cloudinit/config/cc_write_files.py Outdated Show resolved Hide resolved
cloudinit/config/cc_write_files.py Outdated Show resolved Hide resolved
@LRitzdorf
Copy link
Contributor Author

Thanks for the feedback — that's very useful, especially since I'm new to cloud-init. Especially the "source as a dictionary" bit; that'll be much cleaner. Will have updates soon!

(Also, side note: there's already a test for the fallback behavior; that's the second one I alluded to in my previous edit. It currently uses a file URI, but that could change if desired.)

@TheRealFalcon
Copy link
Member

(Also, side note: there's already a test for the fallback behavior; that's the second one I alluded to in my previous edit. It currently uses a file URI, but that could change if desired.)

Ah, sorry, I missed that. That works!

@LRitzdorf
Copy link
Contributor Author

All right, comments addressed — and a fair bit of formatting fixed; tests under older Pythons should actually pass now!

FWIW, I did try changing the fallback test to use an HTTP URI, but that takes several seconds to fail (due to going through the OS's networking stack), so I've left it with a file URI. Not worth slowing down the tests for a tiny bit of added realism.

Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! I left a few more minor inline comments, but I think it's really close now. I'm also approving the CI runs, so check for any failures there too.

cloudinit/config/cc_write_files.py Show resolved Hide resolved
doc/module-docs/cc_write_files/example6.yaml Outdated Show resolved Hide resolved
cloudinit/config/cc_write_files.py Outdated Show resolved Hide resolved
@LRitzdorf
Copy link
Contributor Author

Great, thanks! I've been running tests manually on my system as well, but it looks like I missed ruff. That's fixed now, along with your other comments!

Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LRitzdorf Thanks for this! I see a small change that we should make to the schema before merging. See my comments below.

Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me. Thanks @LRitzdorf, and welcome to cloud-init!

Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution @LRitzdorf !

@TheRealFalcon TheRealFalcon merged commit 7c2d4fd into canonical:main Jul 22, 2024
23 checks passed
holmanb pushed a commit to holmanb/cloud-init that referenced this pull request Aug 2, 2024
This change adds an optional `source` key to the `write_files` module,
allowing users to specify a URI from which to load file contents. This
facilitates more flexible multi-part configurations, as file contents
can be managed via external sources such as independent Git
repositories.

Fixes canonicalGH-5500
@LRitzdorf LRitzdorf deleted the write-files-url-source branch August 5, 2024 18:09
holmanb pushed a commit that referenced this pull request Aug 6, 2024
This change adds an optional `source` key to the `write_files` module,
allowing users to specify a URI from which to load file contents. This
facilitates more flexible multi-part configurations, as file contents
can be managed via external sources such as independent Git
repositories.

Fixes GH-5500
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA signed The submitter of the PR has signed the CLA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[enhancement]: Support downloading file contents with write_files module
3 participants