Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New persist() method #361

Open
dreadatour opened this issue Aug 27, 2024 · 2 comments
Open

New persist() method #361

dreadatour opened this issue Aug 27, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@dreadatour
Copy link
Contributor

Follow-up for the #327

Sometimes it is useful to save intermediate chain state, because operations are lazy, chains are not executed immediately and intermediate results are not stored.

For example, if we want to create dc_filtered_1 and dc_embeddings from dc, without saving intermediate dc chain will be executed twice, for each children.

It is possible to do it with save() method without name param, also we have exec() method, but it looks like persist() is better and more verbose name for this method.

After persist() method will be implemented, we may want to make name param in save() method mandatory.

@mattseddon
Copy link
Member

How about materialise instead of persist? Just a suggestion.

@rlamy
Copy link
Contributor

rlamy commented Aug 27, 2024

.persist() is the name of the method in the dataframe API standard. I think that's what we should use - assuming it works exactly as described in the standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants