Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce prepare for eval, fix evaluation bug #789

Open
wants to merge 9 commits into
base: dev
Choose a base branch
from

Conversation

Niccolo-Ajroldi
Copy link
Contributor

@Niccolo-Ajroldi Niccolo-Ajroldi commented Sep 15, 2024

Description

This pull request introduces a prepare_for_eval function and updates the code to support it.

The implementation follows the blueprint of @fsschneider in #719 (comment) and fixes the bug of giving a free evaluation to a submission that goes out of max_runtime (again #719 (comment)).

Function signature

The arguments of prepare_for_eval are the same as update_params, except for batch. I believe that prepare_for_eval should indeed be agnostic to the last batch used during training. The return type is the same as update_params.

def prepare_for_eval(workload: spec.Workload,
                     current_param_container: spec.ParameterContainer,
                     current_params_types: spec.ParameterTypeTree,
                     model_state: spec.ModelAuxiliaryState,
                     hyperparameters: spec.Hyperparameters,
                     loss_type: spec.LossType,
                     optimizer_state: spec.OptimizerState,
                     eval_results: List[Tuple[int, float]],
                     global_step: int,
                     rng: spec.RandomState) -> spec.UpdateReturn:
  return (optimizer_state, current_param_container, model_state)

List of changes

In submission_runner.py:

  • add timed call to prepare_for_eval
  • add profiler
  • move del batch before prepare_for_eval (instead than before evaluation)
  • update accumulated_submission_time after prepare_for_eval
  • compute is_time_remaining after prepare_for_eval
  • proceed to eval iff is_time_remaining
  • add prep_eval_rng

Minor changes:

  • add PrepareForEvalFn to spec
  • add prepare_for_eval to submission template
  • add prepare_for_eval to all pytorch and jax submissions
  • update the docs

Fixes #719 and #758 .

@Niccolo-Ajroldi Niccolo-Ajroldi requested a review from a team as a code owner September 15, 2024 11:32
Copy link

github-actions bot commented Sep 15, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@priyakasimbeg priyakasimbeg changed the base branch from main to dev October 17, 2024 01:22
@@ -518,6 +548,7 @@ def score_submission_on_workload(workload: spec.Workload,
init_optimizer_state = submission_module.init_optimizer_state
update_params = submission_module.update_params
data_selection = submission_module.data_selection
prepare_for_eval = submission_module.prepare_for_eval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break the submission modules of the competition. Should we default this to a no op function maybe in a try except block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends if we want to make the code backward compatible with the previous submissions. If so, we could make
prepare_for_eval a no-op by doing smth like this:

prepare_for_eval = getattr(submission_module, 'prepare_for_eval', None)

Ant then call it only if it is not None:

# Prepare for evaluation (timed).
if prepare_for_eval is not None:

  with profiler.profile('Prepare for eval'):
    del batch
    prepare_for_eval_start_time = get_time()
    optimizer_state, model_params, model_state = prepare_for_eval(...)
    prepare_for_eval_end_time = get_time()

  # Update sumbission time.
  train_state['accumulated_submission_time'] += (
      prepare_for_eval_end_time - prepare_for_eval_start_time)

What do you think?

My only doubt is that I am not sure that we can make all the API changes backward compatible to the submissions, for example #785 is more tricky to implement in such a way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, I think backwards compatibility is important since we are transitioning to the rolling leaderboard (which may involve hardware changes) and will have to potentially rescore some submissions.
I don't think we want to bear the responsibility to modify the existing submissions' code.
So I prefer this solution to get the prepare_for_eval function and call it only it if it is not None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, it makes sense! I changed the code in this regard and tested it for backward compatibility, it looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inform submission about evaluation step
2 participants