Support for optimizers that require an `.eval()` step #758

netw0rkf10w · 2024-04-11T20:36:24Z

Description

First of all if this feature is already supported then please consider this as a question.

I'm trying to reproduce some results of existing algorithms such as SGD, AdamW, and also the recently proposed ScheduleFree. For the latter, in particular, there is an .eval() step that needs to be done before the validation phase, and I haven't figured out how to do that properly (there doesn't seem to be any indication in their submission code).

Any help would be greatly appreciated! Thank you in advance!

The text was updated successfully, but these errors were encountered:

adefazio · 2024-04-11T20:48:00Z

I use the closure form of the algorithm to avoid needing to do the eval() call. There has been a request from another user to support this this sort of eval() mode so that exponential weight averaging can be implemented. The organizers said that this would be something they will look at adding in the future. It should help with performance.

netw0rkf10w · 2024-04-12T15:38:35Z

Thanks a lot, @adefazio. Nice trick!

As I understand, you basically swap the extrapolated points at every step, is that correct? That seems to be putting your algorithm at a disadvantage though. Have you observed considerable slowdown compared to the .eval() version?

And I'm quite surprised that the proposed feature hasn't been added to support your method. It's just two lines of code (actually we only need one). Or maybe I'm missing something?

Would love to hear your opinion on this, @priyakasimbeg. Thanks.

adefazio · 2024-04-12T16:10:06Z

The overhead is minimal, parameter copying is very fast.
The issue was raised very close to the deadline and so given the low overhead of the copy, it was decided that changing the competition API close to the deadline was not warranted.

priyakasimbeg · 2024-09-03T17:51:24Z

Hi we're planning on discussing this feature request on Th 9/5 during the WG meeting

priyakasimbeg · 2024-10-30T22:07:35Z

I believe this is partially addressed in #798 (decouple BN statistic updates from using running statistics) and will be fully addressed once pr/789 for #719 has been submitted. So I am deduping this. Please feel free to reopen if I have missed anything.

netw0rkf10w mentioned this issue Apr 18, 2024

Add support for .eval() #759

Closed

adefazio mentioned this issue May 30, 2024

Control over batch-norm running_mean/var buffers #767

Closed

Niccolo-Ajroldi mentioned this issue Sep 15, 2024

Introduce prepare for eval, fix evaluation bug #789

Open

priyakasimbeg closed this as completed Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for optimizers that require an `.eval()` step #758

Support for optimizers that require an `.eval()` step #758

netw0rkf10w commented Apr 11, 2024

adefazio commented Apr 11, 2024

netw0rkf10w commented Apr 12, 2024

adefazio commented Apr 12, 2024

priyakasimbeg commented Sep 3, 2024

priyakasimbeg commented Oct 30, 2024

Support for optimizers that require an .eval() step #758

Support for optimizers that require an .eval() step #758

Comments

netw0rkf10w commented Apr 11, 2024

Description

adefazio commented Apr 11, 2024

netw0rkf10w commented Apr 12, 2024

adefazio commented Apr 12, 2024

priyakasimbeg commented Sep 3, 2024

priyakasimbeg commented Oct 30, 2024

Support for optimizers that require an `.eval()` step #758

Support for optimizers that require an `.eval()` step #758