-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for optimizers that require an .eval()
step
#758
Comments
I use the closure form of the algorithm to avoid needing to do the eval() call. There has been a request from another user to support this this sort of eval() mode so that exponential weight averaging can be implemented. The organizers said that this would be something they will look at adding in the future. It should help with performance. |
Thanks a lot, @adefazio. Nice trick! As I understand, you basically swap the extrapolated points at every step, is that correct? That seems to be putting your algorithm at a disadvantage though. Have you observed considerable slowdown compared to the And I'm quite surprised that the proposed feature hasn't been added to support your method. It's just two lines of code (actually we only need one). Or maybe I'm missing something? Would love to hear your opinion on this, @priyakasimbeg. Thanks. |
The overhead is minimal, parameter copying is very fast. |
Hi we're planning on discussing this feature request on Th 9/5 during the WG meeting |
Description
First of all if this feature is already supported then please consider this as a question.
I'm trying to reproduce some results of existing algorithms such as SGD, AdamW, and also the recently proposed ScheduleFree. For the latter, in particular, there is an
.eval()
step that needs to be done before the validation phase, and I haven't figured out how to do that properly (there doesn't seem to be any indication in their submission code).Any help would be greatly appreciated! Thank you in advance!
The text was updated successfully, but these errors were encountered: