You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.
I believe that higher could be used for hyperparameter optimization using bilevel programming. I have attempted to adapt the given meta-learning example for bilevel programming. However, I am somewhat unsure as to whether I have done it correctly. Here is a general structure of what I have done:
# Get optimizers
inner_optim = torch.optim.Adam(params=model.parameters(), lr=args.learning_rate)
outer_optim = torch.optim.Adam(params=hp.parameters(), lr=args.learning_rate)
# Training loop
num_inner_iter = args.inner_loop
for epoch in range(args.epochs):
outer_optim.zero_grad()
with higher.innerloop_ctx(
model=model,
opt=inner_optim,
copy_initial_weights=False,
track_higher_grads=False,
) as (fmodel, diffopt):
for _ in range(num_inner_iter):
# Forward pass
train_out = fmodel(transformed_features, hp)
train_loss = custom_loss(predicted=train_out, actual=train_labels)
diffopt.step(train_loss)
val_out = fmodel(transformed_features_val, hp)
val_loss = custom_loss(predicted=val_out, actual=val_labels)
val_loss.backward()
outer_optim.step()
Does the above look correct? Or am I misunderstanding something?
The text was updated successfully, but these errors were encountered:
I am working on something related to two-level optimization, and Higher could help simplify the code. I would like to see some examples of meta-optimizers using Higher.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi,
I believe that higher could be used for hyperparameter optimization using bilevel programming. I have attempted to adapt the given meta-learning example for bilevel programming. However, I am somewhat unsure as to whether I have done it correctly. Here is a general structure of what I have done:
Does the above look correct? Or am I misunderstanding something?
The text was updated successfully, but these errors were encountered: