Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Using higher for hyperparameter optimization #135

Open
aruniyer opened this issue Dec 21, 2022 · 1 comment
Open

Using higher for hyperparameter optimization #135

aruniyer opened this issue Dec 21, 2022 · 1 comment

Comments

@aruniyer
Copy link

Hi,

I believe that higher could be used for hyperparameter optimization using bilevel programming. I have attempted to adapt the given meta-learning example for bilevel programming. However, I am somewhat unsure as to whether I have done it correctly. Here is a general structure of what I have done:

# Get optimizers
inner_optim = torch.optim.Adam(params=model.parameters(), lr=args.learning_rate)
outer_optim = torch.optim.Adam(params=hp.parameters(), lr=args.learning_rate)

# Training loop
num_inner_iter = args.inner_loop
for epoch in range(args.epochs):
    outer_optim.zero_grad()
    with higher.innerloop_ctx(
        model=model,
        opt=inner_optim,
        copy_initial_weights=False,
        track_higher_grads=False,
    ) as (fmodel, diffopt):
        for _ in range(num_inner_iter):
            # Forward pass
            train_out = fmodel(transformed_features, hp)
            train_loss = custom_loss(predicted=train_out, actual=train_labels)
            diffopt.step(train_loss)

        val_out = fmodel(transformed_features_val, hp)
        val_loss = custom_loss(predicted=val_out, actual=val_labels)
        val_loss.backward()
    outer_optim.step()

Does the above look correct? Or am I misunderstanding something?

@NoraAl
Copy link

NoraAl commented Dec 22, 2022

I am working on something related to two-level optimization, and Higher could help simplify the code. I would like to see some examples of meta-optimizers using Higher.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants