more iteration than setting #65

wt12318 · 2022-05-30T01:02:55Z

Hi,

When I set the num_iteration is 50, the actual running iteration is more than 50:

config = dict()
config["optimizer"] = "Bayesian"
config["num_iteration"] = 50

tuner = Tuner(HYPERPARAMETERS, 
              objective=run_one_training,
              conf_dict=config) 
results = tuner.minimize()

The MLflow shows it has run 62 iterations:

The text was updated successfully, but these errors were encountered:

sandeep-iitr · 2022-05-30T01:31:31Z

Hi,
Thanks for asking this question.

Internally, Mango will run a few random iterations to do a proper initialization
The number of these random iterations by default is 2.
You can modify this by the config parameter 'initial_random': 2
So, in most cases, your total iterations will be num_iteration + initial_random

However, this random parameter is a suggestion to the optimizer, and in some cases,
it may run more random iterations to do proper initialization. This happens for problems where the variation in the objective value is very little, and Mango may internally decide to more random iterations to make sure it finds good regions in the hyperparameter space. For most of the problems setting initial_random will make the iterations bounded as needed.

This may also happen in cases when some of the random iterations didn't succeed, and your objective function was able to consider their failures, due to which Mango ran more random iterations to make sure 2 random iterations succeeded.

wt12318 · 2022-05-30T02:48:17Z

Thank you

wt12318 · 2022-06-09T10:04:14Z

Hi,

When I set the initial_random is one, but it still run more iterations than I set. And the total number combination of my all parameter is 36, but it run more iterations than 36. Why this happened?

Thank you.

sandeep-iitr · 2022-06-09T16:44:54Z

Can you share more details about your parameter space and the definition of your objective function?

wt12318 · 2022-06-10T00:16:31Z

Thank you for reply. This is my objective function and parameter space:

@scheduler.parallel(n_jobs=36)
def run_one_training(**params):
    with mlflow.start_run() as run:
        # Log parameters used in this experiment
        for key in params.keys():
            mlflow.log_param(key, params[key])

        # Loading the dataset
        print("Loading dataset...")
        train_dataset = TCRpMHCDataset(root="/public/slst/home/wutao2/TCR_neo/data/", filename="train_dt.csv",aaindex=aaindex, test=False, val=False)
        test_dataset = TCRpMHCDataset(root="/public/slst/home/wutao2/TCR_neo/data/", filename="val_dt.csv", aaindex=aaindex, test=False, val=True)

        # Prepare training
        train_loader = DataLoader(train_dataset, batch_size=params["batch_size"], shuffle=True)
        test_loader = DataLoader(test_dataset, batch_size=params["batch_size"], shuffle=True)

        # Loading the model
        print("Loading model...")
        model_params = {k: v for k, v in params.items() if k.startswith("model_")}
        model = GNN(feature_size=train_dataset[0].x.shape[1], model_params=model_params) 
        model = model.to(device)
        print(f"Number of parameters: {count_parameters(model)}")
        mlflow.log_param("num_params", count_parameters(model))

        # < 1 increases precision, > 1 recall
        loss_fn = torch.nn.BCEWithLogitsLoss()##
        optimizer = torch.optim.Adam(model.parameters(), 
                                    lr=params["learning_rate"],
                                    weight_decay=0)
        #scheduler = torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma=params["scheduler_gamma"])
        
        # Start training
        best_loss = 1000
        early_stopping_counter = 0
        for epoch in range(20): 
            if early_stopping_counter <= 5: # = x * 5 
                # Training
                model.train()
                loss = train_one_epoch(epoch, model, train_loader, optimizer, loss_fn)
                print(f"Epoch {epoch} | Train Loss {loss}")
                mlflow.log_metric(key="Train loss", value=float(loss), step=epoch)

                # Testing
                model.eval()
                if epoch % 1 == 0:
                    loss = test(epoch, model, test_loader, loss_fn)
                    print(f"Epoch {epoch} | Test Loss {loss}")
                    mlflow.log_metric(key="Test loss", value=float(loss), step=epoch)
                    
                    # Update best loss
                    if float(loss) < best_loss:
                        best_loss = loss
                        # Save the currently best model 
                        mlflow.pytorch.log_model(model, "model", signature=SIGNATURE)
                        
                        early_stopping_counter = 0
                    else:
                        early_stopping_counter += 1

            else:
                print("Early stopping due to no improvement.")
                return [best_loss]
    print(f"Finishing training with best test loss: {best_loss}")
    return [best_loss]

HYPERPARAMETERS = {
    "batch_size": [32,64,128],
    "learning_rate": [0.001,0.0001],
    "model_embedding_size": [32,64,128],
    "model_layers": [2,3],
    "model_dropout_rate": [0.5]
}

torch.set_num_threads(36)
torch.manual_seed(2022060801)
print("Running hyperparameter search...")
config = dict()
config["optimizer"] = "Bayesian"
config["num_iteration"] = 36
config["initial_random"] = 1

tuner = Tuner(HYPERPARAMETERS, 
              run_one_training,
              config) 
results = tuner.minimize()

sandeep-iitr · 2022-06-17T19:43:55Z

Hi,
Thanks for providing the details. I am a little busy due to an immediate deadline for the last few days.
I will work on reproducing this issue next week and will update you with a solution or more information.

wt12318 closed this as completed May 30, 2022

wt12318 reopened this Jun 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

more iteration than setting #65

more iteration than setting #65

wt12318 commented May 30, 2022

sandeep-iitr commented May 30, 2022

wt12318 commented May 30, 2022

wt12318 commented Jun 9, 2022

sandeep-iitr commented Jun 9, 2022

wt12318 commented Jun 10, 2022 •

edited

Loading

sandeep-iitr commented Jun 17, 2022

more iteration than setting #65

more iteration than setting #65

Comments

wt12318 commented May 30, 2022

sandeep-iitr commented May 30, 2022

wt12318 commented May 30, 2022

wt12318 commented Jun 9, 2022

sandeep-iitr commented Jun 9, 2022

wt12318 commented Jun 10, 2022 • edited Loading

sandeep-iitr commented Jun 17, 2022

wt12318 commented Jun 10, 2022 •

edited

Loading