[python-package] First metric is not used for early stopping #6423

sami-ka · 2024-04-20T21:20:47Z

Description

When mixing callable functions and string aliases in evaluation metrics and/or objective function, the first metric flag only selects the first built in metric if any, and after looks to callable functions as metrics.
It comes from the inner_eval function inner_eval function that starts with any built-in metrics if there is any.

Selecting a custom metric for early stopping made me do adjustments to make it work.

The builtin metric can come from the LGBMRegressor instantiation, the built in objective function, or a str in the eval_metric list used in the fit method.

In order to have something that is easier to control, people should have either :

objective and metrics coming from string
objective and metrics coming from callable

The Booster API will handle some of the cases I mentionned above but not all, and the bug will also arise.

Reproducible example

import numpy as np

import lightgbm as lgb
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split


def neg_correlation(y_true, y_pred):
    is_higher_better = False
    return "neg_correlation", -np.corrcoef(y_pred, y_true)[1, 0], is_higher_better


def l2_loss(y_true, y_pred):
    grad = 2 * y_pred - y_true
    hess = 2 * np.ones_like(y_pred)
    return grad, hess


def mse_metric(y_true, y_pred):
    is_higher_better = False
    return "custom_l2", np.mean((y_pred - y_true) ** 2), is_higher_better


SEED = 1
X, y = make_regression(
    n_samples=10000, n_features=20, n_informative=2, random_state=SEED
)
X_train, X_val, y_train, y_val = train_test_split(X, y, random_state=SEED)


params = {
    "early_stopping_round": 5,
    "first_metric_only": True,
    "random_state": SEED,


}

model = lgb.LGBMRegressor(**params)

model.fit(
    X_train,
    y_train,
    eval_set=[(X_val, y_val)],
    eval_names=["validation"],
    eval_metric=[
        neg_correlation,
    ],
)

Here is the adjusted code I needed to make the example work (l2 loss with early stopping on custom metric):

import numpy as np

import lightgbm as lgb
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split


def neg_correlation(y_true, y_pred):
    is_higher_better = False
    return "neg_correlation", -np.corrcoef(y_pred, y_true)[1, 0], is_higher_better


def l2_loss(y_true, y_pred):
    grad = 2 * y_pred - y_true
    hess = 2 * np.ones_like(y_pred)
    return grad, hess


def mse_metric(y_true, y_pred):
    is_higher_better = False
    return "custom_l2", np.mean((y_pred - y_true) ** 2), is_higher_better


SEED = 1
X, y = make_regression(
    n_samples=10000, n_features=20, n_informative=2, random_state=SEED
)
X_train, X_val, y_train, y_val = train_test_split(X, y, random_state=SEED)


params = {
    "early_stopping_round": 5,
    "first_metric_only": True,
    "random_state": SEED,
    "objective": l2_loss,  # comment this line or use "l2" to observe the switch to l2 as first metric
    "metric": None,  # comment this line to observe the switch to l2 as first metric
}

model = lgb.LGBMRegressor(**params)

model.fit(
    X_train,
    y_train,
    eval_set=[(X_val, y_val)],
    eval_names=["validation"],
    eval_metric=[
        neg_correlation,
        mse_metric,
        # "l2"# uncomment this line to observe the switch to l2 as first metric
    ],
)

Here is the screenshot of the two scripts with the terminal output at the bottom :

Environment info

requirements.txt :
lightgbm==4.3.0
scikit-learn==1.4.2

Command(s) you used to install LightGBM

conda create -n env_bug python=3.11 -y
conda activate env_bug
pip install -r requirements.txt

Additional Comments

I opened a PR (#6424 ) to propose an additional way to handle early stopping callback that would prevent this kind of behaviour.
Since the built in metrics can come from anywhere, and it's hard to know which one is the first metric when calling the inner_eval function, I suggest to have an additional argument indicating the name of the chosen metric the early stopping callback should use.
This approach seems to be the one minimizing the changes on the code base.
Related issue : #6223

The text was updated successfully, but these errors were encountered:

sami-ka mentioned this issue Apr 20, 2024

[WIP] Add chosen metric argument to clarify early stopping behaviour #6424

Open

jameslamb changed the title ~~First metric is not used for early stopping~~ [python-package] First metric is not used for early stopping Apr 23, 2024

jameslamb added the feature request label Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python-package] First metric is not used for early stopping #6423

[python-package] First metric is not used for early stopping #6423

sami-ka commented Apr 20, 2024 •

edited

Loading

[python-package] First metric is not used for early stopping #6423

[python-package] First metric is not used for early stopping #6423

Comments

sami-ka commented Apr 20, 2024 • edited Loading

Description

Reproducible example

Environment info

Additional Comments

sami-ka commented Apr 20, 2024 •

edited

Loading