Why are the test results always better? #623

deklanw · 2020-12-27T17:55:34Z

deklanw
Dec 27, 2020

I see on the docs site,

Q2: Why are the test results usually better than the best valid results?

A2: For more rigorous evaluation, those user-item interaction records in validation sets will not be ranked while testing. Thus the distribution of validation & test sets may be inconsistent.
However, this doesn’t affect the comparison between models.

I don't understand this explanation. Since we're using the validation set to determine early stopping, shouldn't the results on the validation set be, on average, slightly better than the test results?

In my observations I don't think I've ever seen the validation results be better. The test results appear to be systematically better.

hyp1231 · 2020-12-29T12:49:59Z

hyp1231
Dec 29, 2020
Maintainer

The key reason lies in user-item interaction records in validation sets will not be ranked while testing.

For example, for one certain user, the model may rank the items as following:

a1, b1,  ..., a5, b5, ...

, where a* represents validation ground truth items, and b* represents test ground truth items. (For a well-trained model, it will rank both a* and b* higher)

As we'll not rank validation ground truth items when calculating test results, the top-5 rank list for validation and test are as follows.

Validation: a1, b1, a2, b2, a3
Test: b1, b2, b3, b4, b5

(The items in bold are ground truth items)
Thus in RecBole, the test results appear to be systematically better.

1 reply

deklanw Dec 31, 2020
Author

I see. Why aren't the test set items similarly excluded from the validation ranking?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why are the test results always better? #623

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Why are the test results *always* better? #623

deklanw Dec 27, 2020

Replies: 1 comment · 1 reply

hyp1231 Dec 29, 2020 Maintainer

deklanw Dec 31, 2020 Author

Why are the test results always better? #623

deklanw
Dec 27, 2020

Replies: 1 comment 1 reply

hyp1231
Dec 29, 2020
Maintainer

deklanw Dec 31, 2020
Author