Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about cdf-xy.pt #1

Open
jiazhi5 opened this issue Oct 20, 2024 · 4 comments
Open

Question about cdf-xy.pt #1

jiazhi5 opened this issue Oct 20, 2024 · 4 comments

Comments

@jiazhi5
Copy link

jiazhi5 commented Oct 20, 2024

Hello,

Thank you for this nice work. I have a few question. What is the cdf-xy.pt for? I am wondering if I can directly use the output from dreamsim_l2.measure(images) as w1kp score. Thank you.

@daemon
Copy link
Member

daemon commented Oct 21, 2024

Thanks for your interest. It is strongly advised to use cdf-xy.pt to normalize the scores, but it isn't strictly necessary.

@jiazhi5
Copy link
Author

jiazhi5 commented Oct 26, 2024

Thank you for your explanation. I am curious about how those values are derived. Additionally, to obtain a reasonable k-expected maximum, how many images (N) should be used? For instance, when k = 5 and N = 5, the score appears quite noisy. I remember that in Figure 5 of the paper, the k-expected maximum was calculated for k ranging from 2 to 300 with N = 300. Is that correct? If so, is there any guideline or relationship suggesting that N should be, for example, at least ten times k to achieve a reasonable score?

@daemon
Copy link
Member

daemon commented Oct 28, 2024

These were derived as the ECDF of a bunch of real values on DiffusionDB prompts, e.g., we generated many image sets and sorted them.

to obtain a reasonable k-expected maximum, how many images (N) should be used?

Good question, if N=5 and k=5, then it is effectively taking the max pairwise score, which is technically unbiased but as you've noticed very high variance. The underlying theory is subsampling bootstrap, so it would be something along the lines of small-ish k and a much larger N. In the paper, we fixed N to 300 and varied k to compute that estimate. For 5, it's probably okay for N to be 50-100 (ten to twenty times).

@jiazhi5
Copy link
Author

jiazhi5 commented Oct 28, 2024

Thank you so much for the explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants