Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Face Distortion #52

Open
zrealli opened this issue Nov 27, 2024 · 11 comments
Open

Face Distortion #52

zrealli opened this issue Nov 27, 2024 · 11 comments
Labels
Answered Answered the question

Comments

@zrealli
Copy link

zrealli commented Nov 27, 2024

Hi,
when the face region is relatively small, it tends to become distorted. Is this due to the high compression ratio of dc-ae? Is there a solution to this problem? Would using a 2K image better?
Thank you very much for your wonderful work!!
A festive image of a large family gathered around a holiday dinner table, their faces lit with the w
A candid snapshot of travelers boarding a train at a busy station, with luggage in hand and exciteme

@lawrence-cj
Copy link
Collaborator

Face distortion happens when the face is relatively small in the image. This problem will be mitigated in Sana-1.5 with DC-AE 1.5 later this year.

@lawrence-cj lawrence-cj added the Answered Answered the question label Nov 27, 2024
@zrealli
Copy link
Author

zrealli commented Nov 29, 2024

Face distortion happens when the face is relatively small in the image. This problem will be mitigated in Sana-1.5 with DC-AE 1.5 later this year.

Thanks for your response! Could it be addressed by training facial dataset at a 2K resolution?

@lawrence-cj
Copy link
Collaborator

Thanks for your response! Could it be addressed by training facial dataset at a 2K resolution?

It will help if you try to fine-tune on some facial datasets with smaller faces.

@Deng-Xian-Sheng
Copy link

Facial datasets usually have large faces
Should we fine-tune a dataset with only faces or a dataset with very small faces?

@zrealli
Copy link
Author

zrealli commented Dec 4, 2024

Thanks for your response! Could it be addressed by training facial dataset at a 2K resolution?

It will help if you try to fine-tune on some facial datasets with smaller faces.

I tried fine-tuning 200k+ facial datasets at 1.5K resolution, which resulted in a slight improvement, but it's still not a perfect solution. Looking forward to SANA-1.5 and new DC-AE!
A candid snapshot of travelers boarding a train at a busy station, with luggage in hand and exciteme

@Deng-Xian-Sheng
Copy link

Deng-Xian-Sheng commented Dec 4, 2024 via email

@Muinez
Copy link

Muinez commented Dec 5, 2024

Face distortion happens when the face is relatively small in the image. This problem will be mitigated in Sana-1.5 with DC-AE 1.5 later this year.

@lawrence-cj will DC-AE remain an AE or switch to a VAE?

@Deng-Xian-Sheng
Copy link

Deng-Xian-Sheng commented Dec 5, 2024 via email

@Muinez
Copy link

Muinez commented Dec 6, 2024

Impossible to convert to VAE

why?

@zideliu
Copy link

zideliu commented Dec 9, 2024

Impossible to convert to VAE

why?

I think it's because the feature dimensions are different

@Deng-Xian-Sheng
Copy link

Impossible to convert to VAE

why?

If you think about it, this is interesting. They went to so much trouble (and published a paper) to switch from VAE to AE and achieve 32x compression.

Now, if you ask them to go back to VAE, they will punch the screen when they hear this.

image

It is estimated that training such a model costs 200 US dollars, in terms of renting computing power.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Answered Answered the question
Projects
None yet
Development

No branches or pull requests

5 participants