Compress-Align

Compress & Align: Curating Image-Text Data with Human Knowledge

Compress-Align is the first general-purpose image-to-text human preference reward model, which is trained on in total 10k pairs of expert comparisons, eclipsing prevailing image-text scoring methods, such as CLIP-Score (by 30.3%) and BLIP-Score (by 33.5%), capturing the nuanced essence of human preference on image-text alignment.

If you find Compress-Align's open-source effort useful, please 🌟 us to encourage our following development!

News

[2024.3.22] The code and data will be coming soon.

Quick Start

Acknowledgement

We are also very grateful that this work is supported by a gift from TPU Research Cloud (TRC) program and Google Cloud Research Credits program.

Citation

@article{zhang2023compress,
  title={Compress & Align: Curating Image-Text Data with Human Knowledge},
  author={Lei Zhang and Fangxun Shu and Sucheng Ren and Bingchen Zhao and Hao Jiang and Cihang Xie},
  journal={arXiv preprint arXiv:2312.06726},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
figs		figs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compress-Align

News

Quick Start

Acknowledgement

Citation

About

Releases

Packages

UCSC-VLAA/Compress-Align

Folders and files

Latest commit

History

Repository files navigation

Compress-Align

News

Quick Start

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages