Skip to content

Latest commit

 

History

History
39 lines (24 loc) · 1.4 KB

README.md

File metadata and controls

39 lines (24 loc) · 1.4 KB

Compress-Align

📃 Paper • 🖼 Dataset • 🤗 HF Repo

Compress & Align: Curating Image-Text Data with Human Knowledge

Compress-Align is the first general-purpose image-to-text human preference reward model, which is trained on in total 10k pairs of expert comparisons, eclipsing prevailing image-text scoring methods, such as CLIP-Score (by 30.3%) and BLIP-Score (by 33.5%), capturing the nuanced essence of human preference on image-text alignment.

If you find Compress-Align's open-source effort useful, please 🌟 us to encourage our following development!

News

[2024.3.22] The code and data will be coming soon.

Quick Start

Acknowledgement

We are also very grateful that this work is supported by a gift from TPU Research Cloud (TRC) program and Google Cloud Research Credits program.

Citation

@article{zhang2023compress,
  title={Compress & Align: Curating Image-Text Data with Human Knowledge},
  author={Lei Zhang and Fangxun Shu and Sucheng Ren and Bingchen Zhao and Hao Jiang and Cihang Xie},
  journal={arXiv preprint arXiv:2312.06726},
  year={2023}
}