Skip to content

This repository includes the official implementation and dataset of our paper "Compress & Align: Curating Image-Text Data with Human Knowledge".

Notifications You must be signed in to change notification settings

UCSC-VLAA/Compress-Align

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 

Repository files navigation

Compress-Align

📃 Paper • 🖼 Dataset • 🤗 HF Repo

Compress & Align: Curating Image-Text Data with Human Knowledge

Compress-Align is the first general-purpose image-to-text human preference reward model, which is trained on in total 10k pairs of expert comparisons, eclipsing prevailing image-text scoring methods, such as CLIP-Score (by 30.3%) and BLIP-Score (by 33.5%), capturing the nuanced essence of human preference on image-text alignment.

If you find Compress-Align's open-source effort useful, please 🌟 us to encourage our following development!

News

[2024.3.22] The code and data will be coming soon.

Quick Start

Acknowledgement

We are also very grateful that this work is supported by a gift from TPU Research Cloud (TRC) program and Google Cloud Research Credits program.

Citation

@article{zhang2023compress,
  title={Compress & Align: Curating Image-Text Data with Human Knowledge},
  author={Lei Zhang and Fangxun Shu and Sucheng Ren and Bingchen Zhao and Hao Jiang and Cihang Xie},
  journal={arXiv preprint arXiv:2312.06726},
  year={2023}
}

About

This repository includes the official implementation and dataset of our paper "Compress & Align: Curating Image-Text Data with Human Knowledge".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published