BrushEdit

😃 This repository contains the implementation of "BrushEdit: All-In-One Image Inpainting and Editing".

Keywords: Image Inpainting, Image Generation, Image Editing, Diffusion Models, MLLM Agent, Instruction-basd Editing

TL;DR: BrushEdit is an advanced, unified AI agent for image inpainting and editing.
Main Elements: 🛠️ Fully automated / 🤠 Interactive editing.

Yaowei Li^1*, Yuxuan Bian^3*, Xuan Ju^3*, Zhaoyang Zhang^2‡, Junhao Zhuang⁴, Ying Shan^2✉, Yuexian Zou^1✉
, Qiang Xu^3✉
¹Peking University ²ARC Lab, Tencent PCG ³The Chinese University of Hong Kong ⁴Tsinghua University
^*Equal Contribution ^‡Project Lead ^✉Corresponding Author

1214_BrushEdit_480_60FPS_release.mp4

4K HD Introduction Video: Youtube.

📖 Table of Contents

BrushEdit

TODO

Release the code of BrushEdit. (MLLM-dirven Agent for Image Editing and Inpainting)
Release the paper and webpage. More info: BrushEdit
Release the BrushNetX checkpoint(a more powerful BrushNet).
Release gradio demo.

🛠️ Pipeline Overview

BrushEdit consists of four main steps: (i) Editing category classification: determine the type of editing required. (ii) Identification of the primary editing object: Identify the main object to be edited. (iii) Acquisition of the editing mask and target Caption: Generate the editing mask and corresponding target caption. (iv) Image inpainting: Perform the actual image editing. Steps (i) to (iii) utilize pre-trained MLLMs and detection models to ascertain the editing type, target object, editing masks, and target caption. Step (iv) involves image editing using the dual-branch inpainting model improved BrushNet. This model inpaints the target areas based on the target caption and editing masks, leveraging the generative potential and background preservation capabilities of inpainting models.

🚀 Getting Started

Environment Requirement 🌍

BrushEdit has been implemented and tested on CUDA118, Pytorch 2.0.1, python 3.10.6.

Clone the repo:

git clone https://github.com/TencentARC/BrushEdit.git

We recommend you first use conda to create virtual environment, and install pytorch following official instructions. For example:

conda create -n brushedit python=3.10.6 -y
conda activate brushedit
python -m pip install --upgrade pip
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

Then, you can install diffusers (implemented in this repo) with:

pip install -e .

After that, you can install required packages thourgh:

pip install -r app/requirements.txt

Download Checkpoints 💾

Checkpoints of BrushEdit can be downloaded using the following command.

sh app/down_load_brushedit.sh

The ckpt folder contains

BrushNetX pretrained checkpoints for Stable Diffusion v1.5 (brushnetX)
Pretrained Stable Diffusion v1.5 checkpoint (e.g., realisticVisionV60B1_v51VAE from Civitai). You can use scripts/convert_original_stable_diffusion_to_diffusers.py to process other models downloaded from Civitai.
Pretrained GroundingDINO checkpoint from offical.
Pretrained SAM checkpoint from offical.

The checkpoint structure should be like:

|-- models
    |-- base_model
        |-- realisticVisionV60B1_v51VAE
            |-- model_index.json
            |-- vae
            |-- ...
        |-- dreamshaper_8
            |-- ...
        |-- epicrealism_naturalSinRC1VAE
            |-- ...
        |-- meinamix_meinaV11
            |-- ...
        |-- ...
    |-- brushnetX
        |-- config.json
        |-- diffusion_pytorch_model.safetensors
    |-- grounding_dino
        |-- groundingdino_swint_ogc.pth
    |-- sam
        |-- sam_vit_h_4b8939.pth
    |-- vlm
        |-- llava-v1.6-mistral-7b-hf
          |-- ...
        |-- llava-v1.6-vicuna-13b-hf
          |-- ...
        |-- Qwen2-VL-7B-Instruct
          |-- ...
        |-- ...

We provide five base diffusion models, including:

Dreamshapre_8 is a versatile model that can generate impressive portraits and landscape images.
Epicrealism_naturalSinRC1VAE is a realistic style model that excels at generating portraits
HenmixReal_v5c is a model that specializes in generating realistic images of women.
Meinamix_meinaV11 is a model that excels at generating images in an animated style.
RealisticVisionV60B1_v51VAE is a highly generalized realistic style model.

The BrushNetX checkpoint represents an enhanced version of BrushNet, having been trained on a more diverse dataset to improve its editing capabilities, such as deletion and replacement.

We provide two VLM models, including Qwen2-VL-7B-Instruct and LLama3-LLaa-next-8b-hf. We strongly recommend using GPT-4o for reasoning. After selecting the VLM model as gpt4-o, enter the API KEY and click the Submit and Verify button. If the output is success, you can use gpt4-o normally. Secondarily, we recommend using the Qwen2VL model.

And you can download more prefromhuggingface_hubimporthf_hub_download, snapshot_downloadtrained VLMs model from QwenVL and LLaVA-Next.

🏃🏼 Running Scripts

🤗 BrushEidt demo

You can run the demo using the script:

sh app/run_app.sh

👻 Demo Features

💡 Fundamental Features:

🎨 Aspect Ratio: Select the aspect ratio of the image. To prevent OOM, 1024px is the maximum resolution.
🎨 VLM Model: Select the VLM model. We use preloaded models to save time. To use other VLM models, download them and uncomment the relevant lines in vlm_template.py from our GitHub repo.
🎨 Generate Mask: According to the input instructions, generate a mask for the area that may need to be edited.
🎨 Square/Circle Mask: Based on the existing mask, generate masks for squares and circles. (The coarse-grained mask provides more editing imagination.)
🎨 Invert Mask: Invert the mask to generate a new mask.
🎨 Dilation/Erosion Mask: Expand or shrink the mask to include or exclude more areas.
🎨 Move Mask: Move the mask to a new position.
🎨 Generate Target Prompt: Generate a target prompt based on the input instructions.
🎨 Target Prompt: Description for masking area, manual input or modification can be made when the content generated by VLM does not meet expectations.
🎨 Blending: Blending brushnet's output and the original input, ensuring the original image details in the unedited areas. (turn off is beeter when removing.)
🎨 Control length: The intensity of editing and inpainting.

💡 Advanced Features:

🎨 Base Model: We use preloaded models to save time. To use other VLM models, download them and uncomment the relevant lines in vlm_template.py from our GitHub repo.
🎨 Blending: Blending brushnet's output and the original input, ensuring the original image details in the unedited areas. (turn off is beeter when removing.)
🎨 Control length: The intensity of editing and inpainting.
🎨 Num samples: The number of samples to generate.
🎨 Negative prompt: The negative prompt for the classifier-free guidance.
🎨 Guidance scale: The guidance scale for the classifier-free guidance.

🤝🏼 Cite Us

@misc{li2024brushedit,
  title={BrushEdit: All-In-One Image Inpainting and Editing}, 
  author={Yaowei Li and Yuxuan Bian and Xuan Ju and Zhaoyang Zhang and and Junhao Zhuang and Ying Shan and Yuexian Zou and Qiang Xu},
  year={2024},
  eprint={2412.10316},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

💖 Acknowledgement

Our code is modified based on diffusers and BrushNet here, thanks to all the contributors!

❓ Contact

For any question, feel free to email [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
app		app
assets		assets
benchmarks		benchmarks
docker		docker
docs		docs
examples		examples
scripts		scripts
src/diffusers		src/diffusers
tests		tests
utils		utils
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
PHILOSOPHY.md		PHILOSOPHY.md
README.md		README.md
README_original.md		README_original.md
_typos.toml		_typos.toml
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BrushEdit

TODO

🛠️ Pipeline Overview

🚀 Getting Started

Environment Requirement 🌍

Download Checkpoints 💾

🏃🏼 Running Scripts

🤗 BrushEidt demo

👻 Demo Features

🤝🏼 Cite Us

💖 Acknowledgement

❓ Contact

🌟 Star History

About

Releases

Packages

Languages

License

TencentARC/BrushEdit

Folders and files

Latest commit

History

Repository files navigation

BrushEdit

TODO

🛠️ Pipeline Overview

🚀 Getting Started

Environment Requirement 🌍

Download Checkpoints 💾

🏃🏼 Running Scripts

🤗 BrushEidt demo

👻 Demo Features

🤝🏼 Cite Us

💖 Acknowledgement

❓ Contact

🌟 Star History

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages