Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to run CogVideoX1.5-5B-I2V as int8? #596

Open
FurkanGozukara opened this issue Dec 11, 2024 · 3 comments
Open

How to run CogVideoX1.5-5B-I2V as int8? #596

FurkanGozukara opened this issue Dec 11, 2024 · 3 comments
Assignees

Comments

@FurkanGozukara
Copy link

FurkanGozukara commented Dec 11, 2024

I am running it like below and still using 22 GB VRAM and very slow on RTX 3090

What I am doing wrong?



import torch
from diffusers import AutoencoderKLCogVideoX, CogVideoXTransformer3DModel, CogVideoXImageToVideoPipeline
from diffusers.utils import export_to_video, load_image
from transformers import T5EncoderModel
from torchao.quantization import quantize_, int8_weight_only

quantization = int8_weight_only

text_encoder = T5EncoderModel.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="text_encoder",
                                              torch_dtype=torch.bfloat16)
quantize_(text_encoder, quantization())

transformer = CogVideoXTransformer3DModel.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="transformer",
                                                          torch_dtype=torch.bfloat16)
quantize_(transformer, quantization())

vae = AutoencoderKLCogVideoX.from_pretrained("THUDM/CogVideoX1.5-5B-I2V", subfolder="vae", torch_dtype=torch.bfloat16)
quantize_(vae, quantization())

# Create pipeline and run inference
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    "THUDM/CogVideoX1.5-5B-I2V",
    text_encoder=text_encoder,
    transformer=transformer,
    vae=vae,
    torch_dtype=torch.bfloat16,
)

pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()
pipe.vae.enable_slicing()

prompt = "a fast car"
image = load_image(image="input.png")
video = pipe(
    prompt=prompt,
    image=image,
    num_videos_per_prompt=1,
    num_inference_steps=50,
    num_frames=24,
    guidance_scale=6,
    generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]

export_to_video(video, "output.mp4", fps=8)
@FurkanGozukara
Copy link
Author

running this way made it use 8 gb . do i set the video fps in the pipe?

prompt = "a fast car"
image = load_image(image="input.png")
video = pipe(
    prompt=prompt,
    height=480,
    width=720,
    image=image,
    num_videos_per_prompt=1,
    num_inference_steps=50,
    num_frames=12,
    guidance_scale=6,
    generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Dec 13, 2024
@zRzRzRzRzRzRzR
Copy link
Member

zRzRzRzRzRzRzR commented Dec 13, 2024

Necessary, otherwise the default is 49, which is 8 * 6 + 1 frame(not for CogVideoX1.5-5B), please adjust and run each parameter according to cli_demo. Thank you.

@FurkanGozukara
Copy link
Author

@zRzRzRzRzRzRzR amazing work i made it work for huge speed and low vram on windows

however do we need to prompt is a mystery can you guide me?

here example

i used this prompt : A highly detailed, majestic dragon, with shimmering orange and white scales, slowly turns its head to gaze intently with a piercing golden eye, as glowing embers drift softly in the air around it, creating a magical, slightly mysterious atmosphere in a blurred forest background.

video_0004.mp4

dragon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants