Release joliGEN v4.0.0 · jolibrain/joliGEN

This main version adds many improvements as well as video generation with diffusion and super resolution with supervised metrics, including for consistency models.

Features

adding separate control of vertical and horizontal flips as augmentation (a7a6109)
aligned crops for super-resolution (8418470)
allow tf32 on cudnn (367cd91)
better Canny for cond image with background (c3c7de6)
consistency models with supervised losses (ed701ad)
data: random bbox for inpainting (764646d)
input and output multiple and different channels (6bcd64c)
load models without stricness (073d57c)
max number of visualized images from train/test set (24f0e81)
ml: add option for vid inference (c3f83b7)
ml: add supervised loss with GANs with aligned datasets (d7f5119)
ml: added LPIPS supervised loss with GANs (70e8ee4)
ml: adding example of CM+discriminator (b6b8b64)
ml: batched prompts for turbo (023dd54)
ml: Canny can use a range of dropout probabilities (7b4c860)
ml: canny dropout for vid (06ce7d7)
ml: CM with added discriminator (10516e0)
ml: consistency models for pix2pix (cd92712)
ml: CUT turbo (cdd508f)
ml: debug args (a11172b)
ml: debug crop (51c9fd6)
ml: debug for canny inference (930f3ce)
ml: debug for canny threshold (dca0bfa)
ml: debug for vid metrics (7c57471)
ml: debug inference_vid for canny (17b9a29)
ml: debug vid for frame limit (ff97c03)
ml: debug vid metric (ba43725)
ml: DISTS supervised loss for aligned data (56273ef)
ml: FID,KID,MSID for multiple test sets and non 8 bit images (74b0e65)
ml: fix canny range option (c102ee0)
ml: fix inference regeneration and crop canny (f75196f)
ml: HDiT for GANs (58bedff)
ml: HDiT generator (9a95f1f)
ml: jenkins test inference print (b68ab53)
ml: L1 or MSE for diffusion multiscale loss (06e3d6a)
ml: metric fvd for video (6d458a3)
ml: min-SNR loss weight for diffusion, 2303.09556 (c802119)
ml: modif for horse2zebra prompt (b66a954)
ml: multiple test sets (6db745c)
ml: option for max_sequence_lenght of video generation (12cfc1b)
ml: prompt for inference horze2zebra (b8e9929)
ml: random canny inside batch (70919cd)
ml: rename dataloader for video generation (98b1315)
ml: The implementation of UNetVid for generating video with temporal consistency and inference (43b7018)
ml: unchange fill_img_with_canny with random drop canny (a2ed3fc)
ml: UNetVid for generating video with bs > 1 (00f11bc)
ml: vid try autoregressive inference (5b92031)
multi-prompt local works (b98746a)
multiprompt (2bffc8b)
multistep lr scheduler (01c3558)
train_finetune for finetuning gans/others and removing / adding losses and networks (2f26503)
unet_vid motion module fine-grained configuration (813e435)

Bug Fixes

aligne dataset, resize domain A only if necessary (4127571)
allowing for no NCE with cut (9d8ff9b)
clamp bbox to image size during inference (fc3874d)
cm at test time (706356b)
cm with conditioning (0fd2d14)
consistency model schedule upon resume (88d03f9)
consistency models with input/output different channels (db61821)
crash in inference script, errors in documentation (f99dd34)
cut options at test time (dcd2438)
D input is G output size with gans (194f42b)
diff across input/output channels in gans (6845816)
diff real/fake not needed + cleanup (5cbd1f0)
diffusion inference for images > 8bits (aefdc38)
diffusion with input and output of different channel size (cd264de)
disable hdit flop count (8c449f8)
fix pytest rootdir (1fe0e80)
further lowering the input test size of cut-turbo (6914731)
gan inference script with prompts (cef7681)
gan metrics reference (d5570b6)
GAN semantic visual output (d3a5565)
GAN semantic visual output (e7ee6bd)
gen_single_image.py for images with channels > 3 (9ad4aaa)
hdit out_channel (84473fc)
identity with cut turbo (2538c00)
inference with images > 8bit and GANs (34e6c96)
input size of cut-turbo test (2c024c2)
interpolation size selection for projected discriminators (ef045d0)
load_image replacement (5af5803)
loading of ema models (995c5eb)
lora config saving with multiple gpus (c98617d)
lower img2img turbo test memory footprint (54a6ab4)
missing SSIM metric option (8530851)
ml: multiscale diffusion loss for any input resolution (5c9f997)
multi-gpu ddp collective mismatch upon resume (471fbbc)
multi-gpu with frozen base network (1a07342)
multiple test sets with test.py + SSIM (06762fb)
option default cut_nce_idt (4c5ec6d)
palette options at test time (75f7b04)
parser uses model_type for model level options (76095b5)
paths are only required for video generation (eb39ec5)
paths loading prompts file (35d2ef3)
perceptual loss for cm when input and output channels differ (ca81789)
potential bug in gen_single_diffusion model path (0cf63fe)
projected discriminator allows grayscale input (44fb458)
prompt unaligned loading (e25d4b1)
rename sketch options in examples (6930d00)
RGB order for diffusion inpainting (eff8a57)
rgbn cut lpips supervision (17cfbb2)
sam for single channel inputs (397f837)
segformer generator for single channel inputs (1eb6695)
show full test set output with GANs (31efdcd)
single dataset (a6266d8)
supervised loss for aligned GANs, with unit tests (e21ddd3)
supervised perceptual metrics all with piq and configurable + lambda weight (d77c3c5)
test image output tensor visuals (19596b2)
tifffile import (a09b5ed)
total_iters wrong variable (066dc1b)
train batch visuals (24adb61)
typo in semantic threshold test variable (5082c36)
unet mha output for GANs (075b6c6)

Docker images:

GPU (CUDA only): docker pull docker.jolibrain.com/joligen_server:v4.0.0
All images available from https://docker.jolibrain.com/#!/taglist/joligen_server

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

joliGEN v4.0.0

Features

Bug Fixes

Docker images: