More Valve Steam deck detail for previous test #3

ClashSAN · 2023-02-08T02:55:36Z

I recently went back to redo/replicate my tests, trying additional parameters. could you add this part below the section I made?

Update: (Click to expand:)

old webui commit 0b8911d
brkirch branch 2cc07719

Can verify that the old webui commit (1d before automatic linux install) that the Anything-V3-pruned fp32 model would give accelerated speeds. The 4gb allocated gpu is also being used. the output is mostly always black, with an ocasional blank badge picture. There is an initial 40 second hangup when first running inference for your instance, and also when you switch sizes. Alternated between 256x256 and 192x256. When running in cpu mode instead, it is slower, but of course yields actual results. Larger sizes crash the machine. This round I had tested --opt-sub-quad-attention , --upcast-sampling, --no-half-vae, opt-split-attention-v1 (lower memory) combinations in new and old commits. Would like to try AUTOMATIC1111/stable-diffusion-webui#3556 (comment) next

daniandtheweb · 2023-02-08T03:13:22Z

How is it possible that before the commit the results were accelerated? Can you explain yourself better, I'd like to understand better.

ClashSAN · 2023-02-08T12:50:49Z

yeah sorry. Most of the tests (the previous part I wrote) were done with AUTOMATIC1111/stable-diffusion-webui@0b8911d I went back to test this flag: --opt-sub-quad-attention

but I also tested with --upcast-sampling on the brkirch branch, as a replacement for --no-half

--precision full --no-half --lowvram 
--opt-sub-quad-attention
--opt-sub-quad-attention --no-half-vae
--opt-sub-quad-attention --upcast-sampling

before the commit the results were accelerated?

no, for both commits where I'm testing various flags, results would be hardware accelerated, results mostly black, and larger sizes cause the device to stall, and crash.

I still have hope for this system, My 4gb laptop gpu can create at 4.6it/sec at batch size 17 512x512 in parallel. with xformers.

ClashSAN · 2023-02-25T11:34:09Z

I got it working on windows, at 3.6s/it 512x512. lshqqytiger/stable-diffusion-webui-amdgpu#14

daniandtheweb · 2023-02-25T12:12:34Z

That's great, I've never seen that fork with directml before. I guess the official webui could implement that as well then.

daniandtheweb · 2023-02-25T12:14:48Z

Have you been able to make it work on Linux?

ClashSAN · 2023-02-28T04:05:18Z

nope, it peeves me. It could be much faster than windows, and I could use all 10gb of vram for training. (direct ml takes up all 4gb and the expanded 6gb shared gpu)

ClashSAN closed this as completed Feb 25, 2023

ClashSAN reopened this Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More Valve Steam deck detail for previous test #3

More Valve Steam deck detail for previous test #3

ClashSAN commented Feb 8, 2023

daniandtheweb commented Feb 8, 2023

ClashSAN commented Feb 8, 2023 •

edited

Loading

ClashSAN commented Feb 25, 2023

daniandtheweb commented Feb 25, 2023

daniandtheweb commented Feb 25, 2023

ClashSAN commented Feb 28, 2023

More Valve Steam deck detail for previous test #3

More Valve Steam deck detail for previous test #3

Comments

ClashSAN commented Feb 8, 2023

daniandtheweb commented Feb 8, 2023

ClashSAN commented Feb 8, 2023 • edited Loading

ClashSAN commented Feb 25, 2023

daniandtheweb commented Feb 25, 2023

daniandtheweb commented Feb 25, 2023

ClashSAN commented Feb 28, 2023

ClashSAN commented Feb 8, 2023 •

edited

Loading