#▶|stable-video-diffusion
1 messages · Page 2 of 1
I'm reading all the posts here, I'll try running the preliminary comfy workflow posted here.
Have to debug
Announcing Stable Video Diffusion Support on imaginairy.
The easiest* way to generate videos locally!
pip install "imaginairy==14.0.0b1"
aimg videogen --start-image pearl-girl.jpg
*if you are blessed with a 3090/4090 GPU
https://github.com/brycedrennan/imaginAIry#stable-video-diffusion
help! I dont have a pt13.txt file
they removed it. just use the pt2 one
Anyone have guidelines on how to set the settings? I tell it to make a sample and it just makes a 0 second mp4 of the image.
How long are my 4090 bretherens iterations here
looking at ~25 for my first blast here
Can't comment on a 4090, but it's taking about 75 seconds to generate a 14 frame 1024x576 video while using a 4080
Would I be better off with lowvram mode?
I'm on 6 minutes for a 25 frame
I'll give it a test after this is done cooking
Check your VRAM as you generate; if it's too high, it'll dip into RAM and slow down immensely. I fill all 16GB of VRAM just doing 22ish frames and decoding 1 at a time
I see, that's indeed what she's done
It doesn't help that changing the frame count for some reason auto-changes the decode amount too 😅 gotta remember to change them both, I've forgotten a few times and majorly regretted it
im installing this to test it, i have a 4090
it went down to the last iteration, maxed out my cpu for a few seconds then dumped 24 gigs to my ram
seems like it's slowly going down?
gpu utilization back down to baseline
idk what I'm going wrong, but every time I hit "sample" it just produces a single frame
T != 1, right?
yup
I built out my own colab and am running it on a V100 with T=1 and low vram on. Doesn't seem to crash, but at same time, it isn't producing much. The videos are a single frame, less than 0 seconds long.
Just to clarify, we're talking about T and not "Decode T Frames", right?
T should be a higher number, such as 14, but "Decode T Frames" should be 1
Your T is 1, you're only generating 1 frame
well doh lol
T needs to be higher than 1, enter in a slightly bigger value and hit enter. Then go to "Decode T Frames" and change it back to 1 and hit enter
thank you
man, i spent all day making the colab work and THIS is what got me... lol
Hover text or something would really help in this scenario 😛
it would help immensely.
in other news, I have a seemingly working google colab if anyone with a pro account wants to work it.
Only thing I haven't done for it is included automatic "low vram" modification to the appropriate .py file
nothing amazing, but produced in about 5 seconds
surprisingly stable in non default resolutions
Here's the google colab if anyone wants it: https://colab.research.google.com/gist/wordbrew/3cde4dadeb15db22eb3047aac55acafc/copy-of-stable-diffusion-video.ipynb
A few notes:
- You'll need a NGROK account and API key.
- You need to manually go into streamlit_helpers.py file after cloning the repo and turn on low_vram.
- When you spin up the actual program in the last cell, use the NGROK public URL, not the two streamlit URLS.
Other than that, it should be fairly foolproof.
Okay someone needs to test this for me, I'm claiming to have generation working on graphics cards with as little as 6 GB
anyone with a lower VRAM graphics card willing to give it a go?
You have it set up to work for your google colab notebook too?
I was trying to get it set up earlier with my 980ti
I haven't tried it with google colab. good idea
I'm regularly landing at no more than 10GB for generation up to 14 frames
yeah that matches what I'm seeing
with a GTX 980ti. ill try to see whats poss
does T 14 mean 14 frames?
yeah, but make sure to go into the "Decode t frames at a time (set small if you are low on VRAM)" option just above the "SAMPLE" button and change that to 1 after you've set T
I'm stretching it with this V100... 31 steps and 28 frames, hitting 15GB lol
I dont see a "SAMPLE" button
you have to load the model first
Ignore and load in an image
dang my google colab crashes becuase it runs out of non-gpu memory
High vram always and forever lol
its because the lowvram is storing everything in CPU ram.
My gpu ram stacked up for some reason. Hit a CUDA error because it was just sitting at 12gb without doing anything...
Are u guys running stable video dif on automatic or comfy or both? Which one runs better
Same, the VRAM and frame correlation is quite proportional
this proportional to be exact
I dont believe it works in A1111 yet. Someone on here got it to work on comfy. Right now it's a separate ui.
I'm running SVD_XT on Google Colab on a V100. Hitting max GPU usage at 32 frames, 8fps, 30 steps
HI all, half my user interface is missing???? what am I doing wrong? and also this " File "C:\Users\admin\anaconda3\envs\genModelVideo\lib\site-packages\torch\cuda_init_.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled"...
My GPU took a breather midway through a sample
is the web version not working for anyone else or is it just me?
there is no web version
as in the ai image generator, ai art webpage
Stable Diffusio Online. FREE AI Image Generator. It generates images from a simple text description using Stable Diffusion XL
nevermind, had adblocker on 😒
`I am facing the same issue, did you get it fixed ?
you just ignore the error and upload an image
I did mate, nothing
oh well the error you got tells you whats wrong
either you dont have a graphics card or didn't install a torch version that can talk to it
got a 4090
then wrong version of torch
so probably torch...but how come other menus are available
cause they didn't need torch to render a menu
i got a couple 4090 24g's myself ready to test any workflows, those renders are insane
I am finding lots of very high quality Gifs made with Prompt Travel in Animatediff -- but no one mentions how to get that crisp quality.
I got this, I had to install 2.0.1+cu118 or whatever instead of 2.0.1
did it work? I am a newbie and I don't wanna play with pytorch and cuda and screw up my stable diffusion 😅
where should I place this file manually open_clip_pytorch_model.bin ? in first run It have downloaded it but right after hit 100% keeps failing with disconnect. And I can't see this open_clip_pytorch_model.bin in my g-drive
that's while loading a model inside the app
https://huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K/tree/main you can D?L from here
It places it automatically. You dont have to place it anywhere.
Well, if you're in your own conda environment, you shouldn't have to worry 'bout that 😛 but yes, it did work to install that version
At least, for my 4080
in a second run I have only
VideoTransformerBlock is using checkpointing VideoTransformerBlock is using checkpointing VideoTransformerBlock is using checkpointing ^C
it's kinda frustrating 😦
Weird. Are you going through all of the checkpoints in order?
yes. also lowvram was set to True
maybe something wrong with this ngrok
yeah, maybe. that's really odd.
I'm not able to troubleshoot anything on the colab right now though.
really sorry.
no prob, I will try to figure it out
maybe you have paid ver of Colab?
it seems like mine can't load a model and disconnects
yes. Sorry. You will not be able to use that colab without Pro.
You need a V100 to do anything in it
maybe this is the key
sorry that wasn't clear.
If you get that thing working on a T4, please let me know lol
I was in side the generative-models directory, then did this command in anaconda terminal, "python -m pip install torch==2.0.1+cu118 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118" then after just ran this "conda activate genModelVideo" and then this "streamlit run video_sampling.py"...still the same, 🥲 can you share your stesp is you dont mind?
/VISIO
I think you need to do the install while the conda env is active, but I could be wrong
😭
You should be all good, just activate genModelVideo and run that torch install again 🙂
Dude. it worked...Thanks a million
🥰 😘
is that the sdxl beta motion module?
yeah
same situation even with V100. do you have free ngrok?
yes
I found a solution https://replicate.com/lucataco/svd
as for this problem – I ran colab with bigger amount of RAM and connection is fine so far. Previously noticed that disconnect appeared only after RAM overflow (it was 16gb by default). Interface just loaded 🙂
face broken lol
but walking motion and the details of the cloth better than others
few days and somebody would add adetailer to the workflow. But the walking is TOP QUALITY!
/25_frames
somewhat successful test of 14 frames in ComfyUI at 1024x576 with SVD on a 3080 with 10GB VRAM... probably did some system ram offloading but it ran in like 3-4 mins
Im having this error
Error occurred when executing SVDimg2vid:
[Errno 2] No such file or directory: 'D:\ComfyUI_windows_portable\custom_nodes\ComfyUI-SVD\svd\configs\svd.yaml'
the file is there so idk whats wrong
file literally here
yeah sorry I don't know how to fix that, it happens with the portable install... someone else had that too and opted to just do venv install instead
I don't get it at all, everything looks to point to right path and the error makes no sense to me
something about python working directory, someone smarter can maybe answer but I don't know 😦
Oh thats fine still thank you for releasing that
I've learned to hate the portable install overall, it's always problems with custom nodes and it's harder to troubleshoot 😛
do you know how to use git ?
yeah all that
so basically, if I remember everything: clone the repository (git clone https://github.com/comfyanonymous/ComfyUI.git) change to that folder run: python -m venv venv run: venv/scripts/activate (depending on commandline you use, .bat if cmd) install torch: https://pytorch.org/get-started/locally/ pip install -r requirements.txt run comfy with: python main.py
and you need python 3.10 or 3.11
So technically I can just copy my comfy folder as backup and do that inside there?
I just dont want to ruin my setup
maybe, I don't use the portable so dunno what all it does different
you can copy the custom nodes but you'd need to install their requirements again
I see ty
with the speed everything updates, it's not a bad idea to just install all with latest 😛
SVD with interpolation to 60 fps
How many frames do you get with a 4090 
or two? 😜
I guess if this runs fine over comfy, we can use the workflow on a multiGPU swarm session and run in parallel 2 or more videos

thoughts about this @fallen wren
Guys is there like an ultra noob tutorial on how to get the best results with stable video diffusion? Including installation
you could just use the path of the script, as in comfyui_svd.py, and search for config from there, like Path(__file__).parent / f"svd/configs/{variant}.yaml", which would hopefully bypass any folder differences
45 was max... but it really breaks after the 25 anyway, so not much point
thing is it does point to right file, and it works on normal manual venv install.. just not the portable one
it's relative to the comfy folder so it shouldn't be any different
yeah but I feel like that's because the directory is hardcoded, and maybe the portable comfy believes the root folder is somewhere else
but the error points to the exact right file, still says "not found"
🤷♂️ just saying
I really don't want to install it just to debug this 😅
but I'd imagine setting the path manually could work
yeah I don't particularly feel like downloading yet another comfy to see
works for me too so
nice
offloading the model between sampling and decoding seems to allow this to run the 14 frames on 10GB
need to figure out how to add rest of the settings though... like samplers
yup, noticed just about 9.8gb at 1024x576, good job
it seems it made yet another ending to Nier: Automata 😅
Where do I get the svd.yaml?
it is included but for some reason portable install fails to see it, I can't figure it out :/
?
isn't that what I wrote? 😄
yes
wouldn't be first time I make such mistake, but that's correct
what are we talking about now
i'd say more but it's 5 am and brain dead so have past me instead #🐝|swarm-ui message
opening the comfy venv instead of the portable
confirmation that we can use stableSwarmUI on multiGPU setups to infer parallel videos
then go to localhost:8188 on browser
Cool to hear man, thanks for making a node this fast too 
Ill check it out when I find the time
btw the seed in your node gets cut off at 2048 since you didn't set the max, should be something like "seed": ("INT", {"max": 2**64 - 1}), (also makes it compatible with other seed nodes)
oh shi...
yeah
thanks, I'll just use what the default ksampler uses:"seed": ("INT", {"default": 0, "min": 0, "max": 0xffffffffffffffff}),
yeah, that works too, same number just different representation
not really sure what I managed but works 
with illustrations, anime etc. it just tends to do that :/
finally got things working, troubleshooting CUDA torch install was a real PITA
I need to apologize, you were absolutely right, installed portable and immediately saw it, made all paths relative to the actual script folder now and it works fine now
cheers
too much coffee
ok so now portable install works, sorry for the confusion, installing this with comfy from scratch with this would be:```
- Download and extract latest release from here: https://github.com/comfyanonymous/ComfyUI/releases
- Install ComfyUI-Manager: https://github.com/ltdrdata/ComfyUI-Manager
- Install the SVD node with the manager: https://github.com/kijai/ComfyUI-SVD (check screenshot how since it's not listed yet)
- Download either of the .safetensors from to the "ComfyUI-SVD/svd/checkpoints" -folder:
https://huggingface.co/stabilityai/stable-video-diffusion-img2vid
https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt```
really impressed by how well SDV is sticking to input image clarity
I see that some people on Windows had to install PyTorch 2.0.1 with CUDA 11.8 but Im wondering, do I need to install that just through cmd or also install it to Windows itself with the CUDA toolkit .exe?
are you on Windows? I think I am trying to work out the same issue.
what fps did you have it generate at? any other tricks? seems rare for me to get such a good output. but also I'm trying it mostly on paintings
Amazing. Care to share your settings for these? Wondering how it is kept so crisp on the faces.
really amazing looking. Im curious too
I got this error. but I just installed Torch 2.0.1 with cu118
2023-11-23 08
32.986 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\ocean\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
exec(code, module.dict)
File "C:\Users\ocean\generative-models\video_sampling.py", line 142, in <module>
value_dict["cond_frames"] = img + cond_aug * torch.randn_like(img)
TypeError: randn_like(): argument 'input' (position 1) must be Tensor, not NoneType
2023-11-23 08:45:51.495 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\ocean\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in run_script
exec(code, module.dict)
File "C:\Users\ocean\generative-models\video_sampling.py", line 137, in <module>
img = load_img_for_prediction(W, H)
File "C:\Users\ocean\generative-models\scripts\demo\streamlit_helpers.py", line 889, in load_img_for_prediction
return image.to(device) * 2.0 - 1.0
File "C:\Users\ocean\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda_init.py", line 289, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
I think I had other Torch installed from before though. should I uninstall torch and reinstall just Torch 2.0.1 with cu118?
I created a fork and you can get the pt13 here: https://github.com/Jordain/StableVideoP13/tree/main/requirements
I haven't tested it myself, but from some of the comments on my tutorial, people said it wasn't working with pt2 and once they tried the pt13 it worked. Some cuda error. There are dependencies in the pt13 that include cuda with the install. So I am guessing that p13 should be the one window users use.
Yes, I am on W11, with 4090. You need to really check over what version of Torch python is actually seeing because I kept trying to install the PyTorch 2.0.1 with CUDA 11.8 but it kept seeing some cached copy that used the CPU instead and that broke everything.
Luckily I kept notes
# check CUDA version nvcc --version #check Nvidia Driver info: nvidia-smi
nvcc --version will show you which version Python will use and nvidia-smi shows which one comes from the driver for your card, it's normal if they don't match.
I have the latest Nvidia driver so my driver CUDA was 2.something and the nvcc was still 11.8
everything is easier if you have Anaconda installed cd to user/generative-models conda create -n genModelVideo python=3.10.11 conda activate genModelVideo
then run through the instructions I'm sure you're already looking at from yesterday about getting this running
before you hit
pip install -r requirements/pt2.txt
you should make sure the requirements/pt2.txt file has +cu118 set on these guys:
torch>=2.0.1+cu118
torchaudio>=2.0.2+cu118
torchvision>=0.15.2+cu118
this is where things kept screwing up for me
the torch cached by conda was the CPU one not the CUDA 11.8 one
so after running pip install -r requirements/pt2.txt you need to check if python sees the CUDA torch or not
to do that you just run python from CLI and then
`>>>import torch
torch.cuda.is_available()`
once it returns true you're good to go with starting streamlit
streamlit run video_sampling.py
if conda keeps giving you CPU torch then running torch.cuda.is_available() will return false and you need to uninstall the torch packages
well explained @severe moon
you can uninstall with
pip3 uninstall torch torchvision torchaudio
you might also need to run
conda uninstall torch torchvision torchaudio
after the pip unstall just to be sure. If you run these commands a few times until it says it can;t find what you want uninstalled then you're good to reinstall.
maybe even try both 😉
after that to meet the requirements we need those specific torch versions
I finally got them successfully installed with
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia
after that you don't want to run pip install -r requirements/pt2.txt again or it might try uninstalling things
finally i ran the pip install . command and then started streamit
and it all worked
Hey Kijai, I am circling back to this. When I follow your instructions I don't see the model in the checkpoints dropdown. Also do we need to run the requirements file if we are using the portable comfyui version?
I tried searching but didn't see a solution. Has anyone run into this problem:
`Error occurred when executing SVDimg2vid:
No module named 'wcwidth'`
Portable comfyui install, installed kijai/ComfyUI-SVD following his instructions just a few minutes ago so it should be the latest on github.
if you install through the manager it should take care of everything now
I tested with completely fresh install and it worked with those steps
alright i'll try again thanks
don't know what you mean by checkpoints dropdown?
I did an update instead of downloading the portable comfyui again
the load checkpoint dropdown
just need to reinstall the custom node, not the whole thing
meant to say that it for sure works with fresh install now
that's just any SD model
not relevant
if you have no models you do of course need one 😛
but the svd stuff is separate from all that, it's really just single node that runs it
takes image in, gives images out... nothing else to it currently
I added example workflow for doing simple text to vid with it now
durrr silly mistake haha ok thanks. I thought I had to select svd in the checkpoint.
so how do I choose between svd or svd_xt?
just swap them out?
nah sadly it can't interact with any of that currently
in the node you can choose
you can have both checkpoints in the folder
which version of Python should we be running?
thank you @severe moon for the suggestions. im gonna try all that uninstall and reinstall with those specific packages. hoping it works. but I got an xformers warning before
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.0+cu121 with CUDA 1201 (you have 2.1.0+cpu)
Python 3.10.11 (you have 3.10.6)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Holy sh*t the latest version worked after reinstall! Thank you and Hello World.
Any advice on why I see this error though? Should I be using 576x1024?
Restored from D:\Comfy UI\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-SVD\svd/checkpoints/svd.safetensors with 0 missing and 0 unexpected keys
WARNING: The conditioning frame you provided is not 576x1024. This leads to suboptimal performance as model was only trained on 576x1024. Consider increasing cond_aug.
I think the source image needs to match the dimensions of what you are outputting. I read another user say that
if need be, crop in PS or another tool
the input was the 1024x1024 output from the generated sdxl image in the workflow
It's trained at 1024x576 and works best with it (somewhere the error dimensions got swapped)
Sometimes square works as well, and even portrait....but often it starts to squash it to wide
how did you get the 2 extra green inputs on the CLIPTextEncodeSDXL?
If you right click a node you can convert any setting widget to an input
thanks :)!
I installed all of these and tried to run something and got this
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.0+cu121 with CUDA 1201 (you have 2.1.0+cpu)
Python 3.10.11 (you have 3.10.6)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
C:\Users\ocean\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\watcher\local_sources_watcher.py:177: UserWarning: Torchaudio's I/O functions now support par-call bakcend dispatch. Importing backend implementation directly is no longer guaranteed to work. Please use backend keyword with load/save/info function, instead of calling the udnerlying implementation directly.
lambda m: [p for p in m.path._path],
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Initialized embedder #0: FrozenOpenCLIPImagePredictionEmbedder with 683800065 params. Trainable: False
Initialized embedder #1: ConcatTimestepEmbedderND with 0 params. Trainable: False
Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False
Initialized embedder #3: VideoPredictionEmbedderWithEncoder with 83653863 params. Trainable: False
Initialized embedder #4: ConcatTimestepEmbedderND with 0 params. Trainable: False
Loading model from checkpoints/svd.safetensors
2023-11-23 10:41:06.952 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\ocean\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
exec(code, module.dict)
File "C:\Users\ocean\generative-models\video_sampling.py", line 142, in <module>
value_dict["cond_frames"] = img + cond_aug * torch.randn_like(img)
TypeError: randn_like(): argument 'input' (position 1) must be Tensor, not NoneType
2023-11-23 10:41:34.255 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\ocean\AppData\Local\Programs\Python\Python310\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in run_script
exec(code, module.dict)
File "C:\Users\ocean\generative-models\video_sampling.py", line 137, in <module>
img = load_img_for_prediction(W, H)
File "C:\Users\ocean\generative-models\scripts\demo\streamlit_helpers.py", line 889, in load_img_for_prediction
return image.to(device) * 2.0 - 1.0
File "C:\Users\ocean\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda_init.py", line 289, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
Pip Uninstal torch, download the right version from torches website
I was trying what @severe moon had succeeded with here
which ones are the "right versions"?
Well these are your two issues:
PyTorch 2.1.0+cu121 with CUDA 1201 (you have 2.1.0+cpu) Do you have an RTX card?
Python 3.10.11 (you have 3.10.6) when starting your venv you need to make it 3.10.11
If you're following most of the instructions given here so far, you should be using anaconda and setting python with this step:
conda create -n genModelVideo python=3.10.11
so some version of 3.10
GTX980ti
This tells you everything you need to know:
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.0+cu121 with CUDA 1201 (you have 2.1.0+cpu)
Python 3.10.11 (you have 3.10.6)
You need these torch versions: PyTorch 2.1.0+cu121
Did you try going through my video tutorial? https://www.youtube.com/watch?v=HMW9hVoQa0M&t=46s
Setup Instructions (Python 3.10.11, 4090, working on Windows): https://pastebin.com/YpqNSHFy
If you are still running into cuda issues after trying p13.txt take a read at this: https://pastebin.com/eSiVGGzA
Requirements:
Probably can run it if you have at least 6Gb of VRAM
Anaconda
Git
Generative-Models github
SVD or SVD_XT
Download Lin...
I have Anaconda and that is what I am using. I must be messing up somewhere. sorry
ok best thing to do is just to create a new virtual environment
I have this saved in my doc of notes. I planned to watch it. Maybe Ive been using too many pieces of diff install notes from people and Im on 5 streets lol
faster than trying to fix the one you have
okay cool how would I do that
? thank you in advance
are you activating afterwards? Open Anaconda
cd to user/generative-models
conda create -n genModelVideo python=3.10.11
conda activate genModelVideo
user/generative-models is wherever you installed generative-models
yeah it's important you make the environment with 3.10.11 like dragon said
for anyone who needs a different Torch version you can find the old ones here: https://pytorch.org/get-started/previous-versions/
(base) C:\Users\ocean\generative-models>conda create -n genModelVideo python=3.10.11
WARNING: A conda environment already exists at 'C:\Users\ocean\anaconda3\envs\genModelVideo'
Remove existing environment (y/[n])?
you probably need a different version of torch+cuda than I did, looks to me like you need:
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
ah silly me thinking I could use what everyone else needed lol
your error message about xformers explains it:
xFormers was built for:
PyTorch 2.1.0+cu121 with CUDA 1201 (you have 2.1.0+cpu)
Python 3.10.11 (you have 3.10.6)
that's where I was confused. between seeing what worked for others (you) and what the errors were telling me in plain English haha thank you
so after creating a new venv I should follow your video?
basically for anyone with this issue I think you need to run nvcc --version inside your conda environment to see what the python in there needs
for many of us we borked the install already
and xformers will throw the error talking about how it was already built
sure. I explain creating the venv in the video too. But if your cuda version is different than mine you still might need to add in what dragon said. But it might work right away
it should be built with the same version as nvcc --version spits out. then you go to the old Torch version page to find tocrches that will work with your xformers
it looks another custom node for ComfyUI added recently, here is the link
https://github.com/thecooltechguy/ComfyUI-Stable-Video-Diffusion
@grim tangle Hey I got it! Thanks great tool!
Yeah that one separates the code to 3 node, it's more comprehensive approach but also does not add anything currently
Things will probably change a lot once we get more tools Emad hinted at...
@grim tangle is the decode t frames at time = decoding_t? I see you set it to 1. Does that 1 = 1 in the streamlit web app?
yes
when I tried a low number in the web app it took forever, but in comfyUI it took like 5 mins
ok so if i increase it, it will be faster?
yeah that takes long
this is not really meant for that high res
it can work but even the motion is usually not right if you use anything but wide aspects
motion bucket to max gives fun stuff sometimes :d
I'll have to test that out. I had a little creature that did a little bob that looked good at 1024x1024, but i'll have to try something else
ok well 48 decoding didn't work in comfyUI lol
OOM
no you can't put that very high at all 😄
ComfyUI-SVD is far better on low vram
the decoding stage is not what takes long anyway
I did 48 in the web app and it worked
48 decode_t?
yeah
is there any web service where we can use stable video?
it's just how many frames are decoded at the same time of the batch
I was generating more frames in the web app
ok let me try it again and increase the number of frames
I have 24Gb
it's pretty useless to generate frames above 14 for SVD and 25 for SVD-XT anyway, nothing useful gets generated
but yeah I don't get why it worked in the web app, I feel like 48 is equal to the one in comfyui. If I set the web app to 1 it will take like 5 hours
14 frames should take about 30 secs
ok
and 25 should take a minute on a 4090
ok I just did
conda activate genModelVideo
so should I do
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
?
is there a way to force stop a generation in comfyui?
yeah that looks right
for this node... not currently, besides just stopping whole thing from console (ctrl+c)
after you get the right cuda torch combo installed you'll need to make requirements/pt2.txt match
so edit the pt2.txt to match what I just installed?
respectively *
So I've always wanted to visit Pripyat. Now it's even more unrealistic. So i made myself a short nostalgic movie 🙂
yeah, update torch, torchvision torchaudio:
torch>=2.0.1+cu121
torchaudio>=2.0.2+cu121
torchvision>=0.15.2+cu121
although that's the step that could go bad, if conda decides to overwrite the torch versions, then just uninstall all like I mentioned above
then reinstall conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia again
after you install them and then do the requirements
you need ot check that python is actually using the right ones
Here's a short clip reel of 12 results. I probably deleted 5 more that were terrible before compiling.
ah I see now. thank you. going to try this. ill do the python import etc. check too to make sure it all took
good luck! you can also jsut run the streamlit startup command too, if the xformers got bad torch it'll throw that same error
like i said it was a PITA to get conda to not install the wrong torch versions
ah good idea. that way itll give me the error read out and show me whats up. thank you again
I bet. youre a G.
wrestling that snake. props
Runway-esque clip time extension in ComfyUI, currently does it 4x, but you can just copy and paste those blocks to add more runs ❤️
It will generate an SDXL image (but you can just load one in too if you want) then do a SVD run, grab the final frame, do another run, repeat, then interpolates all the runs together at the end.
https://github.com/kijai/ComfyUI-SVD/ <-- SVD Node for ComfyUI
doing line 1 of 2 of these
is line 2 necessary in my case? what does "pip install ." do
pip install -r requirements/pt2.txt
pip install .
yeah the first time it goes through the requirements list to make sure it has all the right versions
then the second line it ... builds the cache stuff it needs in the pt2 folder
or something like that
ah tat makes sense. thank you
I managed to figure it out for windows. FFMpeg should be specified as a requirement, as I had to butcher the code to figure that out.
i dunno how python works, only enough to be dangerous
same
YMMV
This look right?
torch>=2.1.0+cu121
torchaudio>=2.1.0+cu121
torchvision>=0.16.2+cu121
im about to run the pt2.txt
hmmm, maybe set them from >= to == to make sure it doesn't try to get cute and install some higher version
It certainly gets my thumbs up compared to other video generators at the moment. I'll have to do some tests tomorrow to extend the videos with the same subject
Gobble gobble
I'm running this on a card with 12gb VRAM. idk what y'all are on with 40gb requirement
Only 1 s with 12gb vram?
I can crank that up, the sampling will probably just take longer. the VRAM can run out if the Decode t frames at a time param is set high
it got real cute
uninstalling and trying again lol
this is where it got cute btw
Using cached torch-2.1.1-cp310-cp310-win_amd64.whl (192.3 MB)
Installing collected packages: torch
What is the maximum time that can be achieved with 12 GB vram?
any way I can clear this cached torch?
(this is true for my comfy node too)
I managed to get to 4 seconds with 12FPS, because if I go above that the sampling time will be over 5 minutes
14 frames barely fits 10GB card yeah
Doesn't sound bad at all 4 seconds
also, will this eventually be natively supported for stuff like A1111? I'm not sure how efficient is the inference we're using for this
yup, that sucks. so what I had to do was somethign like: do the pip installs, then uninstall torches, then install the correct toch with conda, then check that python was getting cuda torch and not cpu torch, then I can;t recall exactly if I then ran streamlit and it worked or if I also did pip install .`` first but definitaly do not rerun pip install -r requirements/pt2.txt
` once you think the torch is sorted out
I also recall at some point deleting the .pt2 folder to clear all the cached stuff out
okay cool this helps. I have stuff to try. thank you and Happy Turkey Day Dragon
yeah first batch of food is out of the instant pot, i'll be incognito for some time once the food coma hits me. good luck!
ok first it acted like I installed it successfully.
I dbl checked everything was uninstalled as it gave me the three orange confirmation those arent installed to uninstall.
(genModelVideo) C:\Users\ocean\generative-models>conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
Channels:
- pytorch
- nvidia
- defaults
Platform: win-64
Collecting package metadata (repodata.json): done
Solving environment: done
All requested packages already installed.
so then...I think it's all good, start streamlit UI and
2023-11-23 13
50.794 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\ocean\anaconda3\envs\genModelVideo\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in run_script
exec(code, module.dict)
File "C:\Users\ocean\generative-models\video_sampling.py", line 3, in <module>
from pytorch_lightning import seed_everything
File "C:\Users\ocean\anaconda3\envs\genModelVideo\lib\site-packages\pytorch_lightning_init.py", line 25, in <module>
from lightning_fabric.utilities.seed import seed_everything # noqa: E402
File "C:\Users\ocean\anaconda3\envs\genModelVideo\lib\site-packages\lightning_fabric_init_.py", line 29, in <module>
from lightning_fabric.fabric import Fabric # noqa: E402
File "C:\Users\ocean\anaconda3\envs\genModelVideo\lib\site-packages\lightning_fabric\fabric.py", line 21, in <module>
import torch
ModuleNotFoundError: No module named 'torch'
makes no sense. I just executed
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
Are you in the venv when executing the UI script?
yes, afaik
(genModelVideo) C:\Users\ocean\generative-models>streamlit run video_sampling.py
something must be wrong with Conda then. I'm personally not using Conda and it works fine
I just did python -m venv venv, then venv/scripts/activate, installed requirements(and FFmpeg) and it works normally
Thank you for this... I will have to try again with only 8 frames
at what stage do you do this if you dont install with Anaconda?
wdym? just in the main directory, I made the venv and installed everything normally (except Triton, which I had to build from source because I'm on Windows)
I must be confused. apologies
What is imaginAIry AI? Is there one of these graphs for SVD?
I have a 3060 (12gb vram), how did you set this up?
I dont understand why it keeps giving this error. I have uninstalled 5 times now
2023-11-23 13:56:29.227 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\ocean\anaconda3\envs\genModelVideo\lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 534, in _run_script
exec(code, module.dict)
File "C:\Users\ocean\generative-models\video_sampling.py", line 5, in <module>
from scripts.demo.streamlit_helpers import *
File "C:\Users\ocean\generative-models\scripts\demo\streamlit_helpers.py", line 12, in <module>
import torchvision.transforms as TT
ModuleNotFoundError: No module named 'torchvision'
each time I uninstall and check it was uninstalled I run
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 cudatoolkit=11.8 -c pytorch -c nvidia
Then I run
pip install .
then
streamlit run video_sampling.py
and it gives that error. I have no idea what to do to fix it
I want you to do fine-tuning to stable video and have Realistic vision Video diffusion 🔥
using images I've made with SDXL technically makes this txt2img2vid, I wonder if that's going to be the workflow once this gets more adapted to
first attempt
i like how he's rollerskating away from the explosions 😂
bummer man. If you uninstall and then go to reinstall but it tells you it's already installed then something ain't right.
"All requested packages already installed." - this means it thinks it has torch already but then your error message says you definitely don't:
ModuleNotFoundError: No module named 'torch'
I'm not sure how to help at this point. It was confusing enough trying to troubleshoot this locally on my own system. But it definitely looks like you need to sort out how to get the correct torch versions showing from inside conda. After the reinstall you should be able to type python and then follow this simple code check:
`# Check GPU Availability
The easiest way to check if you have access to GPUs is to call torch.cuda.is_available(). If it returns True, it means the system has the Nvidia driver correctly installed.
import torch
torch.cuda.is_available()`
On my base system without torch installed (and a differnet python version) it spits out this:
Python 3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
import torch
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'torch'torch.cuda.is_available()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'torch' is not defined
Inisde my genModelVideo conda instance it instead says this once it was installed correctly:
(genModelVideo) F:\SDVideo\generative-models>python
Python 3.10.11 | packaged by Anaconda, Inc. | (main, May 16 2023, 00:55:32) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
import torch
torch.cuda.is_available()
True
If you somehow manage to have the cpu torch installed then when you run the torch.cuda.is_available() bit it will instead return False. This is the simplest check I could find to see if Torch was even going to work at all when starting up streamlit.
You can exit the python CLI with CTRL+Z
Another useful tool for troubleshooting all this is conda list as it will show you all the versions installed inside that environment.
Hopefully this new infodump is useful to you or someone else trying to get past torch install issues.
FPS doesn't affect memory usage. only total frames generated
imaginairy is a tool to generate videos and images. https://github.com/brycedrennan/imaginAIry
If you install the proper version of torch first, I dare say it's the easiest way to get started making videos.
pip install "imaginairy==14.0.0b3"
The graph applies to all 4 models released by Stable Diffusion (svd, svd_xt, svd_image_decoder, svd_xt_image_decoder)
So it’s like a webui for svd?
Technically true, though I find it all relevant when taking into account length of the animation and smoothness of the video.
If I want a 4 second clip at 8fps, that's different T than 4 second clip at 12fps, which correlates to higher memory usage
i wouldnt go that far since there is no ui 🙂 its a command line tool
far fewer features but I prefer the command line. its a lot easier to install than webui and easier to use for some.
yeah because of how limited resources are, I just always output at 6fps and then interpolate if I want to make it look nice
can anyone please share a comfy workflow so I can drag and drop in my comfy to try?
Interesting
I was talking about video length. if I go over T=24 the generation time triples itself due to my VRAM.
right. but also because svd_xt was only trained for 25 frames, if it goes over it doesnt do well. I did the rocket one for example at 40 frames and the rocket stops moving (but keeps exhausting). pretty funny. I'll find it right now
that's a really smart idea. what's your preferred interpolation software?
I didn't use the xt version yet. if I go 25 frames my video would be under 1 second long
I use the normal version with T=18 and I'm getting decent videos IMO
number of frames doesnt determine video length. the combination of number of frames and FPS determines length
so rocket at 40 frames, but 6 fps is almost 7 seconds
6fps would cause the video to look odd though
I think about 12fps is usually fine to make the most out of the range of frames the current models output
Could someone guide me on how to adjust these parameters to achieve better results?
For one, you actually able to manage 14 frames decoded at a time?! I'm impressed.
yeah, but you can interpolate to make it look fine after
this video doesn't play does it
nope
The main thing I've noticed is for motion bucket id, the lower it is the less the movement. Anything over about 200 is a waste of time.
Here is same clip with same parameters, but done with motion bucket id set at: 1, 10, 50, 100, and 500
My king, thanks for this. If I can ask a question as a noob, can you tell me how I load an existing image to this work flow...😅
There is a local A100 GPU available for use.
I would expect that you can get good results by, say, generating 25 frames and then using the last 5 frames as the input for a subsequent generation
Temporal inpainting
yeah I've been wanting to try that. haven't had a chance yet
very nice thing to just have on hand haha
is there a comfy workflow that intergrates the LCM lora and LCM sampler
I would like to quickly test the SDV on ComfyUI,lol
I am not a programmer or algorithm engineer, but thanks to GPT, I am able to execute some commands and write code."
Bunch of random tests 🤯
The fact that it's possible to make this stuff locally is just fricking crazy..
I got the SVDimg2vid node to work in ComfyUI
Looks good
It doesn't really work like that (yet?), but here:
LCM just speeds up making the initial input image but the SVD part requires its own model so no real speed gains.....
And here's an image version, should have the workflow embedded:
Thanks, Dragon 🤜🤛
Error occurred when executing SVDDecoder:
Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 15.94 GiB
Requested : 1.75 GiB
Device limit : 23.99 GiB
Free (according to CUDA): 0 bytes
PyTorch limit (set by user-supplied memory fraction)
: 17179869184.00 GiB
how could I fix this I have a 4090 
close some tabs? :p
This is from ComfyUI or....?
really though you need to futz the numbers low enough to not cause SVD to ask for shared memory....
yeah when I run the models with streamlit it knows how to jump to shared memory and everything just gets way slower but in Comfy the initial node settings gave me the same error. you can grab my spaghetti example to see my settings that are working on my 4090 but I basically got them from attashe's image on this discussion of memory issues: https://github.com/kijai/ComfyUI-SVD/issues/2
I dont even know what streamlit is so imagine my limited knowledge lol
here's my node settings:
streamlit is just a webGUI, how the official Stable Diffusion repo runs its example to load the model
Oh i see
comfy has official SVD nodes built in now :D
cc @grim tangle ^
How do we update it
Here's comfy's official sample workflow for SVD img2vid in ComfyUI + Examples page: https://comfyanonymous.github.io/ComfyUI_examples/video/
Awesome, thank you!
if you're using Swarm, comfy auto-updates itself by default. If you're using a separate comfy install, there's an update folder with update_comfyui.bat inside. Alternately you can git pull in terminal
ps comfy said it works (slowly) on a GTX 1080 8 GIB
eh?
did you feed an actual image in? (the example.png in comfy is not a great init lol)
Currently testing it on a 1080 right now, will try on a 3080 a bit later on
put this project in ComfyUI/custom_nodes ?
problem was cfg was set at 2.5 by default
is there a new example.png for this workflow showing all the nodes hooked up? i ate too much turkey earlier to figure this out.....
you don't need custom nodes anymore, it's built-in to comfy now
just update your comfy install to latest
@severe moon
Qn: what was your base fps from SVD? and what interpolation settings and method did you do to get it from xx fps to 60fps ?
derp, thanks
Now i need to see how people are making such high quality videos
The shaking effect is from SVD? how can that be?
First try used TopazAI for the fps etc
it just applied its own motion based on seed, randomly
all videos from 6 fps, after that I interpolate to 60 fps with 2x slow-mo, and I can also add 2x upscale (if the video quality is low)
I believe if you add some motion blur to your input image, that will encourage the model to give you motion that matches the blur
added examples page to that pin https://comfyanonymous.github.io/ComfyUI_examples/video/
Can u share the original 6 fps video of the horse? also why 60 fps? vs 30fps? What software did u used for the interpolation? in the case of the horse, did u do 2x upscale?
the examples page explains this: VideoLinearCFGGuidance: This node improves sampling for these video models a bit, what it does is linearly scale the cfg across the different frames. In the above example the first frame will be cfg 2.5 (the cfg set in the sampler) the middle frame 1.75 and the last frame 1.0 (the min_cfg in the node).
(the video model prefers low CFGs, unlike regular image SD wanting super high ones)
im going to try to install imaginAIry rn, I might ask for your help if that is ok, also, I dont see any txt2vid in the commandline arguments, is it only img2vid?
TOPAZ Video AI
Thanks
Hopefully my 4090 doesnt blow up
thx
I just get big scary error messages once it hits the ksampler using this JSON:
TypeError: unsupported operand type(s) for *=: 'int' and 'NoneType'
Fantastic
I like how it serves as both a demonstration of the strengths and weaknesses
Yeah no text to vid yet. I should add that soon. Would be easy
will required computing power be about the same?
Whatever falai uses
What's the motion bucket id for this?
127
does motion bucket number matters?
Ive been using the same number
ty
so the lower the number the better it seems
lower seems to be less motion, more stable, higher increases chaos
do you think computing power would be the same and also opened a cmd line in A:\Stable-Video-Diffusion> and did A:\Stable-Video-Diffusion>pip install "imaginairy==14.0.0b3" then A:\Stable-Video-Diffusion>pip install imaginairy but i dont see anything in the directory what am I doing wrong?
also havent installed my models yet so where do i put them when I do
last spaghetti guy was motion bucket around 40 and this one is 100, way more spaghetti:
So eventually will we be able to apply LoRAs to this process to control the camera direction? I'm keen to generate some zoom ins and I 'm guess that the best way to control that at the moment is with different seeds
Try adding motion blur like mcmonkey said
Of course the disadvantage is then you have motion blur
Someone create a video with these images please
Is there an updated version of this WF
@hazy gorge doesn't work very well for this one
Can not be downloaded 🙁
oh, I accidentally encoded as h265 so discord won't embed it, gimme one second
Anime and manga references don't work well with SVDs. Usually it's just a camera shift
updated comfy approach is pinned
@hazy gorge
this?
yes, the model should be easily trainable for custom motions n wotnot i think
mine did the same just looks like a slideshow animation
no the pin above that one lol
yeah, it does the same thing for oil paintings
Excellent!
yes this is one of the limits of the current research preview model, it has a heavy photorealism bias
Only works on people it seems
can you explain the "pin"?
^ that
it's the top pin
right above the one you screenshotted
works on all kinds of stuff
#▶|stable-video-diffusion message
I tried this WF and there were a lot of red nodes to install, but when I installed the missing nodes using manager, I didn't see any available nodes
I might be doing something wrong
Im also using the comfyui release
im using it rn
not important but you haven't updated for a minor fix 50 minutes ago jsyk
ohhhh
lol
(literally just a typo fix, SDV should've been SVD)
Must I use SVD with the corresponding SVD CKP
here's my workflow, it automatically adjusts resolution and also uses rife interpolation node and saves .mp4
will test now
I think you use the svd_xt
- with imaginairy the models are automatically downloaded.
- yes pip installing the software doesn't put any files in the current directory (except when it makes videos or images)
pip install "imaginairy==14.0.0b3"is all you should run to install- after installation you should run
aimg videogen --start-image your-image-here.png
here I fix the final framerate because in the original workflow I leave it at 12 even after interpolation so the result is in slow motion
the text2image2video example is cool. Here I used realvisXL as the image part
i noticed realvisxl seems to create images that make good motion
(also I have it gen images at 1344x768 before feeding to the video model at its 1024x576, as the image model prefers 1mp)
(also I'm using VHS nodes to save the files instead of the webp thingy) (which i specify cause i just noticed it default named as 'animatediff' lol)
how you made it so long?
so simple but I love it
There is 4 videos with slow-mo
the over-motioned failures are entertaining to watch
need SVD-LCM to experiment faster lol
Or something like this 😆
sound on if you want to relax
so cool
comfy's official implementation is on another level 😄
fancy!
i find if face arear is small,face will be broken
and how can I save video instead of webp
use vhs video combine
thx
woah
what motion bucket?
Looks super cool
First time creating a music cover mv with AI. The dream of lovely movies is getting closer and closer
I managed to run it on a GTX1080 earlier today so yes
thanks
Hi can I ask questions about SVD install here or is there a better channel for this ?
What thoses nodes are refering to ? 👀
you need to update ComfyUI
Shit that's always the step i'm forgetting............. Thanks to remind -_-' ahaha
I made a thing. SVD is next level. All of the assets in this vid are generated with A.I. — script, images, voiceover, animation, and background music.
standard anime fight
it is possible to extend your video using every last frame from previous one as an input image for repeating iteration of sampling 🙂
I tried to implement your idea, but it takes a lot of luck to get the set direction of the camera movement to be the same.
yes, it requires some tries with parameters 🙂 reducing or increasing motion, augmentation etc.
Was this human ran on a high vram mode using A100? Could you share how you are able to get this human face not distorted over 4secs?
Again how are u getting the human faces not being distored?
Usually, if the face is in the foreground, it won't collapse, but faces in the middleground will almost always be distorted.
What mode and gpu card are u running this on? vram mode = low and 4090 or a100?
This is the first time I've heard of these tips, I just use official repo with the default settings https://github.com/Stability-AI/generative-models
what card are you using?
4080
well gentelmen... lets get cooking
this is EXTRAORDINARY
meme -> proper input format -> profit
Omg my 4090 arrives in less than a month, I'm so ready to test this new insanity !!!
Does telling it what the camera movement is supposed to be work ?
Hey guys whats the best replicate link? Am such a noob lol
I want the videos to be maybe 4 seconds long or 3 its fine
some initial sampler comparisons (I know it should be compared across steps and schedulers too)
superb work. I was dreaming about good drift generation... now they are moving pictures....
10 steps only test
Is not much but testing the SVD models with 2D artifacts output some interesting results.
it look like an NFT ..
Personally I find failures as interesting as successes
bbtw, I was only able to run the SVD models using the official implementation from comfy;
Yep running that too, with a few changes
any feedback will be appreciated, I'm testing changing parameters to see how it compares to animediff
I implemented an auto size from the input image with potential downscale
and changed it to combine stuff into the VHS combine node
why do you cut the size in half?
Cause my source images tend to be pretty large
(some mistakes with the title there, sampler was euler)
1530x1920 is the current one
SO half the size makes sense
mmm... actualy, i was wrong. i feed it 3mb but generate 1024x576
⚠️ Kinda creepy
lol my attempt's are nothing vs yours, but I'm trying.
j
my PC after the release of SVD 😆
is there a way to prompt it? eg that a woman smiles or a plane starts…?
i use that too
What that button
Lets you create a video
Okay
Not right now, all you can prompt for is the input image
so everything was working perfectly, I restarted my computer and relaunched comfyui>
Everything uninstalled and reinstalled itself for some reason upon launch and now I get this error:
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
How do I make it try to reinstall? And why did it do all that in the first place?
Comfy won't launch at all. Any tips?
the amount of nsfw videos being generate in one blink of this eye...
⚠️ 🕷️ Spider
so besides "the regular" stuff, is there any secret sauce to add more stuff yet?
Hello all o/ noob questions: I am interested in learning and exploring animation with SD,
- Should I start by learning Deforum + controlnet plugin or choose Stable Video Diffusion
- Is it enough for learn and explore the basics by using the available google colabs, or will I need to run it and play with it locally with a powerful graphics card
1 - SVD is pretty basic, you feed it an existing image and then can change a few settings but it's brand new and not very feature heavy. Deforum has been around for way longer and has more to explore so maybe start with SVD and then branch off to Deforum later.
2 - If you are comfortable with colabs it's a great way to access way more powerful hardware (even better with Pro accounts). Free accounts might hit limits with what GPUs are available because so many people try accessing the big GPUs for free. Pro account should give you mroe freedom for choosing but of course cost money. There's really no need to go out and buy a powerful GPU until you have a better sense of why you would want to do so but if you already happen to have a decent rig then by all means start locally.
oh man, I've managed to avoid playing Division for a few weeks now, don't get me started lol :p
💀what are the chances
keep on the grind
I mean I think they did intro a new event this week called Black Friday so....... I will prolly cave today or tomorrow
then again messing with SVD is super fun and time consuming.....
Did another tutorial, which is an easier install than the Stable Video Diffusion streamlit. Here is the ComfyUI install: https://youtu.be/hoIobzZmNiM?si=4Cg-ntO0qEnrybKU
An easier way to generate videos using stable video diffusion models.
Stable Video Diffusion ComfyUI install:
Requirements:
ComfyUI: https://github.com/comfyanonymous/ComfyUI#installing
ComfyUI-Manager: https://github.com/ltdrdata/ComfyUI-Manager
SDXL: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main
you can use any...
yeah thank you, I was the one who replied to your videos.
LOL 1 click method lk that one
but my pc is not built like a super computer 💀
might work with 6Gb of VRAM
8gb of ram???
maybe I know bryce was asking people who had 6Gb of vram to try it
so it might work
might take a while, but it might work
yeah it works with 6gb GPU VRAM, but you can only do 4 frames
l heard it needed a gpu sm like that
8gbs or colab???
which is better
colab+ takes 3 secs to generate 💀
Anyone has demo sites?
yes you need a gpu. are you saying you have a free colab with it working? nice if so
well stick to colab then lol
How much vram does the gpus you usually get in free colab have, again?
its in colab but its has a web ui so its cool
30gbs l think
Apparently 24gb K80s? But that's still a lot better than local for everyone without a 3090
true l was also using openx l think its nother free gpu thingy
it ran a 1 video per over 2 munites
not bad , A100
bro is not in a edit 💀
epiccc
would you like sharing settings? I'm struggling to generate good outputs. https://streamable.com/cct2l1
I use SVD-XT to let me get up to 25 frames or more. I tend to set fps at 6 or 7 max. Produce a base video from SVD, then run it through flowframes to interpolate up to 24 or 48fps.
so, in essence, cheating a bit. SVD is really limited right now in output
got it, I will try that, the part that I have missing is the interpolation.
flowframes is free and works well even on computers without a gpu. I run interpolation up to 4x original fps and only takes like 5 min max per video
do you know if there is anything similar for Linux (Arch)? I heard about Topaz but that's windows and Mac.
I'm not aware, sorry.
Got it, I’ll figure it out and share later the outputs, interpolated.
dang. these are way good. I can only get it to pan the image. What's your setup?
Stuff like this - after a lot of work and experimentation. (ComfuUI)
I am using this space: https://huggingface.co/spaces/multimodalart/stable-video-diffusion
is 127 bucket ID the best? I played with it a bit, at 300, everything gets choppy, at lower levels, it doesn't seem so bad. Like 50 or so?
the lower the motion bucket, the less movement, past 200 it pretty much breaks. starting seed also leads to a lot of random movement choices between seeds but generally bucket around 40 or 50 feels less chaotic to me for photoreal subjects, and starting around 100 or so for painting/art leads to more movement and less tendency to just pan over thee image
but it still all afeels much more random and less science than art at this point
ooh. this is good info. thanks man.
yeah, i dont like that you cant even send in a prompt or anything, it would be nice to at least direct a tiny bit.
I've been toying with feeding SVD subjects with greenscreen backgrounds and then did a real quick and dirty comp with color keying in premiere:
When i try to load videos into flowframes from SVD i got an error. Do you use the WEBP file?
this is more or less my workflow too, make the SVD at 6fps and then into flowframes to smooth it out. can get a little smoothy looking....
haven't tried .webp - when I save my SVD ouput it's in MP4 format
I don't have a node to save in mp4, can you tell where to get or share your workflow?
are you using CompfyUI?
yes
this should have my workflow embedded in it
I'm just hooking the output up to VHS Video Combine node
it lets you choose output format
Doesn't work, i guess. I will try this node
I save as mp4 and have no problems
so I tested flowframe with webp input, it errors out and complains about input format being incompatible
dragging the vid doesn't work to make compfyui show the workflow? how about this?
That works. Thank you for sharing
found it, after doing some reading I'm using RIFE; https://arxiv.org/abs/2011.06294
Real-time video frame interpolation (VFI) is very useful in video processing, media players, and display devices. We propose RIFE, a Real-time Intermediate Flow Estimation algorithm for VFI. To realize a high-quality flow-based VFI method, RIFE uses a neural network named IFNet that can estimate the intermediate flows end-to-end with much faster...
gonna just... steal that one.
@sterile mesa sorry for not tagging you. check above.
Great work!
looking for good sampler settings for least steps used, here's a cfg comparison with just 12 steps
thank you
thank you indeed
sorry for my confusion, but not understanding the sampler use. Is that the image generation sampler? Or the SVD sampler? Because I don't recognize a plain EULER sampler.
in comfy we just use the standard sampler, so all the sampler options are available including euler
For real? Is LCM available?
it is, but not very useful with SVD from my findings, can't use the LCM loras
sounds like I need to get comfy up and running with all this now
seems easier than figuring out how to make this work as an extension for A1111 lol
can recommend, I even build a workflow to generate these grids, it's so versatile
is this an extension for comfyUI or a standalone app?
to me it looks like CFG 2.0 and sticking with Euler gets the best results.
Hi all. How to use stable video with interface on windows 11 locally?
An easier way to generate videos using stable video diffusion models.
Stable Video Diffusion ComfyUI install:
Requirements:
ComfyUI: https://github.com/comfyanonymous/ComfyUI#installing
ComfyUI-Manager: https://github.com/ltdrdata/ComfyUI-Manager
SDXL: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main
you can use any...
it's available as a node in comfy in this pack: https://github.com/Fannovel16/ComfyUI-Frame-Interpolation
thanks!
Thank you very much!
same, find any fix?
I saw someone else make this one, and realized that with editing it looped perfectly, so I couldn't resist doing said editing.
FYI: I tried to get it to always loop when generating. didnt work out for me.
https://x.com/bryced8/status/1728125709661663642?s=20
How hard would it be to make a modification on the model with a defined end frame as well, for it to move towards?
Probably pretty easy for a finetune or controlnet, once those exist. Essentially thats the same as interpolating though.
Well, I guess to define the style of movement.
Which is why I suspect my approach didn't work well if start and end point are the same then interpolation is boring
I suppose, but the two are probably similar, and going to be a part of the same workflow
between generation and interpolation
maybe we just want a model trained in reverse
where you trying looping in one section of the image or the entire image?
the entire image
You need to give it 3 images to essentially interpolate, I think.
A start image, a midpoint image, then using the start again as an end
looping in one section is pretty cool too
To ensure some amount of movement happens, but it's undone
I'm sure people will figure out how to maximize this technology, just like they do with images.
So essentially, generate outwards like it does now, then interpolate back to the starting point.
You can use the pingpong effect with the video combine node in comfyUI to make it loop
Still trying to get my comfy UI set up for myself, but I'll have to try that.
aimg videogen --start-image dog.jpg Traceback (most recent call last): File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\u\AppData\Local\Programs\Python\Python310\Scripts\aimg.exe\__main__.py", line 7, in <module> File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1055, in main rv = self.invoke(ctx) File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\click_shell\core.py", line 164, in invoke ret = super(Shell, self).invoke(ctx) File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\click\core.py", line 760, in invoke return __callback(*args, **kwargs) File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\imaginairy\cli\videogen.py", line 78, in videogen_cmd generate_video( File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\imaginairy\video_sample.py", line 56, in generate_video torch.cuda.reset_peak_memory_stats() File "C:\Users\u\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\cuda\memory.py", line 307, in reset_peak_memory_stats return torch._C._cuda_resetPeakMemoryStats(device) AttributeError: module 'torch._C' has no attribute '_cuda_resetPeakMemoryStats'
I dont understand, it doesnt appear the svd model cards are installed, also i get this ^
I'm working on improving this specific error right now. but basically it means you either need to install the proper torch verison or buy a graphics card
This is the new error message:
"CUDA is not available. This will be verrrry slow or not work at all.\n"
"If you have a GPU, make sure you have CUDA installed and PyTorch is compiled with CUDA support.\n"
"Unfortunately, we cannot automatically install the proper version.\n\n"
"You can install the proper version by following these directions:\n"
"https://pytorch.org/get-started/locally/"
```
how do i install proper torch version? i have a 3060
follow instructions here: https://pytorch.org/get-started/locally/.
I'd really like that step to be automated but I haven't found a way yet
I just installed the latest comfy portable from the main github and copied my favorite checkpoints over to test things out, it seems to have a newer torch/cuda combo than what I had and everything from comfy's SVD examples page worked first try
i think i already have python installed given I have idle and can use pip install commands, is what i need to download CUDA?
Has anyone had it work with a gtx 1080ti 11gb in confy ui?
@pastel storm if you just want to play around with the SVD, it's easier to use the comfyUI version
should work
oh, can you send me a tutorial on setting it up? ive always used automatic1111 for SD so I dont know how to use comfyUI or nodes
An easier way to generate videos using stable video diffusion models.
Stable Video Diffusion ComfyUI install:
Requirements:
ComfyUI: https://github.com/comfyanonymous/ComfyUI#installing
ComfyUI-Manager: https://github.com/ltdrdata/ComfyUI-Manager
SDXL: https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main
you can use any...
thanks
one reason for this is incompatibility with "Advanced_FreeU" -custom node
Is it really easier to install comfyui? wont you still have to install the proper torch version? if not how do they get around that?
Comfy has really good documentation, so likely that makes it easier.
Actually, this turned out pretty nice! It took me a little while to find decent settings, but I think overall, not bad!
yeah fair enough
Interpolate that and slow it down and it'll be ace
hmm, got a good example? Definitely the hair is moving way too fast.
I honestly prefer the style of Automatic's UI, but Comfy I've been able to get SDXL to work, and I haven't with Automatic, so I keep both for different purposes
Visually, and in regards to navegating the UI
Yeah I use both as well. Starting to use ComfyUI a bit more now than before though.
I've been meaning to go to comfy but am just too comfortable in A1111
Both have a place, I think.
Maybe Automatic will make some adaptations that make it better at some of the other stuff
A1111 is really good for img2img, and inpainting.
Hi, How do I generate Audio from the text. The voice pitch/synthesis will be based on audio sample. Which model I can use?
comfyui is very easy to install they have a one click installer. The node based system takes some time to get used to though. I use blender nodes and houdini so I am familiar with the node structure. I keep trying to use blender shortcuts in comfyUI 😆
but auto1111 has great extensions so I still prefer it
nvm figured it out
.
are the 14 and 25 frame model cards locked to that frame count or can I raise them and its just that the 25 frame would be best for going higher and 14 for going lower?
you can change them, but it will affect your results
how will it affect them other than the intended change?
or are you just saying that if i raise it the total length of the video is extended
and im out of memory
what fps was this at?
20
How do you interrupt a generation with Comfyui? I'm rusty
what does lossless do in comfyui?
ctrl c in cli
Ty. But i remember there's a more organic method
Anything past 25 frames for svdxt is going to break the animation path
can you elaborate on what this means
now i remember. You press "Q" and it shows the queue and the cancel button 😄
Svd is only trained to 14 frames, svdxt to 25 frames. So once you pass those points the animation is more lilely to fall apart or look funky.
Have you guys found any good motion bucket id's?
ok
#▶|stable-video-diffusion message
It isnt super noticable but it is there. Here's an example though this is interpolated and slowed so harder to notice.
Toward end of the clip, the cloud path shifts
is there a way to interpolate so i can extend my generation lengths?
what way do u do it
Just that lower you go, the less movement there is. Range is 1 to 200.
Has anyone implemented prompts yet?
I see ty
Depends. I find that there is a good range for different type of shots. For close ups and of people I like between 11-50. For long shots I like 100-500
Does anyone else notice that motion id seems to effect the depth of motion. And that the system is really good at depth and splitting the input image into layers for adding motion. I've noticed higher id leans more towards background motion, where a low id can bring the motion to the foreground.
I am reading the research paper and found this interesting: under camera motion it has 3 deferent types Horizontal moving, zooming and static.
Ok one is 47 motion (left) and the other one is 127 motion (right)
would be interesting to know how to be able to force it to do one of those 3 movements
there will be loras
Ther is some fun stuff to be had letting degenerate like that and using it as a transition?
for sure
thinking of just making an "endless" workflow that uses some frame from previous gen
it would be random, but that can be fun 😄
like my cat...
The quality degrades with each end frame
I don't think I showed it here
unless upscaled
Stuff of nightmares
it does, was thinking of using like middle frame and then the blurred ending as the transition to next or something like that
could even be from new img
what the actual hell
thats terrifying
you never saw a fast cat? 😄
would the animate diff motion loras work? https://github.com/guoyww/animatediff/
lol
nope
😦
they are training, or have trained their own, it was in the paper too afaik
when there's no camera movement, the bucket doesn't ruin it at all (or do anything really)
how do you export the grid from comfyUI?
o boy lol
enjoy 😛
not the most user friendly yet
success! I actually had that installed, uninstalled it and it is now generating, thanks
speaking of though, FreeU_V2 seems beneficial for SVD especially with low steps
interesting
couple of emmas
all of these were made with
video_frames == 25
motion_bucket_id == 127
augmentation_level == 0.01
fps == 12
steps == 30
cfg == 2.5
sampler_name == euler
scheduler == normal
and face swapped at the end with inswapper_128
is there a way to save the the generated file from svd to a gif or mp4? it is saving to webp in the workflow i have and when i 'save' it externally as the .webp the movement is gone?
this one works just fine for me. It's called vhs_videocombine
ah ok thanks.. its funny i have seen a few work flows and downloaded a few things but this 'official' method in the comfyui official post is the only one i've got to work and it seems like many of them are using different nodes?
why is the gif not working
the best thing you can do is download the workflow, and install missing custom nodes. that will fill in the gaps for you.
yeah, i had a workflow with that module in it earlier, just replaced the saveanimatedwebp with videocombine, gif output now. thanks
works now
it's probably because comfyui doesn't have many video related nodes yet, this is the first... the animatediff communiy (me included) has been doing lots of custom nodes around video before