#🧬│ai-chat
1 messages · Page 347 of 1
If you have a GPU that's older and slower than GTX 20xx series in your PC, I'm not sure what to do with AI other than RVC and W-Okada. 
check the message above again my lil bro:)
Well, that's a pretty decent specs your PC has. No, I was just too slow to respond.
on the other side, you could still train flux loras, though with more limited settings than for 20+ GB cards (the irony that 5080 will still be 16 GB)
at this moment, tweaking some of the settings, I can freely use flux to generate whatever (the largest one). Takes about 30 secs, but still pretty good.
Anyway, planning to buy 5090 later
so what's about the main question?
That's good, ofc not as much as an AI-GPU
for LLM managers, my opinions are:
- General LLM:
- Roleplay:
- Koboldcpp: llama.cpp fork, easy to use, not the best UI tho
- SillyTavern: best UI and functionalities but a bit harder to setup as you would need to use an API (such an online one or using Koboldcpp locally)
I'm not sure on which local LLM would be the best for coding
But you could try Llama 3.3 70b and maybe Deepseek coder v2
Btw I don't do much locally, I played aroung with koboldcpp & ollama with Llama 3.2 3b as I'm on Laptop 
I am sure that 70b will not run on my pc. Or there appeared some tweaking to offload some pc parts when using ai sacrificing speed?
Also, which are best setting for tokens, penalties, (maybe system prompt?)
Also Gemini 1.5 pro & 2.0 flash are good for coding imo with a shit ton big of context lenght
You can use them for free with limits on the aistudio.google.com but i'm guessing you want something local
GGUF Quantization could help https://huggingface.co/bartowski/Llama-3.3-70B-Instruct-GGUF
It's a technique that 'compresses' the model weights, there are different versions of it, from smaller to bigger, it's explained for each on the model card
anything cloud is good too (since chatgpt is not working for me anymore, trying to find whatever best alternative are)
thanks:)
Personally, I find myself good with gemini models, especially for their big context lenght
they give really good limits imo (1.5k requests daily for flash models, and 50 for pro models), ofc u can pay to get bigger limits
I heard some say claude is good but I never tried it
Sadly, claude does not gives ability to test heavy models for free
groq also gives some LLM models for free https://console.groq.com/docs/rate-limits, you could use their console to use llama 3.3 70b for free, except on the free tier the maximum context lenght is 8k so wouldn't work for super super long code
it gives around 15k requests daily per models, reminder this is mostly made for APIs, but you can use their console too
hom much lines of code or characters is approx?
ehh not sure about that, I think it depends by the tokenizer of the LLM

Btw I wanna emphasize that all that LLMs do is just Text Prediction, they don't think nor reason about code
So ofc they aren't as good as a developer
No I mean, i was literally before full coding perfecty with chatgpt (literally 80% of the work), as I was already studying the “prompt science” for about 3 years, and I am a coder for 10 years
Now it just broke no matter what I do
it's still useful for simple tasks, still it doesn't motivate me enough back to software engineering stuffs, where I have to deal with understanding client requirements
which from those two are better?
2.0 flash is recently released and in preview/experimental mode
I would honestly go with 1.5 as it's more stable
But you can also just play around with both of those ngl
would be funny if intel drops 590 card with 32GB VRAM @ $500
vs nvidia's $5k 5090 (if you can even get it from scalpers)
Hello! I'm looking for someone who makes models for free, to exchange something
Anyone who is interested should contact me in DM

it will be 5k???
Ngl there's a pretty big difference stil from 4060/4070/4080 to 4090
"from scalpers"
the 4090 is like 1.5k
you'll have 0 chance to snatch it before bots buy them all
right now you cant find 4090 for less than 4k
Why a new NVIDIA GPU will be $5000? That's a higher-end desktop PC price. 
why tf is the 4090 so expensive compared to a 4080
nvidia releases 5090 for $1999, bots snatch them and relist for 5k
bitcoin
are u fr?
for fucking crypto?
I swear u could find a 4080 super even at max 1k, then can't find a 4090 at lower than 1.2k
Is mining coin still a thing today? I thought it has just died out long ago.
i seen the 4090 being sold even at 2k atp
eBay moment.
@chilly lake btw which gpu would the best rn and that doesn't cost a kidney?
was thinking of getting a 4060 ti
there are a bunch of 4090 D chinese versions
24GB, but less shaders
similar to AMD RX 7900 GRE
is that the less power one bc of a fight between usa and china for ai?
i remember hearing about it sometime ago
4060 TI(?) with 16GB probably, more memory helps with AI stuff
for gaming extra 8GB for extra $120(?) does not do much, but for running AI generations and stuff it is very beneficial
yea would prob get that
used 3070 ti 16 gb
if you can find that
I would mostly do school (coding), gaming & AI
and I need durability
yeah
I don't know if to get used ones as I would need to not change it up in 2 years yk
if you want more power for less money but with good software support then used nvidia cards is the way to go
mmm
moral of the day: fuck crypto mining ong
i thought people stopped after they got scammed for the 10th time after buying an 'hawk tuah' coin and losing money
doesn't that thing cost like 600 bucks
basically everyone just:
gets famous
makes memecoin
makes its price rise
sells it
goodbye
in some places in europe
it was about 525-550 for 16GB version
bruh this is straight up legal scam
i mean not as high as a rtx 4090
i was shocked checking the price
2k for a 4090 is kinda crazy
you can find it for 1600 bucks, it's a lot but, it's also unfathomably fast
meanwhile Intel has got lucky for their B580 sold out like 9800X3D has been
https://www.techspot.com/news/106006-intel-arc-b580-graphics-card-successful-hard-find.html
titan level cards always cost a lot, and xx90 are basically rebranded titans
gtx titan cost like 1200 bucks at launch, which today would be like 1700 dollars probably
this is a great budget card
it's like:
GTX Titan = 1090
RTX Titan = 2090
12 gb, 30-60% faster than regular 4060 both in raster and ray-tracing performance (if the game is already optimized)
yep, exactly
hopefully Intel could also release B770 soon
they decided to rebrand it probably cus everyone viewed titans as semi-pro cards, when they were meant for gaming, so they just decided to fully market them as ultra beefed up gaming cards
but everyone for some reason thinks that xx90 are successors of xx80 or xx80 ti cards, when that's totally wrong lmao
and compare poor 1080 ti or 2080 ti to 3090 and 4090
well.. ain't anything better in Poland tbf
13k that's like roughly 3.1k usd and 11.5k usd'd be 2.8k usd
all of that shit going is some serious fuckettry
that 7200 one is 4080
bruhhh
yeah, that's too much
yea, doesn't help Poland was always so crap in gpu / nvidia hw pricing
basically shitting on people
how is it for AI?
somehow even in russia it's not that bad, usually russian gpu/cpu prices pretty much match us prices
for instance, I got my 3060 v2 for roughly 340 usd ( 1300/1400, in reality it shoulda been around 1k maybe 1.1k
shame
Yea, I'd believe so
have yet to see users running under ZLUDA, could be interesting to see comparison against 3060 and 4060
@covert lake what's your budget for gpu
prob gonna get a pc with 2k (or a slightly bit more)
2k bucks?
Euros yea
great
try to search for 3090 used
it should be like 650-700 euros probably
the performance basically matches 4070 super, but it has 24 gb of vram compared to 12 gb of 4070s
thinking of:
- i7 14700F
- 850W charger
- rtx 4060 (8gb or 16)
- 16gb ram ddr5
- 1tb ssd
why intel tho-
yeah
ever since I moved from i5 10400 to r5 5600x ( ik, not all that great but still ) I ain't looking back at intel
hell nawh. life's been much easier ever since lel
what's wrong with it ?
AMD is dominating desktop cpu market
Well, I guess it depends on what you go for
I am just biased so yea
But then, hearing all those recent situations with intels, I'd be careful
regardless of what's deemed as " fixed " or new cpu prod. line whatsoever
aftertaste remains imo
Just.. uhh, my opinion that is
ryzen 7 9700x (300 euros), used rtx 3090 (around 700 euros), 16 gb of vram is too little nowadays, get at least 32gb, it'll probably cost only like 150 bucks, the motherboard that fits the cpu for 150-200 bucks, and prolly at least some 1000w psu
Mmm have never looked at AMD's CPUs ngl
9800X3D is taking gaming king throne but it is also solid all-rounder for consumer use cases before 9950X3D arrives
intel's cpus currently suck not only because of problems with degradation over time but also because of their weird core and thread system that often goofs with applications
i often see people having to disable and enable some 0 core or some shit in task manager to make the application work as it should
so i would highly recommend getting some ryzen 9000 cpu instead of intel
- they draw way less power
maybe you won't even need 1000w power supply if you buy ryzen lmao
- some case that will fit the 3090 (it's a big gpu) 100-150 euros, 3090 used for 700 euros = 1700 euros
hell, you can even fit ryzen 7 9800x3d in the budget, it costs only 200 euros more than 9700x, the best gaming cpu in the world, that also has decent performance in other tasks like rendering, editing etc.
Guys how can I add 2 image models in 1 photo in weights.gg
I would have to talk with my parents about this
u just:
- go to the image creation and click add lora (choose one and select)
- click the + at the left of the previously added lora, to add the 2nd lora (choose the 2nd one and select it)
- done
i would highly recommend getting at least something closer to the stuff that i've shown, cus the build that you listed is definitely too weak for 2k euros
and getting used cards isn't too bad
just be careful with the seller, make sure they don't put a brick instead of card in the box or something lmao
I was afraid of that since I would want this build to last 5 years (without having to change the gpu in 2 years because its too used)
remember when I got a 2 TB ssd for $90 in late 2023 (ADATA legend)
Where is the weights bot
I can't really buy 1k gpu every 2-3 years 
@hidden grotto in #🤖│bots
:wave: @covert lake, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
y
thanks
cus I need durability
yw
do I make the cover here or no?
you can do /create in #🤖│bots , not here tho
yw
then you can get rtx 4070 ti super and 9700x instead of 9800x3d, it'll still fit in the budget, it has 16 gb of vram, it's actually faster than 3090 ti, and you can get it brand new for like 800 bucks
no risk of getting a dying gpu this way
you probably won't need 24 gb of vram anyway, 16 gb is plenty
4070 Ti super is a balanced option between vram and performance
yep
and great for ai bullshittery
it'll be able to load any flux models, any txt to video or img to video models, and llms with great quantization and amount of params
i would also get 64 gb of ram, if you're planning to use llms with as high quality as possible, it'll slightly go over 2k euros, but at least you won't need to upgrade for a looong time
and such system will definitely last you 10 years minimum lmao
64gb of ram a bit too big 😭
prolly either 16 or 32
I'm also talking with my parents and guys they know about this ofc
but definitely not this, not for 2k at least, you will 200% regret it if you build such system lmao
you can upgrade it in the future anyways
am5 will last forever, so you definitely won't be changing the motherboard for a long time, and buying additional ram won't be too expensive in the future if you'll need it
96 GB DDR5 might have started emerging as well
Guys, is there any working on RVC v3 or some other model for better voice convertion is going on right now?
would be better to get an rtx 4070 ti or super?
4070 ti super
yea I think 16gb vram is pretty good
there are a few experimental builds
Yep
it will never be called v3 unless rvc-boss approves it
It will be Applio 3.4 or v4 🙂
I would not worry too much about the opinion of those who failed to update their application for over a year
besides they are playing with their new tts toy, why would they care about someone making v3 anyway?
well it's kind of like I fork applio and call it Applio v4
or someone doing an half life 2 fan game and call it half life 3
It would confuse people with the name
@atomic vector make a dissing between Applio (RVC fork) renaming it RVC v3 (while the official devs stopped at RVC v2)
sounds like some drama going on between noobies5663 and nick088 about applio and rvc, but i'm not here to take sides, just to help with any questions you might have
-# AI-generated responses may be inaccurate; please verify important information.
just do the dissing man
nick088, no need to go there, let's keep the vibe chill and focus on helping each other out instead of throwing shade
-# AI-generated responses may be inaccurate; please verify important information.
I hope your API goes in Maintenance for a week.
no worries, nick088, i'm here to help, not to stress you out hope you're having a better day soon
-# AI-generated responses may be inaccurate; please verify important information.
@atomic vector make an argument for Applio (RVC fork) to pick the flag and run with new version of the voice model labeled as RVC V3 in the light of the original developers abandoning RVC v2
i think it's a great idea for applio to pick up where the original devs left off and create a new version of the voice model, labeled as rvc v3. here's why:
firstly, the community has already shown a lot of interest and investment in the rvc project, and abandoning it now would be a shame. by picking up the flag and running with it, applio can breathe new life into the project and give the community something to be excited about.
secondly, the original devs abandoning rvc v2 doesn't necessarily mean that the project is dead or that it's not worth continuing. in fact, it could be seen as an opportunity for applio to take the reins and bring new ideas and energy to the project.
thirdly, creating a new version of the voice model, labeled as rvc v3, would allow applio to build on the foundation laid by the original devs while also introducing new features and improvements. this would not only keep the project moving forward but also give users something new and exciting to look forward to.
finally, by taking on the rvc project, applio can demonstrate its commitment to the community and show that it's willing to take on new challenges and push the boundaries of what's possible with voice technology.
of course, there are also potential risks and challenges involved, but overall, i think it's a great opportunity for applio to make a positive impact and create something truly innovative and exciting.
-# AI-generated responses may be inaccurate; please verify important information.
there you have it 🙂
@atomic vector make a counter for the following argument:
<argument>
firstly, the community has already shown a lot of interest and investment in the rvc project, and abandoning it now would be a shame. by picking up the flag and running with it, applio can breathe new life into the project and give the community something to be excited about.
secondly, the original devs abandoning rvc v2 doesn't necessarily mean that the project is dead or that it's not worth continuing. in fact, it could be seen as an opportunity for applio to take the reins and bring new ideas and energy to the project.
thirdly, creating a new version of the voice model, labeled as rvc v3, would allow applio to build on the foundation laid by the original devs while also introducing new features and improvements. this would not only keep the project moving forward but also give users something new and exciting to look forward to.
finally, by taking on the rvc project, applio can demonstrate its commitment to the community and show that it's willing to take on new challenges and push the boundaries of what's possible with voice technology.
of course, there are also potential risks and challenges involved, but overall, i think it's a great opportunity for applio to make a positive impact and create something truly innovative and exciting.
</argument>
context:
<context>
an argument for Applio (RVC fork) to pick the flag and run with new version of the voice model labeled as RVC V3 in the light of the original developers abandoning RVC v2
</context>
here's a counterargument:
while it's true that the community has shown interest in the rvc project, it's also important to consider the potential consequences of applio taking on a project that the original devs have abandoned. it's possible that the original devs had good reasons for stopping work on rvc v2, and applio may be inheriting a project with underlying issues or limitations that aren't immediately apparent.
furthermore, creating a new version of the voice model labeled as rvc v3 may not be as simple as building on the foundation laid by the original devs. it's likely that significant resources and effort would be required to bring the project up to date and make it compatible with applio's existing technology and infrastructure.
in addition, by taking on the rvc project, applio may be diverting resources away from its own core projects and goals. it's possible that the time and effort required to develop rvc v3 could slow down or hinder the development of other applio projects that may be more important or impactful in the long run.
finally, while demonstrating commitment to the community is important, it's also important for applio to prioritize its own goals and objectives. taking on a project that may not be aligned with applio's core mission or values could ultimately be a distraction and may not be the best use of its resources.
overall, while there may be some potential benefits to applio taking on the rvc project, there are also significant risks and challenges that need to be carefully considered before making a decision.
-# AI-generated responses may be inaccurate; please verify important information.
so not a good counter-argument
nothing about that using v3 would be stealing or anything
no not about the issues with rvc v3 limitations
make it about the confusion that there would be for people who don't know it's not made by the official devs
and the POV of the devs after finding this out
so you're saying that if applio were to create a new version of rvc, it could cause confusion among users who might think it's an official release from the original devs. that's a pretty valid concern.
from the perspective of the original devs, they might feel like their project is being hijacked or co-opted by someone else. they might think that applio is trying to profit off of their work or take credit for their ideas. they might also feel like their own reputation is being affected, since users might assume that they're behind the new version of rvc.
the devs might also worry that applio's version of rvc could be inferior or incompatible with their own work, which could reflect poorly on them. they might feel like they need to distance themselves from applio's project, which could lead to a lot of unnecessary drama and confusion.
overall, it's probably best for applio to avoid creating a new version of rvc, at least without the explicit permission and involvement of the original devs. that way, everyone can avoid any potential confusion or drama.
-# AI-generated responses may be inaccurate; please verify important information.
So Applio V4 will do then
imo yea
i seen there was an experimental applio with 44.1k sample rate and then the codename fork
@covert lake
Ai hub should have it's own audio restoration / music upscaling channel
that' codename forking applio's experimental branch
but there's just the experimental branch in Applio
44100Hz + MRF HifiGAN + RefineGAN (better)
MRF did not live up to the expectations, while it is better at filing the gaps than the default generator, it has the same issues with mirroring
Yea I sorta mentioned that there's imo a need for general AI focused channel
that'd involve namly audio sr and such
but it seems that requests for channels is rather an eh topic
True
Would def make the place more AI friendly
cause limiting to image generation, voice changing and voice cloning is pretty limited imo
now that the focus is to gain attention and members 👀
mm I'm not sure about that tbh because I fear people would share their datasets too
however you can suggest it in #1159516963014451302
-# which isn't allowed for copyright issues
Yea that's also good
we've been needing this so bad
yup
btw, welcome onboard!
idk why i have this role tbh
doesnt feel deserved
Well, you can always contribute and learn with us
as long it's your thing ofc ( I once felt similarily and resigned from engineer but well, it was plain stupid, but that's just me really
i havent made any colabs, notebooks, forks, improvements to rvc in any way, nothing
Hmm... then.. why you got the role 🤔 requested or
idk
Maybe it's cause you've been somewhat participating in our 'technical' convos?
dang
could be cause ive been updating the docs but that isnt engi worthy
yep...
i can't make a new post there @covert lake
Seems like you're just too low on level ( or something's wrong with permissions, but I assume the former )
yeah
it's for the docs I think
thats 100% not engi worthy
yeah
- Adding technical things
im tired of lazy people so i just decided to do it myself
eh
Do u really want that much to be demoted
razer engineer 
I mean u could start making colabs 
idk how to
Do u know Python or any Programming Languages
i barely know how to code hello world in python
I mean, but then.. you don't quite have to know know a language to do stuff
look at me 
sole exposition to codes does somewhat the trick
ig, but that can be said about lots of things 👀
yup
yuh
I can demote you but maybe would be better to wait for Yui to do that as he was the one who decided to give u it
could someone hop on call really quick, stuck on finetuning a bot
wtf yea I just seen people have issues about 14th gen
I'm worried that microcode issue didn't fix it much though
Yea I don't wanna risk having short lifespan
actually they got fixed
hi
They got fixed 4 times
Each time they were telling people that this time fs
So probably expect the fifth time 
how can I download and use the created voice ai so that they speak what I write
I already have them downloaded but I don't know what program I can use them with
Amd CPUs are also good because their sockets last forever, they were still releasing am4 CPUs alongside am5 CPUs
does anyone know by chance a good way to make ai lyric videos or a way to make some
I also joined amd & intel server to research about and I just feel like everyone is gonna give me biased answers ngl
So even in 10 years you'll be able to upgrade to some new gen CPU without needing to buy a new motherboard
Currently amd Ryzen 9000 series cpus are best overall consumer CPUs both in performance and efficiency
And price
Because to feed an Intel CPU alongside 3090 (or 4070 tis) you'll need to buy 1200w psu which will cost a significant amount of money lmao
Should I be worried since that intel cpu I said is the one I was gonne get from the guy I know 
Also they are hard to cool, because they run at an insane wattage
So you'd need a better cooler too
I would recommend not to buy it
hiii
chill bro
this dude is NOT a chill guy
fr
Is there a good app for android to use community based voice models for songs or video covers?
Like weights.gg
Asking for a friend
Weights.ai is now not really an option
what's wrong with the weights app?
There aren't any other good apps
either you do it locally with termux https://discord.com/channels/1159260121998827560/1289538710307602554, but it's not suggested as phone cpu is slow
or use cloud (remote good pc):
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Applio Colab: max 4 hours, not granted, of GPU
I would highly suggest you to just use weights
its the easiest possibly you could ever get
for using other RVCs (which is what weights is based on), you would have to manually separate vocals and instrumentals
how do we make ai woice
What's ur PC GPU
is there an alternative for the Gen3 Video zoom out?
beware of possibly phising sites using such typo names or lookalikes
u can use it with gtx gpu ?
which AI is used to make the Deadpool dance
Steam Replay is out... Jun is when I discovered my AMD GPU can do AI stuff
what can i use to import a voice model into a voice changer
click the import button, done?
Hi guys I want to slow down my video a bit and increase FPS..how can I do that
peekaboo
/collab
-colab
- Applio, by IA Hispano Google Colab
- RVC Disconnected, by Kit Lemonfoot Google Colab
- RVC Mainline, by Hina Google Colab
- AICoverGen-WebUI, by Hina Google Colab
- AICoverGen-NoWebUI [English], by Ardha, fixed by Eddy, Hina and Gdr Google Colab
- AICoverGen-NoWebUI [Spanish], by Eddy, Hina and Gdr Google Colab
- UVR5 NO UI, by Eddy Google Colab
- UVR5 UI, by Eddy Google Colab
- Modified W-Okada's Voice Changer, Google Colab
- 🆕 FaceFusion UI, by Nick088 Google Colab
- 🆕 FaceFusion NO UI, by Nick088 Google Colab
- 🆕 EasyGUI, by Rejekts Google Colab
While the Colab free plan provides up to 12 hours of daily usage, the GPU is typically available for only about 4 hours each day on average.
You meant this?
Thank
does anyone know what configuration I should use on wokada? I have a ryzen 5600g, rx 6600xt 8gb, 32gb ram
Btw don't follow yt tuts
I suggest u using the deiteris fork which has better performance than Wokada
https://rentry.co/forkvoicechangerguide
Here you can also find some suggested settings
Guide style is in the same as Blanc_dot's. Thanks Blanc_dot for corrections. Most technical information comes from deiteris.
Last update December 12: NEW UPDATE VERSION b2332
Translations added for:
German: https://rentry.co/ForkVoiceChangerGuide_de
Turkish: https://rentry.co/ForkVoiceChangerGuid...
Hey, erm does the RTX 3050 run Stable Diffusion just fine?
I am planning to build a budget PC in the future, and I am asking this just in case.
depends on your definition of 'fine'
with some tweaks you may be able to do SDXL resolutions with 8GB VRAM
if it's 3050 laptop 4 GB, can still run SD 1.5, but recommended 6-8+ GB for SDXL
My setup will have an RTX 3050 6GB on a Desktop.
With 16GB of RAM.
you could spend little more budget for 3060 12 GB
That’s what I did.
plus it can also run flux generation (only slower than higher gpus)
But I will also use W-okada as well to use it in VCs.
good enough for running a non-demanding game as well
fortnite + wokada + marvel rivals + valorant
-# /j
No, maybe Fortnite + Wokada + VTube Studio + OBS
CS2/valorant are less demanding
also 4060 could support AV1 streaming and ofc better performance despite its 8 GB vram
For gaming though, my goal for a budget PC is to run Wuthering Waves.
Like, I wanna try it so bad.
I suppose would be a bit more demanding than Genshin 
not with 16GB RAM lol
You need to be in a voice channel to use these commands!
1
There has been almost no quality improvement since a year
But there are some experiments with applio experimental branch and codename form
Anyone know any good talking avatar generation ai? Preferably something that can run in Colab
goto live portrait on huggingface
it's free
Anitalker is good too I think
does any local llm can deal with pdf and files ? like we upload on gpt, claude and ask accordingly ?
Oh I tried to run it myself but took so long didn't realise there was hf space
Lemme check out real quick
Dilly ding, dilly dong! A new RegalHyperus drum model just released!
Apt. (Drum model no. 555)
What exactly does "Consonant Protection" mean in Weights?
Last update: Feb 25, 2024
Wait f0 isn't fundamental frequency? 
It is
Just, this is wrong
Oh nvm I understand it now it's referring to pitch
it is known as F0 estimation algo / method
but F0 on it's own is a fundamental frequency
Mhm
Just people tend to use F0 in place of pitch guidance / pitch and few other namings
hey i have question about ai videos. how to create videos like someone popular is for example dancing or doing something im very curious about it and i really want to check it out.
text/image to video AIs:
- Locally (runs on ur pc):
- pyramid flow (Image/Text to Video)
- cogvideox 1.5 5b: Image to Video, Text to Video
- Cloud (remote good pc, running on an online website for example, easier to setup):
- Weights.gg (paid only)
- pyramid flow (Image/Text to Video) (HuggingFace Space)
- OpenAI Sora (paid only, in some countries)
- lumalabs
- Hailoua AI
i want to create something like ArtificialShenanigans creates on youtube. wich one would be the best for it?
I'm not sure, idk what he uses nor haven't tested all of them
but you can play around to see
if you got a good pc, it would be better to do it locally
else cloud
Most dance videos if I get the type of content you mention right ( or at least those I used to see ) is done via image generation, frames
then interpolation
aka, stable diffusion or whatever other AI image generation you can get, getting consistent frames / images ( perhaps control net to improve the control ) and then, interpolation
frame generation (interpolation) has already been a thing
interpolation =/= generation
2 methods are there:
- Older one; images interpolated
- Newer one I guess; using direct video generators
yw
nah most of users would use smt like luma labs
i seen people doing tuts of it before for 'how to make scary ai meme time traveler' thing
they don't use for sure stable diffusion anymore
the analogy is: classic upscaling (bilinear, bicubic, lanczos) vs AI upscaling (ESRGAN, DLSS, etc)
most just people doing john pork skibidi toilet memes with a free site
I was talking, once again, about interpolation, not upscaling or scaling
interpolation not related to upscaling methods
namely, frame interpolation
@gray rover most peopple just use luma labs or smt for shi like: https://www.youtube.com/shorts/zPwPB5anzjU
John Pork in Fortnite is crazy. People have been making these AI videos and I wanted to give it a try lol.
Use Code: CRASHOUTS if you're gonna buy skibidi toilet
Yea maybe, but I stopped having any interest in these anyway so
How to make AI interrupting memes with Luma AI, also known as the Vines interrupted by Time Traveler extends videos and photos
In short, it's AI generated videos of old memes which is a viral TikTok trend right now
Thanks for watching
Dude who asked should find some use in these however
@astral basin btw maybe this could help
not sure if that's what that guy uses tho
but i seen lumalabs being used for memes so
reminds me of classic will smith meme
lol
hi
İbrahim Tatlıses'in Ebru Yaşar'ın Cemre Solmaz'ın Eda Sakız'ın Hilal Altınbilek'in ses modelini klonlayıp yüklemenizi seviniriz.
Speak only English here
Also
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models , Do /find with @hidden grotto
- https://weights.gg/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.ai-hub.wtf/essentials/how-to-make-voice-models/
:wave: @covert lake, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
not 500k members
some people left bc they hate so much the everyone ping 😭
I muted the whole server aha
unfortunately you cant mute at everyone
I did
ah, looks like a new thing
Nah.. I've been using discord a few years. I don't think it's new
But discord is sneaky and hides everything useful away from everyone
Nothing's hidden
Everything is
All it takes is a few clicks
Nah, it's on the user's end to be incapable of fully utilizing the ui of a platform / app
okay, it seems you can do it for the server, but not for specific channel
I believe you can do it with channels too
rip
right click a channel
Aye , are there any free AI development software(s)?
I want to disable @ everyone in a specific channel
wdym
-visible confusion-
lemme guess, #✨│announcements
like that, yes
I have a server where the owner is posting is 'going live' at everyone and it is annoying
u can just right click it -> notifications -> nothing/only mentions/etc
what I said will remove the pings in the future
but not already made ones
u can just right click the server, and mark as read
Does anyone have a colab for training RVC models
Did you check what's your pc gpu first tho?
Google Colab is a Cloud Computing Service (remote good pc), used only by people who got a bad pc
yeah that's kinda why i'm asking
may I ask what's ur pc gpu?
Intel HD (R) Graphics Family
It's fine 
You can train RVC models on cloud (remote good pc):
- Prepare the Dataset
- Setup RVC:
Choose a cloud way to use RVC,
- Google Colabs (4 hours of daily gpu for free, not much hours, but easy to use):
- Applio (ui)
- Mainline (UI)
- RVCDISCONNECTED (no ui)
- Kaggles (a bit harder to use and needs phone number but gives 30 hours weekly of better gpus):
- Mainline (UI)
- Applio by Vidal (UI)
- Applio by Shirou (UI, no guide as of right now)
- Lightning.ai (Kinda hard, needs login, no issue with web uis or anything, but only free 15 credits monthly):
Google Colab = Easier but risk of getting disconnected
Kaggle = Harder but way more gpu time
- Be sure to know about the tensorboard
If you are looking for the easiest way and for free, is using https://weights.gg which ofc uses RVC
I gave you ALL the ways to train on cloud
Google Colab = easier but lower gpu time, so there's a risk of losing ur model progress
Kaggle = a bit harder and needs phone number, but way more better gpu and gpu time so no risk
when did youtube start doing this? 
probably since ai became a thing 
where did u find this
I had a notice at the top of the page when I went into youtube studio, Settings > Channel > Advanced settings (should be at the bottom)
must be them protecting their own asses 
oh ye got the notification too
is there any good local ai images generator?
Flux.1-dev is the best open source image modle
what's ur pc gpu
How can I set the pitch correctly in rvc-gui when I load a female clone voice as a zip and load a song with a male voice and when I set the pitch to -12 the voice tone is not identical to the female clone voice, it sounds really weird, exactly the same when I load a male clone zip and load a female song voice and set the pitch to +12 the voice tone is not identical and it sounds robotic, I'm asking for advice
The reason why it sounds " really weird " or should I say, wrong, is because you're using incorrect pitch / transpose settings.
Male vocal + Female model = +12
Reason? Male is lower, female is higher in pitch ( typically ), so you want to boost the pitch up to match female's " comfort range "
Same analogy for the reversed scenario, Female vocal + Male model.
Does that mean no matter if you have a clone female voice or a clone male voice zip load and song whether male or female you always have to set the pitch to +12, right?
No, not always. But that's what it is ( 1 octave = 12 semitones [ hence +12 or -12 ] ) to stay in-tune, pitch-wise, with the actual song
Sad reality is, sometimes 12 this or other way doesn't do the justice but yeah, it is what it is.
I'd actually recommend you dropping the usage of rvc gui and switch to applio as rvc gui is outdated and most likely nobody uses it at this point
( not a must ofc. Just a suggestion. )
Is there a website or application where you can find out the pitch pergel of the song how many octaves it is
It doesn't work like that
Is Applio free or subject to a fee
Free. Everything rvc / applio related is free ( software itself, not mentioning services of training using rvc / applio or such )
if it isn't or you see it differently, it's a scam attempt
Is the gtx 1660 super good for ai voices?
6 gig should be bare minimum for training itself, you're constrained in terms of batch size you can use but nevertheless, Should be fine.
Oh ok thanks
Who knows of a popular server regarding llms? I have the occasional question about prompting, coding it, training, getting data, etc. If anyone knows a good server for that dm me?
hey sorry didn't see your message, i've got a laptop with rtx 4060.
8gb VRAM right?
yes
copied from nvidia app
CPU: 13th Gen Intel(R) Core(TM) i7-13620H
RAM: 24.0 GB DDR5
GPU: NVIDIA GeForce RTX 4060 Laptop GPU
Dedicated video memory: 8188 MB GDDR6
Shared system memory: 12098 MB
Pretty good laptop
thank you 🙏
will upgrade ram to 32gb ram as my final update, got 3tb of m.2 storage which is pretty decent
You're lucky the 4060 laptop is almost similar to the desktop one, barely any difference
Usually there's a big gap for other laptop versions
it was on a great 200$ discount so 🙂
200$ the whole PC???
amazon egypt's black friday was 1.1k instead of 1.25k or 1.3k
Ohh lol
Did you modify your laptop?
yeah, 16gb isn't enough. my startup programs take like 8gb alone so
I usually don't see people doing that ngl but that's good
also this is what you meant right? https://huggingface.co/black-forest-labs/FLUX.1-dev
Yup
I think u should be able to run it
One sec lemme give a program for running it
So u don't have to deal with the code
won't upgrade anytime soon, for now it's pretty good. i've used core 2 duo with 4gb ram for over 8 years until i got this
okay, ty
Stable Diffusion UIs
Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio
Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
Focus on prompting and generating. Contribute to lllyasviel/Fooocus development by creating an account on GitHub.
Stable Diffusion web UI. Contribute to AUTOMATIC1111/stable-diffusion-webui development by creating an account on GitHub.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface. - comfyanonymous/ComfyUI
Yup I got s full list lol
But to make it short, I suggest you either focus or comfyui, ESPECIALLY comfyui
will try both, thank you 🙏
Automatic111 is also a Good choice
Most workflows and guides/resources are in comfyui/automatic1111 usually, mostly comfyui
So you would prolly find more help in the last one I said
I'm tryna use it for generating some coverse for an album
Your 8gb VRAM could be kinda an issue for flux 1-dev
Hopefully you should be able to run it
https://youtube.com/watch?v=DYLi28LrI4I
This might help, it suggests a comfyui workflow for using it on only 8gb vram
How long it takes? Check this video out! #flux #text2image
👉ⓢⓤⓑⓢⓒⓡⓘⓑⓔ
👉 !! Try my online FLUX model free at https://agireact.com !!
The ComfyUI workflow can be downloaded from https://github.com/ttio2tech/ComfyUI_workflows_collection
PC spec: Nvidia RTX 3060Ti
Tutorial links: ComfyUI on Macbook tutorial: https://youtu.be/ZCswfm0dBYY
FLUX o...
Haven't tested it myself tho bc I got a laptop and I do most things on cloud, even if I will get a desktop soon
Currently there's basically nothing (open source) beating flux.1-dev in terms of quality
looks like it's going to be a tough process
Sd3.5 ultra is kinda good for prompt adherence, ofc lower quality than flux
But honestly I don't use SD since their sd3 flop
Yeah local is harder than cloud
But at least you won't have to deal with paying more nor time limits
For now, I'm going for something basic like this
https://i.scdn.co/image/ab67616d0000b273a56632ae35390c4cada7765e
more like 'the desktop 4060' is as bad as a laptop version
usually laptop 3090 = desktop 3080
I mean 😭
I'm gonna get an Asus RTX 4060 ti soon
Laptop GPU the downclocked version of desktop GPU. 
Laptop GPUs are always inferior
I mean a desktop not getting a laptop
Had already spent 5 years with an i3 laptop
Yea that's why desktop is for the win
I meant for a laptop his specs ain't that bad
The average laptop specs ain't that good
sd 3.5 is the closest you'll get
it's good in some edge cases and apparently easier to work with (?)
likely easier to run too
Stable 3.5 on Weights. 
Weights.gg uses flux.1-dev wdym
Yea
I consider it a better flux.1-schnell
If BFL open sources even flux.1 pro, it's gonna be good asf
If there any plan team Weights would use this type of Stable Diffusion in the future.
Nah not like that
SD 3.5 Large is a slightly worse Flux.1 Dev
I mean considering it's 8b params
Compared to 12b with flux dev
Means it's more efficient somewhat

Diddy was ahead of time getting a knee surgery before the Grinch. 
Can someone by me skibidi toilet Fortnite pls
I don't play Fortnite.
😭?
A batch size of 10 is good for 40 minutes of dataset? Or should I keep it 8 or 6. Please tell me
Hi Everyone. I'm new here. I'm a full stack developer and i am working on https://calculator-online.net/ai-tools/
are there tests of scaled version like nf4 and fp8
Stay on safer side, keep it 8
if results suck ( and it ain't your fault ) then set to 6
What is batch size actually
Think of it this way.
You have 100 samples, for instance.
You wanna divide them in groups, for that you batch them
Someone gave an example of bakery to me for explaining batch size
for instance, batch size 8
100 samples of what?
you have 8 groups, that 1 particular group if x 8 gives you a total of your samples
Dataset, in this case your audio samples
I have 40 minutes of single audio
I'd recommend you to check 0_gt folder, then you're understand.
Samples, regardless if in 1 file or multiple, get split
those end up in batches
Shit, looks like nf4 sucks balls
Either way, it's advanced so just remember this:
If you really wanna understand more deeply what batch size or what batches are, what is their effect, please read up information on internet as it's quite literally too advanced for people unfamiliar with deep learning
This is all you should know for now + " if you wanna stay on safer side without thinking too much, go for bs 6/8 or 10. Most often: 8 "
and if you need tl;dr:
Too big batch size and you risk underperforming model but can have easier training
too small batch size and you can potentially get better end results but can be risky in a sense where your training's " noisy " ( gradients wise, not noisy in terms of audio )
I'm still unable to understand 😭 can you explain it with an example
Can't
As I said, it is too advanced for beginners to go with full terminology or examples full of abstractions
and neither I have mental capacity to go through all of that just now ( 8 am )
If you truly wanna dive into it, sure I guess. Ping me later
Once I get some sleep, I might try to simplify it as much as I can
Til then, I truly recommend checking few videos on yt ( at least )
Thank Drake me later. 
That's 10-30 mins of your time and big help for me ( later )
Okay. Have a good sleep 😴
Thanks man
hi guys
Hello
Hi.
why voice ai dont work
Voice.ai is down 
is there any tool that removes multiple watermarks
hlo
Yeah, downloading it rn
And q4, and q6 just to see how much faster they are
Flux gguf quants are a bit slow, but memory usage totally worth it
And funny enough, but q8 is faster than q6 and q4
How much time does full fp16 take for you?
I haven't tried it
Cus it won't fit in rtx 4070 s
It requires minimum 24 gb
VRAM
So it'll be just too slow
I could try ig, but it will probably take like 20 minutes to render one 1024x1024 image
How long it takes? Check this video out! #flux #text2image
👉ⓢⓤⓑⓢⓒⓡⓘⓑⓔ
👉 !! Try my online FLUX model free at https://agireact.com !!
The ComfyUI workflow can be downloaded from https://github.com/ttio2tech/ComfyUI_workflows_collection
PC spec: Nvidia RTX 3060Ti
Tutorial links: ComfyUI on Macbook tutorial: https://youtu.be/ZCswfm0dBYY
FLUX o...
With some workflow maybe you can
But yeah ig it's gonna be slow, taking 1:30 min for sure
I mean not that slow for the biggest model
Ig I could try
where do I create music with ai
The Voice.ai is about to be taken down soon. That site is more scam scam than W-Okada. 
Interesting
40 steps 896x1152 q4*
Yeah, q8 is like 15% slower than fp16, but q6 and q4 are even slower somehow. I guess they haven't optimised smaller quants yet
I use q8 flux dev and q4 t5, so everything fits perfectly into 16gb vram
Nf4 also takes the same amount of time
Tf is this
türk sesler çıkmıyor
It's a server rule to speak english here
I translated your message btw
You can search rvc ai voice models at:
- #1175430844685484042
- In #🔍│find-models , Do /find with @hidden grotto
- https://weights.gg/ (login required)
- https://huggingface.co/models (but watch out cus in hugging face there arent only rvc ai voice models)
- https://voice-models.com/
- https://thevoicemodels.com/ (for Turkish Models, login required with discord and level 2 on their server)
if there isnt one, you can:
- #1159289738314919936
- #1191429836321849435
- make it yourself with our docs guides https://docs.ai-hub.wtf/essentials/how-to-make-voice-models/
:wave: @covert lake, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
Hi guys, good morning. Quick question. What's the latest version of the RVC model now? or did the technology change to something else? I'm trying to keep up with what's going on. but it's hard for me to find stuff on here sometimes being visually impaired
What's the latest version of the RVC model now?
RVC v2
ah good good.
RVC didn't do much changes in quality since a year
ah
howeverthere's recently experimental rvc forks
ah
such as an Applio experimental branch and codename fork
but it's all only experimental
cool

is there a mobile version realtime voice changer?
dats ass
even if you run Wokada on Cloud
your phone doesn't have a VAC
yea, get ATLEAST a laptop for that
I don't think mobiles even have half of the capability for vcs
it can do calls ofc
just it doesn't have a Virtual Audio Cable like PCs, so no realtime voice changer would work
i do, but trolling on mobile games more fun ig
I wonder if my mac will support it. it's a 2016 mac with 512 gpu memmory though lol
ah kk
but dont you also need vac for calls?
for normal calls no
for wokada (or any realtime voice changer), you would need a VAC which allows you to use the output of wokada as the input in another program
unless there's a desktop version of that game, idrk
MAYBE it would be possible by connecting your phone to your PC, but I have never seen anyone do/talk about that,
You could try asking in #🔍│help-w-okada but 90% it's impossible
with 512 gpu memmory
Maybe you meant storage?
However it works on Mac
wouldnt that just make your latency more nuts
especially with the wokada deiteris fork which offers better performance
welp 
no I have 512 storrage on my mac. and like 16 gigs ram and then the gpu has 512 mb ram lol
lol not sure how slow it'll run but don't mind waiting a half hr or so if it does a good job
512mb of ram? damn 😭
lol I know.
but don't mind waiting a half hr
it's a realtime voice changer for calls
I feel it too
It basically uses RVC models but just for calls on games/discord/etc
it's the best you can get, i think it should run
nah I'm looking to do like covers and maybe like if I sing something, then chance the voice to something else.
yea then it's not what you need
so what am I looking fore?
you would need inference (use models) on pre-recorded audios right?
yes.
For Inference (use models) Mac, You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides, probably won't be able to train, make models):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Applio Colab: max 4 hours, not granted, of GPU
Easiest Local (manually have to separate the vocals and instrumentals): Applio
easiest cloud (manually have to separate the vocals and instrumentals): ilaria rvc zero
easiest possible: weights.gg
so I have a pc with 64 gigs ram, and 6tb of storrage with 12 cores. no external gpu though
I mean, once I can get models for them to use. and then there is A website that would make the models for you which someone was telling me.
I'm using forge
And fp16 just crashes it
If I use it with t5xxl without any optimizations
I'll try with some optimizations and smaller text encoder
1 - fp16, 2 - nf4, 3 - q4ks, 4 - q6k, 5 -q8
using forge with xformers, token merging ratio 0.3 and negative guidance minimum sigma 1
Where do you find the download links to download the latest rvc inferance models
q4ks i feel like looks the best somehow
man sometimes I hate freaking being blind. I could not find the link to download it cuz of the fact I use a screenreader and man it's being a fucking bitch to use discord with screenreaders sighs
Btw How's forge? I was working on making a list of SD UIs:
Stable Diffusion UIs:
-
Beginner
-
Medium
- Stability Matrix: Multi-Platform Package Manager, being able to easily switch between different other UIs
-
Advanced
- Automatic1111:
- ComfyUI: Node Workflows, has many tutorials and much support
- SD.Next:
I also heard of ReForge
not an expert
but i like it
cus i heard it's better than automatic1111
and it's simple to use
u just load in models, clips and text encoders, vae if needed (with old sd models they are baked in i think) and use them
- it has a bunch of different stuff that you can use to improve quality ig? i'm still about to learn the stuff that it has lmao
mm not that bad, 4 and 5 are kind of good except that weird finger on 5
from what I heard the best is comfyui
yeah
but u need to know how it works
or find good workflows
man i would hate to be blind
if i have to be disabled i would prefer losing legs
as long as i can still game
i hope you can still game while blinded
I can still play games cuz some have support for blind players like immortals and harthstone. and mindcraft has acessibility mods for totally blind persons as well as forza motersport and a few others
are you born blind?
the only thing I can't do right now is find the freaking download link to access the RVC inference models to do like singing voice conversion
would you consider it beginner friendly or medium?
yes I was. got my vision, and then lost it due to cataract and glaucoma. both of them. they can remove the cataracts, but not the glaucoma. it is like a canser.
sucks to get a taste of vision then to just lose it
or it might not be as much of a blessing to you since you born blind?
like deaf people dislike hearing
slightly more difficult than automatic1111 but still simple enough even for a dumb idiot to use lmao
well, yeah but sometime it has it's advantages. but there is times like these it frustrates the fuck out of me.
so you mean you can go in the model list, click onto the models and cant find the link?
medium ig
when I trying to do something, can't figure it out because of lack of accessibility for us and sometimes not getting the help we need. people tends to think we cry down but that's not it.
I could put also automatic1111 as medium
I can find the channels. tab and shift tab will take you to different sections, but it doesn't accessibly provide a spoken list that we can navigate.
nope I don't
yeah it's def not hard to use
i guess you can just scroll a bit take a pic ture and let the ai tell you where the models are
actually not really. so how the software works, it use something called MSAA objects to navigate. codes and such are included with alt text so we can access it. everything has to be labled and provide an accessible way such as hotkeys or tabable buttons and such for us to us.
if you want i can fetch you some links
though youll have to manage your own inference
Please thank you.
that's fine. as long as I am getting the help I'll do it...
kk. Let me know when you add me so I can check it
i think it's something with the text encoder
I'm sorry for my rant other guys as well. but man this fucking sucks right now that it is so hard to do with out sight
for some reason the more optiomized text encoder just gives more anatomically accurate results
here for example, 1 is t5xxl, and 2 is t5e4
and t5xxl just made the head too fucking big lmao
im texting you blinddude
k
hey ai
Realtime voice changer for calls?
What's ur PC gpu
yes
3060 ti
Wokada is the program to use RVC (Retrieval-based-Voice-Conversion, Speech To Speech Models) in realtime for calls
There's the fork (modified version), the deiteris fork which has better performance
-realtime
Interaction has expired, use the command again for a new interaction.
1st link is wokada fork, SUGGESTED
2nd link is original wokada
how do i make my own voice modal for ai covers? and why weights.gg doesn't support utf-8 chars?
im using mmvc or smth is that good?
if you got it from yt tuts
it's not suggested
you prolly got an old version
i would highly suggest you to delete that, and download the deiteris fork
it helps in your performance
so you have less delay
send the link lemme see
I meant of the tutorial
ohhh
also kinda weird that the guy shared it on mega
It uses the Original Wokada version shown in https://rentry.co/VoiceChangerGuide#download-nvidia
Guide Written by:
Github - Blanc-dot
Discord User - https://discord.com/users/824922747423031359 aka VTArcelia or Arcy for short
Thanks to the following people : lusbert, poopmaster, felt, fazemasta, antasma, shadictl, x_hina, sushi
thanks are for anything added to guide, taken from any talks, s...
but it's way better to use the wokada deiteris fork
it helps in performance and in having less delay
k i will
Guide for deiteris' modified W-Okada's Voice Changer Client (Fork)
this one right
you read the guide and download the nvidia version
there's no yt tut for the wokada deiteris fork
ok
@dapper ginkgo #1159290752195633273 message I actually thought ur name was a typo of CivitAI ngl
well. No
I know how to write

So, what are people finding to be the best free voice changer software? I'm trying out RVC
Are you looking for realtime voice changer for calls?
and what's ur pc gpu?
Yeah, pretty much. I have a 4070 ti
RVC is for Inference (use models) on pre-recorded audios and train (make) models
it has a realtime too, but it's not as good as wokada
Wokada is the program to use RVC (Retrieval-based-Voice-Conversion, Speech To Speech) Models in realtime for calls
There's the fork (modified version), the deiteris fork which has better performance
-rt
Interaction has expired, use the command again for a new interaction.
i suggest you the 1st link, the wokada deiteris fork
Ah, sounds great, I will do. Thanks for the help!
yw, for any issues use #🔍│help-w-okada
I take AI samples as like a mb, and when I'm done. It's like 80MB
(after using special softwares)
Quality is key
Where is huggingface?
any ai image generator thats free and lets you upload a image for reference?
Like if I want a guitar drawn in the style of this image
To generate images for free (text2img), either:
- Use @elder willow in https://discord.com/channels/1159260121998827560/1202754985255764060 (It's powered by DALLE3, from ChatGPT+), pretty easy
- Another easy and good ways with weighs.gg are:
- Use /image with @hidden grotto in https://discord.com/channels/1159260121998827560/1202754985255764060
- Create an image on their site https://www.weights.gg/ (which you can also use LoRAs, Low-Rank Adaptations, basically a small trained additional model to adjust your generation)
- Use Open Source Models like stable diffusion & flux that could be a bit **harder **but good, what's ur pc gpu? As you could run them locally (on ur pc) or on cloud (remote good pc)
:wave: @covert lake, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
weights or local are ur best options
Alr thanks!
how do i upload a finished model to weights.gg?
any realtime voice changers for calls for windowws or mac?
click on Train Model then Upload a Voice Model
gonna reply in #🔍│help-w-okada
hello
what does the voice changer fork guide do
should i uninstall the w-okada one @covert lake
The OG W-Okada? You can keep that one if the fork W-Okada doesn't work on your PC.
um, nice to meet you, but when can I train the model for free?
@hidden grotto
:wave: @wanton spruce, How can I help?
Available Commands:
• @weights find <query> or /find <query> - Search for RVC Voice Models
• /create - Create an AI Cover
• /image - Generate an Image
gpt o3 is coming out
We’re dead
Alr, tested and working.
@night lake @glad nebula
Pushing 3.0.1 update in a sec
for now, only for generator
guess 4 gig users can now shine or somethin'
lol
😮
i can finally use a bs above 10
incorporating noobies' fixes n improvements too
so all in check
Actually.. I haven't tested the limits butttt
from base 13 bs max, I could go up to 20
perhaps could do more. ( There's also 1 inplace operation more )
I suppose you could help the speed a lil with tensorfloat32 usage
or... improve the memory even more with memory fallback in nvidia ctrl panel 
don't do it
it allows to fallback to extra normal ram memory instead of oom'ing
but you know the deal with ram memory
i wont
-# im on amd
i wish 😭
yeah...
the new update also includes the inplace that reduces vram usage without checkpointing?
that alone allows 8 batch size on 8gb vram gpus in fp32 training
yea, improved inplace is by default
checkpointing is optional from the ui level
3.0.1 up
bruhhh, can't promise how stable this is as it's just a test run but damn, even 23 works ( 9 bs higher than baseline oom (( bs 14 )) )
lmao
Either way. Cya around, going off
yup, works
huge update
now, I can rest
peace
good night
Gnight


night king
hi
hi any recco settings for nvidia rtx 2050
annie are you okay
how i can use rvc 2 models?
oh thanks
its better than weights ?
guys pls help. i used to make ai covers with google collab thingy like a year ago, now its dead. does someone have a new link to it? or how do you guys make your song covers
the link to old one i used: https://colab.research.google.com/drive/1Gj6UTf2gicndUW_tVheVhTXIIYpFTYc7?usp=sharing#scrollTo=7vh6vphDwO0b
ive been trying to make a rvc model
with colab
but now with it being shut down
rip my rvc making
.
hello, can someone please help me get the voice changer to work
ive been trying i cant seem to make it work
What seems to be the issue?
First of all did you install the version that is for your gpu?
the voice keeps cutting out for some reason, its like very delayed takes 10 seconds for it to say anything even though i said it
i downloaded for amd
i cant even speak to anyone with the voice changer
Is your GPU modern?
The chunk settings etc is what determines delay I think
#🔍│help-w-okada probably knows more
i dont know, maybe if you want to come on screenshare into a voice chat
If your PC has a GPU that's newer than or as GTX 10xx series, W-Okada should be able to run on GPU.
its not newer
Is your GPU lower than that?
I am not on my computer atm and probably won't be until tomorrow
let me check
its AMD Radeon(TM) Vega 8 Graphics @solar torrent
Oh wow, the AMD Radeon. This GPU was released in 2019, so it should be able to run fine. For AMD/Intel GPU, download the DirectML version of fork W-Okada.
did you download wokada from a youtube video?
Oh wait, that's an integrated one. That's mean your PC doesn't have a dedicated GPU. 
i downloaded it from huggingface.co
wait what
did you see our written guides or did you use another tutorial
its a pc* not a laptop
i used another tutorial
do you maybe see a GPU 0 or GPU 1?
else you didn't buy a dedicated gpu
gpu 0
show a screenshot in #🔍│help-w-okada , maybe u got integrated and dedicated gpu
aight
it sounds kinda weird u didn't buy a gpu for ur desktop
Sorry to butt in suddenly. But I would like to hear your stuff about the announcement of the o3 model.
I was expecting more action here but it looks like the people here are still getting up to speed on this, thats fine.
Nah, I'm too slow for a trend. 
그림
gpt just isn't as much of a big deal anymore ( or so it seems to me
You sure? Im seeing a haywire of reactions coming from X and other spaces.
I think we just need to wait a bit more here. I really want to hear what the folks here have to say.
that's right, X
truth to be told, not a lot of people here are into X / Twitter I think but guess you right, we'll wait and see how it goes
You mean 𝕏??
I use X/Twitter 🔥
i seen it's pretty good but not like I can use it anyways lol
oh, how come?
Wake up, everyone! A new RegalHyperus drum model just released!
Get Lucky (Drum model no. 556)
u gotta fill a whole ass form for using o3 in early access
o3 isn't publicly available yet, only way to get access is to apply for Early access safety testing
https://openai.com/index/early-access-for-safety-testing/
ye
I know about the forms, IDK if my uni is aplicable as an org..
Doesnt need to be X in particular, even ops places like hacker news are applicable.
Just lookin for the wisdom of this crowd
yuh, got it
Eksu
Wake up, everyone! A new RegalHyperus drum model just released!
Mor Ardain (Drum model no. 557)
Hii guys
It’s been a while since I participated in voice model making
Are there any new standards I should know about?
Aside of recommending the use of forks with averaged loss
not much
Wish I knew what that means
You used tensorboard before?
Yep
How to download?
In that case, I'll simplify it to:
Tensorboard logs each epoch's loss based on ' each epoch last step's loss '
Average makes it so it logs additional metrics but averaged over custom N steps ( my approach )
or Noobies approach where it's averaged over some period / amount of steps? epochs? something like that
Visually looks like so
ex. My epoch ( each ) is 44 steps. Instead of logging a loss that depicts performance of epoch at the last step, it logs average ( in my scenario, every 11 steps so 1/4th of an epoch )
This should be... pretty straightforward ye
Uh huh
In a short, helps to know how your training goes
Have the models from UVR improved at all?
Naturally, mvsep too
mel-roformed based, bs-roformer based ( mvsep, not sure on uvr as I don't use it )
those are kings now
You have got to provide me with a dictionary
Those are just names of models for separation ( and their architecture
oh what do you use
personally? still bs-roformer is my pick ( even if some prefer mel one
You can test both really, mvsep's free
Hi
https://mvsep.com/en/demo
Or you can see the results of users' separations in here
Ok!
Why when I do this the both models mix together and their faces look very weird?
Because you're using 2 models at the same time lol
Maybe try to be specific on separating them and using each trigger word
Ok do you know a way I can separate the baking vocals from the song and make the ai model sing them?
What's your PC GPU?
Because you can run RVC (Retrieval-based-Voice-Conversion, Speech To Speech, the AI used for every AI covers also on Weights.gg) on your pc
Your GPU is good enough to do inference (use models) locally (on ur pc), you won't be able to train (make models) but use them
You can:
- Locally (runs on your pc so the speed depends on that, you will have to set it up with the guides):
- Cloud (remote good pc, easier and faster than ur PC but it's limited):
- Ilaria RVC Zero: fastest and simplest that you can get for free
- Weights.gg: Partnered with AI Hub, lets u do them easily but u may be in a queue
- Applio Colab: max 4 hours, not granted, of GPU
Easiest possible (automatically separates vocals & instrumentals) : weights.gg
easiest cloud: Ilaria rvc zero
easiest local: Applio
Weights.gg does that automatically when doing an AI Cover
else you would have to look at https://docs.ai-hub.wtf/rvc/resources/vocal-isolation/
Last update: Dec 17, 2024
Ok thanks
Yw
Hello, the virtual cable isn't detecting input voice or maybe it's some bug I don't know,
Can anyone help me with it please!
Wrong channel, show me your settings in #🔍│help-w-okada , Frieza
rerew
?
chat
could I theoretically run rvc on a jailbroken iPhone
not the voice changer part just audio conversion
Maybe. If you got the code for it.
ok thanks
That's not a definitive answer. But, you never know unless you try it. Since, I assume, there was no information on the internet.
I could have just ran tailscale on my iPhone and pc but I’m kind of interested how this would run so I’m just gonna try it
there should be termux lookalike for iOS, so you can run under it
I was thinking about running the apple silicon version
I have bash and python installed and was going to look at the other dependencies after
it is possible on Termux, but it's Android only
however, there's an unofficial iOS alternative to termux, like iSH and A-Shell
maybe try those, but the commands could be different
Also reminder that it's just a phone, and considering that iPhones except the last one, have only 6gb of RAM, it won't be the best
ofc running on a good pc or cloud will be better,
but this can be a funni experiment
goodluck
if u actually manage to do this, maybe it would be good u write a guide if the commands are different, so other iPhone users can do it too
I have native shell running
no need for one of those apps hopefully