hello everyone, im trying to seperate two people cross talking in the same audio file, ive tried using a blind source separation model, but it badly affected the audio quality. ive seen some models on youtube that managed to maintain good audio quality, but they didnt have any tutorials or any links they only showed the before and after results. im looking for recommendations for blind source separation models or any tools that can deliver high quality results after seperating the two voices cross talking without making the audio quality sound really bad. any help will be greatly appreciated!
#looking for high quality blind source separation for two people cross talking in an audio
1 messages · Page 1 of 1 (latest)
so you mean 2 people talking in the same time? maybe you could use HP 6 Karaoke in UVR (be sure the input file is only vocals)
yes two people talking at the same time, ill try that thank u
yo i installed the app and everything but i dont know where to get the HP 6 Karaoke thing ure talking about
Ayo? @sick gorge level 1 !!! 
Click on the Settings Button (wrench icon to the left of Start Processing) and go to the Download Center tab
Select the vr arch process method and download the ho 6 karaoke model
Also if you want a guide for UVR https://docs.google.com/document/d/1_T9d4pmNg0iS7yzbYdLc0AjDG03cNPACzC0tXT6COUg/edit?usp=sharing
Google Docs
Download and install Ultimate Vocal Remover from https://ultimatevocalremover.com/ for Windows or MacOs. Download the video of the song you want to use as the basis for the AI cover: For Youtube Copy the song link from the search bar and Paste it ontohttps://publer.io/tools/youtube-video-downlo...
i got it im downloading it rn ill let u know if it works
thanks again
Yw
what do i select here?
x-minus pro and mvsep may have a multispeaker model, and spectralayers 11 has an unmix module for that purpose though I have tried it not really effective on some of my cases
yea ive been thinking about getting spectralayers but ill see if UVR has any decent results first
For process method put vr arch, then for the model use the hp 6 karaoke one
Be sure the vocals are clean and to use them as input btw
i did but when i choose vr arch the models disappeared
ooh nvm
their places changed
the voices seperation isn't really working but anyway heres the model i used before and the voices seperation was good but the audio quality was really bad https://huggingface.co/speechbrain/sepformer-wsj02mix
Wdym that isn't really working? The input was the cleaned vocals right
idk what u mean by cleaned vocals but, heres the original audio i used to test
here are the results
Mmm weird
Maybe you could try again using the vocals result as an input? As the audio was kinda noisy and it seems the vocals results are less noisy
i used audios that had no noise as well and the results were similiar
Ayo? @sick gorge level 2 !!! 
your other options are medleyvox and multispeaker models as I ever said above