#Converting ulaw 8khz to pcm 16khz 16 bit?

1 messages ยท Page 1 of 1 (latest)

random sierraBOT
#

<@&987246399047479336> please have a look, thanks.

random sierraBOT
#

While you are waiting for getting help, here are some tips to improve your experience:

Code is much easier to read if posted with syntax highlighting and proper formatting.

If nobody is calling back, that usually means that your question was not well asked and hence nobody feels confident enough answering. Try to use your time to elaborate, provide details, context, more code, examples and maybe some screenshots. With enough info, someone knows the answer for sure.

Don't forget to close your thread using the command </help-thread close:1027500463647621170> when your question has been answered, thanks.

limpid orbit
#

When I run soxi "filename.wav" I see it shows:

Channels       : 1
Sample Rate    : 8000
Precision      : 16-bit
Duration       : 00:00:00.08 = 659 samples ~ 6.17813 CDDA sectors
File Size      : 44
Bit Rate       : 4.27k
Sample Encoding: 16-bit Signed Integer PCM

If I write the audio to a tempoary file then load it and then save it it works and I get:

Channels       : 1
Sample Rate    : 16000
Precision      : 16-bit
Duration       : 00:00:13.18 = 210938 samples ~ 988.772 CDDA sectors
File Size      : 422k
Bit Rate       : 256k
Sample Encoding: 16-bit Signed Integer PCM
limpid orbit
#
System.out.println("uLaw: " + ulawFormat.getEncoding() + "," + ulawFormat.getSampleRate() + "," + ulawFormat.getSampleSizeInBits() + "," + ulawFormat.getChannels() + "," + ulawFormat.getFrameSize() + "," + ulawFormat.getFrameRate() + "," + ulawFormat.isBigEndian() + "," + ulawAudioInputStream.getFrameLength());

gives:
uLaw: ULAW,8000.0,8,1,160,50.0,false,659

System.out.println("PCM: " + pcmForm.getEncoding() + "," + pcmForm.getSampleRate() + "," + pcmForm.getSampleSizeInBits() + "," + pcmForm.getChannels() + "," + pcmForm.getFrameSize() + "," + pcmForm.getFrameRate() + "," + pcmForm.isBigEndian() + "," + pcmAudioInputStream.getFrameLength());

gives:
PCM: PCM_SIGNED,8000.0,16,1,2,8000.0,false,659

random sierraBOT
#

Closed the thread due to inactivity.

If your question was not resolved yet, feel free to just post a message to reopen it, or create a new thread. But try to improve the quality of your question to make it easier to help you ๐Ÿ‘

limpid orbit
#

It wasn't even 24 hours?

#

That's a joke

unborn girder
#

To be honest, I feel like it's hard te get an answer to this question here

#

Can you give a bit of context on what libraries you are using and that kinda stuff

limpid orbit
#

Javax.sound

#

That's all

oblique patio
limpid orbit
soft field
#

cause after 12h pretty much all helpers have seen ur question already

limpid orbit
#

Not sure what else I can add? I've included the code, the library etc. What else can I include to help?

soft field
#

thing is, keeping it open for longer wont magically generate u an answer. after everyone saw it, there wont be suddenly another user who didnt see it yet

#

so either no one knew the answer, or it wasnt easy enough to help and they moved on

limpid orbit
#

Yes that's what I figured it's too complicated

#

So I guess I'll leave it closed. Thanks for the explanation! I just wanted to make sure ๐Ÿ™‚

#

I'll figure it out myself

soft field
#

u could try to create a minimal example so that people can try it themselves

limpid orbit
#

Not sure it can be more minimal

soft field
#

it means that i can copy pasta and run it

#

right now i can't

#

im missing the sound file and the code is not just a single main method

#

so i cant try it out myself locally

#

without extra effort

limpid orbit
#

Alright I'll do that. Most people when they say minimum they want it down to one of two lines so I will get the strings and make an example in the next couple of hours. Thank you very much

soft field
#

but yeah, it's a very specific question. likely better suited for StackOverflow

limpid orbit
#

Yeah I don't do stackoverflow. I never get help and they usually just troll and have a god complex

pale sierra
#

but a minimal reproducable example would be important to provide

#

problem with SO is that most beginners have bad first experiences because the questions asked by beginners are mostly bad (either duplicate or smth similar)

#

its more an intermediate/expert kind of thing

#

but based on the complex question it seems like you know what you do

soft field
#

barely any asker reads the rules and takes them serious. which is unfortunate, since the other side is quite strict about them

#

low quality questions get stomped

#

anything that requires ping pong or guessing, which can't be immediately answered, gets stomped

#

anything that shows no research effort or similiar gets stomped

pale sierra
#

but for a good reason imo

soft field
#

what matters is that u have a high "question asking" skill. and then u also get a high quality answer

pale sierra
#

we allow low quality questions which makes it sometimes hard for the helper
but easier for the asker

limpid orbit
pale sierra
#

I didnt say your question is low quality

#

you said

Yeah I don't do stackoverflow. I never get help and they usually just troll and have a god complex
and I answered how to instead get good answers

limpid orbit
#

So I should open a new thread? Or update here?

pale sierra
#

it was about Stackoverflow

#

I woudl give it a try

limpid orbit
#

? I mean to post my updated question here.

pale sierra
#

update it here

limpid orbit
#

ok

#
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.IOException;
import java.util.Base64;
import javax.sound.sampled.AudioFileFormat;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.LineUnavailableException;
import javax.sound.sampled.SourceDataLine;


public class BaseAudioExample {

    public static void main(String[] args) throws LineUnavailableException, IOException {
        
               
        File file = new File(
            "/path/to/base64.txt");
        
        BufferedReader br
            = new BufferedReader(new FileReader(file));
        
        StringBuilder sb = new StringBuilder();
        String st;
        while ((st = br.readLine()) != null)
        {
            sb.append(st);
        }
        
        
        
        byte[] decodedAudioBytes = Base64.getDecoder().decode(sb.toString());
        ByteArrayInputStream inputStream = new ByteArrayInputStream(decodedAudioBytes);

        AudioFormat audioFormat = new AudioFormat(
                AudioFormat.Encoding.ULAW,  // Audio format details here
                8000,
                8,
                1,
                160,
                50,
                false
        );
        AudioInputStream ulawAudioInputStream = new AudioInputStream(inputStream, audioFormat, decodedAudioBytes.length / audioFormat.getFrameSize());    
   
#

AudioFormat ulawFormat = ulawAudioInputStream.getFormat();
AudioFormat pcmFormat = new AudioFormat(
                AudioFormat.Encoding.PCM_SIGNED,
                ulawFormat.getSampleRate(),
                16, // 16-bit sample size
                ulawFormat.getChannels(),
                ulawFormat.getChannels() * 2, // Frame size
                ulawFormat.getSampleRate(),
                false // Little endian
            );
            
AudioInputStream pcmAudioInputStream = AudioSystem.getAudioInputStream(pcmFormat, ulawAudioInputStream);
AudioFormat pcmForm = pcmAudioInputStream.getFormat();

// Open a line to play the audio
DataLine.Info info = new DataLine.Info(SourceDataLine.class, pcmForm);
SourceDataLine line = (SourceDataLine) AudioSystem.getLine(info);
line.open(pcmForm);
            
// Start playing
line.start();                      
byte[] buffer = new byte[4096];
int bytesRead;
// Read and play audio data from the stream
while ((bytesRead = pcmAudioInputStream.read(buffer)) != -1)
{
    line.write(buffer, 0, bytesRead);
}
AudioSystem.write(pcmAudioInputStream, AudioFileFormat.Type.WAVE, new FileOutputStream("/path/to/save/.wav"));
    }
}

#

It will play fine over the speakers, but not save correctly on the wav

#

Mind you this example is only to illustrate my issue.

pale sierra
#

let me try it on my end

limpid orbit
#

And the rules you guys are mentioning are in #rules ?

#

I guess in the example I also didn't close the streams. So keep that in mind

pale sierra
#

I think Zabu said smth about Stackoverflow rules

limpid orbit
#

Ohhh yea then it was about stackoverflow. I always make it a point to read rules before I post so I was confused

pale sierra
#

all good

limpid orbit
#

If you need more context I am trying to get this to work for transcription but it has to be on pcm

pale sierra
#

I dont even know what ulaw/pcm is

#

maybe tell me the basics

limpid orbit
#

It needs to be in:

16bit little-endian mono
Or
8khz 16bit little-endian mono

pale sierra
#

they are algos for audio?

limpid orbit
#

Yea well codecs

#

G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. It is an ITU-T standard (Recommendation) for audio encoding, titled Pulse code modulation (PCM) of voice frequencies released for use in 1972.
G.711 passes audio signals in the frequency band of 300โ€“3400 Hz and samples them ...

#

ffmpeg -i file.mp3 -ar 16000 -ac 1 file.wav this works if I do it with ProcessBuilder

But I want to not have to save it convert it and then load it from the disk before being able to do anything with the file

#

Because I'll be getting these in real time. In the base64 I sent I merged all the incoming base64 together

pale sierra
#

I changed it to this:

Path base64 = Path.of(Main.class.getResource("/base64.txt").toURI());
byte[] decodedAudioBytes = Base64.getDecoder().decode(Files.readAllBytes(base64));

at the beginning to load the file but that already gives me input byte array has incorrect ending byte at 140628
your method with old I/O is working but this might be important for the File writing later and thus fails maybe

#

but yeah file is also corrupted for me when creating

limpid orbit
pale sierra
#

this throws me an exception already, but I tried with old I/O like you did and it played correctly but corrupted file

limpid orbit
#

and for sure the conversion happens before playing

pale sierra
#

this exception might be important and could be the problem why the file is corrupted

limpid orbit
#

i am

pale sierra
#

with your path

limpid orbit
#

I guess though its worth mention that normally I am not loading this base64 string from a file

#

its being sent from a json object

#

in a socket that is streaming

pale sierra
#

I think the base64 file might already be corrupted

#

and that the playing using java is working fine until that byte

limpid orbit
#

if I save the file im being sent from the network

#

its not

pale sierra
#

but the file is corrupted overall

limpid orbit
#

I can save the file in the same format and its fine

pale sierra
#

so you mean save it in a ulaw format and then its fine?

limpid orbit
#

yes

#

try it

pale sierra
#

nah all good, I trust you

#

if you already tested it

limpid orbit
#

yea i have

#

and just did it again

pale sierra
#

ah I found it, my base64.txt got a new line at the end which was unintended

limpid orbit
#

oh

pale sierra
#

my guess would be that the conversion from ulaw -> pcm is failing somewhere but the playing is working

#

so its probably the conversion from pcm -> wav file

#
System.out.println(AudioSystem.isFileTypeSupported(AudioFileFormat.Type.WAVE, pcmAudioInputStream));

this prints true though

limpid orbit
#

also if you want to add this

#
System.out.println("PCM: " + pcmForm.getEncoding() + "," + pcmForm.getSampleRate() + "," + pcmForm.getSampleSizeInBits() + "," + pcmForm.getChannels() + "," + pcmForm.getFrameSize() + "," + pcmForm.getFrameRate() + "," + pcmForm.isBigEndian() + "," + pcmAudioInputStream.getFrameLength());

System.out.println("uLaw: " + ulawFormat.getEncoding() + "," + ulawFormat.getSampleRate() + "," + ulawFormat.getSampleSizeInBits() + "," + ulawFormat.getChannels() + "," + ulawFormat.getFrameSize() + "," + ulawFormat.getFrameRate() + "," + ulawFormat.isBigEndian() + "," + ulawAudioInputStream.getFrameLength());
                        
#

if I change things too much here it gives a conversion failed error

#

and fails long before playing audio

#

during the actual conversion

#

if you mess around with the audio formats

#

You are testing on windows?

pale sierra
#

yeah

limpid orbit
#

Okay well that rules out a drivers I think

#

Because I've been testing on Debian

pale sierra
#

this prints false:

System.out.println(AudioSystem.isConversionSupported(audioFormat, pcmFormat));
limpid orbit
#

Interesting

pale sierra
#

same with

            System.out.println(AudioSystem.isConversionSupported(ulawFormat, pcmFormat));
#

playing the audio still works lmao

#

these are the provided codecs it checks

limpid orbit
#

so it cant convert?

pale sierra
#

seems like its not supported

#

but I am not enough into javas audio api to tell you what to do now

#

though I think the UlawCodec should work for that

limpid orbit
pale sierra
#

this returns PCM_SIGNED as target if its sample size is 8

#

but you want 16

limpid orbit
#

8 will work

pale sierra
#

that is probably the issue

#

though nvm the source is actually 8

limpid orbit
#

java.lang.IllegalArgumentException: Unsupported conversion: PCM_SIGNED 8000.0 Hz, 8 bit, mono, 2 bytes/frame, from ULAW 8000.0 Hz, 8 bit, mono, 160 bytes/frame, 50.0 frames/second,

pale sierra
#

yeah I misread

#

I do some more debugging to see where its having a problem

#

n v m

#
            System.out.println(AudioSystem.isConversionSupported(pcmFormat, ulawFormat));

this returns true

#

the arguments where swapped

#

๐Ÿ’€

#

mb

limpid orbit
#

Lol

#

All good

pale sierra
#

idk something weird is happening

#
System.out.println(pcmAudioInputStream.available());

this prints the expected 105469

limpid orbit
#

Yes I tested all that before even attempting to ask

#

I even wrote each byte to the file

#

I did do something last night that made it play but it was .08 seconds

pale sierra
#
System.out.println(AudioSystem.write(pcmAudioInputStream, AudioFileFormat.Type.WAVE, Files.newOutputStream(Path.of("D://test.wav"), StandardOpenOption.CREATE)));

but this prints 1362 bytes written

#

this getAudioFileFormat got the property (byte length 1362) but why?

limpid orbit
#

Maybe because it's missing the header?

pale sierra
#

this calculation gives 1362

limpid orbit
#

Well during the conversion the frame size is getting smaller?

#

Which it needs to get larger?

pale sierra
#

frameLength is 659 and frameSize is 160 based on the debugger

limpid orbit
#

After the conversion?

pale sierra
#

a wait the frameSize is from streamFormat not from stream

#

and that frameSize is 2

pale sierra
#

the header got a size of 44

#

so its calculating 659 * 2 + 44 -> 1362

#

but isnt that too small?

limpid orbit
#

Yes

pale sierra
#

isnt pcm > ulaw

limpid orbit
#

Yes

#

Ulaw is compressed

pale sierra
#

this got a 2bytes/frame setting

#

could that be faulty

limpid orbit
#

I think that's where the issue is. I haven't touched audio conversion until yesterday

#

But I think even then it would play

pale sierra
#

you set the frameSize with ulawFormat.getChannels() * 2 but why?

#

getChannels returns 1 and thus the result is 2

#

and I think that when playing instead of saving the audio it just doesnt calculate its size but instead play the stream until its end

#

and the saving saves the content until its calculation point (1362)

limpid orbit
#

I need mono

pale sierra
#

ok that means for this?

limpid orbit
#

yes

pale sierra
#

xD

limpid orbit
#

ulawFormat.getChannels() * 2 calculates the frame size in bytes based on the number of audio channels and the assumption that each sample is represented by 16 bits (2 bytes). This frame size is important when working with audio because it helps determine the amount of data processed or transferred at a given time.

pale sierra
#

but how do you get from 160 bytes/frame (ulaw) to 2 bytes/frame (pcm) this already makes no sense for me

limpid orbit
#

In ulaw, each sample takes 1 byte (8 bits)
PCM takes 2 bytes (16 bits)
This difference in encoding results in different frame sizes, with PCM having a smaller frame size (2 bytes/frame for stereo) compared to ฮผ-law (160 bytes/frame for stereo).

#

Or am I mistaken?

pale sierra
#

I dont really know this

limpid orbit
#

me either

#

xD

pale sierra
#

I thought ulaw is compresed

limpid orbit
#

removing *2 gives me this

Unsupported conversion: PCM_SIGNED 8000.0 Hz, 16 bit, mono, 1 bytes/frame, little-endian from ULAW 8000.0 Hz, 8 bit, mono, 160 bytes/frame, 50.0 frames/second, ```
pale sierra
#

so it only makes sense that pcm takes more bytes per frame no?

pale sierra
limpid orbit
#

yea same error only * 2 works

#

For mono PCM audio, which has one channel, the frame size would be:
Frame Size = Number of Channels * Bits per Sample / 8
Frame Size = 1 channel * 16 bits / 8 bits per byte
Frame Size = 2 bytes
So, for mono PCM audio, the frame size is 2 bytes per frame. Each frame represents one unit of audio data for the mono channel.

pale sierra
#

but this would mean that the pcm audio is smaller in size, so to have the same length in time to play the sound it needs less frames/second

#

because the length of the sound needs to be the same no?

limpid orbit
#

Yea that's where I'm confused. But if that was the case it wouldn't play correct either?

#

Because I came up with these settings while testing

pale sierra
#

hm and the playing works fine

limpid orbit
#

Before these it sounded slow, like the chipmunks etc

#

Or alot of crackling

#

Actually

#

I have an idea

pale sierra
#

lets see xD

limpid orbit
#

Let me find another file

#

same

pale sierra
#

the implementation just writes until this calculated size

#

but no idea what to do against that

#

and I think this is the point where its worth making a Stackoverflow post

#

should I write one real quick

#

?

limpid orbit
#

if you want. Like i said I usually avoid there now cause i either never get an answer or just rude comments

#

I asked a coworker but I'm pretty sure they won't know.

#

I have another file to use though for audio

#

rather than the one i sent

pale sierra
pale sierra
limpid orbit
pale sierra
#

this is a good summarization of the conversion right?

store ulaw to pcm converted sound to wave file
limpid orbit
#

I think so.

pale sierra
#

also its stereo ulaw to mono pcm right?

#

nvm both are mono?

limpid orbit
#

this is the original file saved

limpid orbit
pale sierra
#

now we just need to wait

#

probably till tomorrow

#

I will tell you about new informations

#

also I am wondering, your title says 8khz (ulaw) - 16khz/16bit (pcm)
but isnt the pcmFormat in the code just using getSampleRate which is also 8khz?

limpid orbit
#

I get the same results

#

With 8 or 16

pale sierra
#

I get unsupported conversions when changing to 16

limpid orbit
#

On pcm?

#

Weird

pale sierra
#

yeah

limpid orbit
#

PCM_SIGNED 8000.0 Hz, 8 bit, mono, 2 bytes/frame, from ULAW 8000.0 Hz, 8 bit, mono, 160 bytes/frame, 50.0 frames/second

#

so i guess i do too

#

sorry tried a lot of things

pale sierra
#

all good

#

I was just wondering bc of the title

#

we will see if someone answers

limpid orbit
#

funny

#

I asked chatgpt to review and it fixed it

#

replaced

System.out.println("Actually written: " + AudioSystem.write(pcmAudioInputStream, AudioFileFormat.Type.WAVE, Files.newOutputStream(Path.of("D://test.wav"), StandardOpenOption.CREATE)));

with

try (inputStream; ulawAudioInputStream; AudioInputStream pcmAudioInputStream = AudioSystem.getAudioInputStream(pcmFormat, ulawAudioInputStream)) {
System.out.println("Available: " + pcmAudioInputStream.available());
byte[] audioData = new byte[pcmAudioInputStream.available()];
int bytesRead = pcmAudioInputStream.read(audioData);
System.out.println("Actually written: " + writeWavFile("/path/to/wav.wav", audioData, ulawFormat));
}

and added:

private static int writeWavFile(String filePath, byte[] audioData, AudioFormat ulawFormat) throws IOException {
AudioFormat format = new AudioFormat(
AudioFormat.Encoding.PCM_SIGNED,
 ulawFormat.getSampleRate(),
16, // 16-bit sample size
ulawFormat.getChannels(),
ulawFormat.getChannels() * 2, // Frame size
ulawFormat.getSampleRate(),
 false // Little endian
);
limpid orbit
#

@pale sierra thanks for the help. I can now stream the audio directly from an input stream to cmusphinx4 without creating a file . Now I just need to retrain it with 8khz instead of 16khz i think ๐Ÿ™‚

22:47:31.518 INFO speedTracker            Total Time Audio: 3.43s  Proc: 3.12s 0.91 X real time
22:47:31.518 INFO memoryTracker           Mem  Total: 3800.00 Mb  Free: 3036.33 Mb
22:47:31.518 INFO memoryTracker           Used: This: 763.67 Mb  Avg: 763.67 Mb  Max: 763.67 Mb
22:47:31.518 INFO trieNgramModel       LM Cache Size: 12950 Hits: 2350499 Misses: 12950
Hypothesis: activation but i knew of a remake of the skull
22:47:31.876 INFO speedTracker            This  Time Audio: 1694062080.00s  Proc: 0.29s  Speed: 0.00 X real time
22:47:31.876 INFO speedTracker            Total Time Audio: 1694062080.00s  Proc: 3.41s 0.00 X real time
22:47:31.876 INFO memoryTracker           Mem  Total: 3800.00 Mb  Free: 3361.35 Mb
22:47:31.876 INFO memoryTracker           Used: This: 438.65 Mb  Avg: 601.16 Mb  Max: 763.67 Mb
22:47:31.876 INFO trieNgramModel       LM Cache Size: 13625 Hits: 2442934 Misses: 13625
Hypothesis: the art
#

Ill go ahead and close this.

pale sierra
#

ok the stackoverflow got no answer till now