#Resampling data in C (audio)

169 messages · Page 1 of 1 (latest)

icy cargo
#

I have tried the following code for upsampling but only result in noises:

/* 
 * ratio: ratio between the 2 sample rates
 * bufsz: size of the source buffer
 * buffer: source buffer
 */
static char*
resample(char *buffer, int bufsz, float ratio)
{
        int nsize = bufsz * ratio;
        char *to_return = malloc(nsize);

        for (int i = 0; i < nsize; i++) {
                if (i == 0) {
                        to_return[i] = buffer[0];
                        continue;
                }

                if (i == nsize - 1) {
                        to_return[i] = buffer[bufsz-1];
                        continue;
                }

                int j = i / ratio;
                to_return[i] = buffer[j];
        }

        for (int i = 1; i < nsize-1; i++) {
                if (to_return[i] == to_return[i-1] || to_return[i] == to_return[i+1]) {
                        to_return[i] = (to_return[i+1] + to_return[i-1])/ (float) 2;
                }
        }

        return to_return;
}
clear orioleBOT
#

When your question is answered use !solved to mark the question as resolved.

Remember to ask specific questions, provide necessary details, and reduce your question to its simplest form. For tips on how to ask a good question use !howto ask.

icy cargo
#

This code is to help me go from 44100Hz to 48000Hz

odd cargo
#

One thing I see is that your if condition (i == 0) could be outside the loop, and you can start the for loop with i = 1

#

not sure what your doing though

icy cargo
#

I have a buffer of size bufsz which is a collection of sample at 44100Hz. The resample function is to convert the buffer to a 48000Hz signal. The first loop uses the ratio which is 48000/44100. The ratio is used to get and index in buffer and copy it in to_return. This is done by using the "i" and approximating what index it'll be in the buffer. Since doing this would create duplicate values I use the second loop to smooth things out. This is done by doing an average of the previous and next value of the current index

forest lava
#

your second pass seems to be adding the noise in my testing

#

i think you did your smooth wrong

#

should be something like

int old = to_return[0];    
for (int i = 1; i < nsize-1; i++) {
    int cur = to_return[i];
    to_return[i] = (old + cur + to_return[i+1]) / 3;
    old = cur;
}
#

not quite right either :/

#

hmm

#

alright I think I found the problem @icy cargo :3

char *to_return = malloc(nsize);

should be unsigned char*

otherwise the division in

 to_return[i] = (to_return[i+1] + to_return[i-1])/ (float) 2;

causes unexpected results

#

it would also be good to cast to_return[...] to something larger than a char to guarantee that you don't have any overflows

icy cargo
#

Setting it to unsigned char makes it sound better but my ears are still bleeding unfortunately

forest lava
#

can you send an example wav file?

#

my tests are basically middle C for 2.5 seconds then A for 2.5 seconds

icy cargo
#

I can't send you that sorry its copyrighted :/

forest lava
#

aww

icy cargo
#

I used ffmpeg to convert it to 44100Hz

#

I can send you the link but not the file

#

The original track is in 48000Hz

#

I'm using vorbis to decode it and grab the buffer and read it with syscalls

forest lava
icy cargo
#

Huh?

#

What your OS?

forest lava
#

linux :3

#

I did swap out your smoothing pass with my own thing

icy cargo
#

Wait are you converting then outputing the file or just the signal?

#

Cause what I'm trying to do is like what mpv does

forest lava
#

i've converted the original soinc webm to a u8 pcm file at 44.1khz

#

and thats what i'm taking as input

#

and i'm just printing out the raw pcm at 48khz and piping it into ffmpeg to convert it to a wav

icy cargo
#

pcm is uncompressed right?

forest lava
#

yeah

icy cargo
#

Maybe its because I'm playing an ogg file I'm getting these noises

#

Cause I converted it into a 44100Hz ogg file

#

My function could also be wrong too. I'll try figuring it out

forest lava
icy cargo
#

Hold on you did that with the function I did that you slighty corrected?

forest lava
#

yeah

forest lava
icy cargo
#

Alright so... watch out for your ears:

forest lava
#

🥴

#

yeah that don't sound very good

icy cargo
#

I'm getting there tho. I think

forest lava
#

the original video is in stereo

#

do you know if you are getting a mono or streo stream?

icy cargo
#

Its in stereo. Despite the sound being wonky if you ignore the noises it sounds like in the video

forest lava
#

if you are getting streo you have to split the two interleaved streams, run the funcion on both and then interleve them back together

#

or make a more complicated function that basically does the same thing but with two unsigned chars at a time instead of one

icy cargo
#

Ok. I'll try doing it in mono first to see if my function works for me

#

I have a feeling though that resampling shouldn't be done like how I did it

forest lava
#

It works

#

¯_(ツ)_/¯

#

wikipedia would have you do some sort of interpolation but you aren't going to magically get more quality than the original filie it will just be a little smothed out

#

which actually might sound weird for chip tune stuff

icy cargo
#

I'm not gonna stick with chip tunes thought. I'm doing this so I can play different audio files with a medium margin of configuration

icy cargo
forest lava
#

yeah interpolation is kinda hard :/

#

easiest way would just be linear interpolation

#

basically you view your audio stream as a set of points (x, y) where x = time and y = magnitude and then you draw lines between consecutive points

icy cargo
#

That's kinda what I'm doing in a sense since I'm doing the average of between each points

#

I feel like

#

But I get earrape interpolation instead

forest lava
#

not quite you are taking the floor of i / ratio as your index which basically just repeats the same value until you pass the time at which the next sample starts

#

for interpolation at any given time between the last sample and the next sample you want to find the value that lies on the line between them

icy cargo
#

I thought I was fixing that by creating a second loop that nullify any repeating values. Basically what you're saying is that I have to make a single loop and create a function that returns the sample y value for each index?

forest lava
#

your second loop doesn't remove repeat values it finds the edges of repeat values and averages them

icy cargo
#

Yeah in a sense edges could be identical values as well

forest lava
#

so if you had like 1 3 3 3 3 3 3 1 that would get smoothed out to 1 2 3 3 3 3 3 2 1

#

wait it does something weirder than that

#

1 3 3 3 3 3 3 1 would go to 1 2 2 2 2 2 2 1

#

i believe

icy cargo
#

Yeah it does

#

So for interpolation I'd need to find a and b in ax+b

forest lava
#

ideally with linear interpolation instead of 1 1 1 7 7 7 7 you would get 1 2 3 4 5 6 7

icy cargo
#

Which is easy since values are in sampleA but the problem is how am I supposed to fill sampleB if its bigger

forest lava
#

when you are at some time T you find a sample that happens before T and a sample that happens after T draw a line between them and find the y value at your current time

#

if a sample happens at time T use the sample value

icy cargo
#

Maybe I expressed myself wrong. Lets say this is buffer 44100Hz: 1 1 1 7 7 7 7 I want to convert it to 48000Hz which will have more values. How can I get a value for x if the graph stops at lets say x-50?

#

Or am I getting something wrong?

forest lava
#

so your original sample array will be smaller than your final sample array

but you play them back in the same amount of time

#

basically the points are farther apart in the first sample array and closer together in the second sample array

icy cargo
#

Yeah since there's more samples

forest lava
icy cargo
#

So in a sense the ratio would be ideal for that

forest lava
#

you will need the ratio yes

icy cargo
#

Then you'd need the values at the edges to get the a

#

Ok I think I got it:

1. Get the ratio
2. Loop through the final sample array
3. Convert the index from the final sample array to the source sample array index
4. Find the function with values at the edges in source sample array
5. Get your y value
6. Repeat
clear orioleBOT
#

@icy cargo

It looks like you may have code formatting errors in your message

Note: Make sure to use back-ticks (`) and not quotes (')
Note: Make sure to specify a highlighting language, e.g. `cpp`, after the back-ticks

Markup

```c
int main() {}
```

Result
int main() {}
forest lava
#
1. Get the ratio
2. Loop through the final sample array
3. Convert the index from the final sample array to its time T
4. find samples in the source sample array that happen before T and after T
4. Find the line between them
5. Get your y value at time T
6. Repeat
#

step 4 is easier than it looks, the two samples are consecutive (or the same sample) and they are either the same as the previous itteration or the next consecutive pair

icy cargo
#

When you say T do you mean values between source array index 0 to source array max index?

forest lava
#

the time in seconds

#

when that sample happens

#

for example your 50th sample at 44.1khz happs 50 * (1 / 44.1khz) seconds into the song

icy cargo
#

In that case that means I'd need the whole song

#

The problem is that vorbis's ov_read() only gives me crumbles

forest lava
#

you can do it as an offset from the beginning of the crumble :)

icy cargo
#
static char*
resample(char *buffer, int bufsz, float ratio)
{
        int fbufsz = bufsz * ratio;

        unsigned char *to_return = malloc(fbufsz);
        memset(to_return, 0, fbufsz);

        to_return[0] = buffer[0];
        to_return[fbufsz-1] = buffer[bufsz-1];

        for (int i = 1; i < fbufsz-1; i++) {
                float T = i / ratio;

                /*
                 * linear interpolation
                 */
                int x1 = ((int) T) - 1;
                int x2 = ((int) T) + 1;
                float a = (buffer[x2] - buffer[x1]) / (float) (x2 - x1);
                float b = buffer[x1] - a * x1;

                to_return[i] = (int) (a * T + b);
        }


        return to_return;
}
``` I think I'm doing something wrong
#

EARS BEWARE my attempt at playing it at 48000Hz

coarse mauve
#

As I understand the process, typically up sampling 44100 to 48000 would involve resampling first to the least common multiple of the input and output sample rates (interpolation). You would then apply a low-pass filter to prevent aliasing. You would then down-sample the filtered signal to the target sample rate (decimation).

So, for these two sample rates the GCD would be 300. The up-sample factor would be 48000 / 300 or 160 and the down-sample factor would be 44100 / 300 or 147. For interpolation, you would insert up-sample factor minus 1 zeros between every sample in the original recording. You would then apply a low pass filter with a cut-off at the Nyquist frequency of the original recording (22050 Hz). Lastly you would down sample to your target through decimation by taking every down-sample-factor-th sample from the filtered signal and chucking the rest.

Inserting zeros increases the sample rate but inserts high frequency artifacts. Since the original recording cannot reproduce frequencies above the Nyquist frequency, you remove everything above that frequency with a low pass filter after interpolation but before decimation so those high frequency components cannot cause aliasing.

forest lava
#

for the lowpass filter you would need a fourier transform and a inverse fourier transform. basically you take your signal to frequency space, zero out all frequencies above the Nyquist frequencey and then apply the inverse fourier transform to the result

coarse mauve
#

A simple FIR filter will do nicely

icy cargo
#
/*
 * buffer: sample array
 * size: size of sample array
 * ufactor: 160
 * dfactor: 147
 */
static char*
resample(char *buffer, int size, int ufactor, int dfactor)
{
  int fsize = size * ufactor;

  unsigned char *to_return = malloc(fsize);
  memset(to_return, 0, fsize);

  for (int i = 0; i < size; i++) {
    int j = i * ufactor / dfactor;
    to_return[j] = buffer[i];
  }

  return to_return;
}
``` Now I just gotta low-pass filter I think
coarse mauve
#

hmmm... maybe closer to this?

    for (int i = 0; i < size; i++) {
        to_return[i * ufactor] = buffer[i];
    }
coarse mauve
#

Is it your intent to use 8 bit signed samples?

icy cargo
#

16 bit signed right now

#

I'm on an OpenBSD machine trying to play with sndio a 16 bit signed mono file with a sample rate of 44100Hz on a stereo device with a sample rate of 48000Hz

coarse mauve
#

Well, your resample code is being passed a pointer to a buffer of type char and the return buffer is being treated as an unsigned char, both 8 bit, one signed, the other not. So, these should change to uint16_t or equivalent. So if your samples are unsigned, then the "zero" crossing point is biased up to the 1/2 way point in your 16 bit range (32767). Nothing wrong with it, just something to keep in mind.

#

Oh, sorry you said signed, but I read it as unsigned. Disregard.

#

You do still need to fix the data type for each sample in and out however.

icy cargo
forest lava
coarse mauve
#

It needs to be large enough for the up-sampled version of the recording.

icy cargo
#

Oh I see I gotta low pass filter then down sample

coarse mauve
#

yes, that is correct

icy cargo
#

A low-pass filter is a filter that passes signals with a frequency lower than a selected cutoff frequency and attenuates signals with frequencies higher than the cutoff frequency. The exact frequency response of the filter depends on the filter design. The filter is sometimes called a high-cut filter, or treble-cut filter in audio applications. ...

#

From what I understand low-pass filter needs values such as RC and delta time

coarse mauve
#

I would recommend looking at an FIR filter. You can of course do whatever works for you.

icy cargo
#

Is it better in term of performance and simplicity?

coarse mauve
#

They are good for audio applications. They are completely stable because they do not use feedback. They have a linear phase response so you are not introducing distortion. FIR filters can be very precise in their frequency shape and response plus are simple to implement.

smoky gazelle
coarse mauve
#

Sure, there are a lot of tools out there that can accomplish the task of resampling an audio file. If the goal is to convert the file, go for it. If the goal is to implement something on your own, happy to help.

icy cargo
smoky gazelle
#

Also I suggest you to write code to normalize the output files to -1db true peak measuring the true peak with 4x oversampling so if your audios are 48khz resample to 192000hz and get the difference between the 48khz max peak and the 192khz because that difference is the true peak, and you need 1db below that because at 4x oversampling there is still risk of around .5db error

clear orioleBOT
#

@icy cargo Has your question been resolved? If so, type !solved :)

icy cargo
smoky gazelle
icy cargo
#

I'll try succeeding in resampling first to 48000Hz since its not working right now. Then I'd apply the quality fix with the true peak.

#

From what I understand right now the GCD of 48000Hz and 44100Hz is 300

#

When using GCD I get : 48000/44100 => 160/147

#

So I have to:

  1. Upsample by x160
  2. Apply FIR on the sample array
  3. Downsample by x147
#

I will also normalize like you wrote Juan by finding the true peak

coarse mauve
#

sounds like a plan

icy cargo
#
static char*
resample(char *buffer, int size, int ufactor, int dfactor)
{
        int usize = size * ufactor;             /* upsample size */
        int dsize = usize / dfactor;    /* downsample size */

        char *usample = malloc(usize);
        char *dsample = malloc(dsize);

        memset(usample, 0, usize);
        memset(dsample, 0, dsize);

        /*
         * upsampling the original audio
         */
        usample[0] = buffer[0];
        for (int i = 1; i < size; i++) {
                usample[i * ufactor] = buffer[i];
        }

        /*
         * low pass filter
         */
        int cutoff = 44100/2;
        float alpha = LPFAlpha(cutoff, 44100);

        int filterArray[usize];
        memset(filterArray, 0, sizeof(filterArray));


        filterArray[0] = alpha * usample[0];
        for (int i = 1; i < usize; i++) {
                filterArray[i] = alpha * usample[i-1] + (1-alpha) * filterArray[i-1];
                if (filterArray[i] != 0)
                        printf("%d; ", filterArray[i]);
        }
        printf("\n\n");

        /*
         * downsample
         */

        for (int i = 0; i < dsize; i++) {
                int x = i * dfactor;
                dsample[i] = filterArray[x];
        }
        return dsample;
}
``` Went with LPF after all I understand it a bit better I just need to do some tweaks and I think I'll be alright
coarse mauve
#

Looks like a good start. A couple of comments to consider:

  • the code as written is assuming 8 bit signed samples during up sampling.
  • the filterArray is assuming 32 bit signed samples
  • I don't see an implementation of the LPFAlpha so I cannot comment on the behaviour.
  • the down sample is converting 32 bit filtered samples to 8 bit samples.

For giggles, I implemented a FIR lowpass filter of order 51 and plotted the filter shape. It turned out pretty good though it could be improved. I would be keen to see the performance of your design.

icy cargo
#

Your first point makes me reflect thought. Apparently vorbis files are signed 16 bits in general which is the case in this situation. However you said that I wrote the code for signed 8 bit sample which from what I understand is not right and might be the reason why my realtime conversion is not working properly.

#

Also here is the LPFAlpha:

/*
 * the formula is cutoffFreq = 1/(2*pi * RC)
 * pi => 3.1416...
 */
static float
LPFAlpha(float cutoffFreq, float sample_rate)
{
        float RC = 1.0 / (cutoffFreq * PI * 2);
        float dt = 1.0 /sample_rate;
        float alpha = dt /(RC+ dt);

        return alpha;
}
coarse mauve
#

Well, in reality your LFPAlpha function is merely computing the smoothing factor for a discrete-time first-order low-pass filter based on the filter's cutoff frequency and the sample rate. It is not the filter itself. You would use it to compute the filtered output typically according to this formula:

y[n]=α⋅x[n]+(1−α)⋅y[n−1]

In this formula y[n] is the filtered output at the current time step. x[n] is the current input sample. y[n-1] of course is the filtered output from the previous time step. and α is your smoothing factor. You should be able to update your sample stream in place if you use this approach.

Your approach has some advantages and disadvantages. It is simple requiring only the alpha value and the previous and current samples. The computational cost is very low O(1) and minimal state (only the previous filtered sample). All of this makes it very suitable for real-time processing.

You will find however that there are some disadvantages. I mention them not to discourage you. I strongly recommend you complete the implementation using this approach.

This filter will have limited frequency response meaning it will struggle to attenuate frequencies close to Nyquist. The roll-off will only be about 6dB per octave due to its first order design. You will also see significant phase distortion due to its recursive nature (IIR) particularly near the cutoff frequency which in audio applications is important. Lastly this kind of filter is more sensitive to numeric precision issues.

Again I recommend you update your implementation and get this approach working and you can decide if it is sufficient. Many is the time I have filtered the PWM output of an audio stream with a simple 1K0 resistor and 0.01uf capacitor.

#

You would calculate the alpha value only once and use it to apply the formula above to each sample.

#

Your filter frequency and phase response will look something like this:

coarse mauve
#

There is one other thing you said that concerned me, namely that you are using vorbis files. While I am not intimately familiar with vorbis, what I believe is that vorbis is specifically designed as a lossy, high compression audio format. What I mean by lossy is that it specifically will drop data from the original to achieve high compression while endeavoring to maintain some level of audio quality.

#

The upshot of this is that nothing we have been talking about takes into account any compression of the audio samples. I am presuming in everything I have said that the audio samples are uncompressed PCM data. So, if you are going to use audio from such a source you will need to uncompress it and store it as simple PCM data for what we are doing here to work properly.

icy cargo
#

So the compression shouldn't really be an issue

coarse mauve
#

agreed, thank you for the clarification.

coarse mauve
#

If you are having trouble with filters, you could minimize the need for a filter by doing a linear interpolation of the values inserted rather than inserting zeros and low-pass filtering as you would not be inserting quite the amount of high frequency artifacts. Even fewer artifacts will be inserted if you use sinc interpolation.

icy cargo
#

Haha yeah that's my issue I think I implemented the thing wrong

icy cargo
#

I think I will do this. The ratio is M/L. I will upsample by M and interpolate zeros every Mth element

coarse mauve
#

Happy to give feedback on what you did if you would like. It may be good enough for you with interpolated (linear) samples between the real samples, so you only have to directly upsample from 44100 to 48000

icy cargo
#
static int16_t*
resample(int16_t *buffer, int elements, int ufactor, int dfactor)
{
        int size = elements * ufactor / dfactor;
        int16_t *result = malloc(sizeof(int16_t) * size);
        memset(result, 0, size * sizeof(int16_t));

        result[0] = buffer[0];
        result[size] = buffer[elements];

        /* Linear interpolation */
        for (int i = 1; i < size-1; i++) {
                float x0 = (i - 1) / (float) size;
                float x1 = (i + 1) / (float) size;

                float y0 = buffer[(int) (x0 * elements)];
                float y1 = buffer[(int) (x1 * elements)];

                /*
                 * Setting the result value
                 */
                float x = i / (float)size;
                result[i] = ( ( y0 * (x1 - x) + y1 * (x - x0) ) / (x1 - x0) );

                printf("%d; ", result[i]);
        }
        printf("\n\n");



        return result;
}
``` Already sounds better but still a bit of noises
#

Lots of buzz sound

#

IT WORKS!!!

#

Thanks @coarse mauve @forest lava @smoky gazelle For the feedback

#

!solved

clear orioleBOT
#

Thank you and let us know if you have any more questions!

This thread is now set to auto-hide after an hour of inactivity

coarse mauve
#

You have some problems writing out of bounds and in your interpolation logic. Try this out:

static int16_t*
resample(int16_t *buffer, int elements, int ufactor, int dfactor)
{
    int size = elements * ufactor / dfactor;
    int16_t *result = malloc(sizeof(int16_t) * size);
    if (!result) {
        return NULL; // Handle memory allocation failure
    }

    // Assign the boundaries
    result[0] = buffer[0];
    result[size - 1] = buffer[elements - 1];

    // Linear interpolation
    for (int i = 1; i < size - 1; i++) {
        float x = (float)i / (size - 1);
        float x0 = (x * (elements - 1));
        int idx0 = (int)x0;
        int idx1 = idx0 + 1;

        float y0 = buffer[idx0];
        float y1 = buffer[idx1];

        // Interpolation
        result[i] = (int16_t)((y0 * (idx1 - x0) + y1 * (x0 - idx0)) / (idx1 - idx0));
    }

    return result;
}
#

You can always do the filter later or a sinc interpolation if you are not happy with the result

icy cargo
#

Thanks for the fix I didn't realize it. I'm realizing how accurate you have to be when doing audio programming