Any guidance on where my time start ends | OpenAI | Page 1

velvet magnet May 31, 2023, 2:52 PM

#

Let me know if you get an answer for this. I also need to figure out a policy of adding timestamps.

dusk hazel Jun 1, 2023, 4:47 AM

#

If we had some example training data from someone experienced that would probably help.

velvet magnet Jun 1, 2023, 2:53 PM

#

We think it looks a little like this, but haven't verified yet: <|0.0|>I have a cute dog.<|4.4|><|4.8|>She is small but runs so fast.<|9.2|> timestamps should be rounded to nearest 0.02 seconds.

dusk hazel Jun 2, 2023, 10:59 PM

#

But how does that look w/r/t the starts/ends of the waveforms?

dusk hazel Jun 3, 2023, 1:52 AM

#

https://github.com/openai/whisper/discussions/1417

GitHub

Tiny dataset as an example of labeling? · openai whisper · Discussi...

Can anyone provide a tiny dataset example (just one might even do) where we can see the ideal positioning of the start/stop times for our text? I'm not sure: How close I should go to the start ...

velvet magnet Jun 5, 2023, 2:59 PM

#

Oh, you mean where to position the timestamps with respect to the word waveforms within the audio file? I assume we should position them as close to the starting and stopping of audible voice. So assuming there is no background noise and no filler-words that we might want to ignore (I'm not sure it we should ignore filler words or not...), the timestamps would have zero amplitude on one side and the smallest perceivable amplitude on the other. But I'm still not sure how sensitive Whisper training will be to variations or imperfect positions. I hope to find that out, too.

dusk hazel Jun 6, 2023, 4:07 AM

#

yeah, exactly. I position just at the boundaries, as seen in the spectrogram (which more-clearly shows subtle/low-volume, but important, parts of a vocal pattern; the waveform display may not present these as visibly).

#

I found this, which may be important, but I'm not sure if it relates to fine-tuning yet (I've not investigated). It might relate to the practical functionality of inference though:
https://github.com/openai/whisper/discussions/759#discussioncomment-4934838

GitHub

Fine-tuning Whisper · openai whisper · Discussion #759

I am trying to fine-tune the whisper to improve the WER for a simulated telephone records in English. I am using the "small model" and a dataset of around 32 hours in English with the aud...

#



whisper/whisper/audio.py

Line 17 in 7858aa9
 CHUNK_LENGTH = 30

velvet magnet Jun 6, 2023, 2:01 PM

#

Good point. What tool do you use to see spectrogram?

#

Oh, I didn't realize Audacity does that, too.

#Any guidance on where my time start ends