#cheap+efficient way to transcribe audio?

15 messages · Page 1 of 1 (latest)

charred dagger
#

Trying to train a model on a lot of podcast data and I have a bunch of m4a files that I'm looking to transcribe, found some services but they seem expensive. Does anyone know a good way to do this efficiently and without spending a ton of money? Is there maybe some program that I can download?

grizzled rapids
# charred dagger Trying to train a model on a lot of podcast data and I have a bunch of m4a files...

There is a model released by OpenAI called Whisper. It's an Automatic Speech Recognition model that is free to use & download. You can find a few additional videos on how to use it in your workflow.
https://www.youtube.com/watch?v=HbY51mVKrcE

In this video tutorial we show how to quickly convert any audio into text using OpenAI's Whisper - a free open source language audio to text library that works in many different languages!

Whisper Repo: https://github.com/openai/whisper
Whisper Paper: https://openai.com/blog/whisper/

Follow me on twitch for live coding streams: https://www.twi...

▶ Play video
charred dagger
charred dagger
grizzled rapids
#

Sure, post them here in case others have better answers

charred dagger
grizzled rapids
#

You can just do here if you think it’s relevant or a new thread if you think it’s tangential

#

Just @ me on the new thread if you do decide to create a new one

charred dagger
# grizzled rapids You can just do here if you think it’s relevant or a new thread if you think it’...

It is a tiny bit tangential, but still a follow-up to my last thing. Basically I'm at the point now where I have all of my data and cleaned it up a little bit and am about to get to the point where I need to format it into the Json file. Considering that my goal is to emulate the personality of the podcast host based on the raw podcast data that I gathered, some people were telling me to just put it into the Json file without needing to format it with the 'prompt' 'response' format that people seem to be doing for other types of data. Is this accurate? I really appreciate the help. I used to program all the time in high school, but ended up getting huge issues in both of my hands and now it seems like ML is a fascinating world to explore

#

My goal is to be able to ask a question and get a response in the ballpark of their personality

charred dagger
charred dagger
lunar bobcat
stray robin
#

Or an LLM