#Hey everyone, in need of urgent help. So I have an electron app and I need to use speech recognition
53 messages · Page 1 of 1 (latest)
It works on Edge too, just doesn't work on shell-based Chromium browsers like electron
see the details I provided in my question
lemme paste the content here too
So I have an Electron app that uses the web speech API (SpeechRecognition) to take the user's voice, however, it's not working. The code:
if ("webkitSpeechRecognition" in window) {
let SpeechRecognition = window.SpeechRecognition || window.webkitSpeechRecognition;
let recognition = new SpeechRecognition();
recognition.onstart = () => {
console.log("We are listening. Try speaking into the microphone.");
};
recognition.onspeechend = () => {
recognition.stop();
};
recognition.onresult = (event) => {
let transcript = event.results[0][0].transcript;
console.log(transcript);
};
recognition.start();
} else {
alert("Browser not supported.")
}
It says We are listening... in the console, but no matter what you say, it doesn't give an output. On the other hand, running the exact same thing in Google Chrome works and whatever I say gets console logged out with the console.log(transcript); part. I did some more research and it turns out that Google has recently stopped support for the Web Speech API in shell-based Chromium windows (Tmk, everything that is not Google Chrome or MS Edge), so that seems to be the reason it is not working on my Electron app.
See: electron-speech library's end Artyom.js issue
So is there any way I can get it to work in Electron?
speech recognition cli and api for node using electron. Latest version: 1.0.7, last published: 6 years ago. Start using electron-speech in your project by running npm i electron-speech. There are no other projects in the npm registry using electron-speech.
Saw this thread in StackOverflow. http://stackoverflow.com/questions/36214413/webkitspeechrecognition-returning-network-error-in-electron I'm using Artyom with Electron - can anyone confirm...
Correct, it used to work without the api key as it used to be built in all chromium-based browsers, now that google disabled web speech API in electron, it doesn't work.
I know, that's why I asked the question. How do I get speech recognition to work then?
it is not chromium based
It is, It works on MS Edge too
and Google Chrome
edge and chrome is not chromium
all Chromium based browsers support it except shell-based ones like Electron
bro... they are. What are you talking about?
They are both Chromium-based web browsers.
Electron is Chromium based too but it's a shell-based Chromium browser which isn't supported for the web speech API anymore.
they use gogle api key
See the links I included above.
It's built into them, running the same speech recognition code in Google Chrome or MS Edge without an API key works, while it doesn't work in Electron.
See the link I included, they make this clear actually :)
View this too:
https://stackoverflow.com/questions/60501410/electron-not-working-with-web-speech-api
It's not an issue, Google made the decision to disable the Web Speech API on shell-based Chromium browsers (Electron included) so that only Google Chrome, MS Edge, and other proper Chromium-based browsers can utilize it. So I am looking for another way to get user's speech in JavaScript Electron.
Wait, Discord is built with Electron too, right? @heady forge
discord build electron from sources
i.e. ?
Sorry, it's difficult to understand what you're trying to say.
Are you saying they don't use Electron or they do use Electron?
discord use custom version of electron
oh ok I see
How do they get the user's speech in voice chat?
like you click he VC channel and you can speak through your microphone and it detects your speech.
How does it do that?
analyze audio data
and how do they get the audio data?
what API do they use?
can't be web speech API :/
coz it doesn't work with shell-based browsers as we know
with navigator.mediaDevices.getUserMedia
Ahh okay, they use the media stream API
Do you think it is theoretically possible to code something that
1: Takes the user's live audio speech with navigator.mediaDevices.getUserMedia
2: Send it to a backend server.
??
Because if we can send it to the backend, then we would be able to use a backend-based service to analyze the speech of that user and return a response.
That's possible, right
?
possible but too expensive
howso?
Jasper is a free backend service so we can send the audio stream to the python backend?