#speechio + vosk function get_command(1) - I'm probably using it wrong

1 messages · Page 1 of 1 (latest)

harsh vigil
#

On Ubuntu Laptop I am running speechio and vosk as the speech recogniser, but get_command(1) is returning an empty string.

I have tested the microphone using aplay and arecord defaults, so I think that is OK.

Should I debug my code or the audio?

The setup for all services and the code attached

uneven tiger
#

At the moment, we’re having issues with using vosk as the “listen_provider” in the speech module. The SDK team is working on a fix with how modules communicate with each other. Apologies for the inconvenience. 🙏

#

I will add a note to the speech module README until we have that issue resolved.

harsh vigil
harsh vigil
uneven tiger
uneven tiger
outer bough
#

@harsh vigil in the meantime you can use the vosk and piper modules independently, also with the local-llm module if you want. the speechio module basically is a module that binds this all together, but you could do so in your user code

harsh vigil
# outer bough <@973245369758674978> in the meantime you can use the vosk and piper modules ind...

I found this https://cloud.google.com/text-to-speech/docs/authentication and so setup google cloud, but am getting an unusual log message

2024-04-05T15:38:41.213Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  
2024-04-05T15:38:18.832Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  
2024-04-05T15:38:13.854Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  
2024-04-05T15:38:06.587Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  
2024-04-05T15:38:04.037Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  
2024-04-05T15:37:58.930Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  
2024-04-05T15:37:43.154Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  
2024-04-05T15:37:24.396Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  
2024-04-05T15:37:18.928Z   warn robot_server.speechio     logger named %q sent a log over gRPC with an invalid level %qspeechioWARNING  

Config is attached:

Google Cloud

Learn how to authenticate with Text-to-Speech

harsh vigil
uneven tiger
#

For any Linux single board computer or MacOS computer 🙂

harsh vigil
#

ps - debugging on a Linux Laptop lesson 1 - don't be logged into the laptop or it will hog the mic and speaker

harsh vigil
#

Ahhh - I think vosk might work remotely if I use audio input locally and send the audio. Short on time, but plenty of things to do 😄

harsh vigil
#

OK, splitting and adding a remote vosk from the raspberry pi 4 to the laptop running vosk is generating a "not implemented error"

File "/home/blm/python/viam_sdk/remote_check.py", line 68, in <module>
asyncio.run(main())

File "/home/blm/python/viam_sdk/remote_check.py", line 53, in main
speech_text = await speech_vosk.to_text(speech=audio_bytes, format="wav")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/blm/viam_env/lib/python3.11/site-packages/speech_service_api/api.py", line 174, in to_text
response: ToTextResponse = await self.client.ToText(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

harsh vigil
#

Am getting the same not implemented on piper

File "/home/blm/python/viam_sdk/remote_check.py", line 58, in main
audio_to_output = await speech_piper.to_speech(text="Implmenting your command now")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/blm/viam_env/lib/python3.11/site-packages/speech_service_api/api.py", line 179, in to_speech
response: ToSpeechResponse = await self.client.ToSpeech(request)

with

    speech_piper = SpeechService.from_robot(robot, name="blah-blah-blah:piper")
    audio_to_output = await speech_piper.to_speech(text="Implmenting your command now")
    print(f"Audio back received, length: {len(audio_to_output)}, type: {type(audio_to_output)}")
harsh vigil
# harsh vigil Am getting the same not implemented on piper File "/home/blm/python/viam_sd...

I can run the code locally on the laptop

    # speech
    speech_vosk = SpeechService.from_robot(robot, name="vosk")
    speech_piper = SpeechService.from_robot(robot, name="piper")
    

    # Quick test speech
    print("Testing speech ouput")
    piper_response = await speech_piper.to_speech("Hello, How can I help you?")
    print(f"Piper response:- length: {len(piper_response)} type: {type(piper_response)} details: {piper_response}")

    speech_text = await speech_vosk.to_text(speech=piper_response, format="wav")
    print(f"Vosk response:- length: {len(speech_text)} type: {type(speech_text)} details: {speech_text}")

But when the remote is run with the same code fails

    # Setup vosk and piper for speech
    speech_vosk = SpeechService.from_robot(robot, name="viamubuntulaptop2-main:vosk")
    speech_piper = SpeechService.from_robot(robot, name="viamubuntulaptop2-main:piper")
harsh vigil
#

ps uform works fine as a remote

rpi_cam3 get_image return value: <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1920x1080 at 0x7F8D6A9D10>
uform get_classifications_from_camera return value: [class_name: "A man and a woman are standing close together in a room. The man is wearing glasses and a brown jacket with a zipper. The woman is wearing a white sweater and has a necklace. They both appear to be smiling. Behind them, there is a white board with red dots and a bulletin board with papers."
confidence: 1