#devs_voice-archived | Home Assistant | Page 4

cedar moon Feb 15, 2024, 12:47 PM

#

also it would be possible to create automations by voice with the correct entities, if GPT would know them

#

If there are any projects already started I would love to participate. Otherwise I would try to start some experiments with it

scarlet echo Feb 15, 2024, 12:50 PM

#

shouldn't such an integration expose its own conversation agent and not necessarily rely on the current intents as a dependency? 🤔 however, I see no issue with multiple PRs for new intents if you want to use the existing intents.

cedar moon Feb 15, 2024, 12:55 PM

#

Yeah, that is where I'm not sure about how to start. intents are only used by the home assistant conversation agent?

scarlet echo Feb 15, 2024, 12:56 PM

#

cedar moon Yeah, that is where I'm not sure about how to start. intents are only used by t...

for now, yes, as far as i am aware. i don't know of any other integration that uses them

cedar moon Feb 15, 2024, 12:59 PM

#

ok thx. I will take a look at the current ChatGPT agent and try to integrate functions to it

#

its all new for me (longchain and the conversation agents) so it will be very experimental

nimble ferry Feb 16, 2024, 11:21 AM

#

I have a personal project that does this, by adding a new conversation agent (which I started by copy pasting the openai one and then started to add callable functions (like control_light, shopping list and calendar) to it). Works pretty good.

#

I decided to branch it off a bit though, so that I run the conversation agent as a small api inside a docker container instead and I will make the conversation agent component in home assistant a thin client that only passes on the conversation to this api. I find it much faster to develop this way, probably because I'm not experienced enough with how to set up a dev environment in home assistant (I end up restarting the dev container a lot).

nimble ferry Feb 16, 2024, 2:26 PM

#

i tried creating a wyoming service using elevenlabs because i thought i would be able to reduce latency by processing one sentence at a time and sending one AudioChunk for each sentence. but it seems like the client waits for AudioStop before starting to play audio (only tried in the config "try voice" in the browser). is there a way around this?

nimble ferry Feb 16, 2024, 5:49 PM

#

Same with the mobile assistant, anyone know if this is the expected behaviour for all clients? E.g. the wyoming-satellite also?

worthy wave Feb 16, 2024, 7:44 PM

#

Hey everyone, I have very small question 🙂 Could you explain me what is a different between slots and requires_context? When I should use slots? It will be nice also share with me example 🙂 eg. What is a diffferent between this two yaml configurations?

- sentences:
    - "close <name> [in <area>]"
  slots:
    domain: "cover"

and

- sentences:
    - "close <name> [in <area>]"
  requires_context:
    domain: cover

I found information about slots only on this link https://developers.home-assistant.io/docs/voice/intent-recognition/template-sentence-syntax#responses but about requires_context I found only this information https://developers.home-assistant.io/docs/voice/intent-recognition/template-sentence-syntax#requiresexcludes-context. I also check example from the hassil repository https://github.com/home-assistant/hassil/blob/main/examples/en.yaml but still I can't understand differences.

scarlet echo Feb 16, 2024, 9:14 PM

#

requires_context: {domain: "cover"} means you are only going to match that sentence if the provided {name} (<name> is shorthand for [the] {name}) is a cover

#

slots: {domain: "cover"} means you are setting the value of a slot named domain to cover, regardless of what domain the {name} has, which may have implications in how the intent is handled

worthy wave Feb 16, 2024, 9:22 PM

#

scarlet echo `requires_context: {domain: "cover"}` means you are only going to match that sen...

this part makes 100% sense to me 💪 and that's exactly how I understood it

worthy wave Feb 16, 2024, 9:23 PM

#

scarlet echo `slots: {domain: "cover"}` means you are setting the value of a slot named `doma...

unfortunately, I still don't understand this part, why I should set the domain value via slots?

scarlet echo Feb 16, 2024, 9:29 PM

#

Generally, you shouldn't. But say you have a sentence like "is any door locked". There is no entity name there to reference the lock domain, so you have to let the intent handler know that it should query for entities pertaining to the lock domain

#

https://github.com/home-assistant/intents/blob/54dc139833e778ce6e114bf35e87d17077e1dcf4/sentences/en/lock_HassGetState.yaml#L13 is the example above

west gulchBOT Feb 16, 2024, 9:43 PM

#

@worthy wave I converted your message into a file since it's above 15 lines :+1:

📎 message.txt

scarlet echo Feb 16, 2024, 9:48 PM

#

That is correct, but i can't remember off the top of my head if you need requires_context if you don't have {name}. I am inclined to say no

worthy wave Feb 16, 2024, 9:49 PM

#

scarlet echo That is correct, but i can't remember off the top of my head if you need `requir...

Now it's much more understandable for me 😀 Thank you very much for your help 💪

worthy wave Feb 16, 2024, 10:11 PM

#

If I may ask one more thing, I'll change the topic a bit. 🙂 I have two devices M5Atom Echo. My experience with using them and wake words is not very good. Is there any way to check how sound is transmitted from M5Atom to HA and how on the live HA tries to recognize wake words? I don't really know where the problem is. I have HA installed on a NUC 13 Pro and it is a powerful machine. Very often the word wakeWords is not recognized.

I also did tests with a speaker Jabra SPEAK 510 MS (via USB) and here I also had a lot of problems for wake words to work properly. Do you know how to examine where the problem is?

#

I tested the script locally https://github.com/dscripka/openWakeWord/blob/main/examples/detect_from_microphone.py and in each case, each selected word is recognized correctly..

west gulchBOT Feb 16, 2024, 10:12 PM

#

@worthy wave I converted your message into a file since it's above 15 lines :+1:

📎 message.yaml

worthy wave Feb 16, 2024, 10:16 PM

#

I watched the entire last presentation (Voice Assistant Contest Launch) and there it was indicated to use the debug mode in Pipelin. But unfortunately it doesn't work very well for me.

scarlet echo Feb 16, 2024, 10:17 PM

#

worthy wave If I may ask one more thing, I'll change the topic a bit. 🙂 I have two devices ...

That is a question answered multiple times (and pinned) in #voice-assistants-archived

#

#voice-assistants-archived message

worthy wave Feb 16, 2024, 10:18 PM

#

Thanks, I'm going to look for it

scarlet echo Feb 16, 2024, 10:43 PM

#

scarlet echo https://discord.com/channels/330944238910963714/646814454063038466/1193310018334...

No need, i linked it

west gulchBOT Feb 21, 2024, 3:39 PM

#

@worthy wave I converted your message into a file since it's above 15 lines :+1:

📎 message.txt

worthy wave Feb 21, 2024, 3:41 PM

#

Sorry, I wrongly create message above 🙂 @scarlet echo I have another question. Is is possible to use the same sentences for support two different domain? eg. I Poland we can say: "Open the door". The meaning of such a sentence is twofold: Open the door (eg. patio doors with the electric engine) and unlock the door (just unlock the door lock). So we can cover two different domain cover.door and lock. Can I do something like that? (Check example below)

scarlet echo Feb 21, 2024, 4:21 PM

#

worthy wave Sorry, I wrongly create message above 🙂 <@535933407939657745> I have another q...

Yes. It requires a combination of requires_context and excludes_context so as to make the sentences mutually exclusive. I've had the same issue with opening covers in Romanian, where "opening" is a subset of "turning on" (there are a few more words for turning on)

#

In your example you are missing a requires_context: {domain: "lock"} (or cover, respectively). If there are any clashes with words in the generic homeassistant_HassTurnOn or Off, then you need to specifically excludes_context there

#

Otherwise, the sentence might match the generic homeassistant_HassTurnXx

#

However, the door and the lock in your example need to have different names/aliases, otherwise the first matching sentence's intent will be handled

hollow hollow Feb 24, 2024, 4:04 AM

#

i just wrote Whisper API wyoming server. https://github.com/ser/wyoming-whisper-api-client

#

using it with local whisper.cpp server instance

#

it's not extremely faster than faster whisper but it's advantage is to utilise much more popular API

broken elbow Feb 24, 2024, 10:01 AM

#

Hey, I know you can't judge language you don't understand, but if anyone notices some obvious stupidities on this PR concerning something else than the language structure itself, please let me know. https://github.com/home-assistant/intents/pull/1990
I haven't done this large PR ever and this fast, so ...I'll have to fix some missing tests. I wonder why none of my local commands complained about that?

cunning veldt Feb 24, 2024, 10:29 AM

#

Hi, does anyone know if there is a plan to integrate GPU to Whisper and Piper to Home Assistant so that we can plug and use a GPU ? Thanks

hollow hollow Feb 24, 2024, 10:39 AM

#

cunning veldt Hi, does anyone know if there is a plan to integrate GPU to Whisper and Piper to...

https://github.com/ser/wyoming-whisper-api-client but no docker support atm

worthy wave Feb 24, 2024, 11:53 AM

#

Hi @ivory vessel What is the minimum length/number of voice samples to be able to create a new piper voice? Of course, I am aware that the more the better... 😀 I rather wonder whether 4 hours is enough or whether much more is needed? My second question is whether it is possible for you to train a new language from such samples or do I have to do it myself. My only problem is that I don't have the right equipment to do something like this. Of course, the sound will be publicly available.. 🙂 PS. I talk about polish language

hollow hollow Feb 24, 2024, 11:57 AM

#

@worthy wave have you tried xtts2?

worthy wave Feb 24, 2024, 11:58 AM

#

@ivory vessel A completely different topic I'm wondering is whether it is possible to train the current wakeWords using your own voice or is it rather difficult to do? I'm asking because after my recent tests I noticed that the M5 Stack Echo microphone works properly in my configuration, but due to the fact that my accent is not English, the word itself does not work super accurately to wake up the microphone. And I was wondering if I could record several dozen uses of such a word by household members and whether it was possible to train such a model to work more correctly.

worthy wave Feb 24, 2024, 11:58 AM

#

hollow hollow <@928022708430721127> have you tried xtts2?

What is this?

#

you talk about https://github.com/BoltzmannEntropy/xtts2-ui?

hollow hollow Feb 24, 2024, 12:00 PM

#

worthy wave <@638799193586139136> A completely different topic I'm wondering is whether it i...

https://github.com/dscripka/openWakeWord#training-new-models

worthy wave Feb 24, 2024, 12:01 PM

#

hollow hollow https://github.com/dscripka/openWakeWord#training-new-models

yes yes.. I saw this, but it works only with english accent

hollow hollow Feb 24, 2024, 12:02 PM

#

worthy wave yes yes.. I saw this, but it works only with english accent

so no, there is nothing else and custom verifying models do not work for home assistant

hollow hollow Feb 24, 2024, 12:03 PM

#

worthy wave you talk about https://github.com/BoltzmannEntropy/xtts2-ui?

https://github.com/coqui-ai/TTS with XTTS2 model https://huggingface.co/coqui/XTTS-v2

#

jakość tego modelu wbije cię w fotel

worthy wave Feb 24, 2024, 12:04 PM

#

that is the reason why I ask if there is such a possibility to add several dozen additional samples of voice to create "my own" wake worksfrom current existing models 🙂

worthy wave Feb 24, 2024, 12:05 PM

#

hollow hollow https://github.com/coqui-ai/TTS with XTTS2 model https://huggingface.co/coqui/XT...

great, can you have any example 🙂

hollow hollow Feb 24, 2024, 12:05 PM

#

you need to read openwakeword github issues and eventually chat with mr dscripka

#

i think there are examples on https://huggingface.co/coqui/XTTS-v2

worthy wave Feb 24, 2024, 12:07 PM

#

yep, but without polish language example

hollow hollow Feb 24, 2024, 12:07 PM

#

so polish is identically awesome

#

you won't recognise it from real speech

#

please note, GPU is a must for that model

#

to have real time inference

scarlet echo Feb 24, 2024, 12:08 PM

#

worthy wave Hi <@638799193586139136> What is the minimum length/number of voice samples to b...

https://github.com/rhasspy/piper-recording-studio

scarlet echo Feb 24, 2024, 12:10 PM

#

worthy wave <@638799193586139136> A completely different topic I'm wondering is whether it i...

https://github.com/rhasspy/hassio-addons/tree/master/snowboy

#

Both topics seem a bit more suitable for #voice-assistants-archived (where they have been discussed previously)

worthy wave Feb 24, 2024, 12:12 PM

#

scarlet echo https://github.com/rhasspy/piper-recording-studio

I have also seen it, I already have samples but I do not know what to do with them 1500 examples (about 2 hours) 🙂

scarlet echo Feb 24, 2024, 12:13 PM

#

Piper recording studio provides the "right" amount of samples you need to read aloud to train a TTS voice model

hollow hollow Feb 24, 2024, 12:14 PM

#

XTTS2 needs just few samples, btw

hollow hollow Feb 24, 2024, 2:13 PM

#

OMG, this model is totally crazy super fast

#

https://huggingface.co/distil-whisper/distil-large-v2#whispercpp

broken elbow Feb 24, 2024, 9:36 PM

#

when is the DL to merge translations for 2024.3?

worthy wave Feb 25, 2024, 11:36 PM

#

@scarlet echo I have small question 🙂 I'm one of Polish leader, but I don't know why I can't merge PR prepared by me and accepted from other person.. can you explain me what I should do to be able also merge PR related with polish language? Sorry if this question isn't for you 🙂

scarlet echo Feb 26, 2024, 7:46 AM

#

worthy wave <@535933407939657745> I have small question 🙂 I'm one of Polish leader, but I d...

i'm not a repo admin, but i see you're not in the language-leaders group https://github.com/orgs/home-assistant/teams/language-leaders?query= so I don't think you have write access to the intents repo. only @ivory vessel can help

#

for example @shell dirge is in that group and he never had issues with merging PRs

severe oyster Feb 26, 2024, 6:28 PM

#

Working on Slovenian translation of sentences in VA and one thing I can't figure out (for now on my priority list) is how to tackle the speech output of numbers. Example: the text output (of sensor value) is 22.5 °C. Which is fine, but when spoken on VA I get "two-two-five C" (in Slovenian). Any advice how to tackle this to get whole number output? And decimals? The spoken output ignores decimal point...Thanks

scarlet echo Feb 26, 2024, 9:13 PM

#

Does slovenian use a comma , as a decimal separator?

#

If so, take a look at this and the following messages #devs_voice-archived message

scarlet aurora Feb 26, 2024, 11:08 PM

#

don't the intents repo use the same setence parser?
On HA, for "coloca batatas na lista" I get "batanas n" as item.
Using script.intentfest parse from the intents repo, I get "batatas "

#

https://github.com/home-assistant/hassil/issues/92
looks like that bug, but in this case only happen on HA 🤔 which is weirder

scarlet echo Feb 27, 2024, 8:25 AM

#

scarlet aurora don't the intents repo use the same setence parser? On HA, for "coloca batatas n...

you have to look what version of hassil you have both in HA and in the intents devcontainer (or whatever you're using)

ivory vessel Feb 27, 2024, 3:19 PM

#

worthy wave Hi <@638799193586139136> What is the minimum length/number of voice samples to b...

I have some Polish sentences here: https://github.com/rhasspy/piper-recording-studio/tree/master/prompts/Polish (Poland)_pl-PL
This is about 2 hours worth of audio, and is definitely enough since I've trained a few voices so far 🙂
If you'd like to use the contribution website to record (https://contribute.rhasspy.org/) let me know at voice@nabucasa.com and I can get you a login code.

ivory vessel Feb 27, 2024, 3:21 PM

#

worthy wave <@638799193586139136> A completely different topic I'm wondering is whether it i...

This is possible with snowboy: https://github.com/rhasspy/wyoming-snowboy
For openWakeWord, I will need to train a large multi-speaker Polish model in order to train Polish wake words. The data exists, I just need to find the time 😄

worthy wave Feb 27, 2024, 3:22 PM

#

ivory vessel I have some Polish sentences here: <https://github.com/rhasspy/piper-recording-s...

I have recordings with text. However, at the moment I do not have the equipment to train new models. I can share these recordings now, then maybe new models can be trained 🙂

#

I plan to create at least one more women's voice

ivory vessel Feb 27, 2024, 3:23 PM

#

worthy wave I have recordings with text. However, at the moment I do not have the equipment ...

Sure, if you're willing to share the audio I can train a new voice 👍

worthy wave Feb 27, 2024, 3:24 PM

#

ivory vessel This is possible with snowboy: <https://github.com/rhasspy/wyoming-snowboy> For ...

Regarding the snowboy, unfortunately I had problems with it and ultimately I was unable to train the new model with my voice. I also noticed that it only supports English and Chinese languages

#

For now, I have focused on a major update of the Polish language for intents 🙂

#

I don't know if there is any option to speed up PR checking.. 😅 because I don't have the ability to merge them

ivory vessel Feb 27, 2024, 3:30 PM

#

worthy wave I don't know if there is any option to speed up PR checking.. 😅 because I don't...

It looks like the language leaders are taking a look at your PR (assuming https://github.com/home-assistant/intents/pull/1996). We still have more than a week before the next HA release, so plenty of time.

worthy wave Feb 27, 2024, 3:32 PM

#

Yes, but unfortunately this is only part of changes... the whole thing is still missing, related to checking the sensor and binary_sensor... and I don't want to work on it until this PR is not closed (this is a bit related) 🙂

cobalt needle Feb 27, 2024, 4:03 PM

#

@ivory vessel If I am currently training a voice and the test output seems to have hit a plateau (using the test output) Can I simply halt training, update the dataset wav and csv with more data, run prep again then resume from my current checkpoint? Will doing so start trianing with the additional data or do I need to start over with a clean set of training folders?

ivory vessel Feb 27, 2024, 4:43 PM

#

cobalt needle <@638799193586139136> If I am currently training a voice and the test output see...

Yes, this should just fine. I'm guessing this would be the better approach, but an alternative would be to just prepare the new data and resume training on that alone. It would be worth experimenting with if you have the time, and I'd be interested in the results 🙂

cobalt needle Feb 27, 2024, 4:49 PM

#

ivory vessel Yes, this should just fine. I'm guessing this would be the better approach, but ...

Well at the moment I am working with limited data as I am trying to train a voice off of existing samples not samples I am creating..

As it takes hours to see the results I just was curious if I had the correct approche

spare forge Feb 27, 2024, 7:35 PM

#

one question about the voice assistant: Do we want to add basic-but-not-extricly-smart-home-related commands like "what time is it?"

#

I find myself missing asking alexa alexa about what time is it, adding timers while I'm cooking, etc...

#

I think adding them would make for a smoother transition for people already using alexa and google speakers

worthy wave Feb 27, 2024, 7:42 PM

#

spare forge I find myself missing asking alexa alexa about what time is it, adding timers wh...

"what time is it" you can add directly from automation 🙂 it is very easy..

spare forge Feb 27, 2024, 7:43 PM

#

@worthy wave you mean with custom sentences?

worthy wave Feb 27, 2024, 7:43 PM

#

no, just automation inside HA

spare forge Feb 27, 2024, 7:44 PM

#

I'm not following I'm afraid

worthy wave Feb 27, 2024, 7:44 PM

#

https://www.home-assistant.io/voice_control/custom_sentences/#to-add-a-custom-sentence-to-trigger-an-automation

spare forge Feb 27, 2024, 7:44 PM

#

so what I said, with custom sentences ^

worthy wave Feb 27, 2024, 7:45 PM

#

this is custom sentences https://www.home-assistant.io/voice_control/custom_sentences/#setting-up-custom-sentences-in-configurationyaml

#

😄

#

these are two different configurations

spare forge Feb 27, 2024, 7:45 PM

#

I see. I saw them as one and the same

#

for sure, one can add any sentence to perform anything they want. My question is if we should ship by default several of the most common ones

#

much like we have sentences for the weather, we could have them for time

#

and similarly to how we have sentences to manage a shopping list, i'd expect senteces to set alarms and timers

#

I'm speaking from a user's perspective that is looking to replace alexa with HA

scarlet echo Feb 27, 2024, 7:55 PM

#

@spare forge go ahead and propose some intents and/or sentences. There's nothing set in stone

cobalt needle Feb 27, 2024, 7:56 PM

#

There is a VA expectations poll in the forums that might give hints as to what people expect
https://community.home-assistant.io/t/poll-what-do-you-use-your-voice-assistant-for-what-do-you-expect-it-to-do-multiple-selections/693669/5

#

Setting timers and alarms is very high on that list

spare forge Feb 27, 2024, 7:58 PM

#

@scarlet echo in the architecture repo?

scarlet echo Feb 27, 2024, 8:00 PM

#

spare forge <@535933407939657745> in the architecture repo?

I think you can just go for it. @ivory vessel ? Thoughts?

spare forge Feb 27, 2024, 8:04 PM

#

I did a couple proposals in the past, but I wasn't sure if the architecture one was the right repo for it or voice had another one

ivory vessel Feb 27, 2024, 8:04 PM

#

@spare forge Can you link those to me so I can collect them into a list? Thanks!

spare forge Feb 27, 2024, 8:05 PM

#

ivory vessel <@315146369402929153> Can you link those to me so I can collect them into a list...

I'm not aware of a comprehensive list of basic tasks that alexa knows how to handle, but I'll try to search one or compile one myself

#

some of them, like timers, might require creating new services, others like asking for the time seem rather simple

scarlet echo Feb 27, 2024, 8:07 PM

#

All of them require new intents

cobalt needle Feb 27, 2024, 8:12 PM

#

spare forge some of them, like timers, might require creating new services, others like aski...

Here is one for Google https://support.google.com/assistant/answer/7172842?hl=en

severe oyster Feb 27, 2024, 8:12 PM

#

Thanks @scarlet echo same in Slovenian. The value with (,) is properly spoken out by TTS, the value with (.) is not. But I could check it only on Try voice button on VA Settings, since the jinja2 filter replace doesn't change the sensor value properly. -> I get the same sensor state in Dev tools is with . (see pic1 https://ibb.co/X5XG1dx ) but when I open it it's with ,localized (see pic2 https://ibb.co/ygfF7XG )🤨 It gets me mad slowly... In template editor in devs section I get sensor state with dot (.) (see pic3 https://ibb.co/zVsJyBN) but I have configures in personal settings as 1.234.567,89 (see pic4 https://ibb.co/4WSzHhv ). If I use replace filter in template I get this🙃 https://ibb.co/WHcQB95

spare forge Feb 27, 2024, 8:12 PM

#

scarlet echo All of them require new intents

new intents for sure, but some might require even new services that don't exist

ivory vessel Feb 27, 2024, 8:23 PM

#

Timers are going to require HA to be able to initiate a TTS response on the satellite, which it currently can't do. But I think this will be pretty straightforward.

#

Well, really just an event when the timer elapses. Not necessarily TTS.

cobalt needle Feb 27, 2024, 8:45 PM

#

A timer is just a future time stamped event with a name and destination media device really

#

some assistants allow you to set named timers

spare forge Feb 27, 2024, 8:57 PM

#

named timers are critical for people who cook IMO. I set timers to know when something i'm boiling is done, while I'm baking something else, while there's another one for the max screen time of my daughter

#

I 100% need names on my timers

ivory vessel Feb 27, 2024, 9:20 PM

#

Named timers will definitely be supported 👍

scarlet aurora Feb 27, 2024, 9:48 PM

#

scarlet echo you have to look what version of hassil you have both in HA and in the intents d...

I am using HA 2024.2 (the problem exists for 2 or 3 releases already), and checked out intents tag 2024.2.2, so I think they are on the same version

#

well, even HA on dev has the same issue, so it is not a version mismatch...

scarlet aurora Feb 27, 2024, 11:53 PM

#

found it! its because HA uses recognize_all while the intents parse script uses recognize

#

and recognize_all returns two results: the correct one and the one with the "n"

spare forge Feb 28, 2024, 1:59 PM

#

@ivory vessel I opened two discussions in the architecture repo. One for adding support for basic sentences like asking the time, setting alarms, etc..., and another one to add a "Brief mode" similar to the one alexa has, which makes it prefer shorter responses over verbose ones

scarlet echo Feb 28, 2024, 9:51 PM

#

Can one set up a Wyoming server or something and capture all recorded audio from an ESPHome satellite without trying to use it for HA? I.e. without running it through a pipeline?

reef anchor Feb 29, 2024, 8:36 PM

#

@ivory vessel Is this intent release still just a beta release? Will there be another release before the official launch? I wanted to perfect the cover and valve parts for area management today, but you were super fast this month. 🙂

ivory vessel Feb 29, 2024, 11:49 PM

#

spare forge <@638799193586139136> I opened two discussions in the architecture repo. One for...

Thanks! I'll take a look.

ivory vessel Feb 29, 2024, 11:50 PM

#

reef anchor <@638799193586139136> Is this intent release still just a beta release? Will the...

Yeah, I'll do another bump on release day so there's still plenty of time 🙂

cobalt needle Mar 1, 2024, 4:59 AM

#

ivory vessel Yes, this should just fine. I'm guessing this would be the better approach, but ...

As a follow up since you need a fairly significant minimum amount of data for the prep to complete without error, creating a completely new dataset completely wasn't practical.. Instead I have added some new incremental data targeted at problem words and removed some redundant data from the main set to try and PULL it to a better spot... It is still sluring some words however.. I might abandon the current checkpoints and go back to starting from a good one with a revised dataset but at the moment my loss gen is going down so I am going to give it some time to bake

marsh roost Mar 1, 2024, 3:53 PM

#

Not sure if I should be on ESPHome Discord or here... I'm trying to setup the S3-BOX-3 and the wakeword works... but then when I try to interact with my voice pipeline/assistant, the last log entry I get is:

[D][esp_adf.microphone:273]: Microphone started
[D][voice_assistant:414]: State changed from STARTING_MICROPHONE to STREAMING_MICROPHONE

... I'm not sure if anything is making it to whisper or not

cobalt needle Mar 2, 2024, 7:41 PM

#

Hmm even after re-training my voice model seems to be slurring words as compared to the source data, I wonder if it is a factor of having too small of a dataset or if the training prep stage is attributing the wrong phonetic breakdown of the source for some reason.

floral path Mar 4, 2024, 9:46 AM

#

Do I get it right, that when I want to limit the intent to a specific domain, e.g.:

requires_context:
  domain:
    - cover

This only works if the sentence contains the entity {name}, So there is no way to say I want to control entities of a specific domain in an {area}?

#

And if you do not mind a second question. How to address devices in the same area as the assistant device. I have found 2 ways: ```
requires_context:
area:
slot: true

#

and

slots:
          name: "all"

Are they equal? Which one is right? How do they work? I could not find it in the doc.

scarlet echo Mar 4, 2024, 11:08 AM

#

floral path Do I get it right, that when I want to limit the intent to a specific domain, e...

It's probably better to explain what you want to achieve. Generically speaking, requires_context: {domain: ['cover']} applies to entities referenced by {name}, indeed

#

If you want to target devices which are in a specific area, you need to ~~requires_context: {area: 'bedroom'}~~ slots: {area: 'bedroom'} for example

scarlet echo Mar 4, 2024, 11:11 AM

#

floral path And if you do not mind a second question. How to address devices in the same are...

what this does is that it creates a condition to filter only entities which have an area assigned, but slot: true means that the area of the satellite is promoted to a slot and used for filtering entities from the same area that the satellite is in (which is provided automatically in the context)

scarlet echo Mar 4, 2024, 11:12 AM

#

floral path and ``` slots: name: "all" ``` Are they equal? Which one is right? Ho...

this one does next to nothing. it's used for automatically upgrading the service call to name: "all", but it would do the same if name was None. for example in requests like turn on the lights in the kitchen

#

either way, it has nothing to do with area

floral path Mar 4, 2024, 11:14 AM

#

Thanks

scarlet echo Mar 4, 2024, 11:16 AM

#

scarlet echo If you want to target devices which are in a specific area, you need to ~~`requi...

i think i made a mistake here and I corrected it

floral path Mar 4, 2024, 11:16 AM

#

scarlet echo It's probably better to explain what you want to achieve. Generically speaking, ...

if i want to ask for the temperature in the bedroom, I'd like to ask what is the temperature in the {area} bedroom. But I think I have to ask what is the temperature on the bedroom temperature sensor to get the entity name in it

#

(and in thsi case this is going to crash into the climate domain, but that's another thing)

scarlet echo Mar 4, 2024, 11:17 AM

#

are we talking about built-in sentences?

floral path Mar 4, 2024, 11:18 AM

#

Yes, have been contributing the sentences in cs. It works, but in these two areas I do not fully understand how it works.

scarlet echo Mar 4, 2024, 11:19 AM

#

understood. there was a voluntary choice not to respond to what's the temperature in the bedroom with temperature sensors, but only with climate current_temperature

#

a sensor with device_class: temperature could just as well be a 3D printing nozzle temp sensor, a fridge temperature sensor etc. whereas climate entities are pretty straightforward

#

that said, you can still ask (in English, at least) what is the <device_class> of <sensor name> and some variations, but you have to name the sensor (or its aliases)

floral path Mar 4, 2024, 11:21 AM

#

understood. probably works for thermostats. if you use TRVs they usually do not measure temperature, but I guess it is what it is

scarlet echo Mar 4, 2024, 11:22 AM

#

some TRVs expose climate entities. what exact entities do you have?

#

just the TRV switch?

floral path Mar 4, 2024, 11:23 AM

#

No, zwave or zigbee TRVs. And they show as climate entities. Anyway, I can always havea custom automation to inject the current temperature from the room sensork if I am desparate 🙂

scarlet echo Mar 4, 2024, 11:24 AM

#

floral path No, zwave or zigbee TRVs. And they show as climate entities. Anyway, I can alway...

so what's the problem, then?

#

i mean if they expose climate entities

floral path Mar 4, 2024, 11:25 AM

#

got sidetracked, the question was about the filtering device class when refering to area, not the device name. And using the same area as the satellite

scarlet echo Mar 4, 2024, 11:26 AM

#

specifically for temperatures, you don't need to filter anything, as the HassClimateGetTemperature intent only applies to the climate domain, for which there are no device_classes. you can't query sensors and I've briefly explained why

scarlet echo Mar 4, 2024, 11:27 AM

#

floral path And if you do not mind a second question. How to address devices in the same are...

for querying/issueing commands in the same area as the satellite, this is how to do it

#

that piece of YAML makes sure that area was included in the context (coming from the area assigned to the satellite) and it promotes the area to a slot, which is then used for filtering entities

#

if you're good with Python, i strongly recommend going through the hassil code and the default_agent to really understand how intent recognition and handling works

ivory vessel Mar 4, 2024, 6:07 PM

#

cobalt needle Hmm even after re-training my voice model seems to be slurring words as compared...

Can you provide an audio example? Also, please remind me of the voice model's language.

cobalt needle Mar 4, 2024, 7:16 PM

#

ivory vessel Can you provide an audio example? Also, please remind me of the voice model's la...

What is a good way / your prefered way of sharing audio files? I am training an English model from existing samples and synthetic ones from another model.

cobalt needle Mar 4, 2024, 8:19 PM

#

@ivory vessel sent you some samples as a direct message

floral path Mar 4, 2024, 8:30 PM

#

scarlet echo for querying/issueing commands in the same area as the satellite, this is how to...

Can you please check if I get it right? I believe this is how it is supposed to be for the lights (it had the name: all formula, I changed it to what you explained, if I got it right). Plus I added simlilar for covers.

scarlet echo Mar 5, 2024, 8:10 AM

#

floral path Can you please check if I get it right? I believe [this](https://github.com/home...

yup, that looks ok to me

floral path Mar 5, 2024, 8:15 AM

#

scarlet echo yup, that looks ok to me

Thanks

reef anchor Mar 5, 2024, 12:45 PM

#

ivory vessel Yeah, I'll do another bump on release day so there's still plenty of time 🙂

Well, I managed to include everything 🙂 even the weather status display is now working in Hungarian. Thank you for waiting this long.

severe oyster Mar 6, 2024, 11:02 AM

#

Do we have any intent for stopping opening/closing cover? Example: when the sentence is called: 'open the blinds' the HA starts opening the blinds which takes some time...if we want to stop in the middle or in some desired position - is there any intent already? Or did I miss something? Thanks!

scarlet echo Mar 6, 2024, 11:03 AM

#

severe oyster Do we have any intent for stopping opening/closing cover? Example: when the sent...

we don't currently

severe oyster Mar 6, 2024, 11:06 AM

#

Thanks @scarlet echo, probably the only way is with automation and custom sentence as a trigger?

scarlet echo Mar 6, 2024, 11:07 AM

#

i guess, yes. but we could add the sentence(s) and intent. could you open a ticket please?

floral path Mar 6, 2024, 11:37 AM

#

Speaking of covers, the documentation says that HassOpenCover and HassCloseCover are deprecated, and we shall use HassTurnOn/Off. Is that the goal? I haven't seen any language that has this implemented (at least EN does not have it). How does the Stop fit in?

severe oyster Mar 6, 2024, 11:40 AM

#

scarlet echo i guess, yes. but we could add the sentence(s) and intent. could you open a tick...

Opened in https://github.com/home-assistant/intents/issues/2047

severe oyster Mar 6, 2024, 11:42 AM

#

floral path Speaking of covers, the [documentation](https://developers.home-assistant.io/doc...

Isn't coverHassTurnOn and coverHassTurnOff we are talking about? In intents I mean

floral path Mar 6, 2024, 11:43 AM

#

severe oyster Opened in https://github.com/home-assistant/intents/issues/2047

Easy solution for your last comment - take a sharpie and draw a scale on the wall next to the blind. I am sure the partner will understand 😄

scarlet echo Mar 6, 2024, 11:44 AM

#

floral path Easy solution for your last comment - take a sharpie and draw a scale on the wal...

that's IF the cover supports HassSetPosition. If it doesn't, you're out of luck

scarlet echo Mar 6, 2024, 11:46 AM

#

floral path Speaking of covers, the [documentation](https://developers.home-assistant.io/doc...

what do you mean it's not implemented? there are no HassOpenCover and HassCloseCover being pushed forward, as only HassTurnOn and HassTurnOff are used in Assist, in all languages https://github.com/home-assistant/intents/blob/main/sentences/en/cover_HassTurnOn.yaml

floral path Mar 6, 2024, 11:50 AM

#

I mean that if I look at the HassTurnOn intents:

      - sentences:
          - "<turn> on (<area> <name>|<name> [in <area>])"
          - "[<turn>] (<area> <name>|<name> [in <area>]) [to] on"
          - "activate (<area> <name>|<name> [in <area>])"

It reacts on turn: "(turn|switch|change)" or activate. So words open or raise that are used in HasOpenCover are not implemented.

severe oyster Mar 6, 2024, 11:50 AM

#

floral path Easy solution for your last comment - take a sharpie and draw a scale on the wal...

🙂 The desired position is relative...for instance relative to the shade from sun, and it's 'different' every day (even not noticeably) 😀

scarlet echo Mar 6, 2024, 11:51 AM

#

floral path I mean that if I look at the HassTurnOn intents: ``` - sentences: ...

https://github.com/home-assistant/intents/blob/c761e797446a1d17d02849417fe101a999832ef5/sentences/en/homeassistant_HassTurnOn.yaml#L12

floral path Mar 6, 2024, 11:52 AM

#

Yes, cover is in the domain. So you can turn or switch it on, or activate it (whatever it means)

scarlet echo Mar 6, 2024, 11:53 AM

#

no, it's listed in the excludes_context, which specifically does not match these sentences with entities from the cover domain

severe oyster Mar 6, 2024, 11:53 AM

#

floral path Yes, cover is in the domain. So you can turn or switch it on, or activate it (wh...

I couldn't agree. Cover is open or close. Not on or off like switch

floral path Mar 6, 2024, 11:53 AM

#

scarlet echo no, it's listed in the `excludes_context`, which specifically does not match the...

Sorry, Mea Culpa

scarlet echo Mar 6, 2024, 11:54 AM

#

what does match is what i've linked 5 messges before: #devs_voice-archived message

floral path Mar 6, 2024, 11:57 AM

#

Aaa, I was confused. Sorry for wasting the time/space here

severe oyster Mar 6, 2024, 6:57 PM

#

@scarlet echo just checking your PR https://github.com/home-assistant/intents/pull/2045 because I had some troubles with intent valve (I had to differentiate from set positionand open valve, so I had to use a synonym in sl). Is this the reason the homeassistant_HassSetPosition.yaml was deleted? Thanks!

scarlet echo Mar 6, 2024, 7:09 PM

#

severe oyster <@535933407939657745> just checking your PR https://github.com/home-assistant/in...

I have split the homeassistant_HassSetPosition into domains and nothing else supports the intent other than cover and valve, so yes, i deleted the homeassistant domain

severe oyster Mar 6, 2024, 8:11 PM

#

Some help please: response for sensor_HassGetState which is one in form: {{ slots.name | capitalize }} je {{ state.state_with_unit }} gives me clumsy response according to sl language.
How can <class> (from expansion rules) be added in front of slots.name. So the response will be more human friendly? E.g. for duration sensor: "**Trajanje** trenutnega programa pomivalnega stroja je 64 min" I need bolded (**) word which is from <class> expansion rules? If I put slots.device_class in front I get device class untraslated e.g. duration not trajanje in Slovene. 🙏

dire oar Mar 6, 2024, 10:43 PM

#

I'm interested in designing an open source Voice Assistant hardware and a elegant case compatible with Echo Dot V3 accessories. What component level hardware would be best for the community to get Far Field audio capture with 3+ mics, and be able to exclude it's own audio? I'm currently working on the project over here https://community.home-assistant.io/t/far-field-satellite-with-an-elegant-3d-printed-enclosures/699893

#

I don't feel like there are any ideal off the shelf modules that can quite compete with Amazon or Google, especially when it comes to the Mic arrays for directional/far field arrays. Has the community found a good Voice processing unit that works with a ESP32?

scarlet echo Mar 7, 2024, 7:08 AM

#

dire oar I don't feel like there are any ideal off the shelf modules that can quite compe...

the Onju Voice is pretty decent from a few meters away https://github.com/justLV/onju-voice

dire oar Mar 7, 2024, 7:44 AM

#

It's a nice project, though still lacking a dedicated VPU, and I would guess have issues with detecting voice with Barge in, or echo cancellation. Wouldn't this type of chip make the audio pickup easier on the ESP32 https://www.microsemi.com/document-portal/doc_download/136798-zl38063-datasheet

#

I suppose the question is, what hardware would make the life of the developers easier to make a ideal smart speaker? To start with focusing on the MIC array that can work in noisy environments from a distance

broken elbow Mar 7, 2024, 8:42 AM

#

Hey, for some reason whenever i say "Säädä", Assist hears "Saada" (which is also a Finnish word). Should I fix this in intents, or open an issue elsewhere? Translation is roughly "Modify". "Saada" isn't something that I can immediately think for any voice commands.

scarlet echo Mar 7, 2024, 9:42 AM

#

broken elbow Hey, for some reason whenever i say "Säädä", Assist hears "Saada" (which is also...

That's an issue with the STT engine. What are you using?
Mike H has a workaround for similar sounding words which would help in exactly these situations, but it's not ready for prime time yet (mostly due to missing text-to-phoneme libraries with a usable license)
To answer your question, adding nonsensical words to the sentences just because that's what the STT hears is a band-aid on a broken bone and i would advise against it

broken elbow Mar 7, 2024, 9:44 AM

#

scarlet echo That's an issue with the STT engine. What are you using? Mike H has a workaround...

This is what I thought as well that it's just a "bandaid". I'm using HA Cloud for STT

scarlet echo Mar 7, 2024, 9:48 AM

#

broken elbow This is what I thought as well that it's just a "bandaid". I'm using HA Cloud fo...

I'm using HA Cloud for STT
Ouch! Take a look at this discussion for details on the other thing i mentioned https://github.com/home-assistant/hassil/pull/80

quasi blade Mar 7, 2024, 9:49 AM

#

Hello
is there any way to set the "assist" when it doesn't understand the request it send the message to gpt api so like that we can use both the control of assist and the power of ai in the same time

copper yacht Mar 7, 2024, 11:15 AM

#

There is no built in way as of right now, only with a custom integration.

dense sphinx Mar 7, 2024, 1:37 PM

#

@scarlet echo I was excited to see my media_player intents in the release today! I totally appreciate dev time is precious but was wondering if any of the other service calls, particularly media_previous_track (as we now have next track) are on the roadmap? I have the custom intents I am using for all the other service calls ready to go!

scarlet echo Mar 7, 2024, 1:39 PM

#

dense sphinx <@535933407939657745> I was excited to see my `media_player` intents in the rele...

Thank you for those contriibutions, btw! I have no clue about the roadmap. But I encourage you to add those new intents whenever you want (just mark them as unsupported). If you need any help, I'm here

#

for example, I've just opened a PR for the implementation of a HassClimateSetTemperature intent. No idea if it was on the roadmap, but I've heard many times that the roadmap was largely influenced by community contributions

dense sphinx Mar 7, 2024, 1:42 PM

#

scarlet echo for example, I've just opened a PR for the implementation of a `HassClimateSetTe...

Great I can do that. What do you have to do to mark them as unsupported?

scarlet echo Mar 7, 2024, 1:44 PM

#

dense sphinx Great I can do that. What do you have to do to mark them as unsupported?

https://github.com/home-assistant/intents/blob/d0b5b7b44a0d404850c564f42c543951b2cd60a5/intents.yaml#L179

dense sphinx Mar 7, 2024, 1:47 PM

#

scarlet echo https://github.com/home-assistant/intents/blob/d0b5b7b44a0d404850c564f42c543951b...

OK so I will need to amend that file, add the sentences file and the tests file. Will try and get to it soon!

scarlet echo Mar 7, 2024, 1:52 PM

#

dense sphinx OK so I will need to amend that file, add the sentences file and the tests file....

...just for English and then make sure all linters and tests pass

dense sphinx Mar 7, 2024, 1:56 PM

#

scarlet echo ...just for English and then make sure all linters and tests pass

Last question. I saw above discussion about homophones. In one of my intents I have "Clear <media_player> (queue|cue|Q|cube)" Is this kosher? I can forsee frowns about cube (must be my poor pronounciation but it saves me a lot of didn't understands!) but the other three are homophones so should I list at least those three?

scarlet echo Mar 7, 2024, 1:57 PM

#

you should not list them, especially for a first iteration of a new intent. however, if they help you, i'd totally suggest having that particular custom sentence on your system, tied to the same intent

hollow silo Mar 7, 2024, 3:04 PM

#

@ivory vessel @noble copper were the last intents updates included in yesterdays release? The new Dutch intents are not working (eg volume of media players and vacuum start)

#

They don't work in 2024.3, but they do work on the 2024.4 nightly

ivory vessel Mar 7, 2024, 3:05 PM

#

hollow silo <@638799193586139136> <@105697560013844480> were the last intents updates includ...

I messed up and got the PR in too late 😬
We'll have to wait for the point release, unfortunately.

hollow silo Mar 7, 2024, 3:05 PM

#

okay, no worries, but that explains it 🙂

#

something else

#

STT sometimes adds comma's. If it does everywhere all is fine, but not if it only adds one

#

ivory vessel Mar 7, 2024, 3:23 PM

#

Weird, seems to work in English

#

Oh, wait. It does fail with the same intent. I'll take a look.

hollow silo Mar 7, 2024, 3:28 PM

#

Thanks!

reef anchor Mar 7, 2024, 4:07 PM

#

ivory vessel I messed up and got the PR in too late 😬 We'll have to wait for the point relea...

Oh, this is such good news, I've been searching for two hours what the difference is between the dev and the stable weather 😄 Because what I put in doesn't work. But now I'm relieved.

broken elbow Mar 7, 2024, 6:29 PM

#

i also noticed today that "set curtain to 90%" worked but for some reason "set curtain to 100%" became "set curtain, to 100%" (translated from finnish)

ivory vessel Mar 7, 2024, 7:09 PM

#

reef anchor Oh, this is such good news, I've been searching for two hours what the differenc...

Sorry about that! I was trying to get some last minute PR's merged and I missed the window 😞

reef anchor Mar 7, 2024, 8:56 PM

#

ivory vessel Sorry about that! I was trying to get some last minute PR's merged and I missed ...

Oh, I didn't write this in a negative way. I was just really happy that I didn't mess something up. I couldn't do what you guys do month after month 😉 I do pay attention to what work goes into releasing a main build.

dire oar Mar 7, 2024, 10:00 PM

#

Is there a list of supported ADC's/VPU's that are currently supported by the project, be it with a ESP32 or RPi? I'm aiming to design a new satellite with beam forming

dire oar Mar 7, 2024, 10:40 PM

#

Is the ZL38063 supported? https://www.microsemi.com/product-directory/connected-home/4432-zl38063

ivory vessel Mar 7, 2024, 10:48 PM

#

ESPHome supports I2S microphones directly: https://esphome.io/components/microphone/i2s_audio
The Raspberry Pi would need an I2S kernel module.

dense sphinx Mar 7, 2024, 11:04 PM

#

@scarlet echo Just doing these extra media player intents. I think I have got it all working. Tests are all passing EXCEPT when I add a response key to the tests. Then I get an assertion error even though it is an identical format to the previous ones I have done. Ideas? The error looks like AssertionError: No response template for intent HassMediaClearPlaylist named default: clear TV queue

scarlet echo Mar 8, 2024, 7:27 AM

#

Testing issues

scarlet echo Mar 8, 2024, 10:15 AM

#

should we have support for (custom) integrations to define their own set of intents and sentences which could be added by default to the Home Assistant conversation agent?
so for example, the Alarmo integration could expose an "arm Alarmo" or "arm [the] home alarm" sentence that the default conversation agent would adopt and have ready to use instantly
or since I've been discussing with Gav from Music Assistant, the MA integration could expose specific intents for media playback or other media-related actions
thoughts?

dire oar Mar 8, 2024, 9:49 PM

#

ivory vessel ESPHome supports I2S microphones directly: <https://esphome.io/components/microp...

So I assume that I would treat the processed output from the ZL38063 as a Microphone input to the ESP32 then 🙂

How advanced is the current audio processing on the ESP32 S3. Can we do Acoustic echo cancellation and/or beam forming in software?

#

Has anyone managed to show the beam form direction picked up on the mic array back to the end user? Specifically in a similar way to an amazon echo?

ivory vessel Mar 9, 2024, 12:34 AM

#

dire oar So I assume that I would treat the processed output from the ZL38063 as a Microp...

The ESP32-S3 is capable of echo cancellation and other audio clean up (not sure about beam forming), but it is not being used currently in ESPHome. Espressif's ADF libraries are not exactly easy to use outside of their examples 😄

ivory vessel Mar 9, 2024, 12:35 AM

#

dire oar Has anyone managed to show the beam form direction picked up on the mic array ba...

I only see echo cancellation, blind source separation, and noise suppression listed here: https://www.espressif.com/en/news/esp-afe-algorithms

dire oar Mar 9, 2024, 2:59 AM

#

ivory vessel The ESP32-S3 is capable of echo cancellation and other audio clean up (not sure ...

Ah, that's good to know that it's technically possible. I'm guessing that boils down to Espressif IDF has the library but Platformio is lacking. That also explains why the development boards I've seen omit having advanced sound processors (as they expect the S3 to do a bit more heavy lifting)

#

Thank you for the link, that's what I was looking for, I just couldn't find the keywords to track that down

dire oar Mar 9, 2024, 8:16 PM

#

ivory vessel The ESP32-S3 is capable of echo cancellation and other audio clean up (not sure ...

If there currently isn't support due to the Espressif library not being ported to Platformio, do you think there is need to offload that processing into a dedicated hardware chip such as https://www.microsemi.com/product-directory/connected-home/4432-zl38063

dire oar Mar 10, 2024, 6:24 PM

#

How is development going in regards to space usage of Flash and Ram, looking through the ESP32 S3 data sheet it seems that it supports up to 1gb on both. Would that be ok any use? (Above the 16mb and 8mb?)
It looks like 64 MB would be easy enough to get up to and still be inside the virtual address space

scarlet echo Mar 11, 2024, 5:24 PM

#

I've opened a discussion in the architecture repo for including devices among the things Assist can query to do its job (e.g. what is the <device_class> in <device_name> - what is the temperature in the fridge). If you think it's worthy, please vote and/or comment https://github.com/home-assistant/architecture/discussions/1060

worldly narwhal Mar 12, 2024, 3:40 PM

#

bump ^^

What can I do to get this merged? - I tried making the same changes via the preferred github codespace method however I don't have the necessary permission to push, I think I need to be a language leader? feels like a chicken-egg problem, need two PRs to be language lead, but can't get these merged 😅

scarlet echo Mar 12, 2024, 5:24 PM

#

worldly narwhal bump ^^ What can I do to get this merged? - I tried making the same changes via...

@ivory vessel

ivory vessel Mar 12, 2024, 8:14 PM

#

worldly narwhal bump ^^ What can I do to get this merged? - I tried making the same changes via...

Hi @worldly narwhal, sorry about the wait! I've assigned myself these PR's to take a look 👍

ivory vessel Mar 12, 2024, 9:48 PM

#

@worldly narwhal There was some problem with the CI and I couldn't get your PR's to run the tests. I pulled the changes into a single PR and got it merged. Adding you as a language leader for sw now. Thanks for your patience 🙂

floral path Mar 13, 2024, 7:33 AM

#

worldly narwhal bump ^^ What can I do to get this merged? - I tried making the same changes via...

I think the documentation is slightly incorrect. You need to publish it to your repo first, then make PR from there. I do not think people have the right to publish the branch to home assista t intents. Correct?

scarlet echo Mar 13, 2024, 7:52 AM

#

You need to open 2 PRs, not to merge them yourself

floral path Mar 13, 2024, 10:10 AM

#

I was not talking about Merge, but creating the PR. I think when I follow the documentation step by step, when I create PR, it tries to publish the branch to homeassistant/intents first, and then create PR from this branch to main. But I have no permission to create branch on homeassistant/intents. So I had to publish it to my account first, and create PR from there (I think it automatically creates a fork of homeassistant/intents first - don't catch me there, I am not a git expert).

worldly narwhal Mar 13, 2024, 12:03 PM

#

ivory vessel <@362297059795009548> There was some problem with the CI and I couldn't get your...

🥳 thanks! wait was well worth it 😆 I'll get started on the rest - if there's something I should change/fix in the future to avoid CI problems I'm all ears

worldly narwhal Mar 13, 2024, 12:42 PM

#

floral path I was not talking about Merge, but creating the PR. I think when I follow the do...

that step was indeed a bit confusing, but clear for me personnally now, going to stick to the codespace method 😄 (these first PRs were from a forked intents repo - hadn't yet seen the message that codespace was preferred), after trying a few times i'll see if there's a way/need to improve the tutorial

scarlet echo Mar 13, 2024, 1:47 PM

#

worldly narwhal that step was indeed a bit confusing, but clear for me personnally now, going to...

codespaces have nothing to do with forks (you can create a codespace on your fork) and it's a good idea to update the sentences on your own fork and then create PRs. You can also work on branches in the intents repo, but never commit dirrectly (without a PR) on the main branch

floral path Mar 13, 2024, 1:57 PM

#

scarlet echo codespaces have nothing to do with forks (you can create a codespace on your for...

I think the last sentence is not correct. People are generally not allowed to create or commit to branches on homeassistant/intents (and I do not mean the main). You might not see that as you have more rights.
So it is not only a good idea to update the sentences in your own fork, but that's the only way. And this is also what is confusing on the documentation.
Nobody was talking about making commits to the main directly I think.

scarlet echo Mar 13, 2024, 2:34 PM

#

Now that @worldly narwhal is a language leader, he can commit dirrectly to the repo, which is not advisable. that was my point

floral path Mar 13, 2024, 2:50 PM

#

Ok. I think we were talking about the documentation in general.

dire oar Mar 13, 2024, 8:39 PM

#

@scarlet echo not sure if I've asked here before, I'm looking at making an "Ideal" open spurce smart speaker PCB, with whatever hardware would be best suited to this project and Willow. Could you please advise who would be the best members to contact to collaborate with?

scarlet echo Mar 13, 2024, 10:07 PM

#

Can't say i can think of too many people, you'll probably have more success on the ESPHome server. Here are a few that come to mind:
@static stump, founder of Raspiaudio, probably up to his ears in closed source hardware design
@lyric harbor, founder of Willow, unsure about his availability
I don't know if he's on this server, but Sebastian from SmartSolutions4Home may be another good pick as a skilled electrical engineer https://smartsolutions4home.com/about-me/

#

Note that the above message has not notified the tagged people that they were tagged

lean beacon Mar 13, 2024, 10:59 PM

#

Are there docs on how to stream sound over websockets to the assist_pipeline integration for wake word detection? Or is it the same as for stt?
Also, in what format does the audio stream have to be? (I'm not really experienced in working with audio formats)

ivory vessel Mar 14, 2024, 12:48 AM

#

lean beacon Are there docs on how to stream sound over websockets to the assist_pipeline int...

https://developers.home-assistant.io/docs/voice/pipelines/

#

Same format. You just need to set the start stage to wake.

lean beacon Mar 14, 2024, 12:50 AM

#

ivory vessel Same format. You just need to set the start stage to wake.

And in what format should the audio stream be? Can't find that in that document.

ivory vessel Mar 14, 2024, 12:55 AM

#

lean beacon And in what format should the audio stream be? Can't find that in that document.

It's near the bottom: https://developers.home-assistant.io/docs/voice/pipelines/#sending-speech-data

You send one byte with the handler id, then raw 16khz mono with 16-bit (signed) samples.

lean beacon Mar 14, 2024, 1:11 AM

#

Thanks!

west gulchBOT Mar 14, 2024, 4:15 PM

#

@lean beacon I converted your message into a file since it's above 15 lines :+1:

📎 message.txt

broken beacon Mar 14, 2024, 8:35 PM

#

Is there any development done for the recommended M5 Atom Echo to solve the issues with it when used with Homeassistant?

compact gate Mar 14, 2024, 8:41 PM

#

That's far too vague to answer, and not related to development

#

Please continue in #voice-assistants-archived

ivory vessel Mar 14, 2024, 9:31 PM

#

west gulch <@160723583973195776> I converted your message into a file since it's above 15 l...

It will resample to 16Khz automatically, but this will cause more CPU usage in HA.

lean beacon Mar 14, 2024, 10:54 PM

#

Ah okay thanks.

left socket Mar 15, 2024, 6:24 PM

#

What power supply is everyone using to power the M5 stack? I am looking to replace my Amazon Echo devices and will need 8 of them.

scarlet echo Mar 15, 2024, 6:25 PM

#

left socket What power supply is everyone using to power the M5 stack? I am looking to repla...

#voice-assistants-archived

left socket Mar 15, 2024, 6:25 PM

#

scarlet echo <#646814454063038466>

Oops... Apologies, I thought I was in that channel.

unborn saddle Mar 20, 2024, 12:38 PM

#

pressing button on Echo to say 'Doe de espresso uit' (is in Dutch, Espresso is an alias) and the response is, 'Sorry, ik kan geen apparaat vinden met de naam De Espresso', would that be an issue to report here, or would that be expected.

#

I could have sworn it did act properly before, so guess there was some development that changed its behavior

neat relic Mar 20, 2024, 12:53 PM

#

Echo -> Alexa right?

severe forum Mar 20, 2024, 6:56 PM

#

my voip assistant connectors dont disconnect when the call is finished. What info more is needed to make a bug.

#

one is on for 20 hours now.

unborn saddle Mar 20, 2024, 10:17 PM

#

neat relic Echo -> Alexa right?

no! Atom Echo, my apologies (wake word is disabled, that's why I need to push the button)

hollow silo Mar 21, 2024, 9:14 PM

#

unborn saddle pressing button on Echo to say 'Doe de espresso uit' (is in Dutch, Espresso is a...

That's a bit strange

#

Can't reproduce it

unborn saddle Mar 21, 2024, 9:21 PM

#

Is that when you talk to the device, or when you type? in my case its when I give the voice command

#

i have this switch, with aliases

Schermafbeelding_2024-03-21_om_22.22.44.png

young wadi Mar 22, 2024, 11:22 PM

#

I just want to thank the devs, the latest update fixed my over-a-month-fight with the voice assistant ecosystem

young wadi Mar 23, 2024, 1:10 AM

#

aaaand didn't survive a reboot, damn pulseaudio, you're savage!

severe forum Mar 23, 2024, 9:51 PM

#

heya. the voip assistant does strange things then asterisk breaks in and moves the call to a other channel.
the assist stays open and the assist processing in the debug runs forever.

scarlet aurora Mar 23, 2024, 11:22 PM

#

The lists on the _common.yaml file of the intents cannot have the same "in" value for multiple "out" values, or can it?

#

in portuguese, "persiana" can be used for both blind and shutter...

#

and also "estore"

worthy wave Mar 26, 2024, 2:58 PM

#

Hi @ivory vessel , is it possible to add metadata information from sentence to response (it mean render_response)? eg. If I have a example sentence (below) it will be nice to have metadata <key>: <value> in response eg, one_sensor: "{{ metadata.response_text }} {{ state.state_with_unit }}".

# Wind speed
- sentences:
    - "<what_is_the_class_of_name>"
  response: one_sensor
  requires_context:
    domain: sensor
    device_class: wind_speed
  slots:
    domain: sensor
    device_class: wind_speed
  expansion_rules:
    class: "(prędkość|szybkość) [wiatru]"
  metadata:
    response_text: Prędkość wiatru wynosi

#

The problem in Polish is that in order to correctly create a response for the indicated device, you would need to create the name of the device in its basic form (without inflection). But in current yaml configuration it is impossible. That's why I wanted to prepare answers without providing the name of the device. They will contain information about the class of the device we are asking about.

scarlet echo Mar 26, 2024, 3:02 PM

#

take my upvote!

worthy wave Mar 26, 2024, 3:02 PM

#

Currently, to do this I have to prepare a large number of responses for each device..

#

something like this (but it not looks quite good in the main configuration):

#

one_sensor_apparent_power: Moc pozorna urządzenia wynosi {{ state.state_with_unit }}
one_sensor_aqi: Indeks jakości powietrza wynosi {{ state.state_with_unit }}
one_sensor_atmospheric_pressure: Ciśnienie atmosferyczne wynosi {{ state.state_with_unit }}
one_sensor_battery: Poziom baterii wynosi {{ state.state_with_unit }}
one_sensor_carbon_dioxide: Stężenie dwutlenku węgla wynosi {{ state.state_with_unit }}
one_sensor_carbon_monoxide: Stężenie tlenku węgla wynosi {{ state.state_with_unit }}
one_sensor_current: Natężenie prądu elektrycznego wynosi {{ state.state_with_unit }}
one_sensor_data_rate: Prędkość transferu danych wynosi {{ state.state_with_unit }}
one_sensor_data_size: Rozmiar danych wynosi {{ state.state_with_unit }}
one_sensor_date: Data w kalendarzu to {{ state.state_with_unit }}
one_sensor_distance: Odległość wynosi {{ state.state_with_unit }}
...

scarlet echo Mar 26, 2024, 3:20 PM

#

@worthy wave although it's an absolutely excellent idea and I can try to do that PR myself, just FYI you can hack this as we speak with slots instead of metadata

worthy wave Mar 26, 2024, 3:20 PM

#

or/and if it is possible to add exact text which was recognised from expansion_rules to response. Eg in response we can see extra key like {{ rules.class }} which will contains text like prędkość wiatru or szybkość wiatru 🙂 I know it won't be easy, but it would certainly make it easier to create correct answers 😀

#

@scarlet echo yes I know that I can use slots..

#

but slots in not a good place to add just a response.. I preffer create lot of responses in stead of this 🙂

#

now maybe it will works, but on the future it can generate lot of problem 🙂

scarlet echo Mar 26, 2024, 3:22 PM

#

sentences/xx/homeassistant_HassWhatever.yaml

intents:
  HassWhatever:
    data:
      - sentences:
          - "abracadabra"
        slots:
          testslot: "test value"
        response: testslot

#

responses/xx/HassWhatever.yaml

responses:
  intents:
    HassWhatever:
      testslot: "{{ slots.testslot }}"

#

$ python3 -m script.intentfest parse --language en --sentence 'abracadabra'
{
  "text": "abracadabra",
  "match": true,
  "intent": "HassWhatever",
  "slots": {
    "testslot": "test value"
  },
  "context": {},
  "response_key": "testslot",
  "response": "test value"
}

worthy wave Mar 26, 2024, 3:23 PM

#

yes, I know I can do that... but I don't want to do like that 🙂

scarlet echo Mar 26, 2024, 3:23 PM

#

i totally agree

worthy wave Mar 26, 2024, 3:28 PM

#

@scarlet echo maybe you tested it on HA, small question: if I have real device on HA eg. living room door and I create aliases like door in living room. When I ask: What is the state of door in living room? What will be the value in {{ slots.name }}? living room door or door in living room?

scarlet echo Mar 26, 2024, 3:29 PM

#

the slot text is what you said, in this case door in living room

worthy wave Mar 26, 2024, 3:30 PM

#

Big thank's.. so it still not solving problem with convert polish name of device to base form 🙂

#

maybe there is some magic field in HA that I can fill in to always use this form for answers.. hehe 😅

scarlet echo Mar 26, 2024, 3:35 PM

#

we have loads of issues with not having both the slot "text" and value being available in responses (and other places). basically, there are places where we need both the "translated" and "untranslated" versions of a slot (e.g. a zone name + ID or an entity_id along with the friendly name etc.). i'm not sure of the timeline for this (or if there even is one) and i'm reluctant to implement it, as many of my contributions have become severely outdated by the time someone reviewed them and i don't have enough time to keep them up to date

worthy wave Mar 26, 2024, 3:38 PM

#

yes, I saw your PR (and branch) related with translations.. that is the reason why in polish language I use lot of conditions to create correct response, again I know that is not a good solution, but without this the response will not make a sense in polish language

scarlet echo Mar 26, 2024, 3:39 PM

#

you're gonna love this, then 😛 https://community.home-assistant.io/t/entity-metadata-like-number-gender-or-number-for-localization-or-user-generated-names/535963

worthy wave Mar 26, 2024, 3:42 PM

#

I have exactly the same problem in Polish language 😅

#

I don't know the Romanian language, but I see a lot of similarities to the Polish language

ivory vessel Mar 27, 2024, 5:17 PM

#

worthy wave Hi <@638799193586139136> , is it possible to add `metadata` information from sen...

Created a PR 🙂: https://github.com/home-assistant/intents/pull/2108

#

FYI @scarlet echo I decided to rename the range scale to multiplier per your suggestion.

scarlet echo Mar 27, 2024, 5:26 PM

#

I didn't realize the modification was in the intenta repo. After a quick skim, i thought the core had to be altered

ivory vessel Mar 27, 2024, 5:26 PM

#

It does, I forgot to mention that this is to try it out first before I modify core.

scarlet echo Mar 27, 2024, 5:26 PM

#

But then again, after a week of vacation and hundreds of emails both at work and personal, my context switching fu was not at its peak 😅

ivory vessel Mar 27, 2024, 5:27 PM

#

Nope, you're definitely right 😄

scarlet echo Mar 27, 2024, 5:28 PM

#

Since you're online, Mike, great work with a certain Andean mammal! 😋 Can't wait to test it out

ivory vessel Mar 27, 2024, 5:56 PM

#

Lol, thanks! It can't control HA just yet, but that's the next step. I have a proof-of-concept working, but we decided to generalize things just a bit more 😉

west gulchBOT Mar 27, 2024, 10:32 PM

#

@worthy wave I converted your message into a file since it's above 15 lines :+1:

📎 message.txt

#

@worthy wave I converted your message into a file since it's above 15 lines :+1:

📎 message.txt

worthy wave Mar 27, 2024, 10:34 PM

#

result:

===================================================================================================================== test session starts
platform darwin -- Python 3.12.2, pytest-8.1.1, pluggy-1.4.0
rootdir: .../Home Assistant/intents
configfile: pyproject.toml
collected 79 items / 75 deselected / 4 selected

tests/test_language_intents.py ..                                                                                                                                                                                                                       [ 50%]
tests/test_language_sentences.py ..                                                                                                                                                                                                                     [100%]

============================================================================================================== 4 passed, 75 deselected in 1.08s

#

@ivory vessel big thanks for this small changes.. It will really help to design better voice experience ❤️ 💪 😀

tidal ridge Mar 29, 2024, 10:02 PM

#

scarlet echo we have loads of issues with not having both the slot "text" and value being ava...

Sorry I’m just dropping in here… I think there is a languistics aspect that you both are getting at that might need to be handled differently that a slot. Some languages have pre-positional phrases where some have post-positional phrases. Adding to the fun, within the phrase the order of the linguistic object will change.

So having specific software objects that reflect the parts of speech is useful. My thinking is that the slot could be dirived based on the language setting affecting the construction of the parts of speech determined through an NLP library pulling apart the words through chunks, stemming, and lemmatization.

#

That last sentence was trying to do too much.

scarlet echo Mar 29, 2024, 10:31 PM

#

What is your point?

tidal ridge Mar 30, 2024, 10:59 AM

#

Yes… apologies… The point is to handle the linguistic differences prior to attempting to handle the intent and contents of the slot.

tidal ridge Mar 30, 2024, 4:08 PM

#

STT -> Pragmatics -> Semantics -> Syntax -> Morphology -> Translate -> Intent and Slot -> Action

scarlet echo Mar 30, 2024, 7:35 PM

#

tidal ridge STT -> Pragmatics -> Semantics -> Syntax -> Morphology -> Translate -> Intent an...

Agreed, but all those pieces are missing atm. Some may be a bit of overkill. I doubt anyone will be against somebody implementing those things.

tidal ridge Mar 31, 2024, 8:41 PM

#

scarlet echo Agreed, but all those pieces are missing atm. Some may be a bit of overkill. I d...

I had a similar thought as I was typing, “Aren’t you doing this right now, you big dummy? If not you than who?”

My inner dialogue can be brutal.

So I played around a bit. That pipeline is as I typed it is miserable. If nothing else, incredibly slow.

Now I’m mulling over if this is feasible as typed. 🥸😒

scarlet echo Apr 1, 2024, 6:10 AM

#

tidal ridge I had a similar thought as I was typing, “Aren’t you doing this right now, you b...

My inner dialogue can be brutal.
😆

hollow hollow Apr 2, 2024, 11:52 AM

#

Hello guys! I've just finished initital version of custom integration of AllTalk TTS.

#

https://github.com/ser/AllTalkTTS

#

AllTalk TTS is in my opinion the best currently available TTS system.

#

tests, comments, patches are really welcome

worthy wave Apr 3, 2024, 6:12 PM

#

Hey Guys, I have RTX 4090 and I try train new polish voice. But I have one problem witch torch version 1.13.1

>>> import torch
>>> print(torch.__version__)
1.13.1+cu117
>>> out = torch.fft.rfft(torch.randn(1000).cuda())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR

Does anyone know how to solve this problem? I looked for some solutions but unfortunately I can't find anything..

#

It is worth adding that I am working on Ubuntu 22.04 🙂

worthy wave Apr 3, 2024, 9:29 PM

#

I found a solution 🙂 https://github.com/rhasspy/piper/issues/295 the first tests looks really great

worthy wave Apr 5, 2024, 1:09 PM

#

@scarlet echo could you please review this PR? https://github.com/home-assistant/intents/pull/2108 😉

scarlet echo Apr 5, 2024, 1:17 PM

#

worthy wave <@535933407939657745> could you please review this PR? https://github.com/home-a...

done. sorry for the delay

hollow silo Apr 9, 2024, 12:38 PM

#

I have something I don't understand.
In cover_HassTurOn I have the following intent

      - sentences:
          - open [de|het] <curtain> <in> <area>
          - "[de|het] <curtain> <in> <area> openen"
          - "[<doe>] [de|het] <curtain> (<open> <in> <area>|<in> <area> <open>)"
        response: "cover"
        requires_context:
          device_class: "curtain"
          domain: "cover"

<curtain> refers to "(gordijn[en]|vitrage[s])"

#

These are the tests (which pass just fine)

  - sentences:
      - Open het gordijn in de woonkamer
      - Vitrage woonkamer open
      - Doe het gordijn open in de woonkamer
    intent:
      name: HassTurnOn
      slots:
        area: Woonkamer
      context:
        device_class: curtain
        domain: cover
    response: Geopend

#

but if I try an individual sentence from those tests, it doesn't work

#

@TheFes ➜ /workspaces/intents (nl_volume) $ python3 -m script.intentfest parse --language nl --sentence 'Open het gordijn in de woonkamer'
{
  "text": "Open het gordijn in de woonkamer",
  "match": false
}

scarlet echo Apr 9, 2024, 2:36 PM

#

that sounds like it's because you're not inferring context in your command. i can't remember how you can do that, though

hollow hollow Apr 11, 2024, 2:44 AM

#

Hello guys with privileges, I need you to review my AllTalkTTS HACS PR: https://github.com/hacs/default/pull/2457

#

You will not regret, it's seriously the most advanced existing free TTS

compact gate Apr 11, 2024, 2:53 AM

#

The folks here have nothing to do with that

hollow hollow Apr 11, 2024, 3:05 AM

#

that's bad, making this integration i did not realise that hacs backlog is 3 months long 😦

#

i would not bother

#

it seriously negatively affects HA project as it drains steam from developers

scarlet echo Apr 11, 2024, 7:55 AM

#

a few things here @hollow hollow

HACS (as the name implies) is not HA, but the community store. this channel is dedicated to developers on HA voice stuff
you can always list any repo as a custom Github repo in your HACS instance and use it without waiting for HACS to merge a PR adding the repo to the default collection. you can also instruct your potential users to do that
as the HACS documentation for publishers states, the backlog is quite long and it will take a while to get to yours. out of personal experience, it took about 4 months for me
"it seriously negatively affects HA project as it drains steam from developers" - although I agree (again, out of personal experience) that it can be frustrating and exhausting to wait as a developer for your contribution to get merged and used somehow, i seriously doubt that this process (be it in regards to HA or HACS) affects the HA project. there are just too many great things happening all at once for your great thing to make it a dealbreaker. and only so little manpower to handle and organize all that greatness
when you created the PR in HACS, this was in the description. don't believe me? edit your PR (unless you've deleted everything there)

<!--
DO NOT REQUEST REVIEWS, THAT IS JUST RUDE, IF YOU DO THE PULL REQUEST WILL BE CLOSED!
Make sure to check out the guide here: https://hacs.xyz/docs/publish/start
-->

hollow hollow Apr 11, 2024, 8:00 AM

#

thanks for your lecture, it was funny!

hollow hollow Apr 11, 2024, 8:32 AM

#

But I would really like that review

#

Github clearly writes on the PR page: review required

#

I suppose it must be a Schrodinger review then - it is required and not requested in the same time!

#

And going back to your funny lecture a bit, the only person I see doing reviews is a Nabu Casa employee, it sounds like HA-related thing

scarlet echo Apr 11, 2024, 8:39 AM

#

hollow hollow Github clearly writes on the PR page: review required

"...those who refuse will be shot at dawn" is in the fineprint

hollow hollow Apr 11, 2024, 8:40 AM

#

3 months delays are evidently their internal problem which may be related to lack of workforce or bad procedures

scarlet echo Apr 11, 2024, 8:40 AM

#

i suggest you ask for a refund

hollow hollow Apr 11, 2024, 8:40 AM

#

as they could delegate someone from a community

scarlet echo Apr 11, 2024, 8:41 AM

#

now please stop spamming this channel with off-topic things you don't (want to) grasp

hollow hollow Apr 11, 2024, 8:41 AM

#

I will just ignore your dumb comments, it will be easier 🙂

#

They are funny though

drowsy inlet Apr 11, 2024, 8:51 AM

#

hollow hollow I will just ignore your dumb comments, it will be easier 🙂

with that attitude you are almost guaranteeing nobody will review your PR, devs read these channels too 🙂

split bison Apr 11, 2024, 8:52 AM

#

And frankly, you're alienating potential users by being difficult to deal with 😉

#

I'm going to step in as a moderator and say: you're in the wrong place, you need to follow the rules of this and GitHub

earnest marsh Apr 11, 2024, 8:53 AM

#

hollow hollow 3 months delays are evidently their internal problem which may be related to lac...

Or technical limitations maybe? You are making assumptions there.

#

Making a negative scene, without having actual foundations or context

hollow hollow Apr 11, 2024, 8:54 AM

#

As provoked by mr telelele I checked there is only one reviewer, so I think Mr Stefan you can't blackmail me efficiently

earnest marsh Apr 11, 2024, 8:54 AM

#

It isn't blackmailing IMHO, it is true. Such approach wouldn't be received well in general

#

You might not want to hear that, that is fine 🤷‍♂️

hollow hollow Apr 11, 2024, 8:55 AM

#

I clearly presented the actual foundation: 3 months no review and hundreds of reviews waiting, so you are simply lying, Mr Frenck

earnest marsh Apr 11, 2024, 8:55 AM

#

anyways, HACS != Home Assistant development, so this might not be the right place for this

split bison Apr 11, 2024, 8:55 AM

#

hollow hollow I clearly presented the actual foundation: 3 months no review and hundreds of re...

Pro tip: do not accuse the Home Assistant core team of lying

hollow hollow Apr 11, 2024, 8:55 AM

#

Yes we can close this topic indeed

earnest marsh Apr 11, 2024, 8:55 AM

#

@hollow hollow Sorry, that felt offensive, where was I lying?

hollow hollow Apr 11, 2024, 8:55 AM

#

Why not if he is lying

drowsy inlet Apr 11, 2024, 8:56 AM

#

hollow hollow As provoked by mr telelele I checked there is only one reviewer, so I think Mr S...

https://tenor.com/view/michael-scott-the-office-steve-gif-25464260

hollow hollow Apr 11, 2024, 8:56 AM

#

you told i have not presented the foundation

#

Which I clearly did

earnest marsh Apr 11, 2024, 8:56 AM

#

That is not what I said

hollow hollow Apr 11, 2024, 8:56 AM

#

Mr Frenck said: "without having actual foundations"

earnest marsh Apr 11, 2024, 8:57 AM

#

I said: You are making assumptions on what is happening or the reasons what is going on, while there is no response and thus no foundations for those conclusions. You are guessing

#

yes, you had no response and no context, you cannot make such conclusions out of thin air

hollow hollow Apr 11, 2024, 8:57 AM

#

no you wrote "without having actual foundations "

#

it was a lie

#

you just jumped on me

earnest marsh Apr 11, 2024, 8:57 AM

#

I understand you are unhappy with the wait, but 🤷‍♂️ You are also drawing conclusions based on nothing but wait

hollow hollow Apr 11, 2024, 8:57 AM

#

probably because it's a real problem

earnest marsh Apr 11, 2024, 8:58 AM

#

Alright, ok this is going nowhere. Let's stop this here. This is not HA voice development related.

hollow hollow Apr 11, 2024, 8:59 AM

#

OK

solid cradle Apr 11, 2024, 11:38 AM

#

https://tenor.com/view/jake-gyllenhaal-astonished-dumbfounded-surprised-impressed-gif-15443679

Tenor

subtle sierra Apr 11, 2024, 11:01 PM

#

I just opened this PR 5 days ago that improves the voice assistant Arabic language https://github.com/home-assistant/intents/pull/2125 but I found a test that failed, however I only edited the yaml file https://github.com/home-assistant/intents/actions/runs/8646997707/job/23722202820?pr=2125 so what's the problem?

scarlet echo Apr 12, 2024, 6:35 AM

#

subtle sierra I just opened this PR 5 days ago that improves the voice assistant Arabic langua...

you need to edit the tests to match the changes you made https://github.com/home-assistant/intents/blob/main/tests/ar/light_HassLightSet.yaml

subtle sierra Apr 15, 2024, 7:16 PM

#

I modified it and it still fails!

west gulchBOT Apr 15, 2024, 7:43 PM

#

@subtle sierra I converted your message into a file since it's above 15 lines :+1:

📎 message.txt

scarlet echo Apr 15, 2024, 8:20 PM

#

sorry, i can't read Arabic so it's pretty hard for me to help out. i'd suggest tagging the AR language leaders in your PR, asking for help

worthy wave Apr 19, 2024, 4:37 PM

#

Hey guys, has anyone tried adding own pretrained voice to piper? Do you know how to do this? I tried various ways but unfortunately it doesn't work..

worthy wave Apr 20, 2024, 6:57 PM

#

Finally, it start working for me 🙂 that are few tips to use the own voice:

Add new files to /share/piper but the correct name for the files is wg_glos_meski.onnx and wg_glos_meski.onnx.json. Don't use pl_PL-wg_glos_meski-medium.onnx
Restart piper add on and core HA to see changes
Update your pipeline and select new voice which was added to your HA, in my case it just was wg_glos_meski (medium)
Update all automations where you use tts.speak service. You should select your new voice use options: configuration and set correct values voice: wg_glos_meski where value (wg_glos_meski) is the name of the voice:

service: tts.speak
data:
  media_player_entity_id: media_player.korytarz_homepod
  cache: false
  options:
    voice: wg_glos_meski
  message: Wykryto wyciek wody w kuchni pod ekspresem do kawy.
target:
  entity_id: tts.piper

#

Only one small problem is the inability to set the voice directly in the add-on 🙂 Even when I try to set it directly via yaml configuration.. it is just impossible 🙂 because add-on voice list are hardcoded in plugin configuration: https://github.com/home-assistant/addons/blob/master/piper/config.yaml#L28. If I set something, I get an error message 🙂

worthy wave Apr 22, 2024, 8:34 PM

#

@scarlet echo How do you rate the effect of onju-voice? Are you satisfied with this speaker? How does wakewords perform when there is slight background noise? and most importantly, does it cope well when there is a slight noise and we say a command? PS. I'm asking because I saw that you prepared a video on YT with instructions on replacing a PCB 😉

scarlet echo Apr 23, 2024, 7:44 AM

#

worthy wave <@535933407939657745> How do you rate the effect of onju-voice? Are you satisfie...

i'm pretty happy with it. you can find multiple posts about it in #voice-assistants-archived (search for "onju"). however, I am not a "power user", i don't use it along with TV and whatnot. for my needs, it's perfectly suited

worthy wave Apr 23, 2024, 7:02 PM

#

scarlet echo i'm pretty happy with it. you can find multiple posts about it in <#646814454063...

thanks for the information, I'm asking you because I know you use it yourself, unfortunately in my case Atom Echo did not work properly even though many people had no problem

scarlet echo Apr 23, 2024, 7:25 PM

#

The hardware in the Onju is pretty good. I am waiting for the software to improve so as to fully utilize it 😬

meager onyx Apr 30, 2024, 2:46 PM

#

Hi how do I create custom models for micro wake word?

#

Or is it better to use openwakeword for the time being?

scarlet echo May 13, 2024, 8:04 AM

#

I think we may have made the wrong decision in regards to querying cover entities for questions such as "Which windows are open?". But switching to binary_sensor would be a breaking change, so I started a poll here to get some feedback on which type of entities people have https://github.com/home-assistant/intents/discussions/2168
If it will reveal that binary_sensors are more prevalent, is it ok to switch the default target domain to binary_sensor? @ivory vessel @noble copper

hardy cargo May 13, 2024, 8:10 AM

#

meager onyx Hi how do I create custom models for micro wake word?

for custom models , at the moment it is better to use openwakeword or snowboy , it is quite in depth and complex task to create these for mWW . The process in detail can be found here https://github.com/kahrendt/microWakeWord

noble copper May 13, 2024, 10:54 AM

#

scarlet echo I think we may have made the wrong decision in regards to querying `cover` entit...

We shouldn't make a decision but support both

scarlet echo May 13, 2024, 11:03 AM

#

noble copper We shouldn't make a decision but support both

a decision has been made ~1 year ago. i've discussed with Mike the potential solution to implement support for such thing (i.e. "entity" slot lists, so you can have more than one entities referenced in a sentence (e.g. "is Paulus at the supermarket?") with one or more filters which entities should match
the trouble is that PRs in that area get stale and I, for one, don't have the time to redo everything from the ground up after 2 months because everything got overhauled
so the proposed solution was simply just as bad, but more general

scarlet echo May 13, 2024, 2:18 PM

#

alternatively, we could create a binary_sensor_as_x helper, which turns binary_sensors into covers only from the interface, similarly to switch_as_x https://github.com/home-assistant/architecture/discussions/1084

noble copper May 17, 2024, 3:12 PM

#

I've opened a bounty to get outbound calls working for our voip utils https://github.com/home-assistant-libs/voip-utils/issues/17

hollow steeple May 17, 2024, 4:55 PM

#

All paths lead to pjsip, as with most SIP UAs you can probably think of. If you want to get hacky with it I'm sure you could made a bastardised UA that just makes a call and does not care too much about playing nicely you can hack something together, as with the original UA specific to a Grandstream HT801, it really depends how long you want the call to be and how the UAS reacts. If the audio you want to transmit is under 30 seconds you can probably get away with murder and just send an INVITE, wait for a 200 OK, ACK it and send/receive audio (with very specific codec choices as with the original). What's your actual use case? I can be a bit more specific with more information on what you want to achieve (Does it need to auth? What does it actually need to call?).

spare forge May 24, 2024, 2:31 PM

#

@scarlet echo I updated to the latest version of onju-voice-microwakeword from your repo and it seems to only recognize the wake up word once and then never ever again. Does that sound like a known issue to you? Also, playing media doesn't seem to do anything

scarlet echo May 25, 2024, 5:09 AM

#

spare forge <@535933407939657745> I updated to the latest version of onju-voice-microwakewor...

https://github.com/tetele/onju-voice-satellite/issues/46
Also, this is rather a question for #voice-assistants-archived

naive parcel May 25, 2024, 10:06 AM

#

What sort of hardware is recommended to train/finetune a new voice for Piper? Is a single gpu with 24GB of VRAM enough, or it would be preferred to have a multi-gpu setup?
(another way to phrase this question is "What hardware is Mike training piper on")

#

(i'm interested in contributing voices for Piper)

naive parcel May 25, 2024, 3:07 PM

#

Also, are there tools for processing public domain audio book recordings into a dataset?
I was thinking of like, a tool that uses whisper to transcribe the audio files and save all that metadata into a csv file
I remember Mike saying somewhere that he used public domain audio books to create voices for piper, so I imagine he didn't create datasets manually and used some tools for automating the process=

naive parcel May 28, 2024, 6:07 PM

#

naive parcel Also, are there tools for processing public domain audio book recordings into a ...

It was a PAIN to set up (had to compile freaking gcc 11 to build one ancient module old enough to go to school) and it errors out or stalls unless you feed it a single 48khz WAV file (and I also had to run it in CPU mode because it was video memory oom-ing) but this achieved what I wanted:
https://github.com/davidmartinrius/speech-dataset-generator

#

The whisper transcription is OK but dodgy in a few places, still looking for a good software to edit metadata

little robin May 28, 2024, 9:31 PM

#

Hi all, I would like to extend the conversation component to allow to send TTS messages to for example esphome's Voice_Assistant. I have been looking into the code and it seems easily to do without the need change very much, Sadly i have not the skills and environment to do it my self., I tried. Is there anyone that is willing to help me out?

scarlet echo May 28, 2024, 9:42 PM

#

little robin Hi all, I would like to extend the conversation component to allow to send TTS m...

that sounds out of scope for a conversation agent. have you opened up an architecture discussion?
you can use the tts.speak service and target any media_player, one of which could be in the ESPHome voice satellite

little robin May 28, 2024, 9:51 PM

#

I know i can use the media player. But some devices do not support media_player option. And from what i understand the media player requires MP3 codex to play audio. While VA uses WAV audio.

little robin May 28, 2024, 9:55 PM

#

scarlet echo that sounds out of scope for a conversation agent. have you opened up an [archit...

I see this suggestion as an conversation starter, like HAOS: "Did you take you meds?" , ME: "Yes"/"No" etc. HAOS: When "Yes" =>"Oky noted" else "Time to take them Now." etc.

#

Anyway i will ready the architecture discussion forum and place my suggestion there.

little robin May 28, 2024, 11:05 PM

#

https://community.home-assistant.io/t/start-conversation-from-ha/733856

scarlet echo May 29, 2024, 4:43 AM

#

little robin I see this suggestion as an conversation starter, like HAOS: "Did you take you m...

Oh, ok. I had the same proposal last year https://github.com/home-assistant/architecture/discussions/907

little robin May 29, 2024, 9:34 AM

#

scarlet echo Oh, ok. I had the same proposal last year https://github.com/home-assistant/arch...

And what was the responce on that?

#

there was a lot of reactions on your suggestion i see, but so far i can see none are in the direction of let me implement it. Am i right?

scarlet echo May 29, 2024, 10:19 AM

#

correct. there needs to be a decision from an architectural standpoint in order to guarantee the merging of the feature

little robin May 29, 2024, 12:16 PM

#

From the look at the code, it is almost there, there are no architectural changes needed, imho, it is just extending the already existing code. The assist_pipeline has all the setup options that are needed to send the TTS messaages.

scarlet echo May 29, 2024, 12:18 PM

#

you're looking at it simplistically, i fear. OK, so you can emit a TTS message, but how will that tie into your response? the proper approach, in my view, is to build the foundations for a conversation (i.e. back and forth messages) that can be started by either party

little robin May 29, 2024, 12:29 PM

#

No kidding, This would be my optimal solution as well. And i fully agree to that. And i still belief this is still possible within the current pipeline architect. Maybe not as extended as your proposal is but i see some lights in the dark.

#

My approach is doing it step by step. First the initial message from HA and later controlling responses. With some proper automation setup this can be done by adding different triggers that response on what is said.

hollow silo May 29, 2024, 12:42 PM

#

working on the timer intents now, but I get this error when doing the tests
FAILED tests/test_language_intents.py::test_homeassistant_HassCancelTimer[nl] - AssertionError: Intent HassCancelTimer does not support slot 'seconds'. See intents.yaml for supported slots

#

any idea where this comes from. until now I only made direct translations from EN to NL, didn't use seconds directly anywhere

little robin May 29, 2024, 12:54 PM

#

did you do a full search on 'seconds'?

#

I'm sure you did 😉

hollow silo May 29, 2024, 6:16 PM

#

https://github.com/home-assistant/intents/pull/2194 this is what I have now, still not sure where those unsupported slots come from

little robin May 29, 2024, 11:25 PM

#

@thefly what happens when someone says something like pizzatimer instead of pizza timer?

#

or keukentimer vs keuken timer

hollow silo May 30, 2024, 11:27 AM

#

little robin @thefly what happens when someone says something like pizzatimer instead of pizz...

there's an optional space between them, so it works with or without

#

{area}[ ]timer and {timer_name:name}[ ]timer

west gulchBOT May 31, 2024, 7:27 AM

#

@scarlet echo I converted your message into a file since it's above 15 lines :+1:

📎 message.txt

hollow silo May 31, 2024, 7:28 AM

#

Ah nice, there is a party in my area this weekend to celebrate the intents repo https://www.intentsfestival.nl/en/

scarlet echo May 31, 2024, 7:29 AM

#

do you... <cough cough> intend on going?

#

(i'll see myself out)

hollow silo May 31, 2024, 7:31 AM

#

If the winds are good, I can just listen to it for free (also when I intend to sleep 😅 )

hardy cargo May 31, 2024, 7:33 AM

#

there is a campsite! so you can stay in**tents **

scarlet echo May 31, 2024, 7:35 AM

#

i feel bad about my on-topic message question becoming less relevant, but do they test for compliance at the entrance, before allowing you to go to the main stage?

hollow silo May 31, 2024, 8:16 AM

#

on-topic then

#

It's a bit unclear to me when to use slots and when to use requires_context

#

I had requires_context here, but that didn't work. changing it to slots makes it work

      - sentences:
          - "open [de] garage[ ][deur]"
          - "[de] garage[ ][deur] openen"
          - "[<doe>] [de] garage[ ][deur] <open>"
          - "<zou> [de] garage[ ][deur] ((<open> willen | <open> kunnen | <open>[ ])<doe>|openen)"
          - "<zou> [de] garage[ ][deur] (kunnen|willen) [<open>[ ]<doe>|openen]"
        response: "cover_device_class"
        slots:
          device_class: "garage"
          domain: "cover"

#

I think I initially just copied this from the EN version

scarlet echo May 31, 2024, 8:21 AM

#

requires_context (in sentences definition) is for when:

you use a {name} in the sentence and want to make sure the sentence matches a certain domain (although that can be enforced through the filename - domain_IntentName), device_class etc.
you want to make sure that the satellite used had an area assigned, to treat sentences like area-aware without the user naming the area

#

slots (in sentences definition) is for when you want to specify a certain slot value without the user saying it. for example

- sentences:
    - "start a half hour timer"
  slots:
    minutes: 30

#

this will populate the slot value with a value you specify, then will hand it over to the conversation agent to use it

hollow silo May 31, 2024, 8:24 AM

#

ah thanks

scarlet echo May 31, 2024, 8:25 AM

#

there's a slight issue with the context in tests, as far as i see it. you can't send context without expecting it as a slot, so the sentences MUST send out context as a slot, and i see that as a bug which I have tried correcting https://github.com/home-assistant/intents/pull/2142

#

basically, at the moment, input context should be the same as output context in a test

#

which should not be the case, as i see it

scarlet echo May 31, 2024, 12:58 PM

#

I'm not sure who needs to hear this, but the new code review bot in the intents repo seems very good. A welcome addition, thanks!

hollow silo May 31, 2024, 2:25 PM

#

yeah, it's active on all HomeAssistant repo's, but it seems to provide good information

fair hill Jun 1, 2024, 9:16 AM

#

How can I apply as language leader? I see that for DE there are quite a few open PRs that are neither commented nor reviewed. To distribute the work load I'd like to help out and join the current leaders.

pseudo bobcat Jun 3, 2024, 2:52 PM

#

HI All, apologies if its the wrong place to ask a Voice question - I have voice working from an ESPHome device to HA .. the wake words are running fine, but when the command is spoken, HA doesnt detect the end of the sentence. (Long pauses for timeout) - the phrase is correct, just lots of silence.

#

is there a setting I can butcher to experiment more >

compact gate Jun 3, 2024, 3:10 PM

#

More for #voice-assistants-archived

#

You can try tweaking these:

  noise_suppression_level: 2
  auto_gain: 31dBFS
  volume_multiplier: 2.0

#

And: -https://www.home-assistant.io/voice_control/troubleshooting/#to-tweak-the-assist-audio-configuration-for-your-device

pseudo bobcat Jun 3, 2024, 3:48 PM

#

compact gate You can try tweaking these: ``` noise_suppression_level: 2 auto_gain: 31dBFS...

Thank you!

little robin Jun 6, 2024, 10:22 AM

#

Dear @neon swan, not sure of this is the right why to contact you. I'm working on a mobile VA device (https://discord.com/channels/429907082951524364/1171818251983011920) . For this I like to create a solution to allow HA directly to talk to VA without the need to answer a question 😄 . I know this can be done using the media_player. The thing is that i do not like to add a heavy component like this in esphome, just get announcements from HA. I have been looking in to the HA core code and i belief that the current architecture has everything that is needed to allow this.
But i like to confirm what i figured out and talk on how to implement this new feature.

rotund frigate Jun 6, 2024, 3:39 PM

#

Hello

So I managed to have the answer to my query on the Atom Echo from GPT (OpenAI Extended) talked either by a mp3 sent to it or directly with the text being spoken by Alexa's voice to my Echo Dot. The thing is that I did all this by modifying the components code. I'd like if possible to do this directly from the Atom Echo yaml conf and send the text of the answer via the notify service that makes a request to the Echo Dot, but I don't know how to retrieve the text of the answer. I only manage to get the audio or I also can get the text of what I've said so my STT.

To get the text of what I have said, it is

on_stt_end:
then:
- lambda: id(stt_text) = x.c_str();

I tried tts_text but it's not working. Has anyone a clue ? Thanks

compact gate Jun 6, 2024, 4:04 PM

#

please don't crosspost, and this is not a dev question

#

you'll get more support for that over in the ESPHome Discord

#

https://discord.gg/nmvbJXED

calm sleet Jun 6, 2024, 10:39 PM

#

How do folks reviewing new sentences feel about using an LLM to generate alternative sentences? If I'm not blindly committing but instead using it to come up with alternative natural sounding permutations, will that be accepted?

#

After the new release talking about sticking an LLM in to parse intents in an online system, I got thinking that some of that flexiblity could be achieved offline by letting the LLM come up with sentences structures for each intent. So basically the inverse.

#

Example PR: https://github.com/home-assistant/intents/pull/2219

scarlet echo Jun 7, 2024, 6:52 AM

#

if it makes sense, nobody cares you used AI to generate alternative sentences
Mike's initial plan with the intents repo was precisely to have grammatically correct sentences for doing stuff and then to have some ML model come up with new ones based on them

scarlet echo Jun 7, 2024, 6:53 AM

#

calm sleet Example PR: https://github.com/home-assistant/intents/pull/2219

that PR seems good, i'll formally review it on Github today

calm sleet Jun 7, 2024, 12:45 PM

#

Great! Thanks. I just wanted to make sure it wasn’t against some policy. I’ll carry on. I have a few other PRs in draft to expand sentences quite a bit. I also noticed that there’s a lot of repeated patterns between files. I’ve got another branch started to kind of refactor that a bit. I’ll make sure to keep the patches manageable though.

scarlet echo Jun 7, 2024, 12:47 PM

#

calm sleet Great! Thanks. I just wanted to make sure it wasn’t against some policy. I’ll ca...

i've left some feedback. you might want to apply it throughout the things you want to modify
please don't create huge PRs. one per domain or new piece of functionality seems about right

calm sleet Jun 7, 2024, 12:52 PM

#

Yea. That sounds good. Mind having some discussion about the feedback here?

Mostly around precise grammar structure vs same intent. Targeting precision rather than capturing intent seems unintuitive to me given that it will force users to speak very precisely. Some users may have different grasps of the English language.

For example: If my toddler says: “is all the shades open?” It is not correct grammar, however the intent is unambiguous. It feels to me that it would be best to respond to this kind of query with the expected intent all the same.

scarlet echo Jun 7, 2024, 12:53 PM

#

fo' shizzle

scarlet echo Jun 7, 2024, 12:54 PM

#

calm sleet Yea. That sounds good. Mind having some discussion about the feedback here? Mos...

joking aside, i don't know what to say about that. I think sentences should be grammatically correct, as @ivory vessel initially wanted them. if that changed, he'll let us know

#

but to answer the initial part of your question, this is a good place to discuss such issues. the best place would be here https://github.com/home-assistant/intents/discussions

calm sleet Jun 7, 2024, 1:00 PM

#

Thanks for the pointer. And for this topic, probably specifically here: https://github.com/home-assistant/intents/discussions/871

scarlet echo Jun 7, 2024, 1:01 PM

#

i guess that's a good place and there are already people engaged there

ivory vessel Jun 7, 2024, 5:15 PM

#

scarlet echo joking aside, i don't know what to say about that. I think sentences should be g...

Some slightly "incorrect" sentences are fine, like subject/verb disagreement. If I could get a spare moment, I could try and implement the second stage of the plan and train a small machine learning model on the existing sentences 😄

calm sleet Jun 7, 2024, 9:31 PM

#

Ah, ok. Thanks for the guidance! I've got a few PRs that I'm working on to hopefully make a lot of sentences more general. I've found that it still has the feeling of needing magic encantations while the sentence count is small. Expanding to cover more variants, even grammatically incorrect ones, will help with positive response rates.

verbal arch Jun 20, 2024, 9:39 PM

#

Discuss m5 Atom stuff here? I've had 2 working as voice assistants for 9 months, but now one never detects speech. I've factory reset, rebuilt firmware and uploaded it, power cycled, whatever I could think of. It sees and logs button presses but as for speech it just stays in the WAITING_FOR_VAD state forever, whereas the other unit "detects speech" even in a quiet empty room every 5 seconds. Does the hardware just die?

scarlet echo Jun 21, 2024, 5:56 AM

#

verbal arch Discuss m5 Atom stuff here? I've had 2 working as voice assistants for 9 months,...

#voice-assistants-archived

smoky forge Jun 26, 2024, 11:01 PM

#

Is it possible to have Assist either (1) not respond with a vocal answer, or (2) respond with just saying "Done".

I know what I've asked it to do, so I really don't need to be informed that switch has been turned on when that's what I've asked it to do.

fallow cedar Jun 27, 2024, 2:47 PM

#

does home assistant support streaming chunked TTS responses?
Seems like it's waiting for the full file to download and then it plays back instead of playing immediately

ivory vessel Jun 28, 2024, 2:52 PM

#

smoky forge Is it possible to have Assist either (1) not respond with a vocal answer, or (2)...

Not yet, but it's been proposed to have 3 options: (1) always respond, (2) only respond when targeting things outside the current area, and (3) no responses.

ivory vessel Jun 28, 2024, 2:53 PM

#

fallow cedar does home assistant support streaming chunked TTS responses? Seems like it's wai...

No, the TTS system is tied in with the media system which is file/URL based.

fallow cedar Jun 28, 2024, 2:55 PM

#

ivory vessel No, the TTS system is tied in with the media system which is file/URL based.

Do you know if there plans to change this? I figure it would improve the interaction with voice assistants (specially when generation is local)

ivory vessel Jun 28, 2024, 3:03 PM

#

fallow cedar Do you know if there plans to change this? I figure it would improve the interac...

No plans for now. It would be a major overhaul to the TTS and media integrations. Right now, everything assumes a complete file.

compact gate Jun 28, 2024, 4:00 PM

#

smoky forge Is it possible to have Assist either (1) not respond with a vocal answer, or (2)...

I've used conversation triggers and intents only because I want to control the response. Ideally, I could just choose a tone to emulate Alexa

blissful cave Jun 28, 2024, 5:50 PM

#

compact gate I've used conversation triggers and intents only because I want to control the r...

It won't squelch the tts response but you can use wyoming protocol to send an event that HA listens to and responds with an mp3 file

#

So mine for examples uses a little python blip to pick out a random file from a folder of star trek computer beeps when it hears wakeword and when it is done transcribing stt, you could do similar at any point in the pipeline

blissful cave Jun 28, 2024, 5:52 PM

#

ivory vessel No plans for now. It would be a major overhaul to the TTS and media integration...

Are there any already availible open source solutions that we might be able to explore as a stop-gap or tack-on solution until something more holistic can be drawn up?

#

I know in my own home latency between end of stt and response is the biggest friction point right now so it would be great to explore this as one avenue towards better responsiveness. I haven't made any contributions to the project yet but I'd be happy to jump in now that I'm happy with where my individual setup is

compact gate Jun 28, 2024, 5:55 PM

#

blissful cave It won't squelch the tts response but you can use wyoming protocol to send an ev...

I looked into some ways to do it and all seemed kind hacky, so I settled for 'Done' (compared to 'turned on input underscore Boolean', which is just silly 🙂 )

blissful cave Jun 28, 2024, 6:06 PM

#

compact gate I looked into some ways to do it and all seemed kind hacky, so I settled for 'Do...

Hmmm yeah now that I look, it doesn't seem like there's a way to specifically play an event when it succeeds at the action it performed. Just when it succeeds at listening to you, or generating a response, which may not in itself be indicative of success

#

That being said, I think you can get what you want by playing a success noise on one of those two conditions and returning no speech from the intent. That way, if there's an error firing or finding the intent, it will still inform you verbally, but if it succeeds, it will play a success noise and then should proceed silently

#

I'd be happy to share the code to get that working but I don't want to spam the dev chat 🙂

compact gate Jun 28, 2024, 6:16 PM

#

Appreciate the offer, but I'm okay with what I have for now. I'm more focused on walk-up-and-talk reliability across all my devices for now

fallow cedar Jun 28, 2024, 6:27 PM

#

blissful cave Are there any already availible open source solutions that we might be able to e...

If you find one, let me know!
I thought music assistant might offer a way but I need to check further into it.

blissful cave Jun 28, 2024, 6:35 PM

#

fallow cedar If you find one, let me know! I thought music assistant might offer a way but I ...

It could be a great idea to bark up their tree. Probably a lot of experience among their contributors hacking media_player to extend functionality

#

I assume that's why the integration makes duplicates the way it does

fallow cedar Jun 28, 2024, 6:45 PM

#

blissful cave It could be a great idea to bark up their tree. Probably a lot of experience amo...

im going to modify the open ai tts extension to use chunked audio and will try to pass to a music assistance entity... lets see!

fallow cedar Jun 28, 2024, 7:19 PM

#

fallow cedar im going to modify the open ai tts extension to use chunked audio and will try t...

i was a little too hopeful, theres still some more work to make that work.

delicate pike Jun 29, 2024, 9:13 AM

#

@ivory vessel first of all thank you for the great work on Assist (I'm following you since Rhasspy), thanks to the complete team here too. I'm currently experimenting with a quite small LLM (without GPU) Ollama gemma:2B (the system runs on a server with 8GB memory). Without any "sensors" data in the template the response is acceptable time wise (off course due to the limit of the system it takes couple of milli seconds to respond). Adding the sensor data within the template in the context it takes quite longer (it depends from the amount of data to process, some time minutes, and from the test I've done it is quite proportional with the amount of data in the context). I thought it is possible probably to combine Gemma with Assist. The basic idea is to use Assist detect the user intent, and provide the context to Gemma that formulate the response (user: "what is the temp. in Living Room" -> Assist get the data in Living Room and provide the context to Gemma or user:"turn on Entrance light" -> Assist get the intent do the action -> Gemma get the result context). As well by doing so, it could be possible to control the house, as per passing from Assist to detect the intent could be possible to do so. I know probably this isn't a real "AI" or better a kind of AI teams implementation.. but this could help to keep all in Local and have similar performances to the current OpenAi implementation also on limited systems. The "Conversation" when detect an intent of "general topic" such as User: "why the sky is blue" -> Assist would not found the device "sky" -> should get the user input to Gemma directly. I could invest some time play around with it. And this is just to share the idea with you guys. Thanks once again for the great work.

ivory vessel Jun 30, 2024, 12:56 AM

#

blissful cave Are there any already availible open source solutions that we might be able to e...

I think any solution would need to exist outside HA's TTS system (either as a custom integration or externally). The Wyoming protocol that's used for Piper is streaming by default, so it's at least possible to create something based it. The next version of Piper will include streaming voices as well, so we're getting closer.

ivory vessel Jun 30, 2024, 1:02 AM

#

delicate pike <@638799193586139136> first of all thank you for the great work on Assist (I'm ...

I'm very interested in these sorts of experiments 🙂
Assist could definitely be used to detect the intent, though (as everyone knows) it's fairly rigid. Some ideas I've had for doing this differently:

Use the LLM via text to categorize the intent (slow)
Use the LLM to get an embedding of the user's sentence and compare it with pre-computed embeddings of the various intents (faster)
Use a tiny BERT model to train an intent classifier (should be even faster)

What kind of server are you running?

blissful cave Jun 30, 2024, 6:07 AM

#

the "media_content_type" attribute of media_player might be a good axis for that tts convo from earlier.

#

Might provide an angle to extend some tts optimizations into that integration at least. It would probably make sense from an end-user perspective as well to offer them an intended mode for voice, since media_player will increasingly be used for tts

delicate pike Jun 30, 2024, 6:49 AM

#

ivory vessel I'm very interested in these sorts of experiments 🙂 Assist could definitely be...

I’m running it on a Fujitsu Server Primergy Tx140 S1 (old and cheap server). HA runs on Docker (it’s a supervised version) the OS is Debian Server 11 (amd64). Ollama is installed on the OS directly. So I use localhost to connect it. I think to open soon I branch and start to work on it if you don’t mind.
I also order an AMD GPU just in case but the target is to get it work smoothly and I can ensure you (can do some demo video to show you what I mentioned) Ollama gets the stats for example of the sun position only the response is handled in few milliseconds. I’m currently studying the docs of LLM.py.