#I'll post in here since I don't want to flood the channel anymore

1 messages · Page 1 of 1 (latest)

chilly timber
#

hey can i ask which terminal you are running this in, what OS, what shell

#

@livid spear

livid spear
#

shell

#

python 3.11

#

windsurf

chilly timber
#

Windows?

livid spear
#

mac

chilly timber
#

can you run the program again to the point where it gets stuck and leave it like that

livid spear
#

sorry not bash zsh

chilly timber
#

and then open another terminal -- and find the process ID? something like ps aux | grep -i python and one of them might be the one you are running (that is stuck)

#

once you are sure you have the right process ID

#

can you kill -SIGINT process_id_here

#

then take a look at the program that is stuck and see if anything changed

#

i'm gonna see what happens when I try their example code. I've never used their example code so I'm curious now

#

okay i had to do it in windows due to audio

#

but... i'm getting what you get with their example--- control+c does nothing

#

but the call happens

#

gonna see how to make it work

#

@livid spear hey you still there

livid spear
#

yeah

#

sorry

#

trying a few things

chilly timber
#

np still looking into things too

#

is your program 1 file or split up

livid spear
#

1 file for this

chilly timber
#

can you take a look at this ```python
import os
import signal
import sys
import threading
import time

from elevenlabs.client import ElevenLabs
from elevenlabs.conversational_ai.conversation import Conversation
from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface

def main():
AGENT_ID="redacted"
API_KEY="redacted"

if not AGENT_ID:
    sys.stderr.write("AGENT_ID environment variable must be set\n")
    sys.exit(1)

if not API_KEY:
    sys.stderr.write("ELEVENLABS_API_KEY not set, assuming the agent is public\n")

client = ElevenLabs(api_key=API_KEY)
conversation = Conversation(
    client,
    AGENT_ID,
    # Assume auth is required when API_KEY is set
    requires_auth=bool(API_KEY),
    audio_interface=DefaultAudioInterface(),
    callback_agent_response=lambda response: print(f"Agent: {response}"),
    callback_agent_response_correction=lambda original, corrected: print(f"Agent: {original} -> {corrected}"),
    callback_user_transcript=lambda transcript: print(f"User: {transcript}"),
    # callback_latency_measurement=lambda latency: print(f"Latency: {latency}ms"),
)

shutdown_flag = False
conversation_id = None

def signal_handler(sig, frame):
    nonlocal shutdown_flag
    print(f"\nReceived signal {sig}. Shutting down...")
    shutdown_flag = True
    conversation.end_session()

# Register signal handler
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)

conversation.start_session()
print("Conversation started. Press Ctrl+C to exit.")

# Non-blocking wait - check periodically if we should exit
try:
    while not shutdown_flag:
        time.sleep(0.1)
except KeyboardInterrupt:
    print("\nKeyboardInterrupt received. Shutting down...")
    conversation.end_session()

# Try to get conversation ID with a short timeout
def get_conversation_id():
    nonlocal conversation_id
    try:
        conversation_id = conversation.wait_for_session_end()
    except:
        pass

id_thread = threading.Thread(target=get_conversation_id, daemon=True)
id_thread.start()
id_thread.join(timeout=2.0)  # Wait max 2 seconds for conversation ID

if conversation_id:
    print(f"Conversation ID: {conversation_id}")

print("Program terminated.")```
#

and adapt it to your code

#

there are 2 new imports (time, threading)

#

and then maybe 10-20 lines of code otherwise

#

their example code might be bugged or sensitive to the environment... i'm not sure. I could test it on a mac

#

the main differences are the 2 imports

#

and then-----> (see below)

#

stuff starting from shutdown_flag

#

ending right before conversation.start_session()

#

and then again starting the next line ("print") and extending through that try/catch and all the way to termination message

#

that should make it more robust to be able to actually end. right now it's just stuck due to some kind of deadlock/sync issue, or maybe something else. once you get get out of the convo, you'll be able to do whatever afterwards in the code

#

i could adapt your code for you (at least the snipper you sent me) --- but the imports you'll have to add

#

here:```python
def start_elevenlabs_sdk_conversation(questions: str, agent_id: str = 'place_holder'):
"""
Starts an intake conversation using the ElevenLabs Conversational AI SDK.
This version leverages ElevenLabs' built-in conversation management.
"""
if eleven_client is None:
print("ElevenLabs client not initialized. Cannot start SDK conversation.")
return
else:
print("ElevenLabs client initialized. Starting SDK conversation.")

dynamic_vars= {
    "list_of_questions": f"""{questions}"""
}
print(f'dynamic_vars: {dynamic_vars}')
config = ConversationInitiationData(
    dynamic_variables=dynamic_vars, 
)
conversation = Conversation(
        eleven_client,
        agent_id,
        config = config, 
        requires_auth=bool(os.getenv('ELEVENLABS_API_KEY')), # This checks if key is in env
        audio_interface=DefaultAudioInterface(),
        callback_agent_response=lambda response: print(f'Agent: {response}'),
        callback_agent_response_correction=lambda original, corrected: print(f'Agent: {original} -> {corrected}'),
        callback_user_transcript=lambda transcript: print(f'User: {transcript}')

    )

shutdown_flag = False
conversation_id = None

def signal_handler(sig, frame):
    nonlocal shutdown_flag
    print(f"\nReceived signal {sig}. Shutting down...")
    shutdown_flag = True
    conversation.end_session()

# Register signal handler
signal.signal(signal.SIGINT, signal_handler)
signal.signal(signal.SIGTERM, signal_handler)

conversation.start_session()
print("Conversation started. Press Ctrl+C to exit.")

# Non-blocking wait - check periodically if we should exit
try:
    while not shutdown_flag:
        time.sleep(0.1)
except KeyboardInterrupt:
    print("\nKeyboardInterrupt received. Shutting down...")
    conversation.end_session()

# Try to get conversation ID with a short timeout
def get_conversation_id():
    nonlocal conversation_id
    try:
        conversation_id = conversation.wait_for_session_end()
    except:
        pass

id_thread = threading.Thread(target=get_conversation_id, daemon=True)
id_thread.start()
id_thread.join(timeout=2.0)  # Wait max 2 seconds for conversation ID

if conversation_id:
    print(f"Conversation ID: {conversation_id}")

print("Program terminated.")```
#

some of the indentation might be fucked up though sorry. hopefully just a space or two or space vs tab

#

@livid spear let me know if you have questions about that or if it works/ or you choose another solution

livid spear
#

yes implementing it rn

#

still freezes

#

problem is conversation.end_session() or conversation_id = conversation.wait_for_session_end()

chilly timber
#

can you show me the current code you are running please

#

there should be a straightforward fix for this

livid spear
#

when the convo is done is when it freezes

#

so when the convo gets complete nothing in the terminal will work

#

which is fine

chilly timber
#

wait wait wait.

#

before it is done, and you do control+c, it DOES end?

livid spear
#

yes

chilly timber
#

yeah this is kool. best guess is something when it ends creates a deadlock

#

(two things waiting on each other)'

#

but can I still see what the current code looks like just in case

livid spear
#

ya

#

I literally c/p your code

#

the second function

chilly timber
#

remind me, without that signal part, what happened with you did press control+c

#

program would respond by exiting right

livid spear
#

ya

#

I think I may have solved it

chilly timber
#

what worked

livid spear
#

I gotta.

#

c

chilly timber
#

i'm curious

livid spear
#

but on the website

#

there are 2 ways to cancel the call

#

end_call tool (auto-chosen)

#

or in the system prompt

#

I think they are conflicting

#

like both are getting triggered?

#

end_call tool gets triggered when all questions are answered

chilly timber
#

hmmmm. no those should work together

#

the system message tells it the tool is there, reinforces it

#

the tool is actual tool -- this leads to the thing running the bot/llm/etc. to respond

livid spear
#

ctrl + C works when end_call is turned off

chilly timber
#

interesting!

livid spear
#

and all questions are answered

chilly timber
#

okay

#

so

livid spear
#

but I still need it to end the call without me hitting ctrl + C

#

any ideas for that?

#

is that .end_session()?

chilly timber
#

what I'm hearing is that whenever whatever is controlled by the end_call tool is turned on, and the convo ends, then it gets stuck. when this end_call option is off in the UI, it does respond

#

if that's true, reproducible, etc., would be worthy of a bug report

#

but let me see what you just asked

#

ideas I have if you cannot use control+c (which I think is still an issue to figure out)

#

there is an option where it hangs up due to silence

#

but that might trigger the same thing on the back side if that makes sense

#

worth a try?

livid spear
#

so should I turn it off?

#

end_call needs to be turned on

#

the sys prompt can't turn it off

chilly timber
#

well i'd say figuring out why it gets stuck is probably the "correct" way, but.... might take a while and is hard to do like this over the internet on discord.

#

if you use the silence criteria to end call from the 11labs side, then that might still lead to the same issue as end_call option

#

i'm gonna try to reproduce what you said but for me, control+c anytime just didn't work I think...

livid spear
#

so ur suggestion is turn it off?

chilly timber
#

if you can turn it off and set the silence to whatever you are okay with and test it... that could work for you

#

but like I said, it might still lead to the same problem in your code. you could just try. or wait for someone else's opinion. not trying to waste your time

livid spear
#

no wate at all

#

ya I will turn off silence

#

and experiment

chilly timber
#

the thing is, my control+c is stuck the whole time on windows console

livid spear
#

ya mine was too

chilly timber
#

not just when the call ends

livid spear
#

had to make new terminal

chilly timber
#

gonna try to use a debugger.

#
    def wait_for_session_end(self) -> Optional[str]:
        """Waits for the conversation session to end.

        You must call `end_session` before calling this method, otherwise it will block.```
#

soo that's a problem

#

that we are feeling

#

before wait_... is clearly called before end_session is called because the latter is triggered by the control+c signal handling...

#

i guess block until what..... maybe that signal. so perhaps it isn't implying an issue

chilly timber
#

OKAY

#

mac, example code, control+c works fine... but it just sometimes takes a second or two to get accepted and end the program

chilly timber
#

@livid spear probably my last attempt of the day. have to do other stuff sorry. if you have a chance to adopt your code to this and try, let me know how it goes:

import os
import signal
import sys
import threading
import time

from elevenlabs.client import ElevenLabs
from elevenlabs.conversational_ai.conversation import Conversation
from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface

AGENT_ID="redacted"
API_KEY="redacted"

def main() -> None:
    client = ElevenLabs(api_key=API_KEY)

    conversation = Conversation(
        client,
        AGENT_ID,
        requires_auth=bool(API_KEY),
        audio_interface=DefaultAudioInterface(),
        callback_agent_response         = lambda r: print(f"Agent: {r}"),
        callback_agent_response_correction = lambda o,c: print(f"Agent: {o} -> {c}"),
        callback_user_transcript        = lambda t: print(f"User: {t}"),
    )
    conversation.start_session()

    # Run wait_for_session_end() in a daemon thread -----------------
    def _block_until_done() -> None:
        conv_id = conversation.wait_for_session_end()
        print(f"\nConversation ID: {conv_id}")

    waiter = threading.Thread(target=_block_until_done, daemon=True)
    waiter.start()
    # ----------------------------------------------------------------

    def _sigint_handler(sig, frame):
        print("\nCtrl-C detected — ending session ...")
        conversation.end_session()   # ask ElevenLabs to shut down
        waiter.join(timeout=2.0)     # give it a moment to finish
        sys.exit(0)                  # hard-exit if it’s still stuck

    signal.signal(signal.SIGINT, _sigint_handler)

    # Keep the main thread alive but idle so it can process signals.
    try:
        while waiter.is_alive():
            time.sleep(0.2)
    except KeyboardInterrupt:
        pass    # we never actually get here because we exit in handler

if __name__ == "__main__":
    main()```
livid spear
#

Yes I will try that soon

#

Fyi I tried the silence

#

And the end call turned off

chilly timber
#

other sorry I couldn't help. i would stil personally see if I can ensure it doesn't really work on windows and win -- because it's worthy of a bug report if that remains the case

livid spear
#

Nothing worked

chilly timber
#

ahhh k

livid spear
#

Do you work for 11 labs?

chilly timber
#

nah. i've been just on vacation this week and doing stuff with it like discord bots and phone call bots and stuff

livid spear
#

Wurd

chilly timber
#

so it's kind of fresh but i never used their example code

livid spear
#

You write good code

#

Ty for helping

#

Not sure what to try after this

#

I can make another bot

#

I can try doing it thru JS on the frontend instead

chilly timber
#

well. there are some benefits to this. not all to you. like if I can show the example code as it is (with just env variables in there) doesn't work on a major platform, reproducible, then that is worthy of a bug report and a fix. and that means we did a good job

#

but not sure that is the case yet

#

i did do example code + mac but that DID work

white yarrowBOT
#
dv8s has been warned

Reason: Bad word usage

chilly timber
#

lol

livid spear
chilly timber
#

during the convo, I could control+c to get it to end. in windows, during or after the convo is done, it'll totally block (part of your issue I think)

#

(even if it is not windows, probably related)

#

it's deadlocked or eating up all the SIGINT or whatever signals

#

program just keeps waiting for something

#

the hard part of this kind of stuff is not knowing their library code. stuff like race conditions and stuff are hard to debug in general, especially if we are working on the outside of the issue. but I'm not gonna blame. might be a me/you issue

#

normally a program will be killable in the terminal. if not within (keyboard signals) or external signals (like using kill) but sometimes the program is stuck stuck--- like if it has a handler for the signal -- then the OS will let it keep going often... most times ctrl+c or whatever signal will kill the program because the default handling of that does that. but if you manually handle it and then combine that with some kind of race condition or bug --- then all the signals will be "handled" and the program will still stay stuck in some loop possibly

#

if that makes sense (sorry no idea your level of programming knowledge)

#

i tried to debug via vscode on Windows platform but it wouldn't let me enter the library code....

chilly timber
livid spear
#

will do 1 sec

#

do you think teh problem is the code is in a function?

#

and shouldn't be?

chilly timber
#

not sure. that is part of what I Was talking about before re not knowing the context of that function, because clearly it isn't the full code.

#

but

#

with all that we know now, not sure that's really the issue at all

#

i'd like to see what that code I mentioned above does. and if it doesn't work, i'd like to know exactly what you're running (minus stuff you can redact, don't care about that)

livid spear
#

ok 1 sec

#

yes no difference

#

Ctrl-C detected — ending session ...
^C^C^C^C

#

but the convo got ended in actuality, just not on the terminal

chilly timber
#

can I see what the code looks like now. and if you don't want me to show the calling code, then maybe tell me what it is

#

since clearly it's not the whole program right.... no main

livid spear
#

ya sure hold on

#
elevenlabs_questions = ["What day is it?"]


def main() -> None:
    client = ElevenLabs(api_key=os.getenv("ELEVENLABS_API_KEY"))

    dynamic_vars= {
        "list_of_questions": f"""{elevenlabs_questions}"""
    }
    print(f'dynamic_vars: {dynamic_vars}')
    config = ConversationInitiationData(
        dynamic_variables=dynamic_vars, 
    )

    conversation = Conversation(
        client,
        agent_id,
        config = config,
        requires_auth=bool(os.getenv("ELEVENLABS_API_KEY")),
        audio_interface=DefaultAudioInterface(),
        callback_agent_response         = lambda r: print(f"Agent: {r}"),
        callback_agent_response_correction = lambda o,c: print(f"Agent: {o} -> {c}"),
        callback_user_transcript        = lambda t: print(f"User: {t}"),
    )
    conversation.start_session()

    # Run wait_for_session_end() in a daemon thread -----------------
    def _block_until_done() -> None:
        conv_id = conversation.wait_for_session_end()
        print(f"\nConversation ID: {conv_id}")

    waiter = threading.Thread(target=_block_until_done, daemon=True)
    waiter.start()
    # ----------------------------------------------------------------

    def _sigint_handler(sig, frame):
        print("\nCtrl-C detected — ending session ...")
        conversation.end_session()   # ask ElevenLabs to shut down
        waiter.join(timeout=2.0)     # give it a moment to finish
        sys.exit(0)                  # hard-exit if it’s still stuck

    signal.signal(signal.SIGINT, _sigint_handler)

    # Keep the main thread alive but idle so it can process signals.
    try:
        while waiter.is_alive():
            time.sleep(0.2)
    except KeyboardInterrupt:
        pass    # we never actually get here because we exit in handler

if __name__ == "__main__":
    main()```
chilly timber
#

ty

livid spear
#

literally ur code except I included dynamic vars

#

so my convo ends when one question gets asked

chilly timber
#

you must have imports right

#

but nvm that's okay

livid spear
#

hold on

#
import os
import signal
import sys
import threading
import time

from elevenlabs.client import ElevenLabs
from elevenlabs.conversational_ai.conversation import Conversation
from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface
from elevenlabs.conversational_ai.conversation import Conversation, ConversationInitiationData```
chilly timber
#

can you change _sigint_handler to be python def _sigint_handler(sig, frame): print("[DEBUG] SIGINT handler entered") conversation.end_session() print("[DEBUG] end_session() returned") waiter.join(timeout=2.0) print("[DEBUG] waiter.join done – exiting") sys.exit(0)

#

and tell me wh at's different during runtime, if anything -- thanks

#

@livid spear

livid spear
#

Yes just @ gym will get back to you in 45

chilly timber
#

don't skip leg day

livid spear
#

I think wait_for_session_end is the culprit

#

if you have a suggestion on how to end the call immediately

#

sys.exit(), conversation.session_end, etc.

#

all ears

#

because I can retrieve transcript

chilly timber
#

hmm the interesting thing is with a recent mac, their example code with only the keys being adjusted (agent id, etc.) --- it was interruptable using the default terminal program....

#

a few hours ago

#

and you are on mac

#

but zsh but not sure that matters in this case

#

use that, adjust the 2 variables (agent id, 11labs key) -- and see if that even works?

#

to do a basic sanity check....

#

i understand you won't be able to do more things that you want, but I mean just to see if the actual example code they provide with basic adjustments (the 2 replacements in the variables) --- work in your terminal/shell/environment

#

because if it doesn't and it worked for me.... then that can help

livid spear
#

do u think I can wrap all of that in a function

#

or it needs to be outside of one

chilly timber
#

hmm. well i think a hard part is not doing too many things at once.

#

i'd just be curious about what i said - -- i'd just copy that code (demo.py) -- adjust the keys (2 of them) and just run it and see what happens

#

that's a basic sanity check

#

sometimes when i'm coding, and it's going poorly, I just check basic assumptions --- to make sure i'm not too deep in the weeds in the wrong area

#

if that makes sense

#

but if you just assume you were a new user, cloned demo.py, replaced the things that it tells you to replace, and it doesn't work ---> this goes a long way

#

vs. using a highly edited/altered code with odd terminal or shell, etc. ---> then you don't know what went wrong

#

but what I linked above is from the official repo, in examples, and demo.py is probably the example you maybe started with

#

i'm just curious on your setup, who happens when you run minimally altered code from them (alterations are just making sure the keys are right)

livid spear
#

yes when the convo is done

#

it freezes terminal

#

in that example as well

chilly timber
#

are you using default terminal program of Mac

livid spear
#

yes

#

zsh

#

I also tried on bash didn't work

#

it worked on ur end?

#

did u ctrl c?

#

fyi turning on end call

#

is terrifying

chilly timber
#

mac os, control c, that example, Yes it quits

livid spear
#

asked it to end the convo 5 times and it wouldn't hahaha

#

it control c once the entire convo is done

chilly timber
#

as of a few hours ago

livid spear
#

or before?

chilly timber
#

before

#

i don thave end_call or anything

livid spear
#

yes before works for me too

#

but I need the convo to be finished once it is actually finished

#

so I can proceed w/ rest of code

chilly timber
#

well i didn't have the bot configured to ever be able to end the call though

#

what i'm saying is WITH THAT

#

it does end, the program

#

i can kill it w/ ctrl+c

#

and thus make it move on and do stuff

#

but dude, if you've already done that and I'm just not registering -- i'm sorry

#

if you can use the example, replace those values , and it fucks up --- then that's a clear issue

chilly timber
#

hey can you open terminal, run bash and then run the code? just in case it is zsh, doubt it

livid spear
#

yes I did no difference

#

this is a problem

#

signal.signal(signal.SIGINT, lambda sig, frame: conversation.end_session())

#

I don't want to use ctrl c to end the convo

#

I want it to end automatically (through voice)

#

when I just use conversation.start_session() I can use ctrl + C to end i t

chilly timber
#

the thing is, the example code with appropriate replacements of the keys SHOULD do whatever is natural/expected

livid spear
#

I tried doing ```py
conversation.start_session()
conversation.end_session()
return
conversation.start_session()
sys.exit(1)

livid spear
chilly timber
#

the library isn't simple though. on that backend, it's probably doing a bunch of stuff

livid spear
#

if I told it so stop it wouldn't end

chilly timber
#

this doens't have anything to do with what you are telling it

livid spear
#

ran infinitely

chilly timber
#

it's that you can't kill the program lol

livid spear
#

I understand but I'm saying I can't use demo code

chilly timber
#

that's an independent issue

#

right

#

that's weird

livid spear
#

becuase it won't kill it through voice

chilly timber
#

because I could

livid spear
#

oh wow

#

let me try again

chilly timber
#

i killed it fine

livid spear
#

you use voice or ctrl c?

chilly timber
#

ctrl+c

livid spear
#

ok what about voice

#

because I need to get the transcript after

chilly timber
#

it can't do it with voice

livid spear
#

I can't ctrl c in production

chilly timber
#

i never want it to die via voice end_call

livid spear
#

why not

chilly timber
#

i dunno. for me it doesn't matter. the phone call ends when the person calling hangs up

#

point is

#

i can still stop it

#

without hanging up

livid spear
#

I don't understand

chilly timber
#

the conversation RELIES on your code running

#

that IS the conversation

livid spear
#

ok but what if user says they don't wanna talk anymore or they hang up

chilly timber
#

it needs to die when you ask it to die -- that is the normal way to stop it

livid spear
#

isn't that done through voice?

chilly timber
#

unless you want the agent to self-figure it out

#

and emit those tool calls

#

(end call, silence detection, etc.)

#

okay. if you kill the program, which is an expected thing to be able to do, it will end the call/conversation

#

I hope we can agree with that

#

if yes, then the fact that you cannot kill the program on your side, is a problem

#

and not necessarily seen by others

livid spear
#

there are certain scenarios I can kill it on my side

chilly timber
#

really?

#

because I thought you couldnt

livid spear
#

ctrl c before the agent auto-ending it

#

I am saying I can't kill it once the entire convo is finished

#

so when the agent does its job

#

it won't die

chilly timber
#

but the point is it's a program on your computer

#

you should be able to kill it

livid spear
#

when it doesn't reach conversation.end_session()

#

it can die

chilly timber
#

like any other program

livid spear
#

when it does I can't

chilly timber
#

so the fact that you cannot is a serious issue

livid spear
#

is there something I can do in code

chilly timber
#

what Mac OS version are you on

livid spear
#

like sys.exit(1) or whatever

#

ventura 13.0

#

I need to get the new one

chilly timber
#

when you start a program, it will run. if you are running it in a terminal, then if you do control+c, it'lll have to affect the program

#

meaning it is sent to it via some posix signal

livid spear
#

yea I think so

chilly timber
#

you cannot get the point across to your program, running on your machine

#

because you can't stop it, you are frustrated that the further programmatic lines are not working

#

or it isn't ending

#

the point of the signal handler is so that when you send SIGINT or whatever to the program, it'll capture it and do something

#

vs just killing the program outright

#

if you normally send it, program will stop

#

meaning the next lines will NOT execute

#

the point of the handler is so it intercepts it

#

says "okay"

#

user wants me to die

#

./stop

#

then does more stuff

#

and ends

#

anyways, you can't blame the example code unless you follow closely. i mean you can , but it's harder

#

but it failed for you -- sorry

#

i think probably someone from the team will have to help you- -- sorry I wasted your time. hopefully all this helps.

chilly timber
#

i was just moving to the bar and hanging out with friends -- so i probably won't be able to help

white yarrowBOT
#
dv8s has been warned

Reason: Bad word usage

chilly timber
#

@chilly timber bookmark

#

@livid spear did you try like control+c like 10 times in a row? lol

#

like hitting it hard and frequently for like 5 seconds

livid spear
#

Yes

livid spear
#

@chilly timber r u a dev?

chilly timber
livid spear
#

but in general

#

software wise wdy do

#

where are you located?

#

@chilly timber

chilly timber
#

USA, Wisconsin. I on vacation right now ----- I am a professional in medicine. So not in the stuff in here.

#

sorry I didn't fix your issue.

livid spear
#

that's very interesting I am working on an AI application for healthcare rn

#

I'm an AI engineer

#

but you know a lot about coding

#

if ur writing threads

#

for a non-cs person..

#

if there are any automations needed for healthcare lmk I'm making some stuff for doctors rn

chilly timber
#

@livid spear i think this is similar to prior code but it did fix the issue -- before it I have the same problem as you on Windows via the traditional command prompt and git bash and powershell --- with this, it works

import os
import signal
import sys

from elevenlabs.client import ElevenLabs
from elevenlabs.conversational_ai.conversation import Conversation
from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface

# Global flag to handle interruption
interrupted = False

def signal_handler(sig, frame):
    global interrupted
    interrupted = True
    print("\nInterruption signal received. Shutting down...")

def main():
    global interrupted
    
    AGENT_ID = "redacted"  # TODO: replace with your actual agent ID
    API_KEY = "redacted"  # must exist if agent is private

    if not AGENT_ID:
        sys.stderr.write("AGENT_ID environment variable must be set\n")
        sys.exit(1)
    
    if not API_KEY:
        sys.stderr.write("ELEVENLABS_API_KEY not set, assuming the agent is public\n")

    client = ElevenLabs(api_key=API_KEY)
    conversation = Conversation(
        client,
        AGENT_ID,
        # Assume auth is required when API_KEY is set
        requires_auth=bool(API_KEY),
        audio_interface=DefaultAudioInterface(),
        callback_agent_response=lambda response: print(f"Agent: {response}"),
        callback_agent_response_correction=lambda original, corrected: print(f"Agent: {original} -> {corrected}"),
        callback_user_transcript=lambda transcript: print(f"User: {transcript}"),
        # callback_latency_measurement=lambda latency: print(f"Latency: {latency}ms"),
    )
    
    # Set up signal handler
    signal.signal(signal.SIGINT, signal_handler)
    
    conversation.start_session()

    # Check for interruption periodically instead of blocking
    import time
    try:
        while not interrupted:
            time.sleep(0.1)  # Small sleep to avoid busy waiting
    except KeyboardInterrupt:
        # Fallback in case signal handler doesn't work
        interrupted = True
    
    # Clean shutdown
    print("Ending session...")
    conversation.end_session()
    conversation_id = conversation.wait_for_session_end()
    print(f"Conversation ID: {conversation_id}")

if __name__ == '__main__':
    main()```
livid spear
#

Does that close with ctrl c or voice?

#

I don’t need a ctrl c close

#

i need a voice one

chilly timber
#

let me see

livid spear
#

So according to perplexity and docs I think

#

Signal.sigint is a ctrl c close

#

But in production no one’s clicking ctrl c

#

I need something w/ voice

#

I assigned a twilio number to it

chilly timber
#

right -- it was still an issue. but that wasn't the original goal

livid spear
#

It works

chilly timber
#

i understand

livid spear
#

So through voice it hangs up the call

#

I also built an ai from scratch but it’s like 70% the conversational abilities of elevenlabs

#

Also looking for text to text AI agent through elevenlabs conversational api btw

#

There is an option through widgets but you can’t change the widget size so it’s a little box on the corner of ur screen

#

No bueno

#

If u find that would be great

chilly timber
#

i'll test to see if it ends via the voice and get back to you soon

livid spear
#

Ty

#

If u need any code btw lmk

#

My twilio bot is great works perfectly

#

Just eleven labs id a little expensive

#

$5 for a 20 min convo

#

Actually really bad pricing..

#

But pushes applications towards use-cases where people are paid to talk

chilly timber
#

okay so this is different code -- and it works (when bot hangs up, the program proceeds/ends/isn't stuck) but more code...

#

not sure that audio interface piece was really needed but the code I sent like 20 minutes ago was throwing some errors sooo....

#

should fix your issue

#

@livid spear

livid spear
#

Will check later tonight 🙂

chilly timber
#

kk

chilly timber
#

@livid spear did you get a chance to try it -- or maybe problem is no longer relevant/existent

livid spear
#

Hey

#

What was the problem again

#

Agent wasn’t hanging up on the terminal through voice ya?

chilly timber
#

correct

#

but probably needs some slight changes/adoption to your current code.

livid spear
#

Yes it works for me 🙂

#

Are you able to find any text to text?

#

For conversational AI

#

The docs agent said it couldn’t be done

#

I think it can

#

It’s possible through the widget but you can’t customize widget size

chilly timber
#

Sorry not sure about that — out at a restaurant

#

And haven’t done it via 11labs

#

Isn’t text to text …. Usual LLMs lol

livid spear
#

ya but the 11 conversational backend is v good

#

I've made my own version but not as good as theirs

chilly timber
#

Well you can request in text I believe via api

#

And its response is audio but also it’ll send a transcript of the response

#

So I guess you should be able to do text to text that way

livid spear
#

how its T to T if they reply in audio

chilly timber
#

So the reply has a message of the transcript

#

And even a corrected one if interrupted

chilly timber
#

but yeah maybe technically not text 2 text if there is some audio somewhere that you are ignoring (and just using the text) -- but I was getting to the point of you can send it a text prompt, and it will send the response back in text (as well as audio, not sure if you can ask it to not send audio back vs just ignoring it on your end)

livid spear
#

the cost will be the same but it works for the demo

#

but where are you seeing this?

chilly timber
#

that line should be where you can register something to accept the response

#

if you are instead using websocks, then there is an event sent

livid spear
#

it doesn't produce text from that

livid spear
#

it does produce text but does not receive text

#

audio_interface=DefaultAudioInterface(),

chilly timber
#

I do not know what else to say. I literally have a bot that prints the text to console every time it says something.

livid spear
#

no that's not what I am saying

chilly timber
#

and it receives it directly from the agent

livid spear
#

I am saying type to the agent and receive text

#

so both way text

chilly timber
#

yes I was answering where the text from the agent comes from

#

you can also send text.

livid spear
#

how did u set that up?

chilly timber
#

i'm sure there is a way using the library you are already using though let me see

livid spear
#

can u send me the code you wrote for that?

chilly timber
#

i am gonna try to make it more concise because there is much to the code that has nothing to do with what you are wanting

#

(it's a discord voice channel bot -- and it uses the elevenlabs websocket library features)

#

will get back to you in a few minutes

#

okay

#

conversation.send_user_message(text) ---- that should be how you send text to it

#
    def send_user_message(self, text: str):
        """Send a text message from the user to the agent.

        Args:
            text: The text message to send to the agent.

        Raises:
            RuntimeError: If the session is not active or websocket is not connected.
        """
        if not self._ws:
            raise RuntimeError("Session not started or websocket not connected.")

        event = UserMessageClientToOrchestratorEvent(text=text)
        try:
            self._ws.send(json.dumps(event.to_dict()))
        except Exception as e:
            print(f"Error sending user message: {e}")
            raise```
This is what the library implements that function. it's in the comments that it is meant to send text from the user to the agent
#

let me adopt their example and see if it works and then send you the simple code as proof in principle

#

working on a simple example.

#

this is what it shows me when I run it: console ❯ python main.py Using Agent ID: <I redacted> Starting conversation session... Agent: Hello! Ask me about Pathfinder second edition. Sending message: Hello, what services do you offer? Waiting for agent response... Agent: Hi there! I can answer questions about Pathfinder, including rules, character creation, spells, feats, and lore. I can also help clarify mechanics, suggest builds, and explain how different parts of the game work. If you have general questions or need help with something else, feel free to ask! What would you like to know more about?

#

this is the code (with keys redacted) that produced that

#
import os
import sys
import time

from elevenlabs.client import ElevenLabs
from elevenlabs.conversational_ai.conversation import Conversation
from elevenlabs.conversational_ai.default_audio_interface import DefaultAudioInterface

class NoAudioInterface(DefaultAudioInterface):
    """Audio interface that ignores audio input/output."""
    
    def __init__(self):
        # Don't call super().__init__() to avoid audio setup
        pass
    
    def start(self, input_callback):
        # Don't actually start audio processing
        pass
    
    def stop(self):
        # Nothing to stop
        pass
    
    def output(self, audio_bytes):
        # Ignore audio output
        pass
    
    def interrupt(self):
        # Nothing to interrupt
        pass

def main():
    AGENT_ID = os.environ.get('ELEVENLABS_AGENT_ID', 'redacted')
    API_KEY = os.environ.get('ELEVENLABS_API_KEY', 'redacted')

    if not AGENT_ID:
        sys.stderr.write("ELEVENLABS_AGENT_ID environment variable must be set\n")
        sys.exit(1)
    
    print(f"Using Agent ID: {AGENT_ID}")
    if not API_KEY:
        print("ELEVENLABS_API_KEY not set, assuming the agent is public")

    client = ElevenLabs(api_key=API_KEY)
    
    def handle_agent_response(response):
        print(f"Agent: {response}")
    
    def handle_user_transcript(transcript):
        print(f"User transcript: {transcript}")
    
    conversation = Conversation(
        client,
        AGENT_ID,
        requires_auth=bool(API_KEY),
        audio_interface=NoAudioInterface(),
        callback_agent_response=handle_agent_response,
        callback_user_transcript=handle_user_transcript,
    )
    
    print("Starting conversation session...")
    conversation.start_session()
    
    # Wait for the session to initialize
    time.sleep(3)
    
    # Send hardcoded user message
    hardcoded_message = "Hello, what services do you offer?"
    print(f"Sending message: {hardcoded_message}")
    
    try:
        conversation.send_user_message(hardcoded_message)
    except RuntimeError as e:
        print(f"Error sending message: {e}")
        return
    
    # Wait for the agent's response which when it happens is handled elsewhere
    print("Waiting for agent response...")
    time.sleep(10)
    
    # End the conversation
    print("Ending conversation...")
    conversation.end_session()
    conversation_id = conversation.wait_for_session_end()
    print(f"Conversation ID: {conversation_id}")

if __name__ == '__main__':
    main()
#

the agent under advanced has user_transcript, agent_response, etc. set up otherwise it'll probably not work

#

in this case it is just a hard coded user message to prove it -- of course in real life it can be dynamically/run-time set like from a conversation, etc.

#

but it's just meant to show it can be done. I'm sure there are better ways to structure it for more meaningful use

livid spear
#

not working for me

chilly timber
#

Will take a look after work today. Thanks for trying it out

livid spear
#

thanks for helping!

chilly timber
#

using this exact code with my api key and agent, I get this (I redacted some basic things like agent ID (my own agent)):

#
❯ python another_try.py 
Using Agent ID: (REDACTED, THIS IS MY AGENT)
ElevenLabs client initialized. Starting SDK conversation.
Dynamic variables: {'starting_important_points': "Welcome! I'm here to help you with your inquiries.", 'list_of_questions': '\n    1. What is your name?\n    2. What brings you here today?\n    3. How can I assist you?\n    ', 'ending_important_points': "Thank you for the conversation. Is there anything else you'd like to know?"}

Starting conversation session...
Initializing session...

Agent: Hello! How can I help you today?

==================================================
Conversation started! Type 'exit' or 'quit' to end.
==================================================


Your message: are you there?

Waiting for agent response...

Agent: Yes, I'm here. How can I help you today?

Your message: how old are you?

Waiting for agent response...

Agent: I don't have an age like a person does. I'm an AI designed to help you. How can I assist you?

Your message: ^C

Interrupted by user. Ending conversation...

Conversation ended. ID: REDACTED

Conversation completed successfully!```
#

as I have said, for the agent used, under advanced, if you don't have the right events selected, it will of course not work.

#

@livid spear

livid spear
#

Ok good will check it out when I get back thanks

livid spear
#

@chilly timber hey just wanted to say thanks for helping me solve this

#

it works

#

just wondering how you did?

#

did you see it in docs the text example or you went into source code and tinkered?

#

what's weird is if you don't wrap the conversation initiation in a try block it shows user input before agent starting messag e

chilly timber
chilly timber
# livid spear just wondering how you did?

i knew 11labs sent back the agent response via text because in my voice channel bot thing, i had seen that logged to console. and I also know from agent setup they do have events that are for their text. I also knew (from earlier code that I ended up changing after I got it working reasonably well) that I could SEND text to it. so I just have to refer to that in code and just made use of it. longer answer is I've been coding since like 7th grade, non-professionally but still worked on stuff, including an open source project (reasonably large), and other small projects. I use that experience + AI now --- that combo is insanely powerful. It's not the same as someone who doesn't know coding at all. I usually know what I need the AI to do and if something isn't working, I usually on feel just have an idea of what is wrong -- so I can help guide it and work with it well. So often it's a mixture of me asking it to do my busy work, and me reading the code, understanding it, and when I need something done, I know well what to ask of it. I also do manual code edits too on top of that. For harder debugging I still stick to a debugger if needed.

livid spear
#

yes the sleep was necessary I believe async is being done too

livid spear
#

why not do a startup?

chilly timber
#

ahhh.. i'm 37 now. So it's been a while since I decided on all that. most of us can be happy doing many things. sometimes we don't have time to do it all professionally. I enjoy the path I took. I just took my path and then do other stuff for hobby/fun like this.

livid spear
#

haha fair enough

#

but consider a startup

#

because youre hacky

#

you understand low level stuff could probably make something really cool

chilly timber
#

thanks. glad i was of help

livid spear
#

I know ur busy but if something comes up on my end I might reach out

#

and just a heads up my agent still has a few errors (call ends 1 second early due to twilio error 31921, talking when it hears background noise)

#

but I think first can be solved through webhook, and 2nd can be solved through webhook -> voice isolator system tool. Will be doing that 👍

#

then just need to make it hipaa compliant and my prototype is done

chilly timber
#

you part of a startup?

livid spear
#

yeah I started one

livid spear
#

@chilly timber do you know if it's possible to retrieve each audio chunk?

#

in conversational ai

#

all the elevenlabs tooling right now needs audio chunks to clean audio

#

but something that can be done is using an external model or writing code and putting it as a system tool

#

but their tooling is better just not sure how to access each chunk

#
audio = base64.b64decode(event["audio_base_64"])
            self.audio_interface.output(audio)
"user_audio_chunk": base64.b64encode(audio).decode(),```
chilly timber
#

you should be able to inherit from AudioInterface and then customize it

#

@livid spear can see this code (will have control+c issue, but that isn't the point of the code) -- of course it logs it to console but you can do with it as you wish inside the code-- just giving you an example of how you can obtain the data yourself

#

particularly, output method of that class. input is audio_bytes

#

the logging it does is just the example way to use the data vs other more meaningful ways

livid spear
#

ok so ur solutions are hacky

#

you rewrite the source code instead of using available tooling from elevenlabs

#

or u make ur own classes and use them?

chilly timber
#

well it's a valid way of accessing and it'll be more hacky because it's a wrapper around their websocket api that you are using

#

the websocket api will give you more direct access to the events, like audio, etc.

#

is what my voice discord bot thing uses. but it is in nodejs

#

let me see if using the conversational ai api has other ways of directly accessing the audio without customizing the interface

#

i do not know.

#

i do not think there is another way to access that audio within the python library known as elevenlabs-python

#

their API gives you power by allowing you to pass in the audio interface. so it's inherently something you can use/adjust. I do not think it is hacky.

livid spear
#

my goal is to clean the audio before it is passed

#

through webhook or as a 3rd party model

#

just wondering if u tried self.default_interface.output(audio_bytes)

chilly timber
#

yes. i ran the code. it runs. it might have issues at the end on Windows (the whole ctrl+c issue, but I did not address that because I did not care).

#

i wasn't trying to give you production code. just proving how to do it.

#

of course it would have to be adopted into your code if used with alterations. but I think this might be the only or least-hacky way of doing it if using that python library

livid spear
#

there is also the possible of using a tool and telling the AI to use it to clean the audio before receiving the call

#

through elevenlabs client tools

#

but not sure how reliable that is i'm trying ur implementation tho

chilly timber
#

yeah sorry I was just trying to see if I could help with your question of accessing each piece of audio

#

i'm less aware of the other aspects of the situation

#

i'm massaging a file together to see if it can help explain the tactic I used in the code. or other ways of using it, etc.

livid spear
#

how did u make that so fast?

chilly timber
#

some of those are illustrative examples --- like where it is saying "look you can do this and this and this" -- like some of those methods would of course have to be implemented -- but the point is that -- you can make them and control what happens

#

it's not that every single one of them is something provided by 11labs

livid spear
#

nah like how did u put the document together that fast

chilly timber
#

ohhhh. I asked one AI to make it -- that has access to the official docs, search, etc. And I asked it in the right way. then I crosschecked it with two state of the art ones on top of that. and then finally I looked at it myself.

livid spear
#

you have an agentic system?

#

and how did u design the ui of that?

#

to get it in that output

chilly timber
#

If you know enough about coding, then AI right now will make you prolific.

#

if you know nothing, then it's not as great. but you know coding, so if you haven't recently used AI in coding and gotten the hang of it...

#

you are missing out

#

right now I use multiple things

#

Codex (which is in preview) -- openai, but very costly and I don't use it much

#

Cursor ($20/month plan, but I went over by a few hundred bucks this last period because I had some vacation time)

#

With cursor I mostly use Sonnet, and otherwise I use opus or o3-pro. I want to use the newest Grok but I think it's not that compatible with cursor IDE yet

#

otherwise, rarely, I would use chatgpt the app and use my same sub

livid spear
#

yes I am familiar w/ all except codex

#

codex is an api call?

#

you prompted the AI to put it in that specific format

#

so probably ran search to retrieve the docs then code to output it to that format

#

you got a github?

#

I feel like a girl asking a dude for his number after being impressed

chilly timber
#

I think codex is using a custom coding model based off of o3, and maybe you need to be on some special plans to use it, and it like spins up some VMs and stuff and has tool access, etc. and then shows you the results, and then auto-creates PR if you want, etc. But it's in preview and .... the plan I have now supports it but I'm not sure which else do

livid spear
#

wdy make as a doctor per year?

chilly timber
chilly timber
livid spear
#

like 350k ish ya?

#

I think u should try to make something useful at scale..

#

you have the chops

#

not lying

chilly timber
#

thanks. but I'm hoping what I sent above will help your use case

livid spear
#

it will ya

#

so to confirm you just had an agentic system and asked for that output format?

chilly timber
#

so I was using Cursor. I have my own rules I provide it that aren't that special probably but i'll post them here

#
Verify, don’t guess. Be intentional in your actions and avoid wishful thinking. You must be prudent.
Never assume how a dependency, API, CLI, or build tool behaves — either be certain already or check additional docs and other resources (e.g., web) first and cite evidence.
Use Internet search, local files, debugging, tests, and REPLs instead of guessing.
If it is a complex problem, summarize/restate the problem, list unknowns as well as possibilities and probabilities, make a plan, then execute and verify.
Summarize the root cause and update documentation once an issue is fixed.
Let me know if you have any questions about what I'm requesting before you proceed.
If you are using Python, prefer uv and remember that there might already be a virtual environment.
If the context7 MCP is available, make sure to use it to find docs whenever appropriate.```
#

honestly they might be bad. not sure

#

then I have an MCP server for it

#

context7

#

it is free, publicly available

#

and then

#

the models I use are Sonnet, sometimes opus, o3-pro, etc. and then the rest is just how I talk to it

#

and make my requests

#

so sometimes asking it the right way helps is what I mean

#

it has access to internet which I auto-allow

#

the MCP when it wants to use it -- I have to say yes

#

and yes in that sense it is agentic

#

sorry for the long winded answer.

#

i asked it for .md format. and then I just converted to pdf on my end externally to that

#

though honestly if I had asked it to do it... it woulda done it just fine

livid spear
#

ya word I can fix ur system prompt if u want

#

you need less prompt engineering if ur using the big models

chilly timber
#

ohhhh i'd be happy to hear your suggestions. I'm loving how it is treating me thus far with the models I use though.

livid spear
#

the smarter the model the less you have to prompt engineer and the fewer examples you should give it

#

but for reference i automated google forms entry from the phone call we were just working on and gemini and openai couldn't do it, but claude 3.5 and above could

chilly timber
#

Thanks I’ll look at it

livid spear
chilly timber
#

once this work stretch ends then I'll look into it more and maybe give it a try -- thank you for offering

livid spear
livid spear
#

My code

chilly timber
#

have you tried running your code and it has failed? or just want to know if it looks good before trying to run it?

chilly timber
chilly timber
#

?

livid spear
# chilly timber ?

hey sorry bro I never got notificationa for this hence me not replying - you should @ me next time

#

will try ur code tonight ty for trying