#whats the difference between client and server side tools in conversational ai?
1 messages · Page 1 of 1 (latest)
The difference between client-side and server-side tools in Conversational AI lies in where the processing and functionality occur:
-
Client-Side Tools: These operate directly on the user's device or browser. They are typically used for tasks like capturing user input (e.g., microphone access for voice input), rendering the AI's responses, or managing the user interface. Client-side tools are essential for real-time interactions and ensure a seamless user experience.
-
Server-Side Tools: These run on a backend server and handle more complex or resource-intensive tasks. Examples include processing API requests, managing databases, or executing external function calls (e.g., querying a knowledge base or booking system). Server-side tools are crucial for integrating external systems and ensuring secure, scalable operations.
Both types of tools are complementary and often work together to deliver a complete Conversational AI solution.
give me a more comprehensive example if i set up an agent that handles inbund calls for a healthcare clinic
To set up a Conversational AI agent for handling inbound calls at a healthcare clinic, you can leverage ElevenLabs' tools to create a customized, interactive voice agent. Here's a more comprehensive example of how this could work:
1. Agent Setup
- Voice Selection: Choose a voice from ElevenLabs' Default Voices or create a custom voice clone to match the clinic's branding and tone.
- System Prompt: Configure the agent's behavior and style. For example:
"You are a friendly and professional receptionist for a healthcare clinic. You assist callers with scheduling appointments, providing clinic hours, and answering general inquiries. If the caller has a medical emergency, direct them to call 911."
2. Knowledge Base
- Add a knowledge base with frequently asked questions, such as:
- Clinic hours and location.
- Insurance plans accepted.
- Instructions for new patient registration.
- Non-enterprise users can upload up to 5 files or links, with a total limit of 20MB and 300,000 characters.
3. Call Flow
- Greeting: The agent greets the caller and asks how it can assist.
- Dynamic Responses: Based on the caller's input, the agent can:
- Schedule appointments by integrating with the clinic's booking API.
- Provide directions or clinic hours.
- Answer insurance-related questions using the knowledge base.
- Emergency Handling: If the caller mentions an emergency, the agent advises them to call 911 immediately.
4. Integration with Tools
- Speech-to-Text and Text-to-Speech: The agent listens to the caller's input, processes it using an LLM, and responds in real-time.
- API Integration: Connect the agent to the clinic's scheduling system or patient database for real-time updates.
- Follow-Up Actions: Use webhooks or APIs to send appointment confirmations via SMS or email.
5. Success Criteria and Data Collection
- Define success criteria, such as:
- "The agent successfully scheduled an appointment without human intervention."
- "The caller received accurate information about clinic hours."
- Collect data from call transcripts to analyze common inquiries and improve the agent's performance.
6. Testing and Deployment
- Test the agent within the ElevenLabs dashboard to ensure it handles various scenarios effectively.
- Deploy the agent using a widget on the clinic's website or integrate it with a phone system like Twilio for inbound calls.
This setup ensures the agent can handle routine inquiries, freeing up human staff for more complex tasks while providing a seamless experience for patients.
how would a server side vs client side tool work for a inbound receptionist agent?
In a conversational AI setup for an inbound receptionist agent, such as one handling calls for a healthcare clinic, server-side and client-side tools play distinct roles. Here's how they would work in this context:
Client-Side Tools
Client-side tools operate on the user's device or interface, such as a web browser or mobile app. These tools are responsible for handling interactions directly with the user.
Example for a Healthcare Clinic Receptionist Agent:
-
Microphone Permission Handling: The client-side application ensures the user's microphone is accessible for capturing voice input. For example, when a patient calls, the app requests microphone access to process their voice.
async function requestMicrophonePermission() { try { await navigator.mediaDevices.getUserMedia({ audio: true }); return true; } catch (error) { console.error('Microphone permission denied:', error); return false; } } -
Real-Time Feedback: The client-side interface provides visual indicators, such as showing whether the agent is "listening" or "speaking." This helps the caller understand the agent's current state.
function updateSpeakingStatus(mode) { const statusElement = document.getElementById('speakingStatus'); const isSpeaking = mode.mode === 'speaking'; statusElement.textContent = isSpeaking ? 'Agent Speaking' : 'Agent Silent'; statusElement.classList.toggle('speaking', isSpeaking); } -
User Interaction: Buttons to start or end the conversation are part of the client-side interface. For example, a receptionist agent might allow the user to press a button to confirm an appointment.
Server-Side Tools
Server-side tools handle the backend processing, such as managing the AI logic, securely communicating with APIs, and storing or retrieving data.
Example for a Healthcare Clinic Receptionist Agent:
-
Secure API Communication: The server generates a signed URL to securely connect the client-side application to the conversational AI API without exposing sensitive credentials like the API key.
app.get('/api/signed-url', async (req, res) => { try { const response = await fetch( `https://api.elevenlabs.io/v1/convai/conversation/get_signed_url?agent_id=${process.env.AGENT_ID}`, { method: 'GET', headers: { 'xi-api-key': process.env.XI_API_KEY, } } ); if (!response.ok) { throw new Error('Failed to get signed URL'); } const data = await response.json(); res.json({ signedUrl: data.signed_url }); } catch (error) { console.error('Error:', error); res.status(500).json({ error: 'Failed to get signed URL' }); } }); -
Dynamic Agent Instantiation: The server can dynamically create or configure agents based on the type of call. For example, if a patient calls to check appointment availability, the server-side logic routes the request to an agent trained for scheduling.
-
Data Integration: The server connects to the clinic's database to retrieve or update patient records. For instance, when a patient provides their name, the server-side tool fetches their appointment history and sends it to the conversational AI for context.
How They Work Together
- Client-Side: Captures the patient's voice, provides real-time feedback, and sends the audio data to the server.
- Server-Side: Processes the audio data, interacts with the conversational AI API, retrieves relevant information (e.g., appointment details), and sends the response back to the client.
By combining client-side and server-side tools, the receptionist agent can handle tasks like verifying patient information, scheduling appointments, and answering common questions efficiently.
how would a server side vs client side tool work for a inbound receptionist agent?