#Access multimodal capabilities outside of chatgpt

1 messages · Page 1 of 1 (latest)

peak violet
#

Ideally this would be released in the same way Whisper was, but if that's not happening, I'm wondering if we can at least have custom setups with this new multimodal model with programmatic access to its hearing and sight through the API.

crystal bearBOT
#

Hi @peak violet. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the information provided in the knowledge sources, it appears that the multimodal capabilities of GPT-4, such as vision and voice, are currently available within the ChatGPT ecosystem. However, there is no indication that these capabilities are accessible outside of ChatGPT or through the API for custom setups source (https://community.openai.com/t/506972).

In terms of using vision and voice capabilities in the same conversation, it is possible within the ChatGPT environment source (https://help.openai.com/en/articles/8983381-can-i-use-chatgpt-vision-capabilities-and-voice-conversations-in-the-same-conversation).

As for the release of these capabilities in a manner similar to Whisper, there is no information available in the provided sources to suggest that this is planned or possible.

For the most accurate and up-to-date information, I recommend checking OpenAI's official documentation and announcements.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@peak violet kapa.ai is still learning and improving, please let us know how it did by reacting below