Hi everyone! I've just released a new mini-project!
Described: A system to automatically describe imaging on various social media platforms:
- Automatically takes images sent in chat and describes what's happening in the images using gpt-4-vision
- Supports multiple images
- Great for the visually impaired, TTS support coming soon for described images
- Ability to enable/disable it on a per-channel basis
- Currently only supports Discord but more social media platforms like Messenger, Telegram, etc will be added soon!
I would appreciate it a lot if you could check out the repo and also give it a star if you think it's cool! https://github.com/Kav-K/Described.
The repo also has an example of it in action, and the repo itself is a great example / foundation for how you can use GPT-4-Vision in your own applications, this project is also meant to be a baseline repo for people to learn from!