#Kiwi πŸ₯ learn more from videos with GPT3-powered summaries, questions, and quizzes!

37 messages Β· Page 1 of 1 (latest)

bitter palm
#

Hey there! We are excited to share https://kiwi.video/ with you πŸš€

kiwi πŸ₯ allows you to learn from videos like never before with AI-powered summaries, questions, and quizzes!

It’s very simple:

  • Upload a YouTube video (up to 4hr long!)
  • Get a summary
  • Ask any question, and get answers from the video
  • Generate quizzes to test your understanding!

We built this using OpenAI's Whisper model for transcription (hosted on banana), and the GPT-3 API for inference. We originally built this for the lablab OpenAI hackathon, and won first place πŸ† πŸ•Ί

We're currently in alpha, and would love to get your thoughts as we continue building! You can leave feedback in this thread or DM us directly @tulip remnant @harsh badger

kiwi

Ask Questions. Get a summary. Quiz yourself.

arctic leaf
#

Hi

arctic leaf
#

@bitter palm @tulip remnant @harsh badger

ionic arrowBOT
frigid wraith
#

This is awesome @bitter palm! Are you planning to monetize this in the future? Would you be willing to share some of the backend insight on how you're summarizing the video? Are you converting the video's audio into text and then feeding it to GPT3 in chunks?

bitter palm
#

Hey @arctic leaf ! Thanks for trying out kiwi πŸ™‚ Part of our logic is that we only want GPT to answer questions relevant to the video, so in this case asking for timestamps is not returning a response because that is not part of the video. We'll fix the response message to make that clearer.

All of our responses include timestamps by default that reference the video. Feel free to try the summarize button and you will see that in action πŸ™‚

bitter palm
#

Hey @frigid wraith ! Thanks for trying out kiwi πŸ™‚ monetization is definitely something we are considering down the line (especially since there are a lot of costs associated with running it), but for now we are focused on getting it into the hands of more people and iterating on feedback.

Sure, happy to chat a bit about what is going on in the backend! We use OpenAI's Whisper model to generate a transcript of the video's audio. We then take the text and break it down into chunks based on OpenAI's token limitations before feeding it to GPT3. Your instinct is correct πŸ˜‰

frigid wraith
#

I'm very curious on what the computational complexity is like for you to run whisper, what is your server architecture? @bitter palm

bitter palm
# frigid wraith I'm very curious on what the computational complexity is like for you to run whi...

Our server architecture is simple at this point, we use https://www.banana.dev for hosting whisper transcription. Once we have the transcript we store it in s3 and then use it to call into the gpt API.

Banana provides inference hosting on serverless GPUs for machine learning models. Deploy on Banana in three easy steps and a single line of code.

wanton sonnet
wanton sonnet
harsh badger
# wanton sonnet Since the transcript is broken into chunks, is it difficult to get responses whi...

We are currently experimenting with a new approach that could potentially improve our answers for super long videos. It'd work something like this:

  1. chunk the text
  2. rank the chunks based on how relevant they are to the query (embed + cosine similarity?)
  3. summarize the top N chunks and ask the question on them

Given that the answer will be retrieved from a summary of only N chunks rather than all chunks, then I expect that the answer will be more accurate.

wet fulcrum
#

you have plans to add more languages?

harsh badger
# wet fulcrum you have plans to add more languages?

Yes, we plan on supporting more languages.

We added a disclaimer today to clarify that we currently only support English (we are using the Whisper base english model for transcription). I'm going to explore the multi-lingual approach sometime in the next few days. I don't think using the larger multi-lingual models will be sufficient for a good experience as our prompts are in english, so we might need to do prompt translation or something around those lines.

Today we were discussing to what extent we should support more languages and these are the options that were mentioned:

  1. Any language video -> transcription (English), interaction layer (English)
  2. Any language video -> transcription (Native lang), interaction layer (Native lang)
  3. Any language video -> transcription (Native lang) , interaction layer (Pick lang)
  4. Any language video -> transcription (Pick lang), interaction layer (Pick lang)

I think the most intuitive and desired use-case is #2 (maybe?). The most interesting use-case is #4 as that will allow you to watch videos in languages that you never had access to. I'm thinking of exploring #2 and #4 and test how reliable they can be and then release something accordingly.

I'm curious

  • which one of the 4 options would you use the most and for what use-case?
  • what other language than english would you want supported?
runic falcon
#

Oh wow, this is great! I just wrote an article today on extracting transcripts and generating summaries. Will follow!

harsh badger
wet fulcrum
jaunty needle
#

First of all, I would like to say congratulations to you, this was really very useful for me. However, there is a question I want to ask, is there a way to get the transcript with the times?

peak fossil
#

This is super awesome
I am working on developing a immersible education site would it be possible to incorporate

trail jasper
#

This is great! I especially love the feature where I can ask questions about the video and get answers. It's super simple after getting a summary the video and reading it over, I can ask kiwi about a point in the summary to expand upon.

I also really liked the hyperlinked sections in the summary to the relevant portion in the video. I was worried about not having something like that when I first tried it out, but there it was, which made skipping around the video super easy.

The summary also finished faster than the estimated time limit, which I was impressed by as well. For example, it summarized a 13 min video in a little over a minute when I think the estimated wait time was 3 minutes. That blew my mind.

This is an incredibly useful tool. Thank you. Now I can digest those 3 hr podcasts I've been putting off because I don't even want to listen to them at 2x speed because I don't have the time. You just gave me back a ton more time.

Will it be possible to export the summary/chat history in the future? Will you be expanding upon the 4 hr time limit? Will there be a log of the videos I've uploaded in my account so I can go back and reference them? And any idea what the price point would be when you monetize? I'd gladly pay a subscription fee for this.

trail jasper
#

Also, will you be integrating a text-to-speech feature for people to listen to the video summaries?

manic swallow
#

this is amazing project, thanks for making it

bitter palm
bitter palm
slow cypress
bitter palm
# trail jasper This is great! I especially love the feature where I can ask questions about the...

Hey Josh! Appreciate the love and the thorough questions πŸ™Œ

Will it be possible to export the summary/chat history in the future?
We are considering adding the ability to export chat out of kiwi. I am curious what are the use cases you would like to use that feature for?

Will you be expanding upon the 4 hr time limit?
We definitely want to go beyond 4h video, but this currently is not a high priority for us as most videos we have seen on kiwi are under 4 hours. This is something we will keep an eye on.

Will there be a log of the videos I've uploaded in my account so I can go back and reference them?
Yes, you might have noticed that we have recently added accounts to kiwi! This will allow us to preserve history for our users so that they can go back and reference videos they have uploaded. We will be adding that soon πŸš€

any idea what the price point would be when you monetize? I'd gladly pay a subscription fee for this.
Glad to hear that you are enjoying kiwi this much πŸ™‚ Monetization is still something we are discussing as we want to make sure that we do it at the right time. I am curious if you have a range in mind for what you would be comfortable paying for a monthly subscription?

will you be integrating a text-to-speech feature for people to listen to the video summaries?
I love this idea! We definitely think that would be useful and something we will probably incorporate in the future. Do you prefer audio over text when consuming content?

restive trout
#

Awesome work!

coral hemlock
#

Super useful. Beyond the chat and knowledge-extraction function, I can see this being helpful for the video creators themselves to generate titles, descriptions, clips and even written articles/blogs/social media posts based on the video content.

Seems there's also a pretty natural synergy to expand this into non-vid podcasts given that Kiwi is embedded audio, apart from the embedded video linked to the transcript. True?

fading harness
#

ggit

coral hemlock
#

Not sure if it’s possible, but being able to work with both desktop and mobile aspect ratios for video would be helpful. That way you can quickly find the sections and iterate on shorter clips across distribution channels.

trail jasper
#

Just wondering if there is something wrong with the app. I haven't been able to get it to summarize for the past week

slow cypress
#

Hey

#

Guy why kiwi is not working?

#

Should I pay for it , or it is not gonna work anymore?

#

I mean kiwi.vodeo website