#Library of Ladev - a Neuro transcription archive

1 messages · Page 1 of 1 (latest)

solemn grove
proven vortex
#

Recently I'm trying to find a way to quickly search for certain sentences in neuro's streams to make clips, and this website really comes in handy. Thank you for the hard work and the nice website!

By the way, what's the range of streams that are included? It seems results from the subathon are not shown. And will there be a fuzzy search function in the future?

Edit: Sorry I didn't see the first item in the TODO list

proud fable
#

oh someone is actually doing it since i first made my prototype lol

#

are you using whisper to transcribe the streams?

solemn grove
proud fable
#

no, I made a prototype version a while ago, but haven't made the whole thing and transcribed everything

#

I used whisper turbo v3 and auto generated yt subtitles while available

solemn grove
#

I'm using large-v2 since it supposedly has the highest accuracy, admittedly even after I finish setting up an automatic pipeline it'll probably take months to transcribe the backlog of streams so I'll likely revisit this topic

proud fable
#

have you already started transcripting the backlog?

solemn grove
#

Very slowly yeah

proud fable
#

do you have a spreadsheet of the progress with all vods listed, I could help by donating some compute

#

also you could skip some vods where youtube auto generated captions are available

solemn grove
#

I've got ~15 handpicked streams in the db, and will upload the 2024 subathon (minus asleep segments) in a few days. Not in a hurry since this is a hobby project, but dumping the yt auto generated subs before replacing them w/ whisper transcriptions could be a nice thing to have, I'll look into it

proud socket
solemn grove
solemn grove
#

The entire 2024 subathon (minus asleep segments) is now uploaded! Going to look into improving search next...

west plover
#

great work, this will be useful for wiki editors

solemn grove
#

Advanced search is now available! Though it's still pretty janky. Still figuring out what configuration makes the most sense

wintry herald
solemn grove
#

Added more search filters!
Archive is steadily growing, should have 2024-present complete by the end of next week

solemn grove
#

Archive now includes everything from the first subathon to now (will probably filter out the subathon asleep segments from the transcription at some point)

west plover
#

is there any plan to add a page 1 of x feature to the database to load more on demand, but otherwise save resources if not requested? or is that a budget limitation? I know when you search "the" for example only the most recent vod comes up so theres clearly a parse limit.

#

id also like to suggest a calendar date picker ui to autofill the date range in Search Options. its not necessary but its nice quality of life

#

whether or not my suggestions are feasible/something you want to implement, appreciate the work you've done neuroHeart

solemn grove
west plover
#

is there any way to make searching numbers as the number and the word bring up the same result? like in your example on the home page searching “ten tin cans” brings up the miyune minecraft collab, but “10 tin cans” does not. ReallyInnocent

solemn grove
west plover
#

appreciated

solemn grove
#

I'm excited to announce that the archive is now complete with 422 videos transcribed from the official and unofficial VOD channels!

Thought I'd share some interesting findings with the transcriptions:

  • The model refuses to recognize the word Vedal (even with hot word priming), which I've bandaided with a quick and dirty find + replace for the 20+ ways it is being misspelled
  • Hallucinations are worse for older VODs, seems to be correlated to Neuro's schizo level (if you do come across them feel free to DM me)

I hope to keep the archive up to date as long as there are active users, lest it be... abandoned.

west plover
#

ill use it as long as I am active on the wiki and need sources

#

its very useful for that

#

id have no way to prove mini was still neuros childhood friend in her new life without rewatching every mini collab if this project didnt exist, its a lifesaver