In order to practice writing some Gleam (my first project) I wrote a WebVTT parser and payload tokenizer! There is still quite some things to be done, but I wanted to get early feedback as I am used to write elixir instead and wasn't sure about some patterns.
#glubs - Subtitle parser and tokenizer
1 messages · Page 1 of 1 (latest)
@short plover I just looked at your toml parser and noticed that you worked with string.graphemes and then matched on the list elements! Whats the reason for that? Do you think that would also be better for the webvtt parser/tokenizer? I worked a lot with binary split and pattern matching: https://github.com/philipgiuliani/glubs/blob/main/src/glubs/webvtt.gleam#L129
String matching like that is currently slow on JS runtimes due to how strings are implemented, and I wanted it to be faster there even if that means it's slower on Erlang
What's best I'm not sure. I haven't done extensive research or any proper benchmarking
Ok thanks, sounds reasonable!
In my company we are working a lot with livestreaming and live subtitle transcription/editing! So the first thing that came to my mind was making a WebVTT parser 😄
I'm thinking about writing a WebVTT editor using lustre afterwards 🤔
That's super cool!
I've just added support for the SRT format and published it on hex! https://hexdocs.pm/glubs/index.html
WebVTT and SRT parser and serializer.