#Tool to evaluate LLMs on summarization

3 messages · Page 1 of 1 (latest)

vast tapir Feb 14, 2024, 9:26 AM

I need a tool to evaluate llms models on a document summary dataset. Do you know of a tool that takes the dataset in json, models it and calculates the blue scores?

vast hearth Feb 15, 2024, 6:23 AM

@turbid shadow do you think we can use embeddings here to evalute summarize , what if build a dataset containg 10 summaries of 1 content, each worse than the other . Plot the cosine similarity, you think can yield a result ?

Especially with contextualised token embeddings ?