#TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

1 messages · Page 1 of 1 (latest)

normal flame
#

https://arxiv.org/pdf/2305.07759.pdf

In the current environment May 2023 is a long time ago 😂 but I've been pondering this paper for a while. There are some (I think) unsupported claims, but also some really fascinating insights into what transformers are learning and why, along with an interesting dataset. Anyone who read this still thinking about it? I'd be interested in everybody's take.

normal flame
#

TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

tough pebble
normal flame
#

Oh yeah, there are a series of issues with the dataset, premise, and conclusions. I think the main takeaways for me were the concept for the task itself (e.g. short stories in very basic English, where completion is the goal), and the analysis of what transformers are learning across different model sizes.