#Is anyone building with non-Transformer models (e.g. SSMs, etc.) ?
4 messages · Page 1 of 1 (latest)
Probably a lot of smaller startups. The use case of having smaller, more efficient models is a use case anyone can get behind
Agreed! I'm trying to see what the next "transformer" is going be. it's crazy that the the architecture is relatively new and now its everywhere. I wonder if the next big thing is going to grow to an even larger scale
SSMs seem interesting. Consider implementing vision and language models with SSMs on par with existing foundational transformer models (GPT-2, BERT, ViT, T5, etc)