#Titan: New Architecture by Google

1 messages · Page 1 of 1 (latest)

tulip grail
#

https://arxiv.org/abs/2501.00663
This seems like a big step at first glance!

"We observe that our Titan architecture
outperforms all modern recurrent models as well as their hybrid variants (combining with sliding-window attention) across
a comprehensive set of benchmark"

#

gulp!

nocturne relic
#

Sounds like this could be big. Seems to perform better with larger models. But we've also seen some earlier architectures (like samba) that didn't really scale that well in the end. Wonder how this performs on truly large models.

nocturne relic
#

Who thinks this is part of the magic of gemini's insane Context length?

reef egret
#

Quite possible

#

Man, we need good news

real bison
tulip grail
nocturne relic
tulip grail
#

Yeah that's been bugging me too. But then again they also released the transformer paper. But there you could argue they weren't aware of the importance which shouldn't apply now.

tough radish
#

Google isn't a single coherent organization. Teams can operate with various degrees of autonomy, so if there wasn't an order from above stating that specific types of research cannot be published, the research teams want to publish.