Myna: Small Music Representation Model | EleutherAI | Page 1

rigid oxide Oct 28, 2023, 11:05 PM

#

If anyone wants to join the project - write something here!!

twilit ruin Oct 28, 2023, 11:32 PM

#

I don't think I'm in a position to pick up more projects at this moment, but I'm interested in seeing this move forward

#

It seems you and the other collaborator have already done some work - maybe you could share more details about the project plan at a high level, including concrete things potential contributors could look at/do?

rigid oxide Oct 30, 2023, 5:07 PM

#

twilit ruin It seems you and the other collaborator have already done some work - maybe you ...

I would love to share some details! We are training on a contrastive objective and would probably really benefit just from scaling up and some architecture improvements. We are currently using a variant of Minz Won's self-attention architecture (https://arxiv.org/abs/1906.04972) as the base model. I am a bit hesitant to share more details publicly as we've been working on this for roughly a year (I don't know how it goes in the research world).

As for contributors and what you could do:

We could definitely benefit from some architecture changes to make the model more capable
There are definitely parts in my methodology that could be improved
We should evaluate on the MARBLE benchmark (https://marble-bm.shef.ac.uk/) - we still need to write the evaluation code for this
General ideas to improve performance. Everyone else in the field is training on at least 16 GPUs right now and I have an RTX 4080. While results are good for only a single GPU, I wonder if scaling it up correctly will yield a competitive model that is an order of magnitude smaller and faster.

Would love to hear your thoughts!

arXiv.org

Toward Interpretable Music Tagging with Self-Attention

Self-attention is an attention mechanism that learns a representation by
relating different positions in the sequence. The transformer, which is a
sequence model solely based on self-attention, and its variants achieved
state-of-the-art results in many natural language processing tasks. Since music
composes its semantics based on the relations b...

#Myna: Small Music Representation Model