Model Adaptation with Diffusion | EleutherAI | Page 1

gloomy crater Mar 13, 2024, 12:36 AM

#

ohhh

#

@latent dagger I forget, are you a jax user at all?

#

if so, I'm willing to help out on code for this

#

as in give advice/debugging support, I don't have bandwidth to do the bulk of the work

#

I think it's pretty simple to implement this in jax if you have a diffusion model w/ the appropriate output shape

latent dagger Mar 13, 2024, 12:53 AM

#

gloomy crater <@537633604436754443> I forget, are you a jax user at all?

I am a PyTorch user.

#

Also preparing papers for NeurIPS and I have some other deadlines coming up. But can start later.

silk jasper Jun 7, 2024, 7:24 PM

#

Is this project still active ? I’d love to contribute to this

latent dagger Jun 7, 2024, 8:06 PM

#

Now that I am done with NeurIPS, I'll start working on this.

silk jasper Jun 9, 2024, 5:26 PM

#

Great, where we can discuss? Here? And is there any code that I can start with?

kindred leaf Jun 11, 2024, 5:47 PM

#

I am also interested in contributing to this project if its still active.

dreamy comet Jun 11, 2024, 7:42 PM

#

@latent dagger is your goal to only reproduce their results and extend to a wider variety of tasks? What are your expected outcomes from doing this?

#

In any case I have a set of close to a thousand checkpoints of different architectures for fine grained image recognition on a wide variety of datasets. I expect to release them along with a paper hopefully in the next few weeks. Do you think this would help with your task?

latent dagger Jun 11, 2024, 8:15 PM

#

dreamy comet <@537633604436754443> is your goal to only reproduce their results and extend to...

We aren't interested in reproducing their results. Instead, we want to use diffusion models as hypernetworks for model adaptation.

We can approach this problem in different ways. One challenge is generating the weight data for the diffusion model (which you might be able to help with). However, I think we can avoid the requirement of weight data and train these models end-to-end. Another thing to consider is which architectures/problems to test this on.

Since there appears to be some interest, I will write a more detailed plan this weekend—if anyone has any ideas or suggestions, please share them here.

kindred leaf Jun 18, 2024, 11:36 AM

#

@latent dagger if you are still planning to continue with this project, then ig if everyone agrees, we can schedule a meet to further discuss your idea?

latent dagger Jun 18, 2024, 11:57 AM

#

kindred leaf <@537633604436754443> if you are still planning to continue with this project, ...

Yes. I wrote a mini proposal—I will post when I have time, but I am happy to schedule a meeting or discuss here.

kindred leaf Jul 1, 2024, 6:54 PM

#

@latent dagger any updates?

rugged warren Nov 24, 2024, 11:56 PM

#

This has some surface similarities to ideas I had a few years ago about building an artificial analogue of the LC-NE complex in the human brain for rapid task adaptation. I chose the system as a relatively tractable looking case where we might be able to build artificial neurotransmitters. My interpretation was that the LC-NE optimizes the covariance of errors in subcircuits of tasks in order to balance between exploration and exploitation.

#

But I was being pretty hasty at the time and I'm not confident my understanding of the neuroscience was right, couldn't find anyone using the word covariance. But the idea is that you want to hedge so that if subcircuit 1 is wrong, subcircuit 2 is wrong in a way that counterbalances it, and how aggressive you want to be about exploiting the covariance changes depending on if you're already good at a task or bad at it.

rugged warren Nov 25, 2024, 12:21 AM

#

📎 lcne05arn.pdf

#

https://pmc.ncbi.nlm.nih.gov/articles/PMC6334975/

PubMed Central (PMC)

Locus Coeruleus tracking of prediction errors optimises cognitive f...

The locus coeruleus (LC) in the pons is the major source of noradrenaline (NA) in the brain. Two modes of LC firing have been associated with distinct cognitive states: changes in tonic rates of firing are correlated with global levels of arousal ...

dusk fossil Jan 10, 2025, 7:17 AM

#

https://arxiv.org/abs/2402.18153

arXiv.org

Diffusion-Based Neural Network Weights Generation

Transfer learning has gained significant attention in recent deep learning research due to its ability to accelerate convergence and enhance performance on new tasks. However, its success is often contingent on the similarity between source and target data, and training on numerous datasets can be costly, leading to blind selection of pretrained...

#

related paper that came out around the same time

#

i wonder if you could condition the diffusion model with say 1-10 images, and have it output a lora of the character in the images

#

could be a way to generate much more data efficient loras

latent dagger Jan 10, 2025, 10:59 AM

#

Yes. This would be similar to the original idea.

dreamy comet Jan 24, 2025, 3:11 PM

#

https://weight-space-learning.github.io/
This whole workshop is based around this topic in case anyone is interested
Proposal PDF with more technical details on the topic:
https://openreview.net/pdf?id=Bz6wEdobY7

ICLR 2025 Workshop on Weight Space Learning

Overview

Neural Network Weights as a New Data Modality

#Model Adaptation with Diffusion