What should be a normal amount of loss per epoch for a text -> image model | Learn AI Together | Page 1

Image based generative AI has never been something I've focused on but started a very tangent side project after setting up architecture for a LLM model.

I did a very basic setup for the image generation portion (essentially just a basic U-net of 2 layers. It takes in patches and outputs an image. Trying to avoid a diffusion model right now.) but after "plateauing" (if i were to call it a plateau) it only goes down by 0.01 per epoch. My dataset size is 100 and I've tested the model in several situations (predicting the image using its own image as input, increasing the number of parameters, increasing the number of generated text tokens and decreasing the number of generated text tokens). When changing the image size, it still hits the same plateau and still does only 0.01 loss per epoch updates.

Does anyone have any insight if this amount of loss per epoch is somewhat normal for generative AI models?

#What should be a normal amount of loss per epoch for a text -> image model