#ways to make the most out of data for training

5 messages · Page 1 of 1 (latest)

lime locust
#

i've been researching about ways to make the most out of data for training models because i figured this is quite important especially for niche tasks where you have to collect your own data which is often limited. when searching about this topic, there actually isn't much dedicated to it since most researchers just deal with plentiful data.

so im hoping anyone here might know some good tricks or have read something somewhere about how to effectively train models using low data.
so far i know that in general:

  1. you need to use less parameters
  2. more regularization, especially dropout (dropping layers, not elements)
  3. when it comes to images you can augment the image in many different ways
lime locust
#

gpt4 also briefly mentioned but didnt cite any sources:

  • share parameters, since those parameters get used more and have to be more generalized
  • train for multiple tasks at the same time using the same data
    would be great to see some actual papers dedicated to low data tasks with results
lusty dock
#

ecg suffers from low data regimes quite a bit

#

using contrastive learning (especially with multiple modalities) or adverserial learning also helps

#

check out Lemda