reward shaping | Learn AI Together | Page 1

Hello guys, i'm training an agent on https://gymnasium.farama.org/environments/classic_control/mountain_car/

And as a reward shaping I use

@staticmethod
def potential(s):
  return 10 * abs(s[0]+0.5) + 1000 * s[1]**2

def shape_reward(self, reward, sp, s):
  return reward + self.gamma*self.potential(sp) - self.potential(s)

the problem is that it doesnt work at all and using only the env reward make training impossible.

Gymnasium Documentation

A standard API for reinforcement learning and a diverse set of reference environments (formerly Gym)

#reward shaping