Hello guys, i'm training an agent on https://gymnasium.farama.org/environments/classic_control/mountain_car/
And as a reward shaping I use
@staticmethod
def potential(s):
return 10 * abs(s[0]+0.5) + 1000 * s[1]**2
def shape_reward(self, reward, sp, s):
return reward + self.gamma*self.potential(sp) - self.potential(s)
the problem is that it doesnt work at all and using only the env reward make training impossible.