#Need a place to ask general mlagent related question but cannot find one
1 messages · Page 1 of 1 (latest)
This is the place
Have you tried to make flying AI bot?
I am struggling how to make it @glad rose my AI fails to learn to navigate, it always ends up just going straight when it finds food
but it fails to "search around" the maze
like if it sees a food, it just runs directly to it
but it never actually searches the maze
have you tried including some negative rewards for finding things too quickly when training? Or adding a reward for exploring a given distance of the maze? Or is it just ignoring the maze and skipping to the end? Could be an issue with colliders or your observaitons in that case.
A few questions about how you've set up your agent, what it's action space is, what it's observation space looks like, what type of optimizer you're using (PPO, SAC, etc.), the size of your network, your buffer/batch size, etc. You might want to try posting your config file as well.
Have you tried mixing Curiosity-based learning into your rewards? If your agent is rewarded purely on Extrinsic Reward coming from obtaining the food, then your agent has no incentive to explore the environment. You might try adding Curiosity (https://unity.com/blog/engine-platform/solving-sparse-reward-tasks-with-curiosity) or you can increase the Beta value of your PPO configuration - this increases the entropy of the policy and leads to more varied exploration, but I don't think you'll get as much exploration occurring as if you added an intrinsic motivation for novelty.
Your environment sounds inherently sparse, so it also depends on how much reward you are giving the agent; whether you are penalizing it for acting (in which case it better get to the reward, and fast). Ultimately it sounds like your agent is learning, but you've not incentivized the policy network to behave any differently, especially if you're expecting to find exploration occurring.
I would begin with an agent who has only Curiosity reward at a Strength of 1 and then try adding back in Extrinsic Reward values starting very low - around 0.001. Curiosity Reward doesn't get reported in the End of Episode Logs during training, so it can be hard to know the actual magnitude of the reward. (Keep in mind that a Curious Agent won't collect the food necessarily, but it should cover more area of the maze as long as the maze is not randomized each episode or something.)
We just released the new version of ML-Agents toolkit (v0.4), and one of the new features we are excited to share with everyone is the ability to train agents with an additional curiosity-based intrinsic reward. Since there is a lot to unpack in this feature, I wanted to write an additional blog post on it. In essence, there is now an easy way t...