I am trying to use PPO algorithm to train a novel robotic manipulator to reach a target position in its workspace. What should I include to the observation vector, which works as the input to the control policy? Of course, I should include relevant states, like current manipulator shape (joint angle), in the observation vector.
But I have concerns about the following two states/information for their inclusion in the observation vector:
1): position of the end effector which can be readily calculated based on joint angle. This is confusing because the position of the end effector is an important state/information. It will be used to calculate the distance between the end effector and the goal position, to determine the reward, to terminate the episode if succeed. But can I just exclude the position of the end effector from observation vector, since it can be readily determined from the joint angles. Do the inclusion of both joint angles and joint angles-dependent end effector form redundancy?
2): position of the obstacle. Position of the obstacle is also an important state/information. It will be used to calculate/detect the collision between the manipulator and the obstacle, to apply a penalty if collision detected, to terminate the episode if collision detected. But can I just exclude the position of the obstacle from the observation vector, since the obstacle stays fixed throughout the learning process? I will not change the position of the obstacle at all. Is the inclusion of obstacle in observation vector necessary?
Lastly, if i keep the size of observation vector as small as possible (kick out the dependent information and fixed information), does that make my training process easier or more efficient?
?
Thank you in advance.