0
$\begingroup$

I am trying to use PPO algorithm to train a novel robotic manipulator to reach a target position in its workspace. What should I include to the observation vector, which works as the input to the control policy? Of course, I should include relevant states, like current manipulator shape (joint angle), in the observation vector.

But I have concerns about the following two states/information for their inclusion in the observation vector:

1): position of the end effector which can be readily calculated based on joint angle. This is confusing because the position of the end effector is an important state/information. It will be used to calculate the distance between the end effector and the goal position, to determine the reward, to terminate the episode if succeed. But can I just exclude the position of the end effector from observation vector, since it can be readily determined from the joint angles. Do the inclusion of both joint angles and joint angles-dependent end effector form redundancy?

2): position of the obstacle. Position of the obstacle is also an important state/information. It will be used to calculate/detect the collision between the manipulator and the obstacle, to apply a penalty if collision detected, to terminate the episode if collision detected. But can I just exclude the position of the obstacle from the observation vector, since the obstacle stays fixed throughout the learning process? I will not change the position of the obstacle at all. Is the inclusion of obstacle in observation vector necessary?

Lastly, if i keep the size of observation vector as small as possible (kick out the dependent information and fixed information), does that make my training process easier or more efficient? illustration of my RL task?

Thank you in advance.

$\endgroup$

1 Answer 1

2
$\begingroup$

Every vector added to the state increases the chances of running into the curse of dimensionality. So in general it behooves us to reduce the input size where possible.

One argument for including seemingly redundant information is that not including it requires the network to discover the transformation. It may never do this or it may slow down the learning.

For a totally static obstacle, its inclusion in the state may not be necessary as its presence can be derived from its effects. Or there may be a more interesting encoding of the information which is more interesting to the learner (instead of position relative to the world, distance to the end effector for example).

$\endgroup$
1
  • 1
    $\begingroup$ Thank you so much! For your last comment, "Or there may be a more interesting encoding of the information which is more interesting to the learner (instead of position relative to the world, distance to the end effector for example).", are you suggesting I inlcude the positon of the obstacle relative to the end effector as part of the observation vector? This sounds very interesting. $\endgroup$ Commented Mar 4 at 17:55

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.