Skip to content

Conversation

@dchichkov
Copy link

Does it makes sense?

@danijar
Copy link
Contributor

danijar commented Jan 12, 2019

Yes, this seems reasonable. Did you train an agent like this to see if it affects performance?

@dchichkov
Copy link
Author

I've seen in my environment that _penalty does go to exact zero, and "increase penalty" logic doesn't increase it as a result. I haven't performed enough runs to tell, if it affects performance or not.

It may as well be that it doesn't and a sensible change would be to stop wasting time on calculating KL term, once _penalty is zero!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants