Skip to main content

Timeline for Alternative approach for Q-Learning

Current License: CC BY-SA 4.0

17 events
when toggle format what by license comment
Dec 12, 2023 at 18:44 comment added BitTickler There are also use cases, where it is hard to come up with a reward for every step, but where it is natural to "know" the reward at the end of an episode. Having had that situation just the other day, I kept browsing the internet for pointers and it did not yield much. So - I hope, the answers here will be valuable for anyone who runs into this situation.
Dec 9, 2022 at 16:07 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Aug 7, 2022 at 8:02 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Apr 3, 2022 at 4:07 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Dec 2, 2021 at 11:05 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Jul 31, 2021 at 14:03 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Mar 31, 2021 at 20:07 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Nov 30, 2020 at 4:03 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Aug 2, 2020 at 4:03 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Apr 3, 2020 at 21:02 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Dec 5, 2019 at 16:00 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Nov 5, 2019 at 11:12 answer added Abdul Rehman timeline score: 1
Jul 22, 2019 at 12:03 history bumped CommunityBot This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Jun 22, 2019 at 11:55 history edited maurock CC BY-SA 4.0
added 472 characters in body
Jun 22, 2019 at 11:24 answer added Neil Slater timeline score: 1
Jun 22, 2019 at 10:45 comment added Djib2011 So basically like an off-policy Monte-Carlo approach?
Jun 22, 2019 at 8:23 history asked maurock CC BY-SA 4.0