Skip to main content

Timeline for Alternative approach for Q-Learning

Current License: CC BY-SA 4.0

3 events
when toggle format what by license comment
Dec 12, 2023 at 18:48 comment added BitTickler IMHO (99% sure), you do not even need bellman equation and complicated updates for the tic-tac-toe kind of problems. If you stay table based (5000-ish states for tic tac toe), you can simply update the q-table from back to front (the last move to the first move of the episode) with a simple Q(st,a) = MAX(Q(st-1,a),R).
Nov 5, 2019 at 11:15 review First posts
Nov 5, 2019 at 13:46
Nov 5, 2019 at 11:12 history answered Abdul Rehman CC BY-SA 4.0