Timeline for Alternative approach for Q-Learning
Current License: CC BY-SA 4.0
3 events
| when toggle format | what | by | license | comment | |
|---|---|---|---|---|---|
| Dec 12, 2023 at 18:48 | comment | added | BitTickler | IMHO (99% sure), you do not even need bellman equation and complicated updates for the tic-tac-toe kind of problems. If you stay table based (5000-ish states for tic tac toe), you can simply update the q-table from back to front (the last move to the first move of the episode) with a simple Q(st,a) = MAX(Q(st-1,a),R). | |
| Nov 5, 2019 at 11:15 | review | First posts | |||
| Nov 5, 2019 at 13:46 | |||||
| Nov 5, 2019 at 11:12 | history | answered | Abdul Rehman | CC BY-SA 4.0 |