Questions tagged [reinforcement-learning]

Question 1

In fields such as game theory and reinforcement learning, it is standard to consider the regret-minimization strategy. I don't get the motivation for the definition. Yes, doing your best under worst-...

Question 2

I think R1 was born as an exploration of the AGI approach, and the design of the whole scheme is in line with Professor Sutton's philosophy of "searching + learning to find a meta-approach that ...

Question 3

I am studying from the MARL textbook by Albrecht, Christianos and Schäfer. They define a stochastic game in Sec 3.3 as the multi-agent version of an MDP. In Fig 3.3 (pg 50) they give an intuition for ...

Question 4

Though the reward was assigned by the environment, the once the policy $\pi$ was fixed, the probability of the action on the states $\pi(a|s)$ could be assigned. However, this meant given different ...

Question 5

In my PhD, I will work with ML models. However, I will only use ready-made models as a tool, but I want to delve deeper into Artificial Intelligence not just to use ready-made models, but to ...

Question 6

Google DeepMind recently published a new paper which describes how they used a reinforcement learning to discover faster sorting algorithms. A summary of the paper is here and the paper is here. It ...

Question 7

I am new to reinforcement learning, and recently came across the following issue. When implementing a multi-armed bandit algorithm, we assume we have k machines with reward probabilities [p_1,..., p_k]...

Question 8

There are various meta-learning algorithms in RL that are proposed for settings when we have a (deep) neural network and the policy (or the value function) are parameterized as such. Can these methods ...

Question 9

I read this article, mentioning that either here, or StackOverflow would be the best places to ask generic machine learning questions, however, if the question isn't programming specific with a ...

Question 10

I'm writing an AI based on reinforcement learning to play Connect 4. That's my second bot and attempt to RNN and AI (first was copy a code of snake RNN AI from youtube) and I'm looking for some ...

Question 11

Is there any free/open-source environment, tasks, or dataset for evaluating deep RL algorithms in terms of safety? all available environments (like openAI's) are environments for simple games. These ...

Question 12

I am currently a Ph.D. student in the computer science department, I was given the subject of Deep RL for Healthcare. However, after lots of research on the internet, I could not find any free dataset ...

Question 13

Recently, I had an idea of a novel Deep RL algorithm that might perform better than existing algorithms such as DQN, TRPO, PPO, etc. However, I do not know of a website or a research paper that might ...

Question 14

Note: I consider myself to be a beginner in the field of Deep RL. Deep RL has proven tremendous success in recent years like playing atari and beating go champion. Therefore, considerable interest for ...

Question 15

I was watching a video on Reinforcement Learning by Andrew Ng, and at about minute 23 of the video he mentions that we can represent the Bellman equation as a linear system of equations. I am talking ...

Stack Exchange Network

Questions tagged [reinforcement-learning]

Why does it make sense to minimize regret?

What is the design concept behind DeepSeek-R1

In multi-agent RL stochastic games (multi-agent version of an MDP), why does the Nash equilibrium policy only depend on the state, not history?

In reinforcement learning, does policy affect the maximization of the value?

Main subjects to learn Artificial Intelligence in CS

DeepMind Alphadev: How did it use Reinforcement Learning to reduce the search space?

Multi-Armed Bandit - Reward Probabilities

Tabular Meta-Learning in RL

Reinforcement Learning Reward Function for Optimizing Golf Aim?

Long and short memory in reinforcement learning Connect 4 AI

Evaluating the safety of deep RL algorithms

Deep RL for healthcare: existing benchmarking datasets or environments

Where to find the current state of the art performance of Deep RL algorithms?

What makes Deep RL "fundamentally/mathematically" advantageous?

How to setup the Bellman Equation as a linear system of equation

Hot Network Questions