Skip to main content

Questions tagged [imitation-learning]

For questions related to imitation learning (IL), a reinforcement learning technique where a policy is learned from examples (represented as trajectories) of an (optimal) agent's behavior. IL is similar to inverse reinforcement learning (IRL), where a reward function is learned from examples of the (optimal) agent's behavior, which can then be used to solve the RL problem (i.e. find the policy).

3 votes
3 answers
586 views

I am reading the following article given over here - The goal of both inverse reinforcement learning (IRL) algorithms (e.g. AIRL, GAIL) and preference comparison is to discover a reward function. In ...
desert_ranger's user avatar
1 vote
1 answer
424 views

How can imitation learning data be collected? Can I use a neural network for that? It might be noisy. Should I use manual gathering?
dato nefaridze's user avatar
1 vote
1 answer
86 views

In this page Limitations on horizon length from the Imitation library, the authors recommend that the user sticks to fixed horizon experiments because there could be "information leak" ...
aletelecomm's user avatar
1 vote
1 answer
307 views

Some IL approaches train the agents by using some specific ratio of expert demonstrations to trajectories generated using the policy being optimized. In the specific paper I'm reading they say "...
Samuel Rodríguez's user avatar
1 vote
0 answers
80 views

In the DAGGER algorithm, how does one determine the number of samples required for one iteration of the training loop? Looking at the picture above, I understand initially, during the 1st iteration, ...
RoyJ's user avatar
  • 11
3 votes
1 answer
390 views

I'm less familiar with reinforcement learning compared to other neural network learning approaches, so I'm unaware of anything exactly like what I want for an approach. I'm wondering if there are any ...
Daniel S.'s user avatar
0 votes
1 answer
354 views

For simplicity, let's consider the discrete version of BCQ where the paper and the code are available. In the line 5 of Algorithm 1 we have the following: $$ a' = \text{argmax}_{a'|G_{\omega}(a', s')/\...
HenDoNR's user avatar
  • 116
1 vote
1 answer
421 views

In AlphaGo, the authors initialised a policy gradient network with weights trained from imitation learning. I believe this gives it a very good starting policy for the policy gradient network. the ...
calveeen's user avatar
  • 1,331
2 votes
0 answers
31 views

I've been reading this paper that formulates invariant task-parametrized HSMMs. The task parameters are represented in $F$ coordinate systems defined by $\{A_j,b_j\}_{j=1}^F$, where $A_j$ denotes the ...
stoic-santiago's user avatar
6 votes
1 answer
231 views

I just read the following points about the number of required expert demonstrations in imitation learning, and I'd like some clarifications. For the purpose of context, I'll be using a linear reward ...
stoic-santiago's user avatar
2 votes
1 answer
420 views

I've been reading A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning lately, and I can't understand what they mean by the surrogate loss function. Some relevant ...
stoic-santiago's user avatar
1 vote
1 answer
228 views

Is the GAIL applicable if the expert's trajectories (sample data) are for the same task but are in a different environment (modified but will not be completely different)? My gut feeling is, yes, ...
Sam's user avatar
  • 205
2 votes
0 answers
96 views

Imitation learning uses experiences of an (expert) agent to train another agent, in my understanding. If I want to use an on-policy algorithm, for example, Proximal Policy Optimization, because of it'...
Khush Agrawal's user avatar
6 votes
2 answers
3k views

In short, imitation learning means learning from the experts. Suppose I have a dataset with labels based on the actions of experts. I use a simple binary classifier algorithm to assess whether it is ...
user781486's user avatar
7 votes
1 answer
219 views

Due to my RL algorithm having difficulties learning some control actions, I've decided to use imitation learning/apprenticeship learning to guide my RL to perform the optimal actions. I've read a few ...
Rui Nian's user avatar
  • 443