Skip to main content
added 8 characters in body; edited tags; edited title
Source Link
nbro
  • 43.1k
  • 14
  • 121
  • 222

Why is my Soft Actor Critic - Losses areCritic's policy and value function losses not converging?

I'm trying to implement a soft actor-critic algorithm for financial data (stock prices) and, but I have trouble with losses,: no matter what combination of HyperParametershyper-parameters I enter, they are not converging, and basically it caused bad reward return as well. It sounds like the agent is not learning at all.

I already tried to tune some hyperparameters ( Learninglearning rate for each network + number of hidden layers), but I always get similar results. theThe two Plotsplots below represent the losses of my policy and one of the value functions during the last episode of training.

enter image description here

enter image description here

My question is, would it be related to the data itself (nature of data) or is it something related to the logic of the code?

Soft Actor Critic - Losses are not converging

I'm trying to implement soft actor-critic algorithm for financial data (stock prices) and I have trouble with losses, no matter what combination of HyperParameters I enter, they are not converging, and basically it caused bad reward return as well. It sounds like the agent is not learning at all.

I already tried to tune some hyperparameters ( Learning rate for each network + number of hidden layers) but I always get similar results. the two Plots below represent the losses of my policy and one of the value functions during last episode of training.

enter image description here

enter image description here

My question is, would it be related to the data itself (nature of data) or is it something related to the logic of the code?

Why is my Soft Actor-Critic's policy and value function losses not converging?

I'm trying to implement a soft actor-critic algorithm for financial data (stock prices), but I have trouble with losses: no matter what combination of hyper-parameters I enter, they are not converging, and basically it caused bad reward return as well. It sounds like the agent is not learning at all.

I already tried to tune some hyperparameters (learning rate for each network + number of hidden layers), but I always get similar results. The two plots below represent the losses of my policy and one of the value functions during the last episode of training.

enter image description here

enter image description here

My question is, would it be related to the data itself (nature of data) or is it something related to the logic of the code?

Bumped by Community user
Source Link

Soft Actor Critic - Losses are not converging

I'm trying to implement soft actor-critic algorithm for financial data (stock prices) and I have trouble with losses, no matter what combination of HyperParameters I enter, they are not converging, and basically it caused bad reward return as well. It sounds like the agent is not learning at all.

I already tried to tune some hyperparameters ( Learning rate for each network + number of hidden layers) but I always get similar results. the two Plots below represent the losses of my policy and one of the value functions during last episode of training.

enter image description here

enter image description here

My question is, would it be related to the data itself (nature of data) or is it something related to the logic of the code?