0

I am trying to implement training through an evolutionary algorithm for a neural network using TensorFlow and Keras, but I think something is not working with my implementation as it doesn't seem to improve in the task.

Currently, I am using the Tic-Tac-Toe game as the task. Essentially, my algorithm does the following:

It creates 2 models with the same architecture. In my case, these models consist of a positional embedding, a transformer block comprising an attention layer and a dense layer, and another transformer block with an attention layer and an LSTM-type network. Both transformers use causal masking, and their outputs are normalized. Finally, the output passes through a dense layer before the output layer. The output type is a probability distribution for the best move.

Once the "parent" models are created, they are saved as .h5 files. These models predict using the .predict function, receiving a one-dimensional array with the values of the board positions encoded so that 0 corresponds to player 1's positions, empty positions are represented by 1, and positions used by the opponent are represented by 2. At the end of each round, if there is a winner, their settings are saved with .save in an .h5 file, and it is adjusted randomly to create the opponent. The adjustment is conditioned by a penalty score that penalizes some actions such as choosing an occupied box, but ultimately, it is a random adjustment. If they tie or exceed 20 moves, both models are trained.

At first, it seems like they are improving, but almost immediately they plateau. I will leave the code I am using for the adjustment, hoping someone can tell me what is failing.

Code:

def training(model, penalization): # Variable initialization aleatory_value = tf.random.normal(shape=(1,), mean=0, stddev=3.0) learning_rate = 0.025 if penalization <= 1: learning_rate = 0.0025 aleatory_value = tf.random.normal(shape=(1,), mean=0, stddev=1.0) if penalization >= 2: learning_rate = 0.4 aleatory_value = tf.random.normal(shape=(1,), mean=0, stddev=3.0) # Generate a random decimal value between -3.0 and 3.0 with 4 decimal places of precision penalty = tf.cast(penalization, tf.float32) # Convert the penalty to float32 for layer in model.layers: if isinstance(layer, tf.keras.layers.Dense): for neuron in layer.weights: current_weight = neuron.numpy() adjusted_weight = current_weight * (1 - learning_rate * penalty) + learning_rate * aleatory_value neuron.assign(adjusted_weight) model.save("model1.h5") 

I'm not sure if this method really works for anything since I built it from scratch, and consider that for the real task, it is impossible to obtain a labeled dataset.

5
  • Which optimization algorithm did you choose? Commented Dec 28, 2023 at 9:06
  • @LMaxime That is the optimizer; a completely random adjustment is made conditioned by performance, and it is checked whether it improved or worsened compared to the parent by simply playing against it. The winner becomes the new parent. It is a form of evolutionary algorithm, or at least my best attempt to implement one Commented Dec 28, 2023 at 11:44
  • From your code snippet, there are no initial population, no mechanisms to generate the next candidates and no rules to choose the best ones. It looks like a random search in no preferred direction. Here is a wiki page presenting some common used evolutionary algorithms. Commented Dec 28, 2023 at 13:24
  • It sounds like you are assembling a complex system in a single step without first testing its parts. It will be difficult to guess what went wrong, because all the parts now influence each other's result in a training loop. Maybe try a MLP first, start with a simpler problem (can it learn XOR?), try a simpler algorithm (just random weight initializations, maybe?) Commented Dec 28, 2023 at 20:32
  • Okay, I saw what was happening. My mutation probability is 100%, so my population becomes destabilized, and I have 1 or 2 halfway decent ones, and then everything falls apart. Commented Jan 1, 2024 at 7:14

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.