Revisions to Approximate the chance of a number is prime using neural network

added 361 characters in body

edited Oct 2, 2023 at 9:55

161
1
2

One idea that comes to mind is having the features (inputs) be the digits of the number, rather than the number itself. You might also try converting to binary before hand and then using those digits as features, or use some kind of one hot encoding for the decimal digits. The rationale being that finding primes is a discrete task dealing with integers, you don't want your model interpreting the input as a continuous value.

You might get better results if you include the modulo of the input with some primes (or just random integers) as features as well.

As for the hidden layers pick your favorite architecture. Standard linear layers followed by ReLU activations are acceptable. As you said, the last layer would be a sigmoid activation (preceded by a linear layer, not a ReLU). You can experiment with things like batch norm if you want although I wouldn't imagine they would help much.

With this setup your training data would have features as the binary digits of a positive integer, and the target would be whether or not that integer is prime (0 or 1). Your loss would be binary cross entropy. Train with stochastic gradient descent or friend until convergence. Note that your training data is very imbalanced as most numbers are not prime.

One idea that comes to mind is having the features (inputs) be the digits of the number, rather than the number itself. You might also try converting to binary before hand and then using those digits as features, or use some kind of one hot encoding for the decimal digits. The rationale being that finding primes is a discrete task dealing with integers, you don't want your model interpreting the input as a continuous value.

You might get better results if you include the modulo of the input with some primes (or just random integers) as features as well.

As for the hidden layers pick your favorite architecture. Standard linear layers followed by ReLU activations are acceptable. As you said, the last layer would be a sigmoid activation (preceded by a linear layer, not a ReLU). You can experiment with things like batch norm if you want although I wouldn't imagine they would help much.

One idea that comes to mind is having the features (inputs) be the digits of the number, rather than the number itself. You might also try converting to binary before hand and then using those digits as features, or use some kind of one hot encoding for the decimal digits. The rationale being that finding primes is a discrete task dealing with integers, you don't want your model interpreting the input as a continuous value.

You might get better results if you include the modulo of the input with some primes (or just random integers) as features as well.

As for the hidden layers pick your favorite architecture. Standard linear layers followed by ReLU activations are acceptable. As you said, the last layer would be a sigmoid activation (preceded by a linear layer, not a ReLU). You can experiment with things like batch norm if you want although I wouldn't imagine they would help much.

With this setup your training data would have features as the binary digits of a positive integer, and the target would be whether or not that integer is prime (0 or 1). Your loss would be binary cross entropy. Train with stochastic gradient descent or friend until convergence. Note that your training data is very imbalanced as most numbers are not prime.

added 5 characters in body

Source Link

edited Oct 2, 2023 at 9:50

Vityou

161
1
2

One idea that comes to mind is having the features (inputs) be the digits of the number, rather than the number itself. You might also try converting to binary before hand and then using those digits as features, or use some kind of one hot encoding for the decimal digits. The rationale being that finding primes is a discrete task dealing with integers, you don't want your model interpreting the input as a continuous value.

You might get better results if you include the modulo of the input with some primes (or just random integers) as features as well.

As for the hidden layers pick your favorite architecture. Standard linear layers followed by ReLU activations are acceptable. As you said, the last layer would be a sigmoid activation (preceded by a linear layer, not a ReLU). You can experiment with things like batch norm if you want although I wouldn't imagine they would help much.

Source Link

answered Oct 2, 2023 at 9:49

Vityou

161
1
2

One idea that comes to mind is having the features (inputs) be the digits of the number, rather than the number itself. You might also try converting to binary before hand and then using those digits as features, or use some kind of one hot encoding for the decimal digits. The rationale being that finding primes is a discrete task dealing with integers, you don't want your model interpreting the input as a continuous value.

You might get better results if you include the modulo of the input some primes (or just random integers) as features as well.

As for the hidden layers pick your favorite architecture. Standard linear layers followed by ReLU activations are acceptable. As you said, the last layer would be a sigmoid activation (preceded by a linear layer, not a ReLU). You can experiment with things like batch norm if you want although I wouldn't imagine they would help much.

Stack Exchange Network

Return to Answer