I've been trying to experiment and test the extents to which a neural network works.
I was only able to make something with broad categorical variables function in an acceptable amount of time and in essence it was linear as each part of it could be reduced to linear function.
Eg conversion to polar coordinates given a constant angular measure just yields a real constant when you plug the angle into cos/sin when you convert back to cartesian. etc etc
would have been different if I was solving for the angle I think.
Point is now I'm trying something simple, I have a NN with 2 inputs (x and y) both real numbers that I plug into a polynomial function with 2 dependent variables.
It defines a 1024 neuron hidden layer, which I selected because someone claimed to be able to approximate sin(x) this way.
How long should it take to become accurate ? What activation functions should I use ? There don't really seem to be guidelines I can find to indicate how long until values start becoming less random.
Initially I started this and trained an lr/mom of 0.001/0.9. Training in randomly introduced parameters on the domain of each variable of 0-100 normalizing the training data.
After somewhere around 40000 runs it began to closely resemble the slop of the plane of plotting the equation a 3d axis and then selecting 100 random points via inferencing random parameters on the domain into it.
Around 50,000 training batches all the values plateaued back to a level elevated plane. What happened here ?