nitin_ch asked . 2021-07-15
How can I find the right neural network architecture
I am trying to learn how to use the neural network to fit functions. I did read a little bit into the subject but I am still not sure how to find the right architecture (the number of neuron in a hidden layer. I use networks with 1 hidden layer and my training algorithm are 'trainlm' and 'trainbr'. Currently I am aware of 4 problems that can occur:
+ Algorithm reaches a local minimum: the best training performance (tr.best_perf) is too large?
+ Overfitting: the best validation performance (tr.best_vperf) is much larger than the best training performance (tr.best_perf)?
+ Underfitting: the best validation performance (tr.best_vperf), the best training performance (tr.best_perf), the best test performance (tr.best_tperf) are in the similar size but they are still too large.
+ Extrapolating: the best test error (tr.best_tperf) is much larger than the two other ones.
Currently, I wrote a loop that examine networks with 1 neuron to 50 neurons. Each network (e.g. a network with 20 neurons) is trained for 10 times and the one with the lowest training performance (tr.best_perf) is chosen in order to avoid the local minimum. Afterwards, I store tr.best_tperf, tr.best_vperf and tr.best_perf of that network in a array. Finally I compare those 50 networks to each other and take the one with the lowest error, with error = max([tr.best_tperf, tr.best_vperf, tr.best_perf]).
The other way to go would be to train each network (e.g. a network with 20 neurons) for 10 times and choose the lowest error, with error = max([tr.best_tperf, tr.best_vperf, tr.best_perf]). Then I store this error for each network in a vector. Finally, I choose the network with the lowest element of that vector.
Can someone tell me which way is the correct way? I really appreciate any help you can provide.
neural network , levenberg-marquard... , baysian regularisa...
Prashant Kumar answered . 2024-12-20 18:18:10
Search the NEWSREADER and ANSWERS using
fitnet Hmin Hmax Ntrials
Minimization of the number of hidden nodes subject to the MSEtrn upper bound
MSEtrn <= 0.01*mean(var(targettrn',1))
<= 0.01*var(targettrn,1) for 1-dim
this yields a training subset Rsquaretrn exceeding 0.99.
Many of the posts don't have the training subset subscript trn and/or may have used t instead of target. So, there are probably a jillion variations posted including
MSEgoal = 0.01*vart1
The best way I have found to obtain relatively unbiased results is to use 2 loops.
1. Outer loop over # of hidden nodes Hmin:dH:Hmax
with Hmax <= Hub, the upper bound for not
having more unknown weights, Nw, than training
equations Ntrneq.
2. Inner loop over Ntrials >= 10 different
random distributions of initial weights.
Nets are initially ranked by their validation subset performance. Then unbiased estimates of performance are obtained from the test subset performance.
However, I usually rank the nets by their combined nontraining validation AND test subset performance.
Again, I have jillions of examples posted in the NEWSREADER and ANSWERS. The best search words are probably
Hmin Hmax Ntrials
Not satisfied with the answer ?? ASK NOW