Architectural scaling in online learning process
Abstract
Finding the optimal network structure for information processing remains a challenge in using any of the learning algorithms available. Here we study the optimum architectural scaling of artificial neural networks (ANN) for efficient online learning using parity-N tests. We examine the learning performances of different architectures of a network with four layers-input, entry, hidden, and output. We show that the network learns more accurately when the layer sizes decrease by an order of magnitude from the entry layer to the output layer. This scaling avoids the redundancy of information processing with extra nodes via information compression. The same architecture is seen in neural network for vision in the primate eye. While information compression is observed when the network optimizes the correctness of learning, information expansion is observed when the network optimizes both the speed and correctness simultaneously. Here the use of a hidden layer size that is twice the entry layer size counter-intuitively results to the most efficient learning of the network. The scaling property is similar to the human olfactory network architecture.