Accuracy of a neural network differential equation solver
Abstract
Modeling complex systems is basically a two-fold process. The first step involves formulation of the dynamical relations which usually takes the form of differential equations (DE), while the second step engages in determination of the relationship between dynamical variables. Each of the tasks is significant and oftentimes, equally difficult. For example, when the relevant DEs of interacting systems are prepared, say the Lotka-Volterra equations for predator-prey models or the hydrodynamical equations for fluid motion, the problem of solving this equation is equally formidable since in many cases the DE has no known analytic solution. One way of solving a DE is by transforming its coordinates such that it resembles a form with known analytic solution. Such a process is however generally tedious, or at worst, will never work because such a transformation does not exist in the first place. A more common approach is by utilizing numerical techniques (e.g. finite-difference, Runge Kutta, etc.) which generally relies on the accuracy of the sampling interval. However, numerical methods provide an iterative representation of the solution and hence, the error propagates at later iterations. Another way is by using neural networks (NN). Recently we have shown that an unsupervised NN trained using the modified backpropagation method can solve a differential equation that: 1) models the propagation of pulses in nonlinear media, 2) replicates the arrangement of competing biological entities in a given space, and 3) determines the optimum design for a nuclear reactor. In this study we analyze formally how accurately an NN can solve linear DEs and we propose an approach in increasing the accuracy of the solution obtained by NN for general types of DEs. We utilize a three layer unsupervised NN with inputs {x1, x2, ... , xn} in solving the DE which is given by F(x1, x2, ... , xn) = DΨ (x1, x2, ... , xn) = 0, where D is a differential operator and the xi are the dynamical parameters. The sth output Ψs(k), corresponding to the kth set of inputs of the NN, is given by Ψs(k) =f0(∑m=1,H dsm(k)ym(k)); ym(k) = f0(∑j=1,L wmj(k)rj) where rj is the jth input, ym(k) gives the mth output of the hidden nodes {m = 1, 2, ... , H}, fH(z) = tanh(z) and f0(z) = z represents the hidden and output activation functions, respectively. Parameter wmj(k) describes the interconnection weights between neurons in the mth and the jth layer while dsm(k) gives the synaptic strength of the sth and mth layer.