Joke Logo





Neural Network Software Sensor

Neural Networks

Biological systems are difficult to model largely due to the inherent non-linearity of such systems. However, neural networks can help users to describe and model phenomena that are too complex for analytical methods or empirical rules. Neural networks can efficiently be used to forecast process values in fermentation processes, since they are able to map/describe non-linear functions. In the case of fermentation processes, neural nets are for instance able to forecast the substrate concentration on the basis of inputs such as culture volume, pH, pO2, and current substrate feed rate. Neural networks are used in large scale fermentation systems to aid the supervisor in keeping the batch in good shape thus maximising the product output. Since this is also the main goal in the research laboratories, a neural network software sensor module has been incorporated in to the Fermentor Control program.

As for the fuzzy logic control module, some knowledge about neural networks and use of these is of advantage. This page is only giving a short overview of the implemented technology. Therefore, to fully understand and use the neural network software sensor please have a look at the information about neural networks given below:

The papers below are concerned with the application of neural networks for bioprocess control:

Artificial Neural Networks in Bioprocess State Estimation. Karim, M.N. and Rivera, S.L. (1992) Adv. Biochem. Eng./Biotech. 46, 1 -33.

Neural Networks as 'Software Sensors' in Enzyme Production. Linko, S., Luopa, J., and Zhu, Y.-H. (1997) J. Biotechnol. 52, 257-266.

Control of Fermentors - A Review. Yamuna Rani, K. and Ramachandra Rao, V.S. (1999) Bioprocess Eng. 21 77-88.


There are several very good tutorials and bibliographies on the Internet, if you are interested in learning more about neural networks:

Neural Network FAQ by Warren S. Sarle, Cary, NC, USA.

Backpropergator's Review by Donald R. Tveter.

Books about neural networks:

'Neural Networks for Pattern Recognition' by Christopher M. Bishop (1995) Oxford University Press.


Network Layout

One of the initial step towards a successfully working neural network software sensor is to decide on the network layout. On the basis of information from many different sources, a two layered fully connected backpropagation network has been chosen. Besides input units, one hidden layer and an output layer are forming the sensor network (See Figure 1). A sigmoid activation function is used and a threshold term is added to the neuron output function preventing the network to be trapped in a local minimum, and input data are rescaled to values within a range of -1 to 1, output values to between 0.1 and 0.9 to enhance training speed. Currently 7 inputs are being used as input for the software sensor as these parameters can be derived from data collected by the main fermentor control program (see process parameters). The single output from the net is giving the estimated substrate concentration.

Figure 1


Training the Network

Before a neural network can be used in process control, the network needs to be trained with historical process data obtained from previous fermentation runs. Thus to successfully implement the neural network software substrate sensor, data has to be collected followed by training of the network. From version 3.0 of the Fermentor Control program, it is possible to collect the data semi-automatically. In the software sensor menu there is an entry called Sample Training Data. Using this function, you are prompted to enter the measured substrate concentration corresponding to the time point the sample was taken. This information is together with seven other parameters added to a training data file.

When you have sampled enough data to start training (see below), you are ready to implement neural network control of you fermentation process. The sensor training tool that follows the fermentor program can be used to train the network and to produce the parameter file (Sensor.nnf), which is used by the fermentor control program. We use the Pichia pastoris expression system, which in our case use glucose as substrate. From several fermentation runs various process parameters as well as off-line glucose measurements were collected. These data were then used as input for the training program resulting in a network that modelled the training data.

Figure 2

The newest version of the SensorTrainer tool can be downloaded from here.

Last Updated: 01-04-2004


Using the Software Sensor Output

Once you have trained and saved the parameter file (Sensor.nnf), you can copy this file into the Fermentor Control program directory and start using the software sensor. Once a fermentation run has been started you can turn on the sensor and watch the results in the fermentation tank and the graph windows.

Application of predicted substrate concentration:

If you decide that the performance of the software sensor is good, then you can use the fuzzy logic control module to adjust the substrate feed rate to optimise product formation. The neural network output must be integrated with the rule-based fuzzy logic module, since, as stated above, no liable model exists by which the predicted subtrate concentration can be used to reach the disired concentration.

The output can also be used to validate other sensor readings such as off-line substrate measurements.


How Much Data Do I Need ?

There is no real way to answer this question. But to train a neural network for use in process value forecasting, as a rule of thumb you need at least 5 times the number of weights in the net (Number of inputs * number of hidden units + number of hidden units * number of outputs). To calculate the number of hidden units use a general rule of: (Number of inputs + outputs) * (2/3). This should give: 7 inputs, 6 hidden units, and 1 output unit. Therefore, for training you need at least 250 data points containing process values and measured substrate concentration. It is advisable to try different numbers of hidden units to obtain the best results.

Besides the training data set you need to have a smaller validation data set which represents the training data set. This set is not used for training but to check the network during training. A central goal during network training is not to memorise the training data, but rather to model the underlying generator of the data (the fermentation process). A problem with neural nets is that they will fit almost any data set giving enough training. Therefore, noise and errors in the data set are fitted as well if training is prolonged. To avoid 'overfitting', the validation data set is used to check whether the network is overtrained or needs more training cycles.

Validation is done by following the root-mean-square (RMS) error on the substrate concentration predictions (Subpredicted - Subreal). Thus, the trained network tries to predict the substrate concentration from process data given in the validation data set, and the output value is then compared to the real substrate concentraion value in the validation data set. Once the sum of RMS errors on the validation data set starts to increase during training, the network is automatically saved and may be a sufficiently trained network if not the result of a local minimum in the process data space.

In the sensor training tool that follows the fermentor program (See Figure 2), a graph shows the RMS errors of the training data and the validation data set. After ended training you can test the different nets. Stop training when the RMS error on the validation data set increases slowly while the error on the training data decreases slowly.


Current Input Parameters

The currently supported inputs for the software substrate sensor are:

  • Feed Rate (% pump action) (time = t)

  • Volume Substrate feed (mL)(t)

  • Total culture volume (mL)(t)

  • Offset pO2 (pO2 setpoint - pO2 current)(t)

  • DpO2 (over the last 5 min.)(t)

  • Offset pH (pHsetpoint - pHcurrent)(t)

  • Previous substrate concentration (t-1)

  • For training the measured substrate concentration at time = t is needed.

The file layout of the training data files is tabulated and lines starting with an asterisk is ignored. More information about values and file layout is given in the legend of the above files (see manual Appendix).


Return to the BioStat homepage

Last modified april 10, 2004 08:21