Äîêóìåíò âçÿò èç êýøà ïîèñêîâîé ìàøèíû. Àäðåñ îðèãèíàëüíîãî äîêóìåíòà : http://rswater.phys.msu.ru/Assets/Water/OMNN_2010.pdf
Äàòà èçìåíåíèÿ: Wed Sep 5 15:43:29 2012
Äàòà èíäåêñèðîâàíèÿ: Mon Oct 1 19:43:08 2012
Êîäèðîâêà:
ISSN 1060 992X, Optical Memory and Neural Networks (Information Optics), 2010, Vol. 19, No. 2, pp. 140­148. © Allerton Press, Inc., 2010.

Application of Artificial Neural Networks to Solve Problems of Identification and Determination of Concentration of Salts in Multi Component Water Solutions by Raman Spectra1
S. A. Burikova, S. A. Dolenkob, T. A. Dolenkoa, and I. G. Persiantsevb
a

Department of Physics, M.V. Lomonosov Moscow State University, Leninsky Gory 1/2, Moscow, 119991 Russia e mail: burikov@lid.phys.msu.ru, tdolenko@lid.phys.msu.ru b D.V. Skobeltsyn Institute of Nuclear Physics, M.V. Lomonosov Moscow State University Leninsky Gory 1/2, Moscow, 119991 Russia e mail: dolenko@srd.sinp.msu.ru, ipers@srd.sinp.msu.ru
Received March 31, 2010; in final form, April 15, 2010

Abstract--In this paper, the results of elaboration and comparative analysis of approaches concerned with application of neural network algorithms for effective solution of problem of pattern recognition (inverse problem with discrete output) along with inverse problem with continuous output are pre sented. Consideration is carried out at the example of problem of identification and determination of concentrations of inorganic salts in multi component water solutions by Raman spectra. The studied approach is concerned with solution of both problems (classification and determination of concentra tions) using a single neural network trained on experimental or quasi model data. Keywords: Neural networks, inverse problems, identification, multi component solution, Raman spectroscopy. DOI: 10.3103/S1060992X10020049
1

INTRODUCTION

It is well known that artificial neural networks (ANN) are a class of mathematical algorithms showing very high efficiency for solution of various problems of approximation, prediction, evaluation, classifica tion, pattern recognition etc. ANN are used also for solution of inverse problems (IP) where their prop erties, such as training by examples, high noise stability, stability to contradictory data etc., play a special role (see, for example, [1]). Most often, IP of regression type are under consideration--those with continuous output (for exam ple, determination of temperature of plasma by its proper glow spectrum). However, more complicated situations can take place, when simultaneous solution of problems of classification or pattern recognition (determination of components contributing to observed properties of object, for example, to its spec trum), i.e. IP with discrete output, and traditional IP of regression type for every found component, is required. Call this type of IP complex inverse problem. In this study, elaboration of method of solution of such problems is carried out at the example of com plex IP of identification and determination of partial concentrations of salts in multi component water solution by Raman spectra. The problem of determination of concentrations of substances dissolved in water is very important for oceanology, ecological monitoring, and control of technical waters. The meth ods most required are express non contact methods which can be implemented in remote express mode. It is very promising to use Raman spectra for express remote determination ganic substances dissolved in water. In [2, 3] it is suggested to use Raman spectra mination of type and concentrations of salts in water solution (including remote ­ spectra near 1000 cm­1, narrow bands of valence vibrations of anions NO 3 , present. The type of anion was determined by position of corresponding band, band intensity. This method has a disadvantage--it is suitable only for analysis
1

of concentrations of inor of complex ions for deter sensing mode). In Raman 2­ 3­ 2­ S O 4 , PO 4 , C O 3 are its concentration--by the of substances having their

The article is published in the original.

140


APPLICATION OF ARTIFICIAL NEURAL NETWORKS TO SOLVE PROBLEMS

141

own Raman bands, i.e. for salts with complex anions. However, concentration of such salts in natural waters is much lower than, for example, concentration of metal halides. The authors of [4­11] elaborated a method of determination of concentration of dissolved salts by water Raman valence band. Principle opportunity of use of spectral Raman bands of water for diagnostics of solutions arises from high sensitivity of their characteristics to type and concentration of salts dissolved in water. In [4, 5], concentration of one salt in the solution was determined using water Raman valence band. The authors of [8­11] suggested and elaborated a method of determination of partial concentra tions of several salts in multi component solutions by water Raman valence band. It was possible because of application of ANN for solution of this IP. The authors of this paper are unaware of other studies where concentration of several salts in a solution would be measured by optical methods. Nevertheless, the prob lem already solved by the authors of the present paper--identification of type of salts and determination of their partial concentrations by water Raman valence band by traditional NN approach in 3 component solution [9­11]--does not satisfy urgent problems of oceanology and ecology. It is necessary to identify a larger number of salts in the solution and at the same time to retain accuracy of determination of their concentrations or better to improve it. Solution of such multi parametrical inverse problem requires elab oration of optimal methodical approaches to application of NN algorithms. 1. METHODICAL APPROACHES TO SOLUTION OF IP USING NEURAL NETWORKS Training of ANN for solution of IP requires a representative dataset, i.e. a set of data that would reflect all characteristic aspects of behavior of the object. This dataset can be obtained in different ways. In com pliance with this, one can mark out three different methodical approaches to solution of IP using ANN. (a) "Model based" approach. If an adequate analytical model of solution of the direct problem is avail able, it can be used for generation of arrays of data with necessary representativity. The main disadvantage of this approach is that elaboration and realization of an adequate model is often impossible or very diffi cult. Unfortunately, this is the case with the studied IP of determination of concentrations of components using Raman spectra. Because of extraordinary complexity of the object, there is no adequate model based on physical reasons that would allow obtaining the dependence of water Raman spectrum (i.e. of the intensity in every spectral channel) on concentration of dissolved salts, especially taking into account their non linear interaction. That is why this approach is, unfortunately, unacceptable in this case. (b) "Experiment based" approach. The data used for ANN training are obtained in experiment. (In this paper, the data are Raman spectra of various solutions with different combinations and concentrations of components, obtained using laser Raman spectrometer). This approach does not require a model to be available, and it allows taking non linear properties of the object into account. However, obtaining a rep resentative data set can be a non trivial experimental problem. In this study, 8695 experimental spectra for 4268 different solutions were obtained for realization of the "experiment based" approach. (c) "Quasi model" approach. If no adequate physically grounded analytical model of solution of the direct problem is available, one can replace it by a parametrical "quasi model" based on experimental data. This model formally describes the dependence of the observed data on the sought for parameters. In the present case, a quasi model (a set of quasi models) describes the dependence of intensity in every spectral channel on concentrations of components. The simplest quasi model is a linear one when it is supposed that the intensity in each spectral channel is a linear combination of partial concentrations of the components. In order to construct more complicated quasi models, which can describe the desired dependence better, one can use more efficient adaptive methods for construction of models--for exam ple, group method of data handling (GMDH) or some types of ANN. In the studies of one , two , and three component solutions carried out before, several approaches were used--"experiment based", "linear quasi model", and "GMDH quasi model". It has been dem onstrated [14], that the "linear quasi model" approach is unusable for determination of concentrations of salts in multi component solutions. This fact confirms the conclusion about non linear influence of salts on Raman valence band in multi component solutions. For this reason, the "linear quasi model" approach was not used in this study. 2. BASIC METHODS OF SOLUTION OF PROBLEM AND CRITERIA OF MODEL EVALUATION The considered complex IP naturally breaks down into two problems--the problem of identification of components of the solution and the problem of determination of their partial concentrations. The method applied in this paper provides simultaneous solution of these two problems when a single ANN is
OPTICAL MEMORY AND NEURAL NETWORKS (INFORMATION OPTICS) Vol. 19 No. 2 2010


142

BURIKOV et al.

used. The values at the outputs of the ANN are treated as estimations of concentrations of components; if the value at some output is lower than a pre defined threshold, it is considered that the corresponding component is not present in the solution. This method was also used in all the above mentioned earlier experiments with one , two and three component solutions. The NN architectures used in this study to solve the complex IP were perceptrons with one and three hidden layers, trained by error back propagation method, General Regression NN (GRNN) [12] and Group Method of Data Handling (GMDH) [13]. For construction of quasi models, perceptrons and GRNN were tested. For evaluation of quality of models, four basic criteria were used in this study. In the description of these criteria, the following notation was used: y--estimation of the value of an output variable made by the model (neural network); d--the desired value of this output variable; d --average value of the desired output over the whole concerned dataset; N--the number of patterns in the concerned dataset. Summation is carried out over all the patterns of the dataset for which the criterion is calculated (from 1 to N). 1. Coefficient of multiple determination R2 is calculated according to the following formula: R = 1­
2

(d ­ y) (d ­ d)

2 2

.

(1)

This criterion compares the error of the constructed model with the error of trivial reference model (the estimate provided by this reference model is the mean value of the estimated variable over all the patterns of the dataset). When the estimate is absolutely accurate, R2 equals 1. If the accuracy of the estimate is worse than the accuracy of the trivial model, R2 is negative. For many kinds of problems, R2 is the most substantial universal criterion of evaluation of model quality. This criterion is dimensionless. 2. Mean squared error (MSE) is calculated according to the following formula: MSE = (d ­ y) . N
2

(2)

This criterion has the same dimension as the estimated variable. In our case (determination of concen tration), it is M (molarity--quantity of moles of substance per liter of solution). 3. Mean absolute error (MAE), calculated according to MAE = d ­ y . N This criterion also has the same dimension as the estimated variable. 4. Mean relative error (MRE), calculated according to MRE = 1 N (3)



d­y â 100 %. d

(4)

This criterion has no dimension and is usually expressed in percent. Because of limited size of the article, no comparative analysis of areas of application, advantages and disadvantages of these criteria is presented here. 3. EXPERIMENT The scheme of the laser Raman spectrometer is presented in Fig. 1. Excitation of Raman spectra was performed by argon laser (wavelength 488 nm, output power 350 mWt). In order to remove elastic scattering signal, edge filter (Semrock) was used. It allowed approaching laser line to 200 cm­1. Registration of spectra was performed by monochromator (Acton 2500i, grade 900 l/mm, focal length 500 mm) and CCD camera (Synapse 1024 â 128 BIUV, Jobin Yvon). Spectra were measured in two regions: 200­2300 cm­1 and 2300­4000 cm­1 for every sample. Practical resolution of the spectrometer was 2 cm­1, duration of accumulation of one spectrum was 1 s. The tem perature of samples during experiment was stabilized at 22.0 ± 0.2°C. Spectra were normalized to laser
OPTICAL MEMORY AND NEURAL NETWORKS (INFORMATION OPTICS) Vol. 19 No. 2 2010


APPLICATION OF ARTIFICIAL NEURAL NETWORKS TO SOLVE PROBLEMS

143

power, duration of registration of the spectrum and spectral sensi tivity of the detector. The objects of research were water solutions of the salts NaCl, NH4Br, Li2SO4, KNO3, CsI. These salts are present in natural waters at significant concentra tions. Concentration of every salt in the solution was changed from 0 to 2.5 M (with increment 0.2­ 0.25 M). The range of concentra tions was chosen according to the following considerations: salinity of seawater 35 corresponds to concentration 0.5 M of the most widespread salt NaCl. In mineral waters concentration of certain salts can reach 1 M, in waste waters--up to 1­2 M. For prepa ration of experimental solutions, bidistilled water and analytically pure reagents were used. In Fig. 2, some of the obtained Raman spectra of water and water solutions of salts are presented. In the low frequency region of Raman spectrum (200­1800 cm­1) one can observe bands of valence
­

6 5

7 8

10 9

4

2

3

1 11

Fig. 1. Experimental setup for Raman spectroscopy of liquid water and water solutions: 1--argon laser (488 nm), 2--beam splitter, 3--indicator of laser power, 4--lens, 5--cuvette, 6--focusing system, 7--polarizer, 8--monochromator, 9--photomultiplier (PMT), 10--CCD camera, 11--computer.

1.0

1 2 3

NO 3 (2). Their intensities 0.6 depend on the concentrations of the corresponding salts. As exper imental results demonstrate, 0.4 under increase of concentration of salts, water Raman valence band (2600­4000 cm­1) shifts 0.2 towards high frequencies, its half width decreases, the intensity of its high frequency part increases, 0 the intensity of the low frequency 500 1000 1500 2000 2500 3000 3500 4000 part decreases. The changes in Wavenumber, cm­1 position and shape of water valence band depend on type and Fig. 2. Raman spectra: 1--distilled water; 2--water solution with concen tration of salts: NaCl, NH4Br, Li2SO4--0.4 M, KNO3 and CsI--0.6 M, concentration of salt. Moreover, 3--water solution with concentration of salts: 0.6 M for NaCl and Li2SO4, anions influence the behavior of 0.2 M for NH4Br, KNO3 and CsI. water valence band stronger than cations [6]. 4. DATA PRE PROCESSING AND PREPARATION OF DATASETS During experiment, two bands of Raman spectrum were recorded, in frequency regions 200...2300 cm­1 (low frequency band) and 2300...4000 cm­1 (valence band). Preparation of experimental spectra to work with ANN included the following steps:
OPTICAL MEMORY AND NEURAL NETWORKS (INFORMATION OPTICS) Vol. 19 No. 2 2010

Intensity, arb. units

vibrations of anions S O

2­ 4

(3) and

0.8


144

BURIKOV et al.

(1) To work with ANN, narrower ranges of spectrum were used: 766 channels in the range 281...1831 cm­1 for low frequency region and 769 channels in the range 2700­3900 cm­1 for the valence band. (2) For each of the bands, the pedestal caused by non controlled elastic scattering in cuvette with sam ple, was subtracted. Then the spectra were normalized to the area of the valence band in the pointed region. (3) For the data array obtained after this processing (1535 features--intensities in every channel, 8695 patterns), average value and standard deviation were calculated for every feature. Then the low informative features with standard deviation less than 2.5 â 10­5 and the features corresponding to low frequency region below 950 cm­1 (see Fig. 2) were excluded from further consideration. Variability of the features in the region below 950 cm­1 is caused mainly by non controlled changes of light scattering in the cuvette with sample ("clutter"). Total amount of the remaining features was 704 : 185 channels in the range 950...1673 cm­1 for low frequency region and 519 channels in the range 2870...3682 cm­1 for valence band. (4) It is well known [6, 7] that for single component solutions of salts, dependence of the intensity of Raman spectrum on concentration of salt in the range 0...2.5 M is linear. That is why for single compo nent solutions, spectra only for 5 different concentrations of each salt were measured. This was enough to make sure that dependence of intensity on concentration was linear, and to use this dependence for gen eration of quasi model linearly interpolated spectra of single component solutions for another 45 inter mediate concentrations of each salt. As the result, the total amount of patterns in the data array was 9144 : 160 spectra of distilled water (measured in different time simultaneously with measurement of spectra of different salts); 100 spectra (real and interpolated) for single component solutions of each of 5 salts, 240 spectra for each of 10 two component and each of 10 three component combinations of salts in solutions; 420 spectra for each of 5 four component combinations of salts; 1584 spectra of five compo nent solutions. (5) The thus obtained total array of data (704 features, 9144 patterns) was used to calculate of minimal and maximal values of each feature for normalization of input and output variables. (6) As for each solution with some combination of concentrations two spectra (A and B) were mea sured, partitioning into sets of data necessary for use of NN was performed not by separate spectra but by combinations of concentrations (excluding spectra of distilled water). In order to provide equal conditions for every combination of salts ("combinational class"), the same proportion of training, test and exami nation sets should have been observed for all combinations of salts. That is why partitioning into datasets was performed in following way: (a) The whole data array was divided into 63 sets--two (containing A spectra and B spectra) for each combinational class with solutions and one for the class containing spectra of distilled water. (b) Each of these 63 sets was randomly divided into training, test and examination sets (with 70 : 20 : 10 ratio). For each set, the same random seed was used for random number generator. This procedure guar anteed placement of spectra with the same concentrations (A and B) to the same set and therefore equal proportion among the three datasets fro all the combinational classes. This provided equality and neces sary independence of sets, and this allowed avoiding illusions in evaluation of the results. (c) By appending corresponding parts A and B to each other, 31 training, test and examination sets were obtained--one for each combinational class. The corresponding sets for the class with distilled water were obtained at the stage b). (d) To work with the whole data array without separation into combinational classes, training sets for all combinational classes were merged to obtain the common training set. Similar routine was performed for test and examination sets. Information about number of patterns in different datasets obtained as the result of the described pro cedure is summarized in Table 1. 5. RESULTS OF SOLUTION OF THE PROBLEM USING THE "EXPERIMENT BASED" APPROACH The inputs of ANN were fed with all the 704 features chosen according to the routine described in previ ous section, without any additional selection, compression or other pre processing connected with further reduction of input dimensionality of the problem. Each feature separately was normalized into 0...1 range for the whole data array (see step 5 in the preceding section).
OPTICAL MEMORY AND NEURAL NETWORKS (INFORMATION OPTICS) Vol. 19 No. 2 2010


APPLICATION OF ARTIFICIAL NEURAL NETWORKS TO SOLVE PROBLEMS Table 1. Number of patterns in different datasets Total number of Number of salts Number of classes patterns in solution 0 1 2 3 4 5 Total 1 5 10 10 5 1 32 160 1 00 240 240 420 1584 9144 Training patterns 112 70 168 168 294 1110 6402 Test patterns 32 20 48 48 84 316 1828

145

Examination patterns 16 10 24 24 42 158 914

Table 2. Comparison of the results of determination of salts concentrations by Raman spectra in multi component solutions (MAE on examination set, M) Method Perceptron GRNN GMDH NaCl 0.029 0.102 0.059 NH4Br 0.024 0.047 0.046 Li2SO4 0.020 0.064 0.032 KNO
3

CsI 0.023 0.050 0.046

0.019 0.058 0.033

The ANN had five outputs according to the maximal number of components in a solution. Each output corresponded to one of the considered salts, and its desired value corresponded to concentration of this salt in the solution. Three models were used to solve the considered problem: perceptrons with three hidden layers, GRNN and GMDH. The results (MAE on examination set, M) are presented in Table 2. As can be seen from Table 2, the best results were demonstrated by perceptron with three hidden layers. The hidden layers contained 40, 20 and 10 neurons. Linear activation function was used in the output layer, and logistic activation function was used in hidden layers. The following values of training parame ters were used: learning rate--0.01; moment--0.5; stop training criterion--1000 epochs after minimum of error on test dataset. Table 3 presents comparison of the results obtained by the authors of this study before [11] for diagnos tics of three component solutions using only valence band, and the results obtained in this study using both bands of spectrum and valence band only. Significantly better results obtained in this study compared to the results of [10, 11], in a wider range of concentrations and for five component (instead of three component) solutions, can be explained by the fact that in the present study the low frequency region of spectrum (where bands of complex ions are present) was taken into consideration. 6. RESULTS OF CONSTRUCTION OF QUASI MODELS One of the main reasons causing decrease of accuracy of solution of this inverse problem using the "experiment based approach" is the unfavorable ratio of the number of input variables of the problem (704) and the number of patterns in the training set (8229 in training and sets together). One can overcome
Table 3. Mean absolute error (MAE) for determination of concentrations of salts by Raman spectra in multi component solutions, M (on examination data set) Experiment [10, 11] 3 comp. [10, 11] 3 comp. This study, 5 comp. This study, 5 comp. Band Valence Valence Valence Both Range, M 0 0 0 0 ... ... ... ... 0.7 1 2.5 2.5 NaCl 0.07 0.07 0.047 0.029 NH4Br 0.06 0.11 0.029 0.024 Li2SO
4

KNO

3

CsI ­ ­ 0.032 0.023
No. 2 2010

KI 0.05 0.12 ­ ­

­ ­ 0.040 0.020

­ ­ 0.046 0.019
Vol. 19

OPTICAL MEMORY AND NEURAL NETWORKS (INFORMATION OPTICS)


146

BURIKOV et al.

Table 4. Statistics for construction of quasi models based on perceptron and based on GRNN (on examination dataset) Index, model R2, perceptron R2, GRNN MRE, %, perceptron MRE, %, GRNN min 0.681 0.768 0.5 0.4 max 0.983 0.974 272.9 75.0 mean 0.933 0.933 9.9 7.1 standard. deviation 0.048 0.041 27.0 11.6

it in one of two ways: by further reduction of the number of input variables (this way was not considered in this study) or by increasing the number of patterns. In the situation when additional experiments cannot be conducted, one can try to get additional pat terns by spectra interpolation using "quasi models"--parametrical or adaptive model of solution of the direct problem. In our case this is the model of dependence of intensity in each channel of the spectrum on concentrations of the components in the solution. However, one should understand that if a "quasi model" is not adequate enough, using such "quasi model" approach can lead not to improvement but to degradation of the quality of solution of the IP. In particular, for this reason, a linear "quasi model" is unsuitable for modeling of multi component mix tures [10, 11]. In this study, two types of quasi models were considered: based on perceptron and based on GRNN. For determination of applicability of the "quasi model" approach for this problem it was necessary first to choose the best of the constructed quasi models (i.e. the quasi model providing the smallest error of solution of direct problem on the examination set). In Table 4, statistics calculated over all 704 modeled channels are presented for quasi models based on perceptron and on GRNN. From this table, one can see that average statistics were close, so at this stage there was no reason to prefer either of these two quasi models. 7. RESULTS OF USE OF "QUASI MODEL" APPROACH FOR SOLUTION OF THE INVERSE PROBLEM The data for use of the "quasi model" approach were prepared in the following way. To create the "quasi model" datasets, we used a grid with 0.15 M increment for all salts; only spectra with total concen tration of salts less than 2.5 M were considered. There were 53 130 such spectra with different concentra tions of salts, i.e. more than 10 times as much as experimental spectra with different concentrations. The obtained array of 53 130 "quasi model" spectra was randomly divided into training, test, and examination datasets with 70 : 20 : 10 ratio. After the division of the array of "quasi model" spectra, the thus obtained training and test sets were supplemented with training and test datasets from the experimental array. The examination datasets were left separate. Thus, the following datasets were used for the "quasi model" approach: training set (43 593 patterns), test set (12 454 patterns), examination quasi model set (5313 patterns) and examination experimental set (914 patterns). Two such complete sets have been obtained: one with generation of "quasi model" spectra with the "quasi model" based on perceptron, the other one--with the "quasi model" based on GRNN. The thus obtained training sets were used to train identical perceptrons, with architecture and training parameters identical to those used to solve the IP within the "experiment based" approach (Section 5). Table 5 displays comparison of the results obtained within the "experiment based" approach on the examination (experimental examination) dataset, with the results obtained within the "quasi model" approach on quasi model examination dataset and on experimental examination dataset, for two kinds of "quasi model" (based on perceptron and based on GRNN). The provided results allow making the following conclusions. (1) The "quasi model" approach failed to meet expectations compared to the "experiment based" approach. In all cases, the results on the experimental examination dataset, obtained within the "quasi model" approach, turned out to be worse or significantly worse than the results obtained within the "experiment based" approach. This is an evidence of low adequateness of the "quasi models" used. (2) The "quasi model" based on GRNN provided the results on the experimental examination dataset that outperformed the results provided by the "quasi model" based on perceptron.
OPTICAL MEMORY AND NEURAL NETWORKS (INFORMATION OPTICS) Vol. 19 No. 2 2010


APPLICATION OF ARTIFICIAL NEURAL NETWORKS TO SOLVE PROBLEMS

147

Table 5. Some statistics for NN determination of concentrations of salts in multi component solutions by Raman spectra within the "experiment based" and "quasi model" approaches. Notation for the datasets: EE--experimental examination, QE--quasi model examination. Notation for the approaches: Exp--"experiment based" approach, QMP and QMGR--"quasi model" approach with "quasi model" based on perceptron and GRNN, respectively Approach R Exp QMP QMGR QMP QMGR MSE, M Exp QMP QMGR QMP QMGR MAE, M Exp QMP QMGR QMP QMGR MRE, % Exp QMP QMGR QMP QMGR
2

Dataset EE EE EE QE QE EE EE EE QE QE EE EE EE QE QE EE EE EE QE QE

NaCl 0.990 0.969 0.980 0.989 0.990 0.041 0.070 0.057 0.041 0.039 0.029 0.050 0.042 0.030 0.029 9.2 17.4 12.5 10.4 11.3

NH4Br 0.993 0.985 0.986 0.995 0.992 0.036 0.051 0.050 0.027 0.034 0.024 0.037 0.034 0.021 0.025 5.8 9.3 8.8 7.2 9.5

Li2SO

4

KNO3 0.996 0.986 0.991 0.991 0.991 0.031 0.054 0.042 0.038 0.037 0.019 0.035 0.029 0.028 0.027 4.3 8.2 6.8 8.9 9.5

CsI 0.994 0.987 0.989 0.996 0.992 0.036 0.051 0.047 0.026 0.035 0.023 0.037 0.031 0.019 0.025 5.9 9.3 7.5 6.2 9.1

0.996 0.983 0.992 0.990 0.991 0.029 0.057 0.040 0.037 0.037 0.020 0.041 0.029 0.027 0.027 5.6 10.4 7.7 8.6 9.5

(3) The results obtained within the "quasi model" approach on the "quasi model" examination dataset, are comparable to those obtained within the "experiment based" approach on the experimental examination set, sometimes outperforming them. This is the manifestation of higher representativity of datasets in the "quasi model" approach. CONCLUSIONS (1) Unique experimental material has been obtained. This material is an array of Raman spectra (in the frequency range from 200 cm­1 to 4000 cm­1) of water solutions of inorganic salts (NaCl, NH4Br, Li2SO4, KNO3, CsI) in the range of total concentrations from 0 to 2.5 M (mole per liter of solution). (2) Based on the obtained array of spectra, a special procedure providing optimal representativity of datasets was used to form training, test, and examination datasets for subsequent work. (3) The complex IP of identification of salts and determination of their partial concentrations in 5 component water solution by Raman spectra was solved within the "experiment based" approach using both bands of Raman spectrum (low frequency and valence bands), as well as using only water Raman valence band, and also within the "quasi model" approach using both bands of the spectrum. When experimental Raman spectra used as input data included, besides the valence Raman band of water (2700­3900 cm­1), also the low frequency part of Raman spectrum of water and ions of the dissolved salts (280­1830 cm­1), the obtained values of the error of determination of concentration on the examination dataset were low enough: the mean absolute error was from 0.019 to 0.029 M in the concentration range
OPTICAL MEMORY AND NEURAL NETWORKS (INFORMATION OPTICS) Vol. 19 No. 2 2010


148

BURIKOV et al.

from 0 to 2.5 M. These results significantly outperform the results obtained within the same experiment by water Raman valence band only, and they outperform the results obtained before for 3 component solutions in narrower concentration range several fold (Table 3). (4) Quasi models calculated by experimental spectra were built based on perceptron and on GRNN. The quality of approximation of the sought for dependence was similar for both quasi models. Quasi models based on perceptron and GRNN were used to form training, test, and examination datasets with necessary representativity for the "quasi model" approach to the IP solution. (5) The "quasi model" approach failed to meet expectations compared to the "experiment based" approach. In all cases, the results on the experimental examination dataset, obtained within the "quasi model" approach, turned out to be worse or significantly worse than the results obtained within the "experiment based" approach (Table 5). This is an evidence of the fact that it is necessary to elaborate more complex "quasi models", adequate to the complexity of the solved direct problem. ACKNOWLEDGMENTS This study has been performed with financial support from Human Capital Foundation, grant no. 186 IT for 2009. REFERENCES
1. Dolenko, S.A., Dolenko, T.A., Persiantsev, I.G., Fadeev, V.V., and Burikov, S.A., Solution of Inverse Problems of Optical Spectroscopy Using Neural Networks, Neurocomputers: Elaboration, Application, 2005, nos. 1­2, pp. 89­97 [in Russian]. 2. Baldwin, S.F. and Brown, C.W., Detection of Ionic Water Pollutants by Laser Excited Raman Spectroscopy (short communication), Water Research, 1972, vol. 6, pp. 1601­1604. 3. Rudolph, W.W. and Irmer, G., Raman and Infrared Spectroscopic Investigation on Aqueous Alkali Metal Phos phate Solutions and Density Functional Theory Calculations of Phosphate Water Clusters, Applied Spectros copy, 2007, vol. 61, no. 12, pp. 274A­292A. 4. Georgiev, G.M., Kalkanjiev, T.K., Petrov, V.P., and Nickolov, Zh., Determination of Salts in Water Solutions by a Skewing Parameter of the Water Raman Band, Applied Spectroscopy, 1984, vol. 38, no. 4, pp. 593­595. 5. Furic, K., Ciglenecki, I., and Cosovic, B., Raman Spectroscopic Study of Sodium Chloride Water Solutions, J. Molecular Structure, 2000, vol. 6, pp. 225­234. 6. Dolenko, T.A., Churina, I.V., Fadeev, V.V., and Glushkov, S.M., Valence Band of Liquid Water Raman Scattering: Some Peculiarities and Applications in the Diagnostics of Water Media, J. Raman Spectroscopy, 2000, vol. 31, pp. 863­870. 7. Bekkiev, A.Yu., Gogolinskaya, T.A., and Fadeev, V.V., Simultaneous Determination of Temperature and Salinity of Sea Water Using the Laser Raman Spectroscopy Method, Sov. Phys. Dokl., 1983, vol. 271, no. 4, pp. 849­ 853. 8. Burikov, S.A., Dolenko, T.A., Fadeev, V.V., and Sugonyaev, A.V., New Opportunities in the Determination of Inorganic Compounds in Water by the Method of Laser Raman Spectroscopy, Laser Physics, 2005, vol. 15, no. 8, pp. 1­5. 9. Burikov, S.A., Dolenko, T.A., Fadeev, V.V., and Sugonyaev, A.V., Identification of Inorganic Salts and Determi nation Their Concentrations in Water Solutions Above the Water Raman Valence Band Using Artificial Neural Networks, Pattern Recognition and Image Analysis, 2005, vol. 15, no. 2, pp. 520­523. 10. Burikov, S.A., Dolenko, T.A., and Fadeev, V.V., Identification of Inorganic Salts and Determination Their Con centrations in Multi components Water Solutions Over Water Raman Valence Band Using Artificial Neural Networks, Neurocomputers: Elaboration, Application, 2007, no. 5, pp. 62­72 [in Russian]. 11. Burikov, S.A., Dolenko, T.A., and Fadeev, V.V., Identification of Inorganic Salts and Determination of Their Concentrations in Water Solutions from the Raman Valence Band Using Artificial Neural Networks, Pattern Recognition and Image Analysis, 2007, vol. 17, no. 4, pp. 554­559. 12. Specht, D., A General Regression Neural Network, IEEE Trans. on Neural Networks, 1991, vol. 2, no. 6, pp. 568­576. 13. Madala, H.R. and Ivakhnenko, A.G., Inductive Learning Algorithms for Complex Systems Modeling, CRC Press, 1994.
OPTICAL MEMORY AND NEURAL NETWORKS (INFORMATION OPTICS) Vol. 19 No. 2 2010