Neural networks. Statistica Neural Networks

INTRODUCTION TO MODERN NEURAL NETWORK

Laboratory work No. 1

SOFTWARE PRODUCT STATISTICA NEURAL NETWORKS (SNN) VERSION “SNN 7.0”

Goal of the work - get acquainted with the Statistica software product

Neural Networks (SNN), build a neural network using the solution wizard.

1. Open data file Fan.stw(Table A.1) using the command File – Open. This file contains data about two types of classes - 1 and 2, the presence and absence of overheating.

2. Select a team Neural networks on the menu Analysis to open the STATISTICA Neural Networks launch pad.

Rice. 4. Tool selection

3. On the tab Fast launch pad Neural networks select a task type from the list (in in this case – Classification) and solution method (in this case – Solution Master) and press the button OK(Fig. 4). After this, the standard variable selection dialog will be displayed.

4. Select the dependent (output) variable (in this case, the CLASS variable) (Fig. 5).

Rice. 5. Input data

5. For display Solution Wizards press the button OK on the launch pad.

On the tab Fast(Fig. 6) deselect option Select a subset of independent variables, only two independent variables are defined here, so both variables will be used as inputs for all neural networks being tested. In Group Duration of analysis there are options that determine the time that Solution Wizard will spend on finding an effective neural network. The longer Solution Wizard will work, the more effective the solution found will be. For example, install 25 networks.

Based on the results of the analysis, you can save neural networks of various types with different performance and complexity indicators so that you can ultimately choose the best network yourself.

6. Enter the number 10 to save networks to Solution Wizard saved only the 10 best network options.

Tab Solution Wizard – Fast will have the appearance shown in Fig. 6.

Rice. 6. Settings for analysis

Press the button OK, to Solution Wizard started building

neural networks. After this, a dialog will be displayed Training in progress(Solution Wizard). Each time an improved neural network is discovered, a new row will be added to the information table. In addition, the operating time and percentage of the task completed are displayed at the bottom of the window. If no improvement has occurred over a long period of time, press the button Ready in dialogue Training in progress to complete the network search process. After the search is completed, a dialog will be displayed results, containing information about the found networks for further analysis (Fig. 7).

Rice. 7. Learning outcomes

7. Press the button Descriptive stats. on the tab Fast in dialogue results to display two summary tables: Classification and Error Matrix.

The classification table (Fig. 8) provides complete information on solving the corresponding problem. In this table, there are multiple columns for each output class predicted by each model. For example, the column labeled CLASS.1.11 corresponds to the predictions of Model 1 in the OVERHEAT class for the variable CLASS. The first line provides information about the number of observations of various types of overheating in the data file. The second (third) line displays data (for each class) on the number of correctly (incorrectly) classified observations. The fourth line lists the “unknown” observations. The error matrix is usually used in problems with several

running classes.

8. To display the final statistics, you need to open Analysis(button results in line Analysis or command Continue on the menu Analysis). In Group Selections for displaying results select option All(separately). Then press the button Descriptive Statistics. The final classification table is divided into four parts. Column headings have different prefixes: O, K, T And AND, which correspond to the training, control, test and ignored samples, respectively. By default, observations are divided into three subsets in a 2:1:1 ratio. Thus, 50 training observations, 25 control observations and 25 testing observations were allocated. The results of the neural network on these sets are almost the same, that is, the quality of the neural network can be considered acceptable.

Rice. 8. Classification table

9. To complete Analysis press the button OK in dialogue results. On the launch pad when you press the button Cancel all constructed neural networks will be deleted. Saving neural networks is necessary in order to quickly train neural networks, so before that, find a network with the best performance, and then the constructed neural networks are saved for further use. To save the neural network, select the tab Networks/Ensembles and press the button Save network file as.... (the file has the extension .snn).

Tasks

1. Build and train a neural network using Solution Wizards to automate vehicle diagnostics, determining the need for engine overhaul based on the following parameters: engine compression, oil pressure, gasoline consumption.

2. Enter the initial data in accordance with the table. 1, obtain specific values of the variables from the teacher.

3. Build a neural network in accordance with the settings:

Problem type: classification;

Tool: Solution Wizard;

Number of networks: 25;

5. Analyze the construction of a neural network and reflect it in the report.

6. Prepare a report on the work performed.

STATISTICA Automated Neural Networks is the only neural network software product in the world that is fully translated into Russian!

Neural network methodologies are becoming increasingly widespread in a variety of fields from fundamental research to practical applications of data analysis, business, industry, etc.

is one of the most advanced and most effective neural network products on the market. It offers many unique benefits and rich features. For example, the unique capabilities of the automatic neural network search tool, , allow the system to be used not only by experts in neural networks, but also by beginners in the field of neural network computing.

What are the advantages of using ?

Pre- and post-processing, including data selection, nominal coding, scaling, normalization, removal of missing data with interpretation for classification, regression and time series problems;

Exceptional ease of use plus unrivaled analytical power; for example, a unique automatic neural network search tool Automated neural network (ANN) will guide the user through all the stages of creating various neural networks and select the best one (otherwise, this task is solved through a long process of “trial and error” and requires serious knowledge of theory);

The most modern, optimized and powerful network training algorithms (including conjugate gradient methods, Levenberg-Marquardt algorithm, BFGS, Kohonen algorithm); full control over all parameters affecting network quality, such as activation and error functions, network complexity;

Support for ensembles of neural networks and neural network architectures of almost unlimited size;

Rich graphical and statistical capabilities that facilitate interactive exploratory analysis;

Full integration with the system STATISTICA; all results, graphs, reports, etc. can be further modified using powerful graphical and analytical tools STATISTICA(for example, to analyze predicted residuals, create a detailed report, etc.);

Seamless integration with powerful automated tools STATISTICA; recording full-fledged macros for any analysis; creating your own neural network analyzes and applications using STATISTICA Visual Basic challenge STATISTICA Automated Neural Networks from any application that supports COM technology (for example, automatic neural network analysis in a MS Excel spreadsheet or combining several custom applications written in C, C++, C#, Java, etc.).

A selection of the most popular network architectures, including Multilayer Perceptrons, Radial Basis Functions, and Self-Organizing Feature Maps.
Tool available Automatic Network Search, allowing for automatic mode build various neural network architectures and regulate their complexity.
Preservation of the best neural networks.

Support various types of statistical analysis and construction of predictive models, including regression, classification, time series with continuous and categorical dependent variable, cluster analysis for dimensionality reduction and visualization.

Supports loading and analysis of multiple models.

Optional ability to generate source code in C, C++, C#, Java, PMML (Predictive Model Markup Language), which can be easily integrated into an external environment to create your own applications.

Code generator

Code generator STATISTICA Automated Neural Networks can generate source system program code for neural network models in C, Java and PMML (Predictive Model Markup Language). The code generator is an additional application to the system STATISTICA Automated Neural Networks, which allows users, based on the neural network analysis, to generate a C or Java file with the source code of models and integrate it into independent external applications.

Code generator requires STATISTICA Automated Neural Networks.

Generates a version of the neural network source code (as a file in C, C++, C# or Java).

The C or Java code file can then be embedded in external programs.

STATISTICAAutomated Neural Networks in neural network computing

The use of neural networks involves much more than just processing data using neural network methods.

STATISTICA Automated Neural Networks (SANN) provides a variety of functionality, for working with very complex tasks, including not only the latest Neural Network Architectures And Learning algorithms, but also new approaches to constructing neural network architectures with the ability to enumerate various activation and error functions, which makes it easier to interpret the results. In addition, the developers software and users experimenting with application settings will appreciate the fact that after conducting specified experiments in a simple and intuitive interface STATISTICA Automated Neural Networks (SANN),neural network analyzes can be combined in a custom ,application. This can be achieved either using the COM function library STATISTICA, which fully reflects all the functionality of the program, or using code in C/C++, which is generated by the program and helps run a fully trained neural network.

Module STATISTICA Automated Neural Networks fully integrated with the system STATISTICA Thus, a huge selection of tools for editing (preparing) data for analysis (transformations, conditions for selecting observations, data checking tools, etc.) is available.

Like all tests STATISTICA, the program can be "attached" to a remote database using in-place processing tools, or linked to live data so that models are trained or run (for example, to calculate predicted values or classification) automatically every time the data changes.

Data scaling and nominal value conversion

Before data is entered into the network, it must be prepared in a certain way. It is equally important that the output data can be interpreted correctly. IN STATISTICA Automated Neural Networks (SANN) it is possible to automatically scale input and output data; Variables with nominal values can also be automatically recoded (for example, Gender=(Male,Female)), including using the 1-of-N coding method. STATISTICA Automated Neural Networks (SANN) also contains tools for working with missing data. There are data preparation and interpretation tools specifically designed for time series analysis. A wide variety of similar tools are also implemented in STATISTICA.

In classification problems, it is possible to set confidence intervals that STATISTICA Automated Neural Networks (SANN) is then used to assign observations to one class or another. In combination with a special implemented in STATISTICA Automated Neural Networks (SANN) the Softmax activation function and cross-entropy error functions provide a fundamental probability-theoretic approach to classification problems.

Selecting a neural network model, ensembles of neural networks

The variety of neural network models and the many parameters that need to be set (network sizes, learning algorithm parameters, etc.) can confuse some users. But this is why there is an automatic neural network search tool, , which can automatically search for a suitable network architecture of any complexity, see below. In system STATISTICA Automated Neural Networks (SANN) All main types of neural networks used in solving practical problems have been implemented, including:

multilayer perceptrons (networks with direct signal transmission);

networks based on radial basis functions;

self-organizing Kohonen maps.

The above architectures are used in regression, classification, time series (with continuous or categorical dependent variable) and clustering problems.

In addition, in the system STATISTICA Automated Neural Networks (SANN) implemented Network ensembles, formed from random (but significant) combinations of the above networks. This approach is especially useful for noisy and low-dimensional data.

In the package STATISTICA Automated Neural Networks (SANN) Numerous tools are available to help the user select an appropriate network architecture. The statistical and graphical tools of the system include histograms, matrices and error graphs for the entire population and for individual observations, final data on correct/incorrect classification, and all important statistics, for example, the explained proportion of variance, are calculated automatically.

To visualize data in a package STATISTICA Automated Neural Networks (SANN) Scatterplots and 3D response surfaces are implemented to help the user understand the “behavior” of the network.

Of course, you can use any information obtained from the listed sources for further analysis by other means. STATISTICA, as well as for subsequent inclusion in reports or for customization.

STATISTICA Automated Neural Networks (SANN) automatically remembers the best network option from those that you received while experimenting with the task, and you can refer to it at any time. The usefulness of the network and its predictive ability are automatically tested on a special test set of observations, as well as by estimating the size of the network, its efficiency, and the cost of misclassification. Implemented in STATISTICA Automated Neural Networks (SANN) Automatic cross-validation and weight regularization procedures allow you to quickly determine whether your network is under- or over-complicated for a given task.

To improve performance in the package STATISTICA Automated Neural Networks Numerous network configuration options are presented. Thus, you can specify a linear output network layer in regression problems or a softmax activation function in probabilistic estimation and classification problems. The system also implements cross-entropy error functions based on information theory models and a number of special activation functions, including Identical, Exponential, Hyperbolic, Logistic (sigmoid) and Sine functions for both hidden and output neurons.

Automated neural network (automatic search and selection of various neural network architectures)

Part of the package STATISTICA Automated Neural Networks (SANN) is an automatic neural network search tool, Automated neural network (ANN) - Automated Network Search (ANS), which evaluates many neural networks of varying architecture and complexity and selects the networks of the best architecture for a given task.

When creating a neural network, considerable time is spent on selecting appropriate variables and optimizing the network architecture using heuristic search. STATISTICA Automated Neural Networks (SANN) takes over this work and automatically conducts a heuristic search for you. This procedure takes into account the input dimension, network type, network dimensions, activation functions, and even the required output error functions.

It is an extremely effective tool when using complex techniques, allowing you to automatically find the best network architecture. Instead of spending hours sitting in front of your computer, let the system STATISTICA Automated Neural Networks (SANN) do this work for you.

The success of your experiments to find the best network type and architecture depends significantly on the quality and speed of network learning algorithms. In system STATISTICA Automated Neural Networks (SANN) The best training algorithms to date have been implemented.

In system STATISTICA Automated Neural Networks (SANN) Two fast second-order algorithms have been implemented - conjugate gradient methods and the BFGS algorithm. The latter is an extremely powerful modern nonlinear optimization algorithm, and experts highly recommend using it. There is also a simplified version of the BFGS algorithm that requires less memory, which is used by the system in case random access memory computers are quite limited. These algorithms tend to converge faster and produce a more accurate solution than first-order accuracy algorithms such as Gradient Descent.

Iterative process of network training in the system STATISTICA Automated Neural Networks (SANN) is accompanied by an automatic display of the current training error and the error calculated independently on the testing set, and a graph of the total error is also shown. You can interrupt training at any time by simply pressing a button. In addition, it is possible to set stopping conditions, under which the training will be interrupted; such a condition may be, for example, the achievement of a certain error level, or a stable increase in the test error over a given number of passes - “epochs” (which indicates the so-called retraining of the network). If overfitting occurs, the user should not care: STATISTICA Automated Neural Networks (SANN) automatically remembers an instance of the best network obtained during the training process, and this network option can always be accessed by clicking the corresponding button. After network training is completed, you can check the quality of its work on a separate test set.

After the network is trained, you need to check the quality of its work and determine its characteristics. For this purpose in the package STATISTICA Automated Neural Networks (SANN) There is a set of on-screen statistics and graphical tools.

In the event that several models (networks and ensembles) are specified, then (if possible) STATISTICA Automated Neural Networks (SANN) will display comparative results (for example, plot response curves of several models on one graph, or present predictors of several models in one table). This property is very useful for comparing different models trained on the same data set.

All statistics are calculated separately for the training, validation and test sets or in any combination of them, at the discretion of the user.

The following summary statistics are automatically calculated: the root mean square error of the network, the so-called confusion matrix for classification problems (where all cases of correct and incorrect classification are summed up) and correlations for regression problems. The Kohonen network has a Topological Map window in which you can visually observe the activations of network elements.

Ready-made solutions (custom applications using STATISTICA Automated Neural Networks)

Simple and convenient system interface STATISTICA Automated Neural Networks (SANN) allows you to quickly create neural network applications to solve your problems.

There may be a situation where it is necessary to integrate these solutions into an existing system, for example, to make them part of a wider computing environment (these may be procedures developed separately and built into the corporate computing system).

Trained neural networks can be applied to new datasets (for prediction) in several ways: You can save the trained networks and then apply them to a new dataset (for prediction, classification, or forecasting); You can use a code generator to automatically generate program code in C (C++, C#) or Visual Basic and then use it to predict new data in any visual basic or C++ (C#) programming environment, i.e. embed a fully trained neural network into your application. In conclusion, all system functionality STATISTICA, including STATISTICA Automated Neural Networks (SANN), can be used as COM objects (Component Object Model) in other applications (for example, Java, MS Excel, C#, VB.NET, etc.). For example, you can implement an automated analysis created using STATISTICA Automated Neural Networks (SANN) into MS Excel tables.

List of learning algorithms

Gradient Descent

Conjugate gradients

Kohonen training

K-Means Method for Radial Basis Function Network

Network size restrictions

A neural network can be of almost any size (that is, its dimensions can be taken many times larger than is actually necessary and reasonable); for a network of multilayer perceptrons, one hidden layer of neurons is allowed. In fact, for any practical tasks the program is limited only by the hardware capabilities of the computer.

e-Manual

As part of the system STATISTICA Automated Neural Networks (SANN) there is a well-illustrated textbook that provides a complete and clear introduction to neural networks, as well as examples. A system of detailed, context-sensitive help is available from any dialog box.

Source code generator

The source code generator is an additional product that allows users to easily create their own applications based on the system STATISTICA Automated Neural Networks (SANN). This additional product creates the source system code of the neural network model (as a file in C, C++, C# or Java), which can be separately compiled and integrated into your program for free distribution. This product is designed specifically for enterprise systems developers, as well as those users who need to transform highly optimized procedures created in STATISTICA Automated Neural Networks (SANN), into external applications to solve complex analytical problems. (It should be noted that in order to obtain permission, users must inform sitessia employees about the distribution of programs using the generated code).

Edited by V.P. Borovikova

2nd ed., revised. and additional

2008 G.

Circulation 1000 copies.

Format 70x100/16 (170x240 mm)

Version: paperback

ISBN 978-5-9912-0015-8

BBK 32.973

UDC 004.8.032.26

annotation

Neural network methods of data analysis based on the use of the STATISTICA Neural Networks package (manufactured by StatSoft), fully adapted for the Russian user, are outlined. The basics of the theory of neural networks are given; Much attention is paid to solving practical problems; the methodology and technology of conducting research using the STATISTICA Neural Networks package, a powerful data analysis and forecasting tool that has wide applications in business, industry, management, and finance, is comprehensively considered. The book contains many examples of data analysis, practical recommendations for analysis, forecasting, classification, pattern recognition, and control of production processes using neural networks.

For a wide range of readers engaged in research in banking, industry, economics, business, geological exploration, management, transport and other areas.

Preface to the second edition

Introduction. Invitation to neural networks

Chapter 1. BASIC CONCEPTS OF DATA ANALYSIS

Chapter 2. INTRODUCTION TO PROBABILITY THEORY

Chapter 3. INTRODUCTION TO NEURAL NETWORK THEORY

Chapter 4. GENERAL OVERVIEW OF NEURAL NETWORKS
Parallels from biology
Basic artificial model
Application of neural networks
Pre- and post-processing
Multilayer perceptron
Radial basis function
Probabilistic neural network
Generalized regression neural network
Linear network
Kohonen Network
Classification problems
Regression problems
Time series forecasting
Variable selection and dimensionality reduction

Chapter 5. FIRST STEPS IN STATISTICA NEURAL NETWORKS
Let's start work
Creating a Dataset
Creating a new network
Creating a dataset and network
Network training
Running a neural network
Carrying out classification

Chapter 6. FURTHER POSSIBILITIES OF NEURAL NETWORKS
Classic example: Fischer Irises
Cross-validation training
Stop conditions
Solving regression problems
Radial basis functions
Linear models
Kohonen networks
Probabilistic and generalized regression networks
Network designer
Genetic algorithm for input data selection
Time series

Chapter 7. PRACTICAL TIPS FOR SOLVING PROBLEMS
Data presentation
Isolating useful input variables.
Dimensionality Reduction
Selecting a Network Architecture
Custom network architectures
Time series

Chapter 8. CASE STUDIES
Example 1. Reducing dimensions in geological research
Example 2: Pattern recognition
Example 3. Nonlinear classification of two-dimensional sets
Example 4. Segmentation of various fuel samples according to laboratory research data
Example 5. Building a behavioral scoring model
Example 6. Function approximation
Example 7. Forecasting oil sales
Example 8: Monitoring and Prediction
temperature regime at the installation
Example 9: Determining the validity of a digital signature

Chapter 9. QUICK GUIDE
Data
Networks
Network training
Other network types
Networking
Sending results to the STATISTICA system

Chapter 10. CLASSICAL METHODS ALTERNATIVE TO NEURAL NETWORKS
Classic discriminant analysis in STATISTICA
Classification
Logit regression
Factor analysis in STATISTICA

Chapter 11. DATA MINING IN STATISTICA

Appendix 1. Code generator

Appendix 2. Integration of STATISTICA with ERP systems

Bibliography

Subject index

Annotation: Neural networks and statistics. Neural networks and fuzzy logic. Neural networks and expert systems. Neural networks and statistical physics.

Animals are divided into:

belonging to the Emperor,

embalmed,

tamed,

suckers,

sirens,

fabulous,

individual dogs,

included in this classification,

running around like crazy

countless,

painted with the finest camel hair brush,

others,

broke a flower vase,

from a distance resembling flies.

H. L. Borges, "The Analytic Language of John Wilkins"

Neurocomputing has numerous points of contact with other disciplines and their methods. In particular, the theory of neural networks uses the apparatus of statistical mechanics and optimization theory. The areas of application of neurocomputing sometimes strongly overlap or almost coincide with the areas of application of mathematical statistics, fuzzy set theory and expert systems. The connections and parallels of neurocomputing are extremely diverse and indicate its universality. In this lecture, which can be considered additional, since it requires somewhat more mathematical preparation, we will talk only about the most important of them.

Neural networks and statistics

Since neural networks are now successfully used for data analysis, it is appropriate to compare them with older, well-developed statistical methods. In the statistics literature, you can sometimes come across the statement that the most commonly used neural network approaches are nothing more than ineffective regression and discriminant models. We have already noted before that multilayer neural networks can actually solve problems like regression and classification. However, firstly, data processing by neural networks is much more diverse - remember, for example, active classification by Hopfield networks or Kohonen feature maps, which have no statistical analogues. Secondly, many studies concerning the use of neural networks in finance and business have revealed their advantages over previously developed statistical methods. Let's take a closer look at the results of comparing the methods of neural networks and mathematical statistics.

Are neural networks a description language?

As noted, some statisticians argue that neural network approaches to data processing are simply rediscovered and reformulated but well-known statistical methods of analysis. In other words, neurocomputing simply uses a new language to describe old knowledge. As an example, here is a quote from Warren Searle:

Many neural network researchers are engineers, physicists, neuroscientists, psychologists, or computer scientists who know little about statistics and nonlinear optimization. Neural network researchers are constantly rediscovering methods that have been known in the mathematical and statistical literature for decades and centuries, but often find themselves unable to understand how these methods work.

This point of view, at first glance, may seem reasonable. The formalism of neural networks can truly claim to be a universal language. It is no coincidence that already in the pioneering work of McCulloch and Pitts it was shown that a neural network description is equivalent to a description of propositional logic.

I actually found that with the technique I developed in the 1961 paper (...), I could easily answer all the questions that brain scientists (...) or computer scientists asked me. As a physicist, however, I knew well that a theory that explains everything actually explains nothing: at best it is a language. Eduardo Cayanello

It is not surprising, therefore, that statisticians often discover that concepts they are familiar with have their analogues in neural network theory. Warren Searle has compiled a small glossary of terms used in these two areas.

Table 11.1. Glossary of similar terms

Neural networks	Statistical methods.
Signs	variables
inputs	independent variables
exits	predicted values
target values	dependent variables
error	residual
training, adaptation, self-organization	grade
error function, Lyapunov function	evaluation criterion
training images (pairs)	observations
network parameters: weights, thresholds.	Estimated parameters
high order neurons	interaction
functional connections	transformation
supervised learning or heteroassociation	regression and discriminant analysis
unsupervised learning or auto association	data compression
competitive learning, adaptive vector quantization	cluster analysis
generalization	interpolation and extrapolation

What is the difference between neural networks and statistics?

What are the similarities and differences between the languages of neurocomputing and statistics in data analysis? Let's look at a simple example.

Let us assume that we have made observations and experimentally measured N pairs of points representing the functional dependence. If we try to draw the best straight line through these points, which in the language of statistics will mean using a linear model to describe the unknown dependence , (where denotes the noise during the observation), then solving the corresponding linear regression problem will be reduced to finding the estimated values of the parameters that minimize the sum of quadratic residuals.

If the parameters are found, then it is possible to estimate the value of y for any value of x, that is, to interpolate and extrapolate the data.

The same problem can be solved using single layer network with a single input and a single linear output neuron. The connection weight a and the threshold b can be obtained by minimizing the same amount of residual (which in this case will be called the root mean square error) during network training, for example, using the backpropagation method. The generalization property of the neural network will be used to predict the output value from the input value.

Rice. 11.1.

When comparing these two approaches, what immediately catches your eye is that when describing their methods, statistics appeals to formulas and equations, and neurocomputing refers to a graphical description of neural architectures.

If we remember that the left hemisphere operates with formulas and equations, and the right hemisphere with graphic images, then we can understand that in comparison with statistics, the “right hemisphere” nature of the neural network approach again appears.

Another significant difference is that for statistical methods it does not matter how the discrepancy is minimized - in any case, the model remains the same, while for neurocomputing it is the training method that plays the main role. In other words, unlike the neural network approach, the estimation of model parameters for statistical methods does not depend on minimization method. At the same time, statisticians will consider changes in the type of residual, say by

Like a fundamental change to the model.

Unlike the neural network approach, in which most of the time is spent training networks, in the statistical approach this time is spent on a thorough analysis of the problem. It uses the expertise of statisticians to select a model based on the analysis of data and information specific to the field. The use of neural networks - these universal approximators - is usually carried out without the use of a priori knowledge, although in some cases it is very useful. For example, for the linear model under consideration, the use of the root mean square error leads to obtaining an optimal estimate of its parameters when the noise value has a normal distribution with the same variance for all training pairs. At the same time, if it is known that these variances are different, then using a weighted error function

Can give significantly better parameter values.

In addition to the simplest model considered, we can give examples of other, in a sense, equivalent models of statistics and neural network paradigms

The Hopfield network has an obvious connection with data clustering and factor analysis.

Factor analysis used to study data structure. Its main premise is the assumption of the existence of such signs - factors that cannot be observed directly, but can be assessed by several observable primary signs. For example, such characteristics as production volume and the cost of fixed assets can determine such a factor as the scale of production. Unlike neural networks, which require training, factor analysis can only work with a certain number of observations. Although, in principle, the number of such observations should only be one greater than the number of variables, it is recommended to use at least three times the number of values. This is still considered less than the size of the training sample for the neural network. Therefore, statisticians point out the advantage of factor analysis in using less data and therefore leading to faster model generation. In addition, this means that the implementation of factor analysis methods requires less powerful computing tools. Another advantage of factor analysis is that it is a white-box method, i.e. completely open and understandable - the user can easily understand why the model produces a particular result. The connection between factor analysis and the Hopfield model can be seen by recalling the vectors minimum basis for a set of observations (memory images - see Lecture 5). It is these vectors that are analogues of factors that unite various components of memory vectors - primary characteristics.

Size: px

Start showing from the page:

Transcript

2 UDC BBK N45 N45 Neural networks. STATISTICA Neural Networks: Methodology and technologies of modern data analysis / Edited by V. P. Borovikov. 2nd ed., revised. and additional M.: Hotline Telecom, s, ill. ISBN Neural network methods of data analysis are described, based on the use of the STATISTICA Neural Networks package (manufactured by StatSoft), fully adapted for the Russian user. The basics of the theory of neural networks are given; Much attention is paid to solving practical problems, the methodology and technology of conducting research using the STATISTICA Neural Networks package, a powerful data analysis and forecasting tool that has wide applications in business, industry, management, and finance, is comprehensively considered. The book contains many examples of data analysis, practical recommendations for analysis, forecasting, classification, pattern recognition, and control of production processes using neural networks. For a wide range of readers engaged in research in banking, industry, economics, business, geological exploration, management, transport and other areas. Publisher's Internet address Reference publication Neural networks STATISTICA Neural Networks: Methodology and technologies of modern data analysis BBK Corrector V.N. Mikhin Preparation of the original layout E.V. Kormakov Cover by artist V.G. Sitnikova Signed for printing Format70 100/16. Conditional ed. l. 32.5. Published by LLC Scientific and Technical Publishing House "Hot Line Telecom" Printed in the printing house "Til-2004" Order 05 ISBN STATISTICA Neural Networks (SNN), 2008 V. P. Borovikov, 2008 Design of the publishing house "Hot Line Telecom", 2008

3 Preface to the second edition The second edition of the well-known book has been significantly expanded and revised. New chapters have been written on introduction to data analysis, probability theory, and neural network theory. The material contained in these chapters allows you to gain an in-depth understanding of the methodology for using neural networks. Currently, neural networks are intensively used in banking, industry, marketing, economics, medicine and other areas where forecasting and in-depth understanding of data are required. It is generally accepted that neural networks are a natural complement to classical methods of analysis and are used where standard procedures do not give the desired effect. STATISTICA Neural Networks is the only software product in the world for conducting neural network research that is fully translated into Russian. This means that the entire interface (dozens of dialog boxes and research scripts) and the STATIST1CA Neural Networks help system are translated into Russian and are available to the user in a single environment. We have included an additional chapter on classical methods of analysis in the book, which allows the reader to compare different approaches. A separate chapter of the book is devoted to data mining methods and modern data analysis technologies that combine classical and neural network models. Employees of StatSoft Russia took part in the work on the book: B.C. Pactunkov, A.K. Petrov, V.A. Panov. We express our sincere gratitude to all of them. Our special gratitude to Lyudmila Ekatova for the complex and painstaking work in preparing the manuscript for publication. Scientific Director of StatSoft Russia V.P. Borovikov

4 Introduction An Invitation to Neural Networks Over the past few years, interest in neural networks has increased significantly: they are used in finance, business, medicine, industry, engineering, geological exploration and other fields. Neural networks are used wherever forecasting, classification or control problems need to be solved, since they are applicable in almost any situation where there is a relationship between predictor variables (input variables) and predicted variables (output variables), even if this relationship is of a complex nature and its difficult to express in conventional terms of correlations or differences between groups. Neural network methods can be used independently or serve as an excellent complement to traditional data analysis methods. Most statistical methods are associated with the construction of models based on certain assumptions and theoretical conclusions (for example, under the assumption that the desired relationship is linear or the variables have a normal distribution). The neural network approach is free from model restrictions, it is equally suitable for linear and complex nonlinear dependencies, and is especially effective in exploratory data analysis when it is necessary to find out whether there are any dependencies between variables. The power of neural networks lies in their ability to learn themselves. The training procedure consists of adjusting the synaptic weights to minimize the loss function. This book uses the STATISTICA Neural Networks package to build neural networks, which has a convenient interface and allows you to conduct research in an interactive mode. All dialog boxes and tips, including the electronic help system, are fully translated into Russian and are available to users. Neural networks STATISTICA is the only software product in the world for neural network research that is fully translated into Russian. A significant advantage of STATISTICA Neural Networks is that it is naturally integrated into STATISTICA's powerful arsenal of analytical tools. It is the combination of classical and neural network methods that gives the desired effect. This book consists of eleven chapters. In the first chapter we describe the basic concepts of data analysis, in the second we give an introduction to probability theory. The third chapter contains a theoretical introduction to neural networks. Note that probability theory is the basis of neural networks. This chapter is necessary for an in-depth understanding of the methods and principles of neural networks. In it

5 we describe the famous Bayes formula and the optimal Bayes classification rule. The fourth chapter contains a general overview of neural networks implemented in STATISTICA Neural Networks, introduces the reader to the program interface, options, and helps to understand the main areas of analysis. In chapter five, the reader learns to take his first steps in STATISTICA Neural Networks. The sixth chapter describes further capabilities of neural networks. Networks based on radial basis functions are considered in detail, multilayer perceptrons, self-organizing maps, probabilistic and generalized probabilistic models are described. It describes how to build a network using the Solution Wizard, a convenient tool for conducting neural network analysis for novice users; gives an idea of genetic algorithms for dimensionality reduction. The seventh chapter presents practical advice on solving problems using neural networks. The eighth chapter contains solutions to specific problems (case studies). This chapter is especially interesting to a wide range of readers, as it shows neural network technology in action. The examples cover a wide range of applications: from geology and industry to finance; The problems of classification, pattern recognition, forecasting, and production process management are considered. In the ninth chapter, the reader will find a brief guide to using the STATISTICA Neural Networks neural network package. The tenth chapter is devoted to statistical methods alternative to neural networks. Discriminant analysis, factor analysis, and logistic regression methods are described here. Obviously, the user should be able to compare methods and choose the most adequate. In the eleventh chapter, we briefly describe modern data mining technologies that combine neural network methods with classical analysis methods. Here are typical examples of the use of neural networks. In industry, the task of managing production processes (production installation) is relevant. For example, in the gas industry, you can set up a neural network and automatically change parameters to control the quality of the output product. Similar problems arise during oil refining. It is possible to control the quality of gasoline based on spectral characteristics, measuring the spectrum, and assigning the produced product to a certain class. Since the dependencies are nonlinear, neural networks are a suitable tool for classification. In the financial sector, consumer lending is a pressing issue. Behind last years consumer lending developed intensively and became one of the most growing sectors of the banking business. The number of financial institutions providing goods and services on credit is growing

6 day after day. The risk of these institutions depends on how well they can distinguish “good” loan applicants from “bad” ones. By analyzing the borrower's credit history, it is possible to predict his course of action and make a decision on whether to issue a loan or refuse a loan. An interesting task is the discrimination of electronic signatures, voice recognition, and various tasks related to geological exploration. Neural networks can be used to solve these problems. Next, we will present a chain of dialog boxes in the STATISTICA Neural Networks package and show how the dialogue with the system user is organized. Let's pay attention to the user-friendly interface and the presence of the Solution Wizard and Network Designer tools, which allow users to design their own networks and select the best ones. So, first of all, let's launch the STATISTICA Neural Networks package. Step 1. We start with the start panel (Fig. 1). Rice. 1. Start panel of neural networks In this panel you can select different kinds analyzes that need to be performed: regression, classification, time series forecasting, cluster analysis. Select, for example, time series if you want to build a forecast. Next, select the solution tool in the Tool section. For novice users, it is recommended to select the Solution Wizard; for advanced users, use the Network Designer. We

7 select Solution Wizard. Step 2: Click the Data button to open the data file. If the file is already open, you should not click this button. When you click the Advanced button, a window appears where additional tools are available, in particular dimensionality reduction procedures, a code generator, etc. (Fig. 2). Rice. 2. Launchpad STATISTICA Neural Networks Step 3. From open file Let's select variables for analysis. Variables can be continuous or categorical; in addition, observations may belong to different samples (Fig. 3).

8 Fig. 3. Window for selecting variables Step 4. Set the duration of the analysis by indicating the number of networks to be tested or the solution time (Fig. 4). Rice. 4. Solution Wizard Quick tab Step 5. Select the type of networks offered by the program with which we will work: linear network, probabilistic network, network based on radial basis functions, multilayer perceptron. You can select any type of networks or combination (Fig. 5).

9 Fig. 5. Solution Wizard Network Type tab Step 6. Set the format for presenting the final results (Fig. 6). Rice. 6. Solution Wizard Feedback tab Step 7. Start the procedure for training neural networks by clicking OK (Fig. 7).

10 Fig. 7. Display of the learning process Step 8. In the results window, you can analyze the solutions obtained. The program will select the best networks and show the quality of the solution (Fig. 8). Rice. 8. Results window, Quick tab Step 9. Select a specific network (Fig. 9).

11 Fig. 9. Model Selection Dialog Box Step 10: One way to check is to compare the observed values and the predicted results. A comparison of observed and predicted values for the selected network is presented in Fig. 10.

12 Fig. 10. Table of observed and predicted values Step 11. Save the best networks for further use, for example, for automatic forecasting (Fig. 11 and 12). Rice. 11. Launchpad for selecting and saving networks/ensembles

13 Fig. 12. Standard window for saving a network file This is exactly the typical research scenario in the STATISTICA Neural Networks package. A more systematic presentation is contained in the remaining chapters of the book.

14 Chapter 9 QUICK GUIDE In this chapter you will find a quick guide to working with STATISTICA Neural Networks. The STATISTICA Neural Networks package implements all types of neural networks that are currently used to solve practical problems, as well as the most modern algorithms for fast learning, automatic construction and selection of significant predictors. DATA Introduction Let us recall once again that neural networks learn from examples and build a model based on training data. The training data represents a certain number of observations (samples), for each of which the values of several variables are indicated. Most of these variables will be specified as inputs, and the network will learn to find a correspondence between the values of the input and output variables (most often there is one output variable), using the information contained in the training data. Once the network is trained, it can be used to predict unknown output values from given input values. Thus, the first stage of working with a neural network is associated with the formation of a data set. You can create a data table in STATISTICA (Neural Networks) using the Create command on the File menu (or the corresponding button on the toolbar), specifying the number of variables and observations. The resulting new data file will initially contain only empty cells, and the values of all variables in it will be set as missing (Fig. 9.1).

15 Fig. 9.1 The selection of input/output variables, as well as the sets into which the variables are divided, is made inside the Neural Networks module (but after the table with the data has been prepared). However, this is not usually done this way: a data file is imported from some other package using the Open command (you will need to specify the data format) or the External Data command of the File menu, which allows you to create complex queries against various databases (Fig. 9.2 and 9.3).

16 Fig. 9.2 Fig. 9.3 In the Neural Networks module, it is possible to directly read data files of the STATISTICA system, and nominal variables are automatically determined (that is, variables that can take one of several specified text values, for example, Gender = (Male, Female)), and such data types such as dates and times are converted into numeric representation (input

17 only numerical data can be supplied to the neural network). If you receive data in some other program (for example, spreadsheets), then, first of all, you will need to import the data using the STATISTICA system. In addition to the import function, the STATISTICA system also implements other options for accessing external sources of information: using the Windows clipboard (STATISTICA understands clipboard data formats used in applications such as Excel and Lotus); access to various databases using the STATISTICA Query query tool. Tab- and comma-delimited text files can be imported directly into STATISTICA. In this case, if desired, the first line of the file can be left for variable names, and the first column for observation names (Fig. 9.4). Rice. 9.4 Once a data file is opened or newly created, its contents can be edited as a normal table in the STATISTICA environment. STATISTICA implements basic data operations typical of spreadsheet processors, including: editing, selecting a block of cells,

18 forwarding to clipboard, etc. In addition, there are special operations for specifying the type and names of variables and observations, adding, deleting, moving and copying them. Types of variables and observations In the STATISTICA Neural Networks program, all observations from a data file are divided into four groups (sets): training, control, test and unaccounted for. The training set is used to train the neural network, the control set is used to independently evaluate the progress of training, and the test set is used for final evaluation after completing a series of experiments. The disregarded set is not used at all (it may be needed if some of the data is corrupted, unreliable, or there is simply too much of it). Similarly, all variables are divided into input, output, input/output (for example, in time series analysis) and unaccounted for (the latter are usually “candidate input variables”, whose usefulness for making a forecast is not clear in advance, and therefore in the process of experimentation some of them turn off). The type of variables and observations is specified in the Neural Networks module. The number of input and output variables, as well as training, control and test observations, is displayed in the corresponding fields at the top of the STATISTICA Neural Networks start window. The proportions between types can be changed by editing the settings in these fields. This will not add new cases or remove existing cases or variables; only the type of existing cases or variables will change. A similar operation is used to form an unbiased control set. First you need to indicate the size of this set (usually half of the entire data set is allocated to it, and the other half to the training set; if you also need a test set, then the file must be divided into three parts). Then, using the Random selection option, all available cases are randomly assigned to different types. When you first read a data file in STATISTICA Neural Networks, you need to determine which variables will be input and which will be output; in the same way, for observations, it is necessary to set the parameters of samples for training, verification, and testing. Settings related to variables must be made in the start window of the Neural Networks module, and settings related to observations must be made using the Selections tool in the dialog box for setting analysis parameters (you go to it after the start window). Note, however, that if a sample identification variable is specified, it must be specified in the start window when setting input/output variables. Naming Variables and Observations It is possible to assign names to individual variables and/or

19 observations. This is done using the Variable Specifications command, All Variable Specifications command, Observation Name Manager of the Data menu. You can simply double-click a name in the row or column header field, and the name can be entered directly into the table. STATISTICA Neural Networks does not require you to assign names to observations or variables. If the name was not specified, the table displays the default conditional name. Defining a Variable (Nominal Values) STATISTICA Neural Networks has special capabilities for working with categorical (nominal) variables. For nominal variables, there are special methods for transforming values, and the type of output variables allows you to distinguish classification problems (which use nominal variables) from regression problems (which use numeric variables). A variable can be either numeric or nominal, but not both. To define a nominal variable in STATISTICA Neural Networks, you need to select this variable as categorical (either on the Quick tab by clicking the Variables button, or go to the Advanced tab and click the Variable Type button). When you import tab-delimited or comma-delimited files that contain nominal values (represented as lines of text), STATISTICA Neural Networks automatically recognizes them and determines the correct nominal values. Adding and deleting cases and variables You can add, delete, copy, and move cases and variables using the Data menu or directly in the table. Various commands in the Data Variables and Data Observations menus help achieve greater efficiency, and the tools for working with the table directly are more convenient to use. You can add new observations in two ways: 1. Select an observation. Left-click on the title of this observation and select Add Observations. You can also do this: go to the Observation Data Add menu. 2. Observations can also be pasted from the clipboard. To do this, you need to left-click on the name of the observation or variable into which we want to insert data. To delete an observation or group of observations, you need to select them in the usual way through the row headings, and then press Ctrl+X. In fact, this will place the observations on the clipboard, so if you move the cursor

20 to another location and press Ctrl+V, observations will be placed at the cursor location, and using the keyboard shortcuts Ctrl+C and Ctrl+V, observations can be copied and pasted. Moving and copying variables is done in the same way. Missing Data The STATISTICA Neural Networks module has special tools for handling missing data, which are similar to those used in other STAT1STICA modules. Despite the fact that the STATISTICA Neural Networks program can work with missing data, substituting reasonable estimates in their place, it is nevertheless not recommended to use missing values, both when training the network and during its operation, if possible. Although it happens that the volume of available training observations is too small, and we are forced to use all available observations. STATISTICA Neural Networks can automatically mark any variables or cases that contain missing data as missing (so that they are not used in the analysis). What exactly will be declared as ignored observations or variables is determined by the user's choice. If one of the variables has too many missing values, then it may be worth eliminating it from consideration. If a variable is missing only a few values, it makes sense to declare the corresponding observations unaccounted for. We can recommend the following sequence of actions: first declare the variable unaccounted for and see how many values are actually missing. If there are few such lines, then make the variable an input again, and declare the observations to be ignored. In an imported file that is tab-delimited or comma-delimited, missing data may be indicated by a space. NETWORKS Introduction Once a dataset has been created or imported, you can begin building and training neural networks. The network of the STAT1STICA Neural networks package can contain layers for pre- and post-processing, in which, respectively, the source data is converted to a form suitable for input to the network, and the output data to a form convenient for interpretation. In this case, nominal values are converted into numerical form, numerical values are scaled to a suitable range, missing values are substituted, and in problems with time series, blocks of sequential observations are formed. Pre- and post-processing data includes a set of input and output variables, each with its name and type, as in the original data set.

21 A note about input and output variables The set of input and output variables in the STATISTICA Neural Networks package exists separately from the data file. To simplify the network construction process, STATISTICA Neural Networks automatically copies variable names and definitions from the data set into the network being created, and then separates the network and data from each other. This way, the network can be used to analyze new data without having to go back to the original file (since the network remembers the names and types of its variables, it will know what to do). Building a network To create a new network, you should use the Solution Wizard or Network Builder tool. The Solution Wizard and Network Designer dialog box sequence provides tools for setting and editing parameters for pre- and post-processing variables. First, of course, it is necessary to define the variables and select the appropriate transformation method for them, as well as the network architecture. To go to the dialog for setting analysis parameters, click OK on the start window. Depending on which tool we use, the option to select the network type will be on the Network Type tab (for the Solution Wizard) or on the Quick tab (Network Designer). If you are solving a time series modeling problem, then when using any tool, the Time Series tab will be available. In the Solution Wizard, this tab sets the bounds for the forecast window (that is, the number of observations that are used to forecast one observation ahead). In the Network Designer, this tab provides options for specifying the exact value of the forecast window and the parameter for the number of steps forward. In tasks not related to time series, these options are not available. In problems of time series analysis, the number of steps forward is taken to be 1 or more (most often 1, which corresponds to a forecast one step ahead), and the time window is taken to be the number of previous values of the series from which its next value will be predicted. In addition, in time series analysis problems, before running the corresponding tool, you should select the variable containing the time series values as both input and output, since you are going to predict the next values of the variable from its previous values. If a multilayer perceptron is built, then the number of layers in the network can be changed; For other types of networks, this parameter cannot be changed (with one exception: a probabilistic network can consist of three or four layers, depending on whether it includes a loss matrix). The Edit option (available for the Network Designer tool in the network parameters dialog) provides information about pre- and post-processing variables, including their names and definitions, as well as the function

22 transformations, which are used to prepare data for input to the neural network. You can change how missing values are imputed and the transformation controls. Typically, the default values provided are quite suitable. The same dialog shows the current parameters of the network architecture: the number of elements in each layer and (if you scroll the table to the right) the width of the layers. The number of input and output variables is usually strictly related to the number of input and output variables of pre- and post-processing, the transformation function and (in time series analysis problems) the size of the time window. The STATISTICA Neural Networks program itself determines the corresponding parameters and displays them in gray, thereby indicating that they cannot be edited. The number of intermediate layers can be changed arbitrarily at your discretion, but usually the program offers heuristically determined reasonable default values for them. The layer width does not have any functional meaning except for the output layer of the Kohonen network and, as a rule, is ignored. In order to create a network, having already downloaded a set of training data, it is usually enough: 1) Set the types of variables in the start window (Input or Output). 2) Select network type and time series. 3) Set the values of the Time window and Forecast parameters (only in time series analysis tasks). 4) Set the number of layers (only for multilayer perceptrons). 5) Set the number of hidden elements (if you are using the Network Designer). 6) Set the number of elements and the width of the output layer (only for Kohonen networks). 7) Click OK. Editing Networks Once a network has been built, its design can be modified using the Model Editor tool. In this case, you can change all the parameters used in its construction, as well as a number of additional characteristics. The tool also allows you to change the names and definitions of input and output variables, their functions and transformation parameters, and methods for replacing missing values. There are also options for adding new and deleting existing variables and changing time series parameters (Time Window and Forecast Forward). These opportunities are rarely used. In addition, the pre- and post-processing editor makes it possible to change classification parameters that are not specified when building the network, while during operation it may be necessary to adjust them. The values of the classification parameters are used only when solving classification problems, i.e. when at least one of the output variables is nominal. When the network is running, STATISTICA Neural Networks makes a classification decision based on the values of these output variables. So, if there is a nominal output variable with three possible

23 values and a 1-of-N encoding is applied, the program must decide whether, for example, the output vector (0.03;0.98;0.02) should be treated as belonging to the second class (Fig. 9.5). Rice. 9.5 This issue is resolved by setting acceptance and rejection thresholds. In 1-of-N encoding, a classification decision is made if one of the N output values exceeds the acceptance threshold and the rest fall below the rejection threshold; if this condition is not met, then the result is considered undefined (and returned as a missing value). With the default threshold values for acceptance (0.95) and rejection (0.05) set in the program, the above example will indeed be classified as second class. Choosing less stringent thresholds will give a more effective classification, but may lead to a higher error rate. The way you interpret the values of the Accept and Reject parameters depends on the type of network. For some types of networks (for example, Kohonen networks), large values lead to large errors, and the classification decision is made if the output value is below the acceptance threshold (Figures 9.6 and 9.7).

24 Fig. 9.6

25 Fig. 9.7 The network editor allows you to change some other network parameters. Thus, you can change the type of error function that is used when training the network and to evaluate the quality of its work. You can also select specific network layers and change their activation functions and postsynaptic potential (PSP) functions. It is also possible to add or remove network elements. Typically, this can only be done with intermediate layers, since the input and output elements are bound to pre- and post-processing variables (adding or removing variables will add or remove the corresponding elements). Exclude Kohonen networks, where you can add and remove output elements. To add or remove hidden elements, you need to go to the Layers tab and remove elements from the hidden layer. You can also use the tools to cut, copy, and paste the columns of the weight table, which you can edit on the Weights tab. All this allows you to experiment with different network architectures without creating the network each time. You can remove a layer entirely from the network. This is required in rare cases, for example, to separate the preprocessing half of an auto-associative network when reducing dimensionality. The weights table shows all weights and thresholds for either a selected layer or the entire network. If desired, the weights and thresholds can be edited directly, but this is very uncharacteristic (the weight values are set by the learning algorithms). This data is output primarily so that the weight values can be sent to another program for further analysis. TRAINING NETWORKS After the network is built, it needs to be trained on the available data. The STATISTICA Neural Networks package has special algorithms for training networks of each type, grouped by type in the Training menu (these options are only available when using the Network Designer). Multilayer perceptron To train multilayer perceptrons, the STATISTICA Neural Networks package implements five different training algorithms. These are the well-known backpropagation algorithm, the fast second-order conjugate gradient descent and Levenberg Markart methods, as well as the fast propagation and delta-delta-bar methods (which are variations of the backpropagation method that are faster in some cases). All these methods are iterative and the methods of their application are largely similar. In most situations, you should choose the conjugate gradient method, since here learning occurs much faster (sometimes by

26 order of magnitude) than the backpropagation method. The latter should be preferred only when a very complex problem requires quickly finding a satisfactory solution, or when there is a lot of data (on the order of tens of thousands of observations) and even a known excess of them. The Levenberg-Marquard method for some types of problems can be much more effective than the conjugate gradient method, but it can only be used in networks with one output, a mean-square error function and a not very large number of weights, so in fact its scope is limited to small-scale regression problems. Iterative learning The iterative learning algorithm sequentially goes through a series of so-called epochs, at each of which the entire set of training data is fed to the network input, errors are calculated and the network weights are adjusted based on them. Algorithms of this class are subject to the undesirable phenomenon of overfitting (when the network learns well to produce the same output values as in the training set, but is unable to generalize the pattern to new data). Therefore, the quality of network operation should be checked at each epoch using a special control set (cross-check). You can monitor the progress of training in the Training Error Graph window, where the graph shows the mean square error on the training set at a given epoch. If cross-validation is enabled, the mean square error on the test set is also displayed. Using the controls located under the graph, you can change the image scale, and if the entire graph does not fit in the window, scroll bars appear below it (Fig. 9.8).

27 Fig. 9.8 If you want to compare the results of different training runs, you need to click the Advanced button in the training window, and then click the Train button again (pressing the Train button again without initializing will simply continue training the network from the point where it was interrupted). Upon completion of training, using the buttons located above the legend field, the graph can be sent to the STATISTICA system (button). It is important that the effect of overfitting can be easily seen in the graph. Initially, both the learning error and the control error decrease. With the beginning of retraining, the training error continues to decrease, and the control error begins to increase. An increase in the verification error signals the beginning of retraining and indicates that the learning algorithm is beginning to be destructive (and at the same time that a smaller network may be more suitable). If overtraining is observed, the training procedure can be interrupted by pressing the Stop button in the training window or the Esc key. You can also set STATISTICA Neural Networks to stop automatically using stop conditions. Stopping conditions are set in the window of the same name, which is accessed through the Training menu End of analysis. In addition to the maximum number of epochs allocated for training (which is set on the Fast tab), here you can require that training stop when a certain error level is reached or when the error stops decreasing by a certain amount. The target value and minimum reduction can be set separately for the training error and the control error. The best way to combat overfitting is to set the minimum improvement level to zero (i.e., do not allow the slightest deterioration). However, since training involves noise, it is usually not recommended to stop training just because the error has worsened in one epoch. Therefore, the system has introduced a special improvement parameter, Window, in which the number of epochs during which deterioration should be observed is specified, and only after that training will be stopped. In most cases, a value of 5 is fine for this setting. Keeping the Best Network Regardless of whether you use early stopping, you may end up with a network that has already degraded as a result of retraining. In this case, you can restore the best network configuration of all obtained during the training process using the Best Network command (Training Advanced menu) (Fig. 9.9).

28 Fig. 9.9 If the Best Network function is enabled, STATISTICA Neural Networks automatically saves the best network obtained during training (measured by test error). In this case, all training runs are taken into account. Thus, STATISTICA Neural Networks automatically stores the best result of all your experiments. You can also set a Unit Penalty to penalize networks with a large number of units when comparing (the best network is usually a trade-off between the quality of the test and the size of the network). Backpropagation Before using the backpropagation algorithm, you need to set the values of a number of control parameters. The most important control parameters are the learning rate, inertia, and mixing of observations during the learning process (note here that the advantage of the conjugate gradient method lies not only in speed, but also in the small number of control parameters) (Fig. 9.10).

29 Fig. Parameter P The learning speed sets the step size when changing the weights: if the speed is insufficient, the algorithm converges slowly, and if the speed is too high, it is unstable and prone to oscillations. Unfortunately, the best speed depends on the specific task; for quick and rough learning, values from 0.1 to 0.6 are suitable; much smaller values are required to achieve accurate convergence (e.g. 0.01 or even 0.001 if there are many thousands of epochs). Sometimes it is useful to reduce the speed while learning. In the STATISTICA Neural Networks program, you can set the initial and final speed values, in this case, as training progresses, interpolation is performed between them. The initial speed is set in the left field, the final speed in the right (Fig. 9.11).

30 Fig. The coefficient of inertia (Moment) helps the algorithm not to get stuck in lows and local minima. This coefficient can have values in the range from zero to one. Some authors recommend changing it during the learning process. Unfortunately, here, too, the “correct” value depends on the task and can only be found experimentally. When using backpropagation, it is usually recommended to change the order of observations from epoch to epoch, since this reduces the likelihood that the algorithm will get stuck in a local minimum and also reduces the effect of overfitting. To take advantage of this feature, set the Shuffle observations mode. Assessing the quality of the network After the network has been trained, it is worth checking how well it works. The root mean squared error, which is reported in the Training Error Graph window, provides only a rough measure of performance. More useful characteristics are reported in the Classification Statistics and Regression Statistics windows (both are accessed through the Analysis Results window). The Classification Statistics window is valid for nominal output variables. This reports how many of each class of observations from the data file (each corresponding to a nominal value) were classified correctly, how many were classified incorrectly, and how many were unclassified, as well as details about the classification errors. Having trained the network, you just need to open the Descriptive Statistics window (Fig. 9.12).

31 Fig Statistics can be obtained separately for training, control and test sets. At the top of the table are summary statistics (the total number of observations in each class, the number of classified correctly, incorrectly and unclassified), and at the bottom of the cross classification results (how many observations from a given column were assigned to a given row) (Fig. 9.13). Fig. If there are many Unknown answers in this table, but few or no Wrong answers, then you should probably relax the acceptance and rejection thresholds (Edit Pre/Post Processing menu) (Fig. 9.14).

32 Fig The Regression Statistics window is used in the case of numeric output variables. It summarizes the accuracy of the regression estimates. The most important statistic is the standard deviation ratio (S.D. ratio), shown at the bottom of the table. It represents the ratio of the standard deviation of the forecast error to the standard deviation of the original data. If we had no input data at all, then the best we could take as a forecast for the output variable is its average value over the available sample, and the error of such a forecast would be equal to the standard deviation of the sample. If a neural network works effectively, we can expect that its average error on the available observations will be close to zero, and the standard deviation of this error will be less than the standard deviation of the sample values (otherwise the network would give a result no better than simple guessing). Thus, a ratio of standard deviations significantly less than one indicates the efficiency of the network. A value equal to one minus the ratio of standard deviations is equal to the proportion of model variance explained. Kohonen Networks The training algorithm for Kohonen networks is in some respects similar to the training algorithms for multilayer perceptrons: it is iterative and carried out in epochs, and the mean squared training error can be plotted (although in fact it is the mean square of a completely different measure of error than in multilayer perceptrons). However, the Kohonen algorithm has a number of features. The most significant of them is that learning here is unguided, i.e. the data may not contain any output values at all, and if there are any, they are ignored. The performance of the algorithm is determined by two parameters: Learning Rate and Neighborhood. Training occurs like this: the next observation is fed to the input of the network, processed by it, the winning (most active) radial element is selected (i.e., the element of the second layer of the network), and then it and its nearest neighbors are adjusted so as to better reproduce the training observation. The learning rate controls the degree of adaptation, and the neighborhood determines

33 number of adjustable elements. Typically, the work of the Kohonen algorithm is divided into two stages: ordering and fine tuning, at each of which the learning rate and the size of the neighborhood gradually change from their initial values to their final values. In STATISTICA Neural Networks, you can specify initial and final values for both the learning rate and the neighborhood size. The size of the neighborhood is determined by a square centered on the winning element; a zero “size” corresponds to one winning element; “size 1” to a 3 3 square centered on the winning element; “size 2” square 5 5, etc. If the winning element is located close to the edge, then the neighborhood is trimmed (rather than spread to the opposite one). Although by its very nature this parameter is an integer, it can be specified in real form to more precisely control it when the algorithm begins to reduce the size of the neighborhood. In this case, STATISTICA Neural Networks first corrects this number and then rounds it to the nearest integer. After completing the Kohonen learning algorithm, you need to mark the radial elements with icons of their corresponding classes (see the “Topological map” section). OTHER TYPES OF NETWORKS Training other types of networks is quite simple; In each case there are only a few settable learning parameters, all of which are described below. Radial Basis Functions (RBFs) Training consists of three stages: placing the centers of radial elements, choosing their deviations, and optimizing the linear output layer. For the first two stages, there are several options for the operation of the algorithm, the selection of which is carried out in the Radial Basis Function window (accessed through the Training menu); The most popular combination is the K-means method for the first stage and the K-nearest neighbors method for the second. The linear output layer is optimized using the classical pseudoinverse matrix (singular decomposition) algorithm. The STATISTICA Neural Networks program also allows you to build hybrid RBF networks by choosing other activation functions for the output layer (for example, logistic ones), and in this case, to train this layer you can use any of the multilayer perceptron training algorithms, for example, the conjugate gradient method. Linear networks Here, under the guise of a two-layer network, a conventional linear model is implemented, which is optimized using the pseudoinverse matrix algorithm in the Radial Basis Function window.

34 A linear network can also be used for principal component analysis to try to reduce the number of variables before another type of network processes the data. Probabilistic and generalized regression neural networks PNN/GRNN Probabilistic (PNN) and generalized regression neural networks (GRNN) are based on statistical methods of kernel probability density estimates and are intended, respectively, for classification and regression problems. They have simple and fast learning algorithms, but the resulting neural network models are large and relatively slow. Automatic network designer The process of selecting the appropriate network type and its architecture can be long and unproductive, since it consists of a lot of trial and error. Moreover, since training is noisy and the algorithm may get stuck in local minima, each experiment must be repeated several times. This tedious work can be minimized by using the automatic network construction capabilities of the STATISTICA Neural Networks package, which uses fairly complex optimization algorithms to automatically conduct large series of experiments and select the best architecture and network size. Automatic network design functions are called up when you select the Solution Wizard tool. Here you just need to specify the types of architectures that should be considered, specify the number of iterations (or analysis time) that determines the duration of the search (since the algorithm can take a long time to run, it makes sense to first specify a small number of iterations to estimate how long it might take entire search), and select the Saved network selection criterion, which will penalize a network with an unreasonably large number of elements. The algorithm will perform the required series of experiments and indicate the best of the resulting networks. If this algorithm is used in a time series analysis task, then you additionally need to set the value of the Time window parameter. Genetic Algorithm for Input Data Selection One of the most difficult questions to solve when using neural networks is the question of which input variables should be used (it is rarely known in advance which of them are important for solving the problem and which are not). Using the Reduce Dimension tool,

35 accessible from the Advanced tab of the start menu, you can automatically find a suitable set of input variables. By building and testing a large number of PNN or GRNN networks (for classification or regression problems, respectively) with different sets of input variables, a genetic algorithm (as well as inclusion and exclusion algorithms) selects combinations of inputs and searches for the best one. As in the case of an automatic network designer, this procedure can be time-consuming, but, nevertheless, it is often the only way to solve the problem. During operation, the genetic algorithm generates a large number of test bit strings (their number is specified by the Population parameter) and artificially “crosses” them over a given number of generations, using artificial selection operations of mutation and crossing, the intensity of which can be controlled. A PNN or GRNN network is trained with a given smoothing parameter (it is wise to run a few tests to determine an appropriate smoothing coefficient before applying the genetic algorithm), and in order to give an advantage to small sets of input variables, a Penalty per element parameter can be specified. The algorithm looks at all the input variables present in the data set. To run the algorithm, you need to click OK. When its work is completed, the word Yes will be displayed in the table at the bottom of the window opposite useful variables, and a dash opposite useless ones. To use the results of the algorithm, you must first select Run Network Designer for selected variables or Run Solution Wizard for selected variables in the Dimensionality Reduction menu on the End of Analysis tab. WORKING WITH THE NETWORK Obtaining output values After the network is trained, it can be used to perform data analysis: run the network on individual observations from the current data set, on the entire data set, or on arbitrary user-defined observations. The network can process any other compatible data set that has input variables with the same names and definitions as those in the network. This means that having built the network, we are no longer tied to the training set. If you are analyzing a data set that has compatible output values in addition to its inputs, STATISTICA Neural Networks will calculate the error values. When you open a network or dataset, STATISTICA Neural Networks checks whether the dataset contains variables that are compatible with the network's input variables. If there are any, then their type in the data set is automatically set as needed, and all other variables are ignored. This way it is possible to have multiple networks (as files) working with

CHAPTER 10 Add-ins Some Excel utilities become available only after you enable add-ins. First of all, we will focus on the add-ons Search for a solution and Analysis package. Let's demonstrate what

Laboratory work 105. Application of the clustering algorithm: self-organizing Kohonen maps Main goal Learn to use the “Self-organizing Kohonen maps” data processing method. Theoretical

Laboratory work 3. Analytical platform Deductor (data import and data cleaning) (a demo version of the product is used in this work) Deductor consists of five components: analytical application

CONTENTS Chapter 13: PLACEMENT OF MULTIPLE CHARTS Contents OVERVIEW...2770 AUTOMATICALLY PLACEMENT OF MULTIPLE CHARTS WIZARD...2771 Using the Automatic Placement of Multiple CHARTS Wizard window

SECURITY OF DATABASE SYSTEMS topic 10 Lecture 10. Using macros in Access A macro is a set of one or more commands that perform specific, frequently used operations, for example, opening

Lecture 6 Consolidated 1 ISE COURSE 1 LECTURE Topic 8: Technology and methods of processing economic information using consolidated and summary tables Plan 1. The concept of a consolidated table. Methods of consolidation.

LAB 14 Automatic classification of political articles This example is based on a “standard” set of news documents published by the website lenta.ru. From this site

Electronic scientific journal “RESEARCHED IN RUSSIA” 270 http://zhurnalaperelarnru/articles/2006/36pdf Application of neural networks to solve forecasting problems Soldatova OP, Semenov VV (vlad-eraser@mailru)

Laboratory work 2 Topic: Analytical modeling technology in DSS. Technologies for analysis and forecasting based on trends Goal: to study the possibilities and develop the ability to use universal

Lecture 11 CALCULATIONS IN TABLE PROCESSOR MS EXCEL 2010 Purpose of the lecture. Study the features of carrying out calculations using formulas in the spreadsheet processor Ms Excel 2010. Lecture questions: 1. Formulas

TIME TO COMPLETE THE WORK: 2 hours. 1. Extracurricular training Prepare a title page. See APPENDIX 1 2. Working in the laboratory Basic information Immediately after starting Word, it automatically creates a new document

Instructions for working with the Neurosimulator program (using the example of multiplication table modeling) The Neurosimulator program allows you to create and use perceptron-type neural networks. Rice. 1. Working

Chapter 1 Charting Basics Data in a spreadsheet is presented in rows and columns. By adding a chart, you can add value to this data by highlighting relationships and trends that are not

Lecture 7 course 1 DSS 1 12/15/2012 Lecture TOPIC 9: Information technologies for creating decision support systems and forecasting methods Plan: 1. Methods of forecasting in MS spreadsheets

Contents 1. Cost summary, form KS-3... 2 1.1. Creating a document general description... 2 1.2. Adding data to a document... 5 1.2.1. Synchronizing totals in a row... 6 1.2.2. Types to be added

Training of neural networks During operation, the neural network implements some data transformation, which in general can be described by a function of many variables Y = f (X), where = x x,...,

Chapter 8 Setting up views What are views What are views A view is a way of visualizing (or in other words presenting) information to the user based on stored data

Work 9 Forms in Access Purpose of work: learn to create and edit forms using autoforms and in the form wizard mode Contents of work 1 Types of forms 2 Creating forms 1 Types of forms Entering and viewing data

To create a test you need: WHERE TO START? Step 1. Add questions (to tests) to the question bank. Two ways to add: import from notepad (create a notepad file (.txt format), add questions there,

Working with the Microsoft Excel spreadsheet processor Brief theoretical information Windows application Excel allows you to generate and print documents presented in tabular form and perform calculations

1. Introduction Laboratory work 3 Selection of parameters When solving various problems, you often have to deal with the problem of selecting one value by changing another. It is used very effectively for this purpose.

General information When planning the release of the 2007 Microsoft Office system, the developers set the task of making the main Microsoft Office applications more convenient to use. As a result, a custom

Main goal Laboratory work 104. Logistic regression and ROC analysis Learn to process data and predict events using the capabilities of logistic regression and ROC analysis. Theoretical

METHOD OF GROUP ACCOUNTING OF ARGUMENTS Method of group accounting of arguments, GMDH (Group Method of Data Handling, GMDH) is a method for generating and selecting regression models of optimal complexity. Beneath the complexity of the model

Lesson 1: Excel interface * version 2010 * 1.0 Introduction Data in Excel is located in “cells”, which in turn form columns and rows. This helps us understand this data better and allows us to

Binary response models In the previous section, we conducted regression analysis under the assumption that the response variable Test is a continuous random variable with a normal distribution. On

Introduction to Access Access is a database application or database management system (DBMS). Computer databases are used in almost all areas of activity. Skill

Instructions for filling the website of the Department of Culturology and Sociology (Part 2 “site content editor”) 1 Table of contents 1 Editor interface... 3 2 Changing the editor size... 4 3 Toolbar...

Chapter 17 Searching, sorting and displaying information in a database IN THIS CHAPTER...» Searching and filtering data» Sorting the database» Creating and applying queries If you need to find a specific thing in the database

6 Frequencies 97 Step-by-step algorithms calculations 102 Presentation of results 105 Completing the analysis and exiting the program This chapter discusses frequencies, their graphical representation (bar and circle

PRACTICAL LESSON 5 TOPIC: Integrated use of the capabilities of MS Word to create large documents OBJECTIVE: Learn to comprehensively use the capabilities of MS Word to create large documents

Municipal educational institution "Lyceum 43" Saransk Methodological development “RESEARCH OF THE ACCESS DBMS IN CREATION AND EDITING OF A DATABASE” Author: computer science teacher Zhebanov A. A. Saransk 2014 RESEARCH OF THE ACCESS DBMS IN

CHAPTER 1 Getting Ready to Use Excel Many readers are more or less familiar with Excel spreadsheets. However, it is necessary to define the terms that are most frequently encountered

Practical lesson 3 Creation of reporting documentation. Data connection and consolidation. Pivot tables Goal of work: learn how to create data consolidation in tables, create and use pivot tables

Practice 3 Creating a Form A form is a database object that can be used to enter, change, or display data from a table or query. Forms can be used to manage

ITC Group Guide to working with the "ArpEdit" utility Version 1.4 Moscow, 2014 Contents CONTENTS... 2 1 INTRODUCTION... 4 1.1 Purpose of the document... 4 1.2 Purpose of the "ArpEdit" utility... 4 2 GENERAL

Working with standard document templates Cognitive Technologies User's Guide Moscow, 2015 2 ABSTRACT This document provides information about the use of the E1 Euphrates software package

APPROVED by the Director of the State Budgetary Educational Institution of Further Vocational Education and Training Center of St. Petersburg "Regional Center for Assessment of the Quality of Education and information technologies» E.V. Mikhailova AISU “Paragraph” for educational institutions Service NEW LIST Guide

Laboratory work 6 Pivot tables Theoretical section The concept of pivot tables For a comprehensive and effective analysis of data from large tables in Excel, so-called pivot tables are used.

OpenOffice.org Impress Impress is a program included in OpenOffice.org for working with slide shows (presentations). You can create slides that contain many different elements, including text, bullets

Introduction to ACCESS First of all, Access is a database management system (DBMS). Like other products in this category, Access is designed to store and retrieve data and present it in a convenient form

Electronic platform FINTENDER.RU STAR system NMC justification service Moscow 2017 Contents NMC justification service... 3 Formation of a protocol by manually entering a list of positions... 4 Formation of a protocol

Placement editing subsystem Section. Placement editing subsystem Window Cell placement...-1 Layout mode...-2 Placement mode...-2 Length of links...-2 Active subcircuit...-2 Table

STO MI user “Setting up reports in 1C: Enterprise” Description The user instructions describe working with reports in the 1C: Enterprise program. This manual allows you to gain skills in setting up

On the Margins tab of the Page Setup dialog box, the Top, Bottom, Left, and Right fields set the margins from the edge of the pages to the table. The height and width of the table field depends on the amount of indentation.

90 Chapter 5 Caption This tab is available for objects within which text has been typed. With its help, you can adjust the internal margins and specify whether the object will be resized if the text does not fit

1. Inserting and creating tables in Word 2007 Word tables are used to structure page content. In addition, tables are used for calculations. Word uses insert and create technology

Working with notes About notes A note is information (data) that is specific to a cell and is stored independently of the contents of that cell. This could be some kind of explanatory information,

Organization of document protection using package tools Microsoft Office 2010 The goal of the work is to learn how to organize the protection of text documents, the protection of spreadsheets, and the protection of databases. Having completed this work,

Searching and replacing data Searching for data You can search for data on the entire sheet or in a selected area of the sheet, for example, only in some columns or rows, or in the entire workbook at once. 1. In a group

1. OpenOffice.org Writer word processor. Entering and formatting text General information The Writer word processor is by far the best-known OpenOffice.org application. Like text

Module “Traffic schedule control”. Brief information...3 First settings...3 Control panel of the “Traffic schedule control” module...4 Working with the route editor...4 Composition of the editor...4 Points

LABORATORY WORK MULTILAYER SIGMOIDAL NETWORKS Multilayer perceptron In a multilayer perceptron, neurons are arranged in several layers. Neurons of the first layer receive input signals and convert them

Procedure for updating configuration releases MAKE AN ARCHIVE COPY OF YOUR INFORMATION BASE. Before making any changes, you must make an archive copy of your infobase on your hard drive,

PRACTICUM Basic skills in Deductor Studio 5.2 Lesson 7. Using scripts Introduction Scripts are designed to automate the process of adding similar processing branches to a script.

System integration module Materials rationing and CAD TP VERTICAL User's Guide The information contained in this document is subject to change without prior notice. None

Task 2. Creating and editing tables. Working with a data schema Objective of the task: Learn to create a new database, create and edit the structure of tables and establish connections between them using