Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation between this model and brain biological architecture is debated.
A subject of current research in theoretical neuroscience is the question surrounding the degree of complexity and the properties that individual neural elements should have to reproduce something resembling animal intelligence.
Historically, computers evolved from the von Neumann architecture, which is based on sequential processing and execution of explicit instructions. On the other hand, the origins of neural networks are based on efforts to model information processing in biological systems, which may rely largely on parallel processing as well as implicit instructions based on recognition of patterns of 'sensory' input from external sources. In other words, at its very heart a neural network is a complex statistical processor (as opposed to being tasked to sequentially process and execute)
An artificial neural network (ANN), also called a simulated neural network (SNN) or commonly just neural network (NN) is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing based on a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network.
In more practical terms neural networks are non-linear statistical data modeling or decision making tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data
What is a Neural Network?
An Artificial Neural Network (ANN) is an information-processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning in biological systems involves adjustments to the synaptic connections that exist between the neurons. This is true of ANNs as well.
Historical background
Neural network simulations appear to be a recent development. However, this field was established before the advent of computers, and has survived several eras. Many important advances have been boosted by the use of inexpensive computer emulations. The first artificial neuron was produced in 1943 by the neurophysiologist Warren McCulloch and the logician Walter Pitts.
First Attempts: There were some initial simulations using formal logic. McCulloch and Pitts (1943) developed models of neural networks based on their understanding of neurology. These models made several assumptions about how neurons worked. Their networks were based on simple neurons, which were considered to be binary devices with fixed threshold.
Promising & Emerging Technology: Not only was neuroscience, but psychologists and engineers also contributed to the progress of neural network simulations. Rosenblatt (1958) stirred considerable interest and activity in the field when he designed and developed the Perceptron. The Perceptron had three layers with the middle layer known as the association layer. This system could learn to connect or associate a given input to a random output unit. Another system was the ADALINE (Adaptive Linear Element) which was developed in 1960 by Widrow and Hoff (of Stanford University). The ADALINE was an analogue electronic device made from simple components. The method used for learning was different to that of the Perceptron, it employed the Least-Mean-Squares (LMS) learning rule.
Today: Progress during the late 1970s and early 1980s was important to
the re-emergence on interest in the neural network field. Significant progress has been made in the field of neural networks-enough to attract a great deal of attention and fund further research. Neurally based chips are emerging and applications to complex problems developing. Clearly, today is a period of transition for neural network technology.
Why use neural networks?
Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained
neural network can be thought of as an "expert" in the category of information it has been given to analyze. This expert can then be used to provide projections given new situations of interest and answer "what if" questions. Other advantages include:
1. Adaptive learning: An ability to learn how to do tasks based on the data
given for training or initial experience.
2. Self-Organisation: An ANN can create its own organization or representation of the information it receives during learning time.
3. Real Time Operation: ANN computations may be carried out in parallel,
and special hardware devices are being designed and manufactured which take advantage of this capability.
4. Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads to the corresponding degradation of performance. However, some network capabilities may be retained even with major network damage.
Why would anyone want a `new' sort of computer?
What are (everyday) computer systems good at... .....and not so good at?
HUMAN AND ARTIFICIAL NEURONSINVESTIGATING
THE SIMILARITIES
How the Human Brain Learns?
Much is still unknown about how the brain trains itself to process information, so
theories abound. In the human brain, a typical neuron collects signals from others through a host of fine structures called dendrites. The neuron sends out spikes of electrical activity through a long, thin stand known as an axon, which splits into thousands of branches. At the end of each branch, a structure called a synapse converts the activity from the axon into electrical effects that inhibit or excite activity from the axon into electrical effects that inhibit or excite activity in the connected neurons. When a neuron receives excitatory input that is sufficiently large compared with its inhibitory input, it sends a spike of electrical activity down its axon.
Learning occurs by changing the effectiveness of the synapses so that the
influence of one neuron on another changes.
Typically, brain cells, i.e., neurons are five to six orders of magnitude slower than
the silicon chip happen in the nanosecond range, whereas neural events happen in the millisecond range. However, the brain makes up the slow rate of operation of a neuron by massive interconnection between them. It is estimated that the human brain consists of about one hundred billion neural cell,about the same number as the stars in our galaxy.
From Human Neurons to Artificial Neurons
Neural networks are realized by first trying to deduce the essential features of neurons and their interconnections.
Inputs, xi:
Typically, these values are external stimuli from the environment or come from the outputs of other artificial neurons. They can be discrete values from a set, such as {0,1}, or real-valued numbers.
Weights, wi:
These are real-valued numbers that determine the contribution of each input to the neuron's weighted sum and eventually its output. The goal of neural network training algorithms is to determine the best possible set of weight values for the problem under consideration. Finding the optimal set is often a trade-off between computation time and minimizing the network error.
Threshold, u:
The threshold is referred to as a bias value. In this case, the real number is added to the weighted sum. For simplicity, the threshold can be regarded as another input / weight pair, where w0 = u and x0 = -1.
Activation Function, f:The activation function for the original McCulloch-Pitts neuron was the unit stepfunction. However, the artificial neuron model
has been expanded to include other functions such as the sigmoid, piecewise linear, and Gaussian
INTERCONNECTION LAYERS
The most common neural network model is the multilayer perceptron (MLP). This type of neural network is known as a supervised network because it requires a desired output in order to learn. The goal of this type of network is to create a model that correctly maps the input to the output using historical data so that the model can then be used to produce the output when the desired output is unknown. A graphical representation of an MLP is shown below
Block diagram of a two hidden layer multilayer perceptron (MLP). The inputs are fed into the input layer and get multiplied by interconnection weights as they are passed from the input layer to the first hidden layer. Within the first hidden layer, they get summed then processed by a nonlinear function (usually the hyperbolic tangent). As the processed data leaves the first hidden layer, again it gets
multiplied by interconnection weights, then summed and processed by the second hidden layer. Finally the data is multiplied by interconnection weights then processed one last time within the output layer to produce the neural network output.
BACK-PROPAGATION ALGORITHIM
The MLP and many other neural networks learn using an algorithm called
backpropagation. With backpropagation, the input data is repeatedly presented to the neural network. With each presentation the output of the neural network is compared to the desired output and an error is computed. This error is then fed back (backpropagated) to the neural network and used to adjust the weights such that the error decreases with each iteration and the neural model gets closer and closer to producing the desired output.
Demonstration of a neural network learning to model the exclusive-or (Xor) data. The Xor data is repeatedly presented to the neural network. With each presentation, the error between the network output and the desired output is
computed and fed back to the neural network. The neural network uses this error to adjust its weights such that the error will be decreased. This sequence of events is usually repeated until an acceptable error has been reached or until the network no longer appears to be learning In order to train a neural network to perform some task, we must adjust the weights of each unit in such a way that the error between the desired output and the actual output is reduced. This process requires that the neural network compute the error derivative of the weights (EW).
APPLICATIONS
Given this description of neural networks and how they work, what real world applications are they suited for? Neural networks have broad applicability to real world problems. In fact, they have already been successfully applied in many industries. Neural networks have been successfully applied to broad spectrum of dataintensive applications, such as:
Voice Recognition - Transcribing spoken words into ASCII text.
Target Recognition - Military application which uses video and/or infrared
image data to determine if an enemy target is present.
Medical Diagnosis - Assisting doctors with their diagnosis by analyzing the
reported symptoms and/or image data such as MRIs or X-rays.
Process Modeling and Control - Creating a neural network model for a
physical plant then using that model to determine the best control settings for the plant
Credit Rating - Automatically assigning a company's or individuals credit
rating based on their financial condition.
Targeted Marketing - Finding the set of demographics, which have the
highest response rate for a particular marketing campaign.
Financial forecasting - Using the historical data of a security to predict the
future movement of that security. Now we shall look into a few interesting applications developed across the world.
NETTALK
The most famous example of a neural-network pattern classifier is the NETtalk system developed by Terry Sejnowski and Charles Rosenberg, used to generate synthetic speech.
Once the network is trained to produce the correct phonemes in the 5000-word training set, it performs quite reasonably when presented with words that it was not explicitly taught to recognize. The data –representation scheme employed allows a temporal pattern sequence to be represented spatially, while simultaneously providing the network with a means of easily extracting the important features of the input pattern.
NETtalk Data Representation
Anyone who has learned to read the English language knows that for every
pronunciation rule, there is an exception. For example, consider the English
pronunciation of the following words:
FIND FIEND FRIEND FEINT
While these four words are very similar in their form and structure, the
pronunciation of each is vastly different. In each case, the pronunciation of the vowel(s) is dependent on a learned relationship between the vowel and its neighboring characters. The NETtalk system captures the implicit relationship between text and sounds by using a BPN to learn these relationships through experience. Sejnowski and Rosenberg adopted a sliding window technique for representing words as patterns. Essentially, the window is nothing more than a fixed-width representation of characters that form the complete input pattern for the network. The window “slides” across a word, from left to right, each time capturing (and simultaneously losing) one character. Sejnowski and Rosenberg used a window of seven characters, with the third position (middle) designated as the focus character. According
to the language studies three characters were adequate to exert the proper influence on the pronunciation of any one character in an English word. Sejnowski and Rosenberg chose to represent the input characters as pattern vectors composed of 29 binary elements-one for each of the 26 upper-case English alphabet characters, and one for each of the punctuation characters that influence pronunciation.
NETtalk Training
The training data for the NETtalk application consist of 5000 common English words, together with the corresponding phonetic sequence for each word. For each exemplar set of n input patterns are defined such that each input pattern contained one instance of the seven –character sliding window, with each character
represented as a 29- element vector, where n represents the number of characters in the word. Using this scheme, the dimension of each input pattern was 203 elements (7characters x 29 elements per character). Choosing a 26-element vector gave 30,000 exemplars .A three layer BPN with 80 sigmoidal units on the hidden layer, completely interconnected with all elements on the input and output layers.
NETtalk Results
Training the NETtalk BPN from the exemplar data was only marginally better, requiring 10 hours of computer time on a VAX 11/780 class computer system. While the network was training, Sejnowski periodically stopped the process and allowed the network to simply produce whatever classifications it could, given a partial set of the training words as input. The classification produced by the network was used to drive a speech synthesizer to produce the sounds that were recorded on audiotape. Before the training started, the network produced random sounds, freely mixing
consonants and vowel sounds. After 100 epochs, the network had begun to separate words, recognizing the role of the blank character in text.After 500 epochs, the network was making clear distinctions between the consonant sounds and the vowel sounds.After 1000 epochs, the words that the network were classifying had become distinguishable,although not phonetically correct.After 1500 epochs, the networks had clearly captured the phonetic rules,as the sounds produced by the BPN were nearly perfect, albeit somewhat mechanical.Training
was stopped after epoch 1500, and the network state was frozen. At that point, the NETtalk system was asked to pronounce 2000 words that it had not been explicitly trained to recognize.Using the relationships that the network had found during training,the NETtalk system had only minor problems “reading” these new
words aloud.Virtualy all of the words were recognisable to the researchers,and are also easily recognised by people not familiar with the system when they hear the audio tape recording. Sejnowski and Rosenberg reported that NETtalk can read English text with an accuracy of “about 95%”.
RADAR –SIGNATURE CLASSIFIER
The primary application of pulse Doppler radar is to detect an air-borne target, and determine the range and velocity of the target relative to the radar station. Pulse Doppler radar operates on two very simple principles of physics: First electromagnetic radiation (EMR) travels at a constant speed, and, second, EMR waves reflected from a moving body are frequency shifted in the direction of travel. Usually, the radar system provides a digital readout of these parameters for each target acquired, leaving the chore of using the information to the radar operator. They make this determination based on the electronic signature of the radar return. As we indicated previously, radar-signature recognition is currently a strictly human phenomenon; there are no automatic means of identifying a target based on its radarsignature incorporated in radar systems.
No comments:
Post a Comment
leave your opinion