Future Hearing Prosthetics - Seminar Paper

Future Hearing Prosthetics


It is reported that millions of people are suffering from some kind of hearing impairment. The number is climbing due to the increasing portion of elderly people in the world’s population. While hearing loss is usually caused by permanent mechanical damage to the ear, there is no medicine to reverse it, as surgery offers little help.
Thus the electronic hearing aid or prosthetic are the best known solutions to the patients. Hence a neurofuzzy approach for devices that combine neural networks and fuzzy logic to achieve optimal gain requirements is becoming the future for hearing prosthetics.

This article describes an endeavor to help the hearing impaired through the use of neurofuzzy methodologies to tune prosthetic hearing devices in an efficient and tractable manner. The integration of a graphical user interface, a hearing aid emulator module and a fuzzy inference engine into the framework of an intelligent tool that can be used to tune prosthetic hearing devices is described. The graphical user interface permits the extraction of perceptual information pertaining to the patient's aural response to test stimuli in the form of speech patterns. This interface could significantly reduce the role of the acoustician in fine-tuning the hearing aid thereby reducing possible human error and facilitating a more direct involvement of the patient in the tuning procedure. A hearing aid emulator permits one to test the working of the entire tool in a device independent fashion. The test speech patterns are passed through a filter bank that mimics the frequency response of the hearing aid. Testing, simulation results and possible future work form the remainder of the paper.
As we grow older, we cannot hear very weak sounds. There are large differences between individuals in ear aging, but hearing loss starts in 40's and progresses rapidly in 60's for high frequency sounds. One of the main reasons is cochlea damage. In this case, people can not hear speech sounds properly and it is difficult to understand what others say, especially when speech is presented at noisy background.
Research on digital hearing aids in hearing impaired listeners, sound pressure level above which a sound is audible (threshold of hearing) is higher than that in normal listeners. However the sound pressure level which is unbearably loud to hear any more is almost the same for normal and hearing-impaired listeners. The task of old type hearing aids was only to amplify input sounds by using analog circuits. Therefore, in such a type of hearing aids, a loud sound is often amplified too much to hear and this type of hearing aids was not broadly accepted. The digital hearing aid, CLAIDHA (Compensating Loudness by Analyzing the Input-signal, Digital Hearing Aid) was developed. In this hearing aid, digital signal processing based on an analysis of input sounds is done. For hearing-impaired listeners, it amplifies sounds in such a way that the sounds are nearly as loud for normal listeners. In our laboratory, we are also researching on a new algorithm of speech enhancement using the newest generation of digital signal processors and techniques. We are going to apply our algorithm in a new digital hearing aid that will have relatively low hardware requirement and process sounds in a real time.
Conceptually, the hearing aid is nothing more than an amplifier, collecting and amplifying environmental sounds in order to compensate for hearing loss. However the hearing process is much too complex to be compensated by current hearing aids. Hearing is, by its nature, an intelligent process; people try to understand received sounds, which mean that the brain is involved in the process.

This nature makes the design of a hearing aid much more complicated than finding a set of linear amplifiers. Hearing aid technology has progressed tremendously in the past couple of decades. There are 3 main categories in hearing aid technology today; analog, programmable, and digital. The most basic of technology is the Analog system. These hearing aids are adjusted for a person using a fine screwdriver. As you can imagine this way of fitting hearing aid has been around for awhile. For many people with hearing loss this type of technology works just fine.

A person with a harder to fit type of hearing loss who requires a little more precise tuning of the hearing aid may be fit with a programmable hearing aid. This allows the hearing professional to program the hearing aid using a computer. This allows for more fine tuned changes in the hearing aid program to more individualize the hearing aid. It is easier to make small changes in the program with this technology compared to the analog system. This technology is very beneficial for people who do rely more on their hearing in their daily lives.
The newest technology to the hearing aid world is digital hearing aids. With this technology, acoustic signals are transformed at high speed and with great precision into a binary code. This allows much more complex calculations and adjustments of the amplified signal than is possible with the other two technologies. It gives greater flexibility in providing individualized solutions to hearing loss. This paper also deals with digital processing of speech as it pertains to the hearing impaired. At present the available hearing aids lag behind the technology curve, both, in terms of algorithm research and available hardware. A digital hearing aid normally consists of a microphone, an analogue-to-digital (A/D) converter, a microprocessor, a digital-to-analogue (D/A) converter and a loudspeaker. The microprocessor replaces the amplifier circuit used in an analogue hearing and it is designed for special purposes such as providing optimal speech intelligibility.  In digital hearing aids, sound processing is digitalized. Thus, it is possible to refine the sound signal, for instance by reducing noise and improving speech signals, or by amplifying only the frequencies that the user needs amplified. The digital circuit itself has no internal noise.  The central part of a digital hearing aid is the microprocessor, or the digital signal processor (DSP).

The mapping of the audiogram into the target is a process to preselect the required gains of the hearing aid for the user. It is desirable to make this guess as close to the real requirement as possible to save effort and time in the following fine-tuning process. One type of strategy for preselection is called comparative technique, which was introduced by Carhart in the 1940’s. In this approach, the audiologist has a set of established targets that are obtained from previous experience. The user is asked to try and select the best matched one. It is obviously a time consuming process, which makes it less and less attractive. In contrast to the comparative technique, another paradigm, called prescriptive procedure, is widely used.  The prescriptive procedure employs a designated formulated to asset the gain requirement of the user. The input to the formula is the audiological data from the patient. Many prescriptive procedures exist. One – half gain rule, in which the optimal gains are taken at the middle point of the audiogram curve, is one of the simplest processes. Others, like prescription of gain and output (POGO), developed by McCandless and Lyregaard, and National Acoustics Laboratory (NAL) procedures, also available. However, the existence of so many procedures reveals the fact that there is no general agreement over the question of which one is best. As a matter of fact, in order to make them easy to apply, the formulae are oversimplifying the hearing process in some sense. Actually, since an unknown and nonlinear relationship exists between the hearing loss of the patient (audiogram) and the desired gain (target), it is very difficult to derive a general formula manually. The neural network, which is a model free architecture and capable of fitting virtually any functions, might be the promising way to derive the formula. A neural network includes many neurons.  All the information is stored in the forms of connection strengths (weights) between neurons. The weights can be initially set to some random numbers that evolve during the learning process. The learning for a neural network is a process through which the network adjusts its weights by comparing the actual outputs and the desired outputs under given inputs.
            The target generated by the neural network represents an overall requirement of counting a large population of patients, which makes it impossible to be optimal for a particular user. Instead, the user is asked to take the fine tuning step to give an objective evaluation of the performance of the current target. Some minor changes are expected based on user’s evaluation. However, the user’s feelings are fuzzy and undetermined. The scores they give are not quantified gains but the linguistic terms: bad, good, OK, etc. To rebuild the quantified correction amounts that are needed to tune the target from the linguistic terms is a difficult task. The major obstacle is that there are too many degrees of freedom (too many adjustable variables), and sometimes the adjustments are contradictory. Facing this difficulty, we try to seek help from fuzzy logic, which is good at handling fuzzy and contradictory information. Fuzzy logic uses linguistic terms to describe a problem and operates on them. The fuzzy logic output can be defuzzified to obtain crisp numbers.
            In this section, we will introduce a methodology combining neural networks and fuzzy logic to be used in next generation hearing aids.

Systematic view
            The whole picture of our approach shown in the figure for the first step, hearing loss data is collected by an audiologist the data drive a trained neural net to generate initial targets. Initial targets are the first guess of the real gains requirement of the user. It is worth pointing out that the response of humans to environmental sounds is not a simple linear function of the sound level. A patient with hearing loss may complaint that he/she finds a soft sound to be too soft to perceive but does not have any problems with normal or loud sounds.

            Therefore he/she needs more gain for soft sounds and little or even no gain for normal and loud sounds, which implies that the target (gains) should be dependent on the input sound level. We employ three sets of targets that correspond to soft, normal, and loud sounds respectively. The initial targets are programmed into the hearing device some hearing aid parameter setting algorithms. The user is then asked to listen to specific speech and sound stimuli with the programmed hearing aid. Then the user evaluates the performance based on six subjective indexes: loudness, tone, clarity, comfort, distortion and noise. The user is asked to give a quantitative measurement (between 0 and 10, for example) for each index. If the user is satisfied with all these aspects, the process is done. Otherwise, a fine tuning algorithm will be invoked to adjust the target toward the user’s desired direction. This new, finer target is then programmed and evaluated by the user. Further tuning is performed based on the user’s evaluation. This process continues until the user’s full satisfaction is achieved.

Neural Network for Target Generation
            A neural network is being used in this approach to generate initial target for the user. As stated earlier, the comparative approach for selecting the target is inefficient because numerous alternatives are available. The prescriptive procedures are oversimplified. They are developed for generic purposes, which implies that they are not adaptive. It is difficult for physicians to modify these formulae to fit their specific situations, which is a fatal weakness of the prescriptive procedures. On the other hand, a neural network is a universal function-fitting tool; it can fit virtually any function as long as sufficient examples are provided. A three layered neural net is used in our approach.
            The input layer has five neurons that correspond to hearing loss at 250 Hz, 500 Hz, 1 kHz, 2 kHz and 4 kHz respectively. The output layer has eight neurons that corresponds to the hearing loss at 250Hz, 500Hz, 750Hz, 1 kHz, 1 .5 kHz, 2 kHz, 3 kHz and 4 kHz respectively. The number of neurons in a hidden layer may vary from 20-40, which is determined by actual situations. The activation function in the hidden layer is a nonlinear sigmoid function, and the output layer is linear function.
            Two modes exist in the operation of the neural network; learning and simulation. In the learning mode, the neural network is given a set of training data that are abstracted from previous successful cases. The weights inside the network are adjusted to minimize the error between the desired and actual output. The training data consists of the patient’s hearing loss as input and the actual target as desired output. This data is obtained from either some public medical database or a physician’s routine practices. Through this approach, it becomes easy for a physician to incorporate his/her experience into the system +by training the neural network with his/her successful cases. After a certain number of epochs of learning, the configuration of the network should reach an optimal state for all given training data. Thus, the network is ready to solve the target for a specific hearing loss- the initial target for the user to evaluate. Three targets are generated for soft, normal, and loud sound, respectively.

What is fuzzy logic?
Fuzzy logic is a superset of conventional (Boolean) logic that has been extended to handle the concept of partial truth -- truth values between "completely true" and "completely false". Dr. Lotfi Zadeh of UC/Berkeley introduced it in the 1960's as a means to model the uncertainty of natural language. Zadeh says that rather than regarding fuzzy theory as a single theory, we should regard the process of ``fuzzification'' as a methodology to generalize ANY specific theory from a crisp (discrete) to a continuous (fuzzy) form (see "extension principle" in [2]). Thus recently researchers have also introduced "fuzzy calculus", "fuzzy differential equations", and so on (see [7]). Fuzzy Subsets: Just as there is a strong relationship between Boolean logic and the concept of a subset, there is a similar strong relationship between fuzzy logic and fuzzy subset theory. In classical set theory, a subset U of a set S can be defined as a mapping from the elements of S to the elements of the set {0, 1}, U: S --> {0, 1} This mapping may be represented as a set of ordered pairs, with exactly one ordered pair present for each element of S. The first element of the ordered pair is an element of the set S, and the second element is an element of the set {0, 1}. The value zero is used to represent non-membership, and the value one is used to represent membership. The truth or falsity of the statement x is in U is determined by finding the ordered pair whose first element is x. The statement is true if the second element of the ordered pair is 1, and the statement is false if it is 0.Similarly, a fuzzy subset F of a set S can be defined as a set of ordered pairs, each with the first element from S, and the second element from the interval [0, 1], with exactly one ordered pair present for each element of S. This defines a mapping between elements of the set S and values in the interval [0, 1]. The value zero is used to represent complete non-membership, the value one is used to represent complete membership, and values in between are used to represent intermediate DEGREES OF MEMBERSHIP. The set S is referred to as the UNIVERSE OF DISCOURSE for the fuzzy subset F. Frequently, the mapping is describe as a function, the MEMBERSHIP FUNCTION of F. The degree to which the statement x is in F is true is determined by finding the ordered pair whose first element is x. The DEGREE OF TRUTH of the statement is the second element of the ordered pair. In practice, the terms "membership function" and fuzzy subset get used interchangeably.

Many decision-making and problem-solving tasks are too complex to be understood quantitatively; however, people succeed by using knowledge that is imprecise rather than precise. Fuzzy set theory, originally introduced by Lotfi Zadeh in the 1960's, resembles human reasoning in its use of approximate information and uncertainty to generate decisions. It was specifically designed to mathematically represent uncertainty and vagueness and provide formalized tools for dealing with the imprecision intrinsic to many problems. By contrast, traditional computing demands precision down to each bit. Since knowledge can be expressed in a more natural by using fuzzy sets, many engineering and decision problems can be greatly simplified.

Fuzzy set theory implements classes or groupings of data with boundaries that are not sharply defined (i.e., fuzzy). Any methodology or theory implementing "crisp" definitions such as classical set theory, arithmetic, and programming, may be "fuzzified" by generalizing the concept of a crisp set to a fuzzy set with blurred boundaries. The benefit of extending crisp theory and analysis methods to fuzzy techniques is the strength in solving real-world problems, which inevitably entail some degree of imprecision and noise in the variables and parameters measured and processed for the application. Accordingly, linguistic variables are a critical aspect of some fuzzy logic applications, where general terms such a "large," "medium," and "small" are each used to capture a range of numerical values. While similar to conventional quantization, fuzzy logic allows these stratified sets to overlap (e.g., a 85 kilogram man may be classified in both the "large" and "medium" categories, with varying degrees of belonging or membership to each group). Fuzzy set theory encompasses fuzzy logic, fuzzy arithmetic, fuzzy mathematical programming, fuzzy topology, fuzzy graph theory, and fuzzy data analysis, though the term fuzzy logic is often used to describe all of these.

Fuzzy Logic in the Fine Tuning Process

            Fuzzy logic tuning plays an important role in the fitting process because success in the first try is nearly impossible. The inputs to the tuning process are user evaluation and the target being evaluated. The output is a finer target for further evaluation. Several factors favor fuzzy logic in the tuning process over other approaches. First, the user’s evaluations are fuzzy. Second, the rules used to fine-tune are described using linguistic terms. Third, several adjustments that are required at the same time might be contradictory.         

            The user is first exposed to three different sound environments: soft, normal, and loud sounds separately. He/she is then asked to give quantitative evaluations on loudness, tone, intelligibility, comfort, distortion, and noise. These evaluations, normalized between -1 and 1 or 0 and 1, are then fuzzified using the membership functions.   

The output variables are modifications to the target at three different frequencies: high frequency (HF), medium frequency (MF), and low frequency (LF). The crisp outputs, which are the actual modification to the targets, must be defuzzified using the following membership functions.

            The rules that are used to fine-tune the target are from theoretical considerations or clinical experiences. Samples of these rules include:
v    If (loudness is LOUD), then (DECREASE LF, DECREASE MF, DECREASE HF).
v    If (tone is BRIGHT), then (DECREASE HF).
v    If (tone is DULL), then (INCREASE LF).
v    If (loudness is NORMAL) and (intelligibility is BAD) then (HF is INCREASED SLIGHTLY).
v    If (distortion is DISTURBING) and (intelligibility is BAD) then (INCREASE LF SLIGHTLY). 

            Most rules will be provided by factories, yet the physician still has the freedom to add his/her own rules into the rule base.

            One important issue is related to the convergence of the rule base. Due to the shortcoming of the rules and the inaccuracy in the user’s evaluations, the fine tuning process may not converge to an optimal result. In order to address this problem, we also designed a set of rules to guide the tuning process. The rules are capable of adjusting the tuning step adaptively. Some samples include:

v If (gain is CURRENTLY INCREASED) and (gain was PREVIOUSLY INCREASED), then (SLIGHTLY INCREASE adjustment step).
v If (gain is CURRENTLY INCREASED) and (gain was PREVIOUSLY INCREASED) and (adjustment step was PREVIOUSLY SLIGHTLY INCREASED), then (MODERATELY INCREASE adjustment step).
v If (gain is CURRENTLY INCREASED) and (gain was PREVIOUSLY DECREASED), then (DECREASE adjustment step).

Simulation and Results
            In order to model the effect of a hearing aid, the input signal is passed through a filter that emulates the frequency response of the hearing aid. This eliminates the device –dependent nature of the process of evaluation of the tool while permitting testing without a prosthetic hearing device .The filter parameters are determined using the data that the user enters while characterizing the speech sample that he/she hears. 
The emulator is essentially a finite-impulse response (FIR) filter constructed using the desired output as given by the fuzzy engine. The filter is shaped to avoid having sharp gradients, which is actually close to how a real hearing aid performs. Having a smooth response curve prevents the sound from being suddenly distorted at some frequencies.   
            Once the filter shape is determined, the speech sample (with the loss imposed) is passed through this filter and played to the user using a set of speakers.
            Tests were performed to determine the functionality of the system. Through the process of testing, it was discovered that a variable rate of approach to the final state may be necessary since, typically, large changes need to be made in the first few iterations while progressively smaller changes are needed as the number of the iterations increase. To this end, a velocity factor (0 < a <1) has been incorporated into the target update process. The effect of this may be illustrated by considering
                                                            Tk+1 = T­­k + aDT
            Here, Tk represents the target vector at iteration k, and DTk represents the change in the target vector at iteration k as given by the fuzzy engine. The value of a may be kept static through all the iterations or varied as the number of iteration increase. A decrease in value of a as the number of iterations increases should yield a faster convergence since this reduces the probability of overshoot of the final state.
            Also, a constant multiplicative gain factor C is used to change the effect of the target on the hearing aid emulator frequency response. This effect is illustrated in

                                                            Gk = CTk

            Here, G is the gain vector of the hearing aid at iteration k, Tk is the target vector at iteration k and C is the multiplicative constant.
The frequency simply measures the number of waves that travel by in each unit of time. It is usually measured in hertz; 1 hertz is 1 cycle per second. (1 kHz = 1000 Hertz). The range of human hearing is from 20 Hz to 20,000 Hertz.
Intensity is a measure of the power in a sound, as it contacts an area such as the eardrum, and is directly proportional to the square of the amplitude of the waveform. Intensity is expressed as power per unit area and measured in Watts per square meter. The decibel scale is logarithmic and provides the most convenient physical measure of intensity. One bel (named after Alexander Graham Bell) is defined as the ratio between two sounds whose intensities have a ratio of 10:1, and a decibel (dB) is one tenth of a bel. Note that the Decibel is a unit for sound level differences between two sounds. In acoustics, dB is often used as an absolute measure of intensity - in this case, it is a measurement relative to the threshold of hearing of 10 (power -12) Watts per Sq. meter. Here are some sample intensity levels in decibels:

Threshold of hearing                           0
Leaves rustling in the breeze             20
A quiet restaurant                                50
Busy Traffic                                          70
Vacuum cleaner                                   80
Threshold of pain                                 120
Jet at takeoff                                         140

            Neural networks and fuzzy logic are powerful tools for next generation hearing prosthetics. A neural network, as a function fitter to map the hearing loss to desired gains requirements, provides many benefits over other approaches. The network is able to learn dynamically through experience. It is open and expandable ---- a physician can easily incorporate new knowledge into the system. Fuzzy logic, on the other hand, is an indispensable tool for the tuning process. It builds a direct and reasonable link between a user’s subjective evaluation and actual required modifications to the gain targets. Again, physicians are free to add new rules to the rule base in reflection of specific needs and patterns.
            The presented neurofuzzy approach helps hearing prosthetic devices not only in an offline fitting process, but also in online operations. Next generation hearing prosthetics will differ from current devices in that they will be more than a mechanical one. The effects of the brain should be taken into account in order to design a successful hearing aid. For example the hearing aid should be situation dependent; it should be capable of evolving or adapting. The neurofuzzy approach presented makes these features possible.         

No comments:

Post a Comment

leave your opinion