Medical Electronics Guide: Chinese speech processing in digital hearing aid in the development of

The current foreign research to the development of hearing aids a hotspot is set in China, it literally is based on Chinese language and voice studies, development-related speech recognition technology and products.

As the center of Chinese audiology is no exception. We already know that hearing science is developing rapidly, knowledge update soon a subject, it studies the human hearing, and now we will present and discuss the scientists and audiologists are more concerned with how the hearing science applied to China's actual auditory and speech.

Chinese is a feature of the tone of the language, and the other for Alpha-based languages, such as a Slavic language, with a very obvious difference Phonetics.

This difference not only in language feature is clear, in particular, the difference is very big. If different languages of different voice characteristics affect hearing impaired patients on verbal understanding, especially in the use of different languages and that research results produced by the voice of whether to play an important role, has recently become academic and scientific research of a popular topic. For example, domestic research of characteristics of a cochlear implant is designed to take into account its algorithm for Chinese speech features. Hearing aid manufacturers will be abroad recently introduced to Chinese speech hearing aids for the characteristics of the algorithms. Canada in China of a speech lab through years of research and experimentation, as early as 2000 years leading digital signal processing (DSP) technology, in its digital hearing aids in Chinese phonetic algorithm, and at the same time, applied for a patent. At present they spearheaded the launch of Chinese speech processing technology as the core of the new digital hearing aids — Intelligia, in clinical trials approved by subjects, preliminary evidence of this new type of hearing aid on said in Chinese-speaking patients.

Current research shows that different languages such as Chinese and English have their own characteristics, in auditory perception process vary greatly.

English and Chinese in speech and language are important distinction, el al Ming-Xi Tsai (2000) believe that Chinese and English phonetic difference in structural characteristics. Chinese words, words, syllables, vowels and acoustic section contains different levels of information, and maintaining complex relationships. In spoken English, Mandarin pronunciation are large, in a different session, subject to these structures in different levels of information.

On Chinese speech recognition and voice tones of Chinese in the cochlear's algorithm.

Speech processing strategy is to help patients understand the cochlear language core technology, has a lot of research. But for speech sound especially of tone, intonation, tone, for example on the basis of Chinese research or very small. In a recent experiment, they used to observe the cochlear Australia on the influence of Chinese speech understanding. Results indicate that in some speech processing strategies in Chinese use of understanding degree than other time policy. If you can improve the stimulation rate, strengthen the voice and tone of understanding, they also believe that different speech processing policies to Chinese version have understanding. Research proves again that the Chinese should have a certain voice system handles its own language, especially for people who are particularly important for the hearing-impaired.

United States MIT researchers Michael Qin in its noise background sounds and tones of the identification code of the test, on the recognition of Mandarin tone and the noise of the relationship.

He believes that different languages use different types of spoken tones make our wealth to a different meaning, the noise environment these meaningful tones will be affected, so he needs to find speak Mandarin Chinese how to noise environment to identify different tones. In the test he uses six auxiliary vowel phonemes, using four tones: the yin and Yang. The results indicate a lower signal to noise ratio of Chinese tones and vowels of recognition is a great effect, thus affecting the lower speech understanding. Therefore the effect signal to noise ratio is important to understand Chinese. This test on auditory rehabilitation and design targeted hearing aids.

At the same time, the recent United States established a comprehensive expert research team, started to develop suitable for Chinese speech of hearing aids.

The group members include the world famous House Ear Institute, the University Ent. Similar to the above studies. They believe that listening to tone of voice as recognition and semantics of the language, such as Mandarin, Cantonese and Thailand language may hearing is more important to rely on the basic frequency relative information to understand the language, it is different with the other language. Therefore, in the development of hearing aids, we should take into account the characteristics of these patients.

Of course, I was most interested in is the most recent study by wei er Kang Fund (Wellcome Trust)-sponsored entitled the Chinese Mandarin session to understand language than English conversation with brain more "test, which aims to use imaging techniques to observe and study Chinese language and English mother-tongue speakers of the different activities of the brain.

Chairing the psychologist Dr. sofus Scott found that when the English volunteers hear English, the left temporal lobe become hyperactive, researchers think that this area is the speech sound together form a separate Word. But when you hear Mandarin Chinese subjects, the left temporal lobe at the same time. Obviously, as said in a different language subjects with different areas of their brains to different language stimulation is decoded. This we understand these theories had a large impact. They further believe that Chinese subjects of the left temporal lobe audio signal processing, they right temporal lobe is processing, producing a tone. Words sound very complex sounds and understand speech delivery means, in such cases, the brain will make full use of the speaker modulated tones to decode words, spoken into a meaningful signal.

Brain auditory area very vulnerable to external influences, and change the resolution capacity of the voice.

Once the auditory trauma, required for rehabilitation, the brain needs to reconnect and encoding. Plasticity of the brain is very strong. Understanding the brain's response to different languages, you can effectively help hearing patients toThe new recovery for language understanding. It is important to research based on these, we can clearly see developed with Chinese phonetic characteristics of hearing rehabilitation equipment. Remember that in 2002, Beijing University and the founding speech auditory CDPF centre opening ceremony, Deng in devoted to: he first heard about Chinese speech processing features for hearing aid users, he thinks this is an important issue that required a lot of work, and the development of Chinese phonetic characteristics of auditory rehabilitation equipment will have important significance. According to the international acknowledgement of the prevalence of hearing loss, China has a 10 per cent of the population, i.e. the degree of 130 million people, hearing loss, the use of Chinese speech processing technologies to more effectively help to disabled patients have a very important role.

1. principle of Chinese speech processing technologies

Chinese speech processing strategies of English words with "Chinese speech processing strategy" or "speech recognition", Chinese (Mandarin speech recognition) and "algorithm" hearing aid (hearing aids algorithm), etc.

Among them, the algorithm that the word "algorithm" use more, especially related to the development of digital hearing aids, "algorithm" represents a special technology core. "Algorithm" may be simply seen as realization of some specific signal processing features of the instruction sequence. Chinese speech characteristics can be through the algorithm. Digital signal processor and algorithms form the DSP line of digital hearing aids. Contains channel dynamic range compression, noise attenuation and processing, design of hearing aids algorithm's main goal is to use Chinese speech processing technologies, even if in various listening environments, shall ensure that the words were heard and listening comfort. At the same time, the use of digital hearing aids to improve intelligibility in Chinese, the Chinese patients with hearing loss can more easily understand Chinese.

Chinese is the language of words, tone of voice is one of the important voice features.

Tone features primarily in voice-based frequency patterns change over time. Eady technology (1982) had been on the tone language — the Chinese base frequency pattern and accented languages — what's the difference between the English language. Chinese words of tone in the contend with intended role in life practice, we can realize voice helps us to understand what others say, "not" often represents nanqiangbeidiao understood and not understand and not very nice.

For continuous speech, long time average of positive and negative fibrillation factors, various language and people and men the same pronunciation.

Only negative fibrillation is always better than just fibrillation and frequency is higher. Eady measurement results show that Chinese speaking speed slower than in English. This may be due to speak Chinese, speaker takes more effort to come up in each syllable, i.e. Control vocal campaign tone language syllable laryngeal motion control with greater linguistic load, thus spend some more time. The result is shown as speaks slowly.

Therefore, the main tone information exists on the base frequency over time, the intensity variation on tone compensation function, as well as the presence of the consonants Qing on voice clarity is a certain effect.

• Principles (Principles)

This article shows a may apply to digital hearing aids to improve Chinese intelligibility of speech processing methods, its goal is to make Chinese native people can hear the residual more easily understood language.

Enhancing speech intelligibility of thinking derived from people's practical experience. Recall that when you order a hearing impaired person to more easily understand yourself approach: you not only to increase the volume, but also to change the way it sounds, and speak more slowly and more clearly. Some studies indicate that clearly reading meaningless statement than in daily conversation about improving sentences, 17% of the word intelligibility. Here the so-called said more clearly refers to emphasize some of the speech signal, which implies that there is implied that the many different forms, such as a specific sound segments duration, vowel formants location or transitions between phonemes.

Not all people will simply and easily on hearing loss in patients with "clear" speech.

Therefore, we want to use speech enhancement is in speaker and listener intermediate build a processing model that is able to emphasize and highlight specific elements in the statement, the statement sounded clearer.

All voice expression meaning to, because there are differences between the various codes.

These differences arise from internal organs and tune muscle activities decided pronunciation method and articulation of differences and behave as the voice of the difference of acoustic characteristics. This article's speech enhancement methods it is through the reconstruction of the voice signal to reinforce these differences. Refactor refers to voice signals in the signal of a different nature are identified and specifically to deal with, highlight on human perception play a role in the feature, so as to achieve the purpose of enhancing speech intelligibility. This method can be simply summarized as: Zoom consonants, stress accent and highlight tone.

Chinese speech signal sensing characteristics

Tone

• Tones in Minnan dialect

• Perception of the tone

• Major changes according to the base frequency

• Tone pitch changes on sound and sound effects are likely to produce strong

Accents

• Weight tone acoustic properties

• With the actual sound intensity is closely related, but are not equal

• Also influenced by the tone, pitch and tone length constraints

• Perception characteristics: weigh tone, tone strong often are not the decisive factors

1) consonants zoom (Consonant Amplification)

Statements

Language of perception of psychological experiments confirmed the following characteristics: people in speech perception, voice signal load method on the pronunciation and the articulation of distinguished information perception ability strong or weak. Overall, people on pronunciation articulation than to have a better ability to distinguish. But the clarity and sharpness consonant method very similar. In Chinese consonant sounds of the importance of perception, there is clear and dark, aspirated and plosives, friction and non-friction from strong to weak precedence relationships. Research has shown that relatively strengthened consonants help to improve voice intelligibility.

Kates describes ways to enlarge a consonant, Figure 1 is widely used in a model.

The system to signal decomposition into several bands, testing at each wavelength spectra for short time, according to the spectral shape recognition vowels and consonants, give the consonants. It should be noted that, for the benefit and the Chinese pronunciation guide features concept, from acoustic information calculated detection angle for automatic speech recognition system provides a secondary match structure.

Figure 1 consonants enhancement system

2) accent (Stress)

Composition of a language stream of syllable sound loud and not quite equal.

Some syllables in the word stream sounds loud sounds than other syllable is accented syllables. Some accents and semantics, syntax closely related, such as Mandarin word accents. Word stress appears in the word, is due to the different meaning of the word, accented syllable location. Such as "technical" and "count", accent on first syllable and the second syllable. This semantic difference is through the "hyper-codes segment features" to express.

In Chinese, accent on prosodic features parameters of award-winning attention.

Language stream "prosodic features" (prosodic feature) by pitch, tone and sound intensity changes, that is, the "hyper-codes segment features". From the language on the observation that range significantly expanded accents. High obviously Mandarin statement summary stressed accents conducted a study on acoustic performance, noting that: (1) rise of the "pitch is Mandarin statement emphasizing the importance of prosodic features the accents." (2) the pitch and duration for the implementation has emphasized the accents are equally important role. The relationships between them are opposite and complementary.

Speech synthesis of experience has taught us that the pitch is adjusting the accents of the most effective means, so strengthening the main accents method is to raise the pitch.

3) tone (Tone and Internation)

A syllable in addition to including the vowels and consonants in chronological order as a series of sound quality unit, you must include a certain pitch, tone and sound.

In some languages, the pitch in the role played in the syllables can be said to be consonant and vowel, equally important, it can distinguish between syllables of meaningful pitch is the "voice". According to the tone of a can of language in the world is divided into voice language and non-voice language. Han, Sino-Tibetan languages the most prominent feature is the tone.

Mandarin tone plays a role of word formation argues.

With the same phonetic syllables, as of a different tone, you can have different meanings. Mandarin single syllable tone changes there are four modes, different tones are reflected in the speech parameters is pitch frequency change of trajectory. Depending on the defined rules, you can view pitch frequency trajectory of a parameter exceeds a predetermined threshold, it may award as a tone type. On this basis, the yellow Ze town, identification of yangxing Jun mode using pitch trajectory curve, second slope, Valley point and flatness to four tones has a strong distinctive, experiments show that this algorithm results in the recognition rate up to 99%.

Lin Chan pointed out that the main tone information exists in the main vowel (and its acoustic transition).

Taking into account the change of tone pitch, tone on tone length and intensity are likely to have an impact, that is: åˆ† shortest, strongest, tone of the longest, most weak yin and yangping weightlifting, yangping often longer than the Yin Dynasty. Tone enhancements cannot simply to zoom on main vowel, but rather a different tone in pitch and tone strongly different treatment. Practical application we take the following strategies: (1) to enhance the sound intensity åˆ† (2) increase the tone of the sound (3) on the yin and the yangping does not change. Figure 3 shows the four acoustic curve describes the four tones at different times of the frequency characteristics.

Figure 3 Chinese inflection tone acoustic characteristics

2. the method (Methodology)

A core part of the digital hearing aid is gain calculation, based on frequency domain processing, it establishes the frequency of the input transient energy and gain as a function, as shown in Figure 3, for each frequency band of instantaneous energy for short-term energy accumulated and long slow average available signal identification and classification by the necessary data.

Of which:

(1) E j (n) = a E j (n-1) type: a time constant

(2) using Cepstral algorithm base frequency, 512 point FFT, 40ms Hamming window, the window is moved to 10ms

(3) using a simple moving average algorithm for each syllable of the fundamental frequency of measurement to be smoothed, excluding those smoothing period too much deviation from the mean value.

(4) long pitch and tone are normalized

(5) uses a quadratic curve for the minimum mean square error of the approximation in the sense of pitch trajectory.

And calculate the slope of the curve, the second slope, Valley point and flatness.

The algorithm uses a system based on TOCCATA directives Assembly language implementation.

14-bit A/D, sampling rate set to 32KHz.

Figure 3. Chinese: speech enhancement system chart

1).

Voice of segmentation (Classifications of Phonemes)

Sonic by the sound quality (tone), pitch, tone and tone length of four parts, the four parts in plays a different voice, but in time and at the same time.

• Quality ingredients — by syllables, such as vowels and consonants

• A super sound quality ingredients — by pitch, tone and sound long three-part, attached to a syllable or code section.

From the Sonic characteristics, can be determined by the base frequency, amplitude determines the pitch according to strong, according to time sound determines the tone length.

2).

Treatment principle (Algorithm Principles)

Chinese speech processing mainly reflected in the

• Fitting process

In the fitting process, consider the Chinese speech long frequency spectrum coverage for weighted processing, elevation target curve speech frequency section, you can achieve the strengthening of the role of speech understanding.

• Hearing AIDS treatment

In a hearing aid's signal handler, on compact controller special settings to make the high-frequency signal compression startup time and release time is very short, do make clarifications of the effect of the consonant, and enhance consumer understanding of the language.

• Noise reduction processing policy

In noise reduction, according to Chinese speech in noise environment of sampling and analysis, a Chinese voice optimized noise reduction strategy.

Experiment of confirms that the policy can improve the SNR 18dB.

2. Chinese speech processing technologies in applications relating to hearing aids

Here is the Chinese voice technology applied to the design of specific instances of hearing aids.

This technology uses the world's most advanced DSP technology, including low-power digital chip.

TOCCATA digital signal processing system

Toccata TM system is a miniature, ultra low power consumption, high efficiency of digital signal processing systems.

It includes a high-fidelity weighted overlay filter Group (WOLA filter bank), a 16-bit DSP core, two 14-bit A/D converter, a 14-bit D/A converter and other peripherals. Toccata TM technology provides standard software programmable DSP development platform and uses 0.18? process manufacturing of micro-VLSI. It not only for audio processing system manufacturers as well as other DSP micro, low power consumption and product development. [4]

A. hardware architecture (Hardware Structure)

Figure 4 hardware system chart

TOCCATA system consists of three chips, a "simulated" chip (ALPHA), a "digital" chip (DELTA), and one for non-electric storage E 2 PROM chip.

ALPHA chips

ALPHA chip including input and output amplifier, two A/D converter, D/A converters, as well as the main clock and power supply system.

DELTA chip

DELTA chip includes a 16-bit software programmable DSP core, a WOLA filter coprocessor, a DMA controller (input and output processor or IOP) and memory (RAM and ROM).

Programmable core and flexible filter combination allows software to change the signal. Thus, the structure can perform traditional audio processing system for processing program (such as dual channel compression), and of course through the DSP core, you can perform more powerful treatment programmes (such as 16 channel and even more channel compression, noise reduction, feedback, etc.).

DSP core and instruction system (DSP Core)

RCORE is a flexible DSP core, using single-cycle even multiply-accumulate operation and 40-bit accumulator-Harvard architecture.

Peripheral components through an extended registers, memory map registers and shared memory system consists of.

Signal path

Figure 5.

Toccata system provides the signal path:

• Structural Intelligia digital hearing aids

Intelligia full digital hearing aid is based on the introduction of chip design of the technical characteristics, its structure can be demonstrated in Figure 6.

Although hearing aids with analog, digital hearing aid microphone and receiver are also used as an energy converter, digital signal processors through A/D sampling, level signal has been converted into digital code. Digital coding can be very flexible to be used to provide the gain, improve the frequency response, or at the request of the patient listening for other processing. When the DSP algorithm is complete, digital encoding and D/A conversion for level signal, the receiver converts the sound.

Digital hearing AIDS lies with information processing system, here is a section on digital signal processing system into the Toccata TM, developed the digital hearing aid Intelligia, with unique Chinese speech processing features.

Intelligia hearing aids in theDesign will signal decomposition into a 16-band filter processing, 16-band signal is composed of 10 group channels and each channel independently enter automatic gain control (AGCi), signal compression, how each channel using two time detectors, quick time detection devices to monitor signal faster change and slow time detectors detect slow signal changes, changes, and syllables selection and Chinese speech changes to match the compression, the release time constant, to achieve better auditory effect.

Intelligia ™ full technical characteristics of digital hearing aids

• Chinese speech signal processing an in-depth study of Chinese and other characteristics of the audible tone languages, we put the original Chinese speech processing technologies implantation Intelligia ™, enabling it to significantly increase in the Chinese language environment listen for intelligibility.

• Faster Intelligia ™ uses specially designed for digital hearing aids design third-generation digital hearing AIDS treatment system TOCCATA, its powerful computing capacity to Intelligia ™ can handle a variety of speech signals.

• Save energy Intelligia ™ work for less than 1 Ma current, and it can in no signal input to automatically enter power saving mode, so the low energy consumption from wearing ' often replace the battery.

• Fully programmable Intelligia ™ through its programmable advantages as impaired configuration most appropriate listening compensation programs and parameters, thereby ensuring that wear to get the best effect of listening.

• Multi-channel independent compression Intelligia ™ voice of the outside world by frequency subdivided into several bands and channels, and each band and the channel's signal to a different address, thereby ensuring that wearing those heard more clearly, more realistic sounds.

• Noise reduction processing Intelligia ™ effectively inhibit environmental noise, enhance the ability to identify the language, thereby ensuring that the wearing of users both in noisy street or in a noisy supermarket will be able to hear the clear voice.

• Directional handling Intelligia ™ can be configured with a directional microphone system and appropriate software, enables better noise reduction, thereby ensuring that wearing those heard more clearly, more natural sound.

• Acoustic feedback suppression hearing aid in the use process may cause whistling, a phenomenon that is acoustic feedback.

Intelligia ™ uses acoustic feedback suppression techniques to effectively suppress acoustic feedback, so wear to hear the voices of more comfortable.

• Can be easily upgraded as Intelligia ™ full opening up of digital signal processing (DSP) technology platform, TOCCATA provides programmable, fully adaptable and upgrade capability, so wear, as long as the use of our software, you can immediately enjoy the latest features.

Here is the Chinese speech processing of specification:

Table 1 Intelligia ™ Chinese voice technology for hearing aids and other hearing aid technology comparison in the laboratory, with Chinese speech enhancement of digital hearing aids Intelligia preliminary results of the experiment, shows that Chinese speech processing technologies, helps Chinese native patients better understand the language, improve rehabilitation level.

In clinical use, wear hearing AIDS patients feel Intelligia effect is pretty good, especially in noisy environments, enhancing speech intelligibility. In a sense, the patient feels understood language ability is improved. Of course, we must be aware of the Chinese speech processing technologies in the application of digital hearing aids are still in the early research stage. The author holds hearing aid Audiology scientists and experts from the following areas for further study:

• Respond in English and Chinese-based speech processing technology for in-depth comparative studies, especially in noisy environments, the observation of the two technologies are on two different treatment of speech.

The ideal experimental conditions should be the use of bilingual volunteers participate.

• Chinese speech processing technologies and current use of non-linear hearing aid fitting method combined with research, observation of the English-based formulation of fitting methods, whether in Chinese speech processing technologies support, the more effective in helping Chinese native patients in daily life improving speech understanding.

• Chinese speech processing technologies is currently one of the man-machine conversation on top, the algorithm is complex and diverse, we should be more in-depth research with Chinese characteristics, hearing aid technology algorithm, give full play to the enormous potential of the digital chip.

Will Chinese speech processing technologies applied to the listening device in the device is just the beginning, this is a very complex, involving many unresolved technical issues topic.

However, I believe that only developed with Chinese phonetic characteristics of a hearing aid, to more effectively help numerous Chinese native residual hearing.

Reference documents

Picheny, M., Durlach, N., and Braida, L. (1985).

Speaking clearly for the hard of hearing. I: Intelligibility differences between clear and conversational speech. Journal of Speech and Hearing Research 28:96-103.

Chang, speech perception of reflection, Chinese science, 1978; 5: 519-530

Chang, Chi Shih-Chang,

Shi Nan lu, Chinese consonant consciousness structure of the Psychology Press, 1981; 1: 76-85

Kates, J.M. (1984).

Speech intelligibility enhancement. U.S. Patent 4,454,609.

Du Limin, self-improvement, "Chinese characteristics automatically stops the selective extraction of wavelet transform method", Journal of the acoustical.

Volume 21, no. 6

Xu Jie-ping, initial sensitivity, Herb Lin, Lu Shi nan, "accent on pitch Chinese sentences and tone effect.", Journal of the acoustical.

Volume 25, no. 4

Yangxing Jun Zhao kaijiang,, "Mandarin isolated word recognition" of the four, the third voice, essays on communication and image processing.

Yellow Ze town, yangxing Jun, "Mandarin isolated Word four tones of a pattern recognition method", Journal of the acoustical.

Volume 15, issue 1

Lin CHAN, "Beijing dialect tone distribution area of consciousness studies", journal, volume 20, no. 6

10.

Wang, Chao, Prosodic Modeling for Improved Speech Recognition and Understanding (2000), Doctoral dissertation,

11.

Tsai, Ming-yi, Chou, Fu-chiang, Lee, Lin-shan, Pronunciation Variation Analysis with respect to Various Linguistic Levels and Contextual Conditions for Mandarin Chinese (2000)

12.

Bo Xu Bing , Ma Shuwu Zhang, Fei Qu and Taiyi Huang, Speaker-independent Dictation of Chinese Speech with 32K Vocabulary (2000)

Medical Electronics Guide

Wednesday, December 15, 2010

Chinese speech processing in digital hearing aid in the development of

No comments:

Post a Comment

Blog Archive

ANALYSIS

Followers

About Me