DIARCA: A Component Approach to Voice Recognition

Juan-Carlos Díaz-Martín,Juan-Luis García Zapata,José Manuel Rodríguez García,José F. Álvarez Salgado,Pablo Espada Bueno,Pedro Gómez Vilda

DIARCA: A Component Approach to Voice Recognition

2001

Abstract Current voice recognition systems tend to be implemented asa PC desktop facility. This model is not suitable for thegrowing complexities of present and future developments: Itis single-user, it is non portable, and it assumes theworkstation model, where all the CPU resources are supposedto be locally available. This work researches how a highperformance speech recognition system can be redesigned andimplemented as a time-critical network service shared throughordinary data transmission media with three main designgoals: Scalability, predictability and POSIX portability. Thewhole idea has been tested by rebuilding IVORY, a wellknown robust desktop voice recognition methodology, as adistributed component. 1. Introduction While Speech Processing and Recognition is a fieldexperiencing a rapid and promising expansion, the operating-system environments for the desktop PC still typically lack oftrue real-time support. To overcome this limitation, currentspeech recognition systems are confident on the workstationprinciple: all the CPU resources are always available to theapplication where they are embedded. This approach shows amain limitation: Its growing computational complexity.IVORY ([1], [7]), a stand-alone speech recognition system ofisolated words, gives figures of computational complexityaround 21 Mflop/s. Though this load is easily assumed bycurrent CPU's, continuous speech can raise the computingpower demand one order of magnitude. Noise cancellationdemands up to five or six times the power of the recognitionitself. Furthermore, new applications of speech processingdemand much more computing power. For instance, tracking asingle speaker by the Microphone Arrays technique shows acomputational complexity near 166 Mflop/s ([6]). Thoughtoday's PC microprocessors claim peak execution ratesexceeding 1 Gflop/s, regular DSP algorithms rarely result insuch a high performance. In our view, desktop speechprocessing is -and will always be- strongly limited by itscomputational complexity, nowadays constrained to thecomputing power of the average personal computer. This work was founded by CICYT and Junta de Extremaduraunder the TIC99-0609 (DIARCA) and CICYTEX IPRR98A039 projects respectively.Distributed computing should change this scenery.Ongoing developments on component based softwareengineering makes possible to envision a remote service ofDSP computing power for speech processing. It would allowto bring both to the current desktop PC and to the futureinternet appliances the more advanced developments on thefield. This work investigates the distribution of speechrecognition in the context of DIARCA, a research projectwhose aim is two-fold. Firstly, to distribute IVORY with threedesign goals: Scalability, predictability and POSIX portability.Secondly, to extend the results in order to support microphonearray developments. This work is about the first goal.Figure 1. IVORY: a Robust Voice Recognition system

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations