Evaluation of a Silent Speech Interface Based on Magnetic Sensing and Deep Learning for a Phonetically Rich Vocabulary.
2017
To help people who have lost their voice following total laryngectomy,
we present a speech restoration system that produces
audible speech from articulator movement. The speech articulators
are monitored by sensing changes in magnetic field caused
by movements of small magnets attached to the lips and tongue.
Then, articulator movement is mapped to a sequence of speech
parameter vectors using a transformation learned from simultaneous
recordings of speech and articulatory data. In this work,
this transformation is performed using a type of recurrent neural
network (RNN) with fixed latency, which is suitable for realtime
processing. The system is evaluated on a phoneticallyrich
database with simultaneous recordings of speech and articulatory
data made by non-impaired subjects. Experimental results
show that our RNN-based mapping obtains more accurate
speech reconstructions (evaluated using objective quality metrics
and a listening test) than articulatory-to-acoustic mappings
using Gaussian mixture models (GMMs) or deep neural networks
(DNNs). Moreover, our fixed-latency RNN architecture
provides comparable performance to an utterance-level batch
mapping using bidirectional RNNs (BiRNNs).
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
31
References
15
Citations
NaN
KQI