Measuring Vibrations from Video Feeds
2017
By using a high-speed camera, researchers at MIT in 2014 where able to recover human speech from videos of minute vibrations of objects in a room. For example, in one experiment a 2,200fps camera was positioned outside a room behind sound-proof glass, videoing an empty crisp packet on the floor inside the room, while a researcher shouted “Mary had a little lamb” at the crisp packet. By detecting minute oscillations of the crisp packet of 1 μm (0.001 mm), and using hours of computer processing, a ten second audio clip could be produced that was recognisably “Mary had a little lamb” in an American accent.
The purpose of this study group was to investigate whether this tech- nique could be used in practice, with emphasis on the recovery of intel- ligible speech from a video feed of a room. During the week, the group investigated several aspects of the problem, including:
• how much an object vibrates due to sound;
• what can be done to maximize the vibration;
• how the MIT technique detects minute vibrations in videos; • what affects the quality of the resulting recording; and
• how good a recording is needed for intelligible speech.
It was discovered the MIT experiments would not have recovered intel- ligible speech from an ordinary conversation; their success depended on loud sounds and prior knowledge of “Mary had a little lamb”. Camera vibrations were also ignored by MIT; these are expected to be signifi- cant, but the technique could be adapted to be resilient to them. Other possibilities for enhancing their technique, by exploiting resonances or reflections, are discussed in the report. A high-speed low-noise cam- era is essential, and any existing video footage (such as from CCTV) is unlikely to be of sufficient quality. Further experiments with high-end high-speed cameras are needed to assess the feasibility of the technique in practice.
Keywords:
- Correction
- Cite
- Save
- Machine Reading By IdeaReader
0
References
0
Citations
NaN
KQI