A speech corpus (or spoken corpus) is a database of speech audio files and text transcriptions.In speech technology, speech corpora are used, among other things, to create acoustic models (which can then be used with a speech recognition engine). In linguistics, spoken corpora are used to do research into phonetic, conversation analysis, dialectology and other fields. A speech corpus (or spoken corpus) is a database of speech audio files and text transcriptions.In speech technology, speech corpora are used, among other things, to create acoustic models (which can then be used with a speech recognition engine). In linguistics, spoken corpora are used to do research into phonetic, conversation analysis, dialectology and other fields.