Persian Language Model based on BiLSTM Model on COVID-19 Corpus

Moein Salimi Sartakhti,Mohammad Javad Maleki Kahaki,Seyed Vahid Moravvej,Maedeh javadi Joortani,Alireza Bagheri

Persian Language Model based on BiLSTM Model on COVID-19 Corpus

2021

Coronavirus disease 2019 (COVID-2019) appeared in China, in 2019. COVID-2019 expanded all over the world quickly and caused many deaths. COVID-19 has become one of the hottest research areas recently. In this paper, we create a language model (LM) to determine the probability of a given sequence of words occurring in a sentence. Some of the LM applications include machine translation, question answering, and spell checking. In this study, long short-term memory is applied to language modeling of Persian. To do this, we use unidirectional and bidirectional Long Short-Term Memory (LSTM) Models to give contextual informaton. We compared their results together. Our experiments demonstrate how different LSTM language models operate. BiLSTM with two layers is the best language model for Persian COIVD-2019 news. The corpus contains 10,000 pieces of news about COVID-2019 and more than 2,100,000 words, which were provided by the Lobkalam system.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations