Towards a hybrid NLG system for Data2Text in Portuguese

J. C. R. Pereira,António J. S. Teixeira,Joaquim Sousa Pinto

Towards a hybrid NLG system for Data2Text in Portuguese

2015

In many new interactions with machines, such as dialogue or output using voice, there is the need to convert information internal to a system into sentences, using Data2Text systems. Trying to avoid the limitations of template-based and classical NLG methods, systems based on automatic translation have been proposed in recent years. Despite providing sentences with the important variability needed for a better interaction, this doesn't come without a cost. Contrary to template-based, these systems produce sentences with heterogeneous quality. In this paper we proposed to combine a translation based NLG system with a classifier module capable of providing information on the Intelligibility or Quality of the sentences. Sentences marked as unacceptable are replaced by template-based generated ones. This classifier module is the main focus of the paper and combines extraction of linguistic features with a classifier trained in a manually annotated corpus. Results suggest that our approach is valid as best results obtained have false positives below 8% and this metric can be even lower in practical applications, decreasing to around 3%, as the generation module produces low quality sentences at a rate lower than 30%.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations