A Hybrid Bi-LSTM-CRF Model for Sequence Labeling Applied to the Sourcing Domain

2020 
In a number of areas, companies are often faced with the task of dealing with large amounts of textual customers' requests. Automating information extraction like key phrases from customers' requests can help to accelerate the processing process. Silex France is currently facing this challenge in the context of processing sourcing requests. In this article, we share our sequence labeling results based on a hybrid method Bi-LSTM-CRF, in an industrial context. This work was integrated in the B2B Silex platform for service providers recommendation. Experiments with the B2B Silex platform data show that, with a good choice of features to extract and optimal choice of hyper-parameters, the combination of the Bi-LSTM and CRF helps to achieve good results even in a context of small data. Indeed, the textual content processed is in the form of complete sentences generated by users, and thus is subject to typing errors. To handle this type of data we combine several types of extracted features describing the textual content such as: (i) semantics, (ii) syntax, (iii) word characters, (iv) position of words.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []