Stochastic modelling of scientific terms distribution in publications

2006 
In this paper, we address the problem of automatic keywords assignment to scientific publications. The idea to use textual traces learned from training data in a supervised manner to identify appropriate keywords is considered. We introduce the transparent concept of identification cloud as a means to represent the semantics of scientific terms. This concept is mathematically defined by models of scientific terms stochastic distributions over publication texts. Characteristics of models as well as procedures for both non-parametric and parametric estimation of probability distributions are presented.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    2
    Citations
    NaN
    KQI
    []