Rank-frequency distribution of natural languages: a difference of probabilities approach.
2018
The time variation of the rank $k$ of words for six Indo-European languages is obtained using data from Google Books. For low ranks the distinct languages behave differently, maybe due to syntaxis rules, whereas for $k>50$ the law of large numbers predominates. The dynamics of $k$ is described stochastically through a master equation governing the time evolution of its probability density, which is approximated by a Fokker-Planck equation that is solved analytically. The difference between the data and the asymptotic solution is identified with the transient solution, and good agreement is obtained.
Keywords:
- Correction
- Source
- Cite
- Save
- Machine Reading By IdeaReader
19
References
1
Citations
NaN
KQI