Temporal Correlation Analysis of Programming Language Popularity

2019 
Based on the premise that programming languages interact with one another while their popularities changes over time, we describe a technique for extracting latent features from the popularities of programming languages. We constructed a matrix in which each column consisted of a time series of partial correlation coefficients between the popularities of different languages. For the analysis, we utilized non-negative matrix factorization (NMF) to factorize the matrix into the matrices of temporal modes and mixture components. We found that the matrix was optimally factorized with three temporal modes, and the factorization results were more or less independent of factorization algorithms. In accordance with NMF, which learns a part-based representation of the matrix, the sparse property of the temporal modes illustrated different patterns of correlation strength over time. By analyzing the NMF results, we show that the most popular languages of Java, C, and C++ become more correlated as time passes and that the recent similar trends in the popularities of Java and C can be explained by the positive correlation between the two at a later stage in time. These and other characteristics of the popularity explained by NMF may provide clues to understanding the evolution of the popularity of programming language.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    0
    Citations
    NaN
    KQI
    []