An Approximation Polynomial Algorithm for a Problem of Searching for the Longest Subsequence in a Finite Sequence of Points in Euclidean Space

2018 
The following problem is considered. Given a finite sequence of Euclidean points, find a subsequence of the longest length (size) such that the sum of squared distances between the elements of this subsequence and its unknown centroid (geometrical center) is at most a given percentage of the sum of squared distances between the elements of the input sequence and its centroid. This problem models, in particular, one of the data analysis problems, namely, search for the maximum subset of elements close to each other in the sense of the bounded from above the total quadratic scatter in the set of time-ordered data. It can be treated as a data editing problem aimed at the removal of extraneous (dissimilar) elements. It is shown that the problem is strongly NP-hard. A polynomial time approximation algorithm is proposed. It either finds out that the problem has no solutions or outputs a 1/2-approximate solution if the length \(M^*\) of an optimal subsequence is even, or it outputs a \((M^* - 1)/2M^*\)-approximate solution if \(M^*\) is odd. Some examples of numerical experiments illustrating the algorithm suitability are presented.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    3
    Citations
    NaN
    KQI
    []