High performance GPU primitives for graph-tensor learning operations

2021 
Abstract Graph-tensor learning operations extend tensor operations by taking the graph structure into account, which have been applied to diverse domains such as image processing and machine learning. However, the running time of graph-tensor operations increases rapidly with the number of nodes and the dimension of data on nodes, making them impractical for real-time applications. In this paper, we propose a GPU library called cuGraph-Tensor for high-performance graph-tensor learning operations, which consists of eight key operations: graph shift (g-shift), graph Fourier transform (g-FT), inverse graph Fourier transform (inverse g-FT), graph filter (g-filter), graph convolution (g-convolution), graph-tensor product (g-product), graph-tensor SVD (g-SVD) and graph-tensor QR (g-QR). cuGraph-Tensor supports scalar, vector, and matrix data processing on each graph node. We propose optimization techniques on computing, memory accesses, and CPU–GPU communications that significantly improve the performance of the graph-tensor learning operations. Using the optimized operations, cuGraph-Tensor builds a graph data completion application for fast and accurate reconstruction of incomplete graph data. In the experiments, the proposed graph learning operations achieve up to 142 . 12 × speedups versus CPU-based GSPBOX and CPU MATLAB implementations running on two Xeon CPUs. The graph data completion application achieves up to 174 . 38 × speedups over the CPU MATLAB implementation, and up to 3 . 82 × speedups with better accuracy over the GPU-based tensor completion in the cuTensor-tubal library.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    0
    Citations
    NaN
    KQI
    []