Improving an autotuning engine for 3D Fast Wavelet Transform on manycore systems

2014 
This paper presents an enhanced auto-optimization method to run the 3D-Fast Wavelet Transform on different computing units in a system (GPU, MIC, CPU). The proposed method automatically selects a set of parameter values (block size, number of streams and number of threads) in order to reduce the total execution time, obtaining performances close to the optimal and decreasing the number of evaluations needed.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    2
    Citations
    NaN
    KQI
    []