Privacy-Preserving String Search on Encrypted Genomic Data using a Generalized Suffix Tree

2021 
Abstract Background and Objective Efficient sequencing technologies generate a plethora of genomic data available to researchers. To compute a massive genomic dataset, it is often required to outsource the data to the cloud. Before outsourcing, data owners encrypt sensitive data to ensure data confidentiality. Outsourcing helps data owners to eliminate the local storage management problem. Since genome data is large in volume, executing researchers queries securely and efficiently is challenging. Methods In this paper, we propose a method to securely perform substring search and set-maximal search on SNPs dataset using a generalized suffix tree. The proposed method guarantees the following: (1) data privacy, (2) query privacy, and (3) output privacy. It adopts the semi-honest adversary model, and the security of the data is guaranteed through encryption and garbled circuits. Results Our experimental results demonstrate that our proposed method can compute a secure substring and set-maximal search against a single-nucleotide polymorphism (SNPs) dataset of 2184 records (each record contains 10000 SNPs) in 2.3 and 2 seconds, respectively. Furthermore, we compared our results with existing techniques of secure substring and set-maximal search [ 1 , 2 ], where we achieved a 400 and 2 times improvement (Table 5).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    0
    Citations
    NaN
    KQI
    []