An Adaptive Algorithm for Splitting Large Sets of Strings and Its Application to Efficient External Sorting

DOI Open Access

Description

In this paper, we study the problem of sorting a large collection of strings in external memory. Based on adaptive construction of a summary data structure, called adaptive synopsis trie , we present a practical string sorting algorithm DistStrSort , which is suitable for sorting string collections of large size in external memory, and also suitable for more complex string processing problems in text and semi-structured databases such as counting, aggregation, and statistics. Case analyses of the algorithm and experiments on real datasets show the efficiency of our algorithm in realistic setting.

Details 詳細情報について

Report a problem

Back to top