- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Knowledge Graph Search feature is available on CiNii Labs
- 【Updated on June 30, 2025】Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
NGNC: A Flexible and Efficient Framework for Error-Tolerant Query Autocompletion
Description
Query autocompletion (QAC) is an important feature that automatically completes a query and saves users’ keystrokes. It has been widely adopted in Web search engines, desktop search, input method editors, etc. In some applications, especially for mobile devices, typing accurately is laborious and error-prone. Hence advanced QAC methods tolerate errors when users are typing. As such, some data integration tasks also adopt this feature to process string similarity searches. Most existing work uses edit distance to measure the similarity between the input and correct strings. These methods overlook the quality of the suggested completions, and the efficiency needs to be improved. In this paper, we present NGNC, a framework that supports error-tolerant QAC in a flexible and efficient way. The framework is designed on the basis of a noisy channel model which separates the query prediction to two estimations, one by a language model and the other by an error model. Many QAC ranking methods and spelling correction methods can be easily plugged into the framework. To address the efficiency issue, we devise a neighborhood generation method accompanied with a trie index to quickly find candidates for the error model, as well as a fast top-\(k\) retrieval method by caching and pruning. We develop a QAC system based on NGNC. It is able to evaluate the combinations of various ranking and spelling correction methods using query logs and automatically choose the best combination for online query workloads. We highlight research challenges, present our solutions, overview the system architecture, and perform an experimental evaluation on a real dataset to showcase how NGNC improves the state of the art of error-tolerant QAC.