Paper on Learned Index Selection accepted at VLDB 2023

Congratulations to Shi Jiachen for publishing the paper “Learned Index Benefits: Machine Learning Based Index Performance Estimation” at VLDB 2023.

Index selection remains one of the most challenging problems in relational database management systems. To find an optimum index configuration for a workload, accurately and efficiently quantifying the benefits of each candidate index configuration is indispensable. As materializing each index configuration candidate and physically executing queries are infeasible, most of index tuners rely on the cost estimations from optimizer with “what-if” API. However, “what-if” based index benefit estimations have the following two limitations. Firstly, they generate significant errors, which compromise index recommendation quality. Secondly, generating query plans and benefit estimations for each candidate index configuration takes a considerable amount of time. To address the two challenges in index selection, we propose an effective end-to-end machine learning based index benefit estimator. In particular, we propose novel feature extraction and encoding techniques that do not rely on “what-if” call to generate query plan for each index configuration candidate. In addition, we design an attention mechanism to address index interaction issue and aggregate the impacts of different query operations. Finally, we leverage transfer learning technique to improve the estimator’s learning ability for adaption to new database. Comprehensive experiments are conducted on different workloads, and extensive experimental results show that our proposed method outperforms “what-if” based index benefit estimations in terms of accuracy and efficiency. In addition, integrating our method into existing index selection algorithms can significantly improve index recommendation quality.


My research interests include data mangement and data mining.