High Cardinality Column Placement in Composite Indexes with Range Queries
When querying a table with a composite index involving a range condition, the placement of columns within the index can significantly impact performance.
Consider the table files with a primary key (did, filename) and two composite indexes: INDEX(filetime, ext) and INDEX(ext, filetime). Both indexes contain the filetime column, which has higher cardinality than ext.
The query:
WHERE ext = '...' AND filetime BETWEEN ... AND ...
requires accessing data based on both ext and filetime. The question arises: which index is optimal for such a query?
Analysis
To determine the optimal index, we can use FORCE INDEX and examine the execution plans:
-- Force range on filetime first FORCE INDEX(fe) SELECT COUNT(*), AVG(fsize) FROM files WHERE ext = 'gif' AND filetime >= '2015-01-01' AND filetime = '2015-01-01' AND filetimeThe output shows that INDEX(ext, filetime) (ef) has a significantly lower row count, indicating a more efficient scan.
Optimizer Trace
To further analyze the optimizer's behavior, we can use the optimizer trace:
SELECT explain_format = 'JSON'; SELECT COUNT(*), AVG(fsize) FROM files WHERE ext = 'gif' AND filetime >= '2015-01-01' AND filetimeThe trace reveals that the optimizer chooses INDEX(ext, filetime) because it can use both columns of the index to filter and fetch data. In contrast, INDEX(filetime, ext) can only use the first column (filetime) for filtering.
Conclusions
Based on the analysis, the following conclusions can be drawn:
Disclaimer: All resources provided are partly from the Internet. If there is any infringement of your copyright or other rights and interests, please explain the detailed reasons and provide proof of copyright or rights and interests and then send it to the email: [email protected] We will handle it for you as soon as possible.
Copyright© 2022 湘ICP备2022001581号-3