」工欲善其事,必先利其器。「—孔子《論語.錄靈公》
首頁 > 程式設計 > Mysql資料庫索引初學者詳解

Mysql資料庫索引初學者詳解

發佈於2024-08-01
瀏覽:242

Core Concepts

  • Primary Key Index / Secondary Index
  • Clustered Index / Non-Clustered Index
  • Table Lookup / Index Covering
  • Index Pushdown
  • Composite Index / Leftmost Prefix Matching
  • Prefix Index
  • Explain

1. [Index Definition]

1. Index Definition

Besides the data itself, the database system also maintains data structures that satisfy specific search algorithms. These structures reference (point to) the data in a certain way, allowing advanced search algorithms to be implemented on them. These data structures are indexes.

2. Data Structures of Indexes

  • B-tree / B tree (MySQL's InnoDB engine uses B tree as the default index structure)
  • HASH table
  • Sorted array

3. Why Choose B Tree Over B Tree

  • B-tree structure: Records are stored in the tree nodes.

Mysql Database Index Explained for Beginners

  • B tree structure: Records are stored only in the leaf nodes of the tree.

Mysql Database Index Explained for Beginners

  • Assuming a data size of 1KB and an index size of 16B, with the database using disk data pages, and a default disk page size of 16K, the same three I/O operations will yield:
  1. B-tree can fetch 16*16*16=4096 records.

  2. B tree can fetch 1000*1000*1000=1 billion records.

2. [Index Types]

1. Primary Key Index and Secondary Index

  • Primary Key Index: The leaf nodes of the index are data rows.
  • Secondary Index: The leaf nodes of the index are KEY fields plus primary key index. Therefore, when querying through a secondary index, it first finds the primary key value, and then InnoDB finds the corresponding data block through the primary key index.
  • In InnoDB, the primary index file directly stores the data row, called clustered index, while secondary indexes point to the primary key reference.
  • In MyISAM, both primary and secondary indexes point to physical rows (disk positions).

Mysql Database Index Explained for Beginners

2. Clustered Index and Non-Clustered Index

  • A clustered index reorganizes the actual data on the disk to be sorted by one or more specified column values. The characteristic is that the storage order of the data and the index order are consistent. Generally, the primary key will default to creating a clustered index, and a table only allows one clustered index (reason: data can only be stored in one order). As shown in the image, InnoDB's primary and secondary indexes are clustered indexes.
  • Compared to the leaf nodes of a clustered index being data records, the leaf nodes of a non-clustered index are pointers to the data records. The biggest difference is that the order of data records does not match the index order.

3. Advantages and Disadvantages of Clustered Index

  • Advantage: When querying entries by primary key, it does not need to perform a table lookup (data is under the primary key node).
  • Disadvantage: Frequent page splits can occur with irregular data insertion.

3. [Extended Index Concepts]

1. Table Lookup

The concept of table lookup involves the difference between primary key index and non-primary key index queries.

  • If the query is select * from T where ID=500, a primary key query only needs to search the ID tree.
  • If the query is select * from T where k=5, a non-primary key index query needs to first search the k index tree to get the ID value of 500, then search the ID index tree again.
  • The process of moving from the non-primary key index back to the primary key index is called table lookup.

Queries based on non-primary key indexes require scanning an additional index tree. Therefore, we should try to use primary key queries in applications. From the perspective of storage space, since the leaf nodes of the non-primary key index tree store primary key values, it is advisable to keep the primary key fields as short as possible. This way, the leaf nodes of the non-primary key index tree are smaller, and the non-primary key index occupies less space. Generally, it is recommended to create an auto-increment primary key to minimize the space occupied by non-primary key indexes.

2. Index Covering

  • If a WHERE clause condition is a non-primary key index, the query will first locate the primary key index through the non-primary key index (the primary key is located at the leaf nodes of the non-primary key index search tree), and then locate the query content through the primary key index. In this process, moving back to the primary key index tree is called table lookup.
  • However, when our query content is the primary key value, we can directly provide the query result without table lookup. In other words, the non-primary key index has already "covered" our query requirement in this query, hence it is called a covering index.
  • A covering index can directly obtain query results from the auxiliary index without table lookup to the primary index, thereby reducing the number of searches (not needing to move from the auxiliary index tree to the clustered index tree) or reducing IO operations (the auxiliary index tree can load more nodes from the disk at once), thereby improving performance.

3. Composite Index

A composite index refers to indexing multiple columns of a table.

Scenario 1:

A composite index (a, b) is sorted by a, b (first sorted by a, if a is the same then sorted by b). Therefore, the following statements can directly use the composite index to get results (in fact, it uses the leftmost prefix principle):

  • select … from xxx where a=xxx;
  • select … from xxx where a=xxx order by b;

The following statements cannot use composite queries:

  • select … from xxx where b=xxx;

Scenario 2:

For a composite index (a, b, c), the following statements can directly get results through the composite index:

  • select … from xxx where a=xxx order by b;
  • select … from xxx where a=xxx and b=xxx order by c;

The following statements cannot use the composite index and require a filesort operation:

  • select … from xxx where a=xxx order by c;

Summary:

Using the composite index (a, b, c) as an example, creating such an index is equivalent to creating indexes a, ab, and abc. Having one index replace three indexes is certainly beneficial, as each additional index increases the overhead of write operations and disk space usage.

4. Leftmost Prefix Principle

  • From the above composite index example, we can understand the leftmost prefix principle.
  • Not just the full definition of the index, as long as it meets the leftmost prefix, it can be used to speed up retrieval. This leftmost prefix can be the leftmost N fields of the composite index or the leftmost M characters of the string index. Use the "leftmost prefix" principle of the index to locate records and avoid redundant index definitions.
  • Therefore, based on the leftmost prefix principle, it is crucial to consider the field order within the index when defining composite indexes! The evaluation criterion is the reusability of the index. For example, when there is already an index on (a, b), there is generally no need to create a separate index on a.

5. Index Pushdown

MySQL 5.6 introduced the index pushdown optimization, which can filter out records that do not meet the conditions based on the fields included in the index during index traversal, reducing the number of table lookups.

  • Create table
CREATE TABLE `test` (
   `id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'Auto-increment primary key',
   `age` int(11) NOT NULL DEFAULT '0',
   `name` varchar(255) CHARACTER SET utf8 NOT NULL DEFAULT '',
   PRIMARY KEY (`id`),
   KEY `idx_name_age` (`name`,`age`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
  • SELECT * from user where name like 'Chen%' Leftmost prefix principle, hitting idx_name_age index
  • SELECT * from user where name like 'Chen%' and age=20
    • Before version 5.6, it would first match 2 records based on the name index (ignoring the age=20 condition at this point), find the corresponding 2 IDs, perform table lookups, and then filter based on age=20.
    • After version 5.6, index pushdown is introduced. After matching 2 records based on name, it will not ignore the age=20 condition before performing table lookups, filtering based on age before table lookup. This index pushdown can reduce the number of table lookups and improve query performance.

6. Prefix Index

When an index is a long character sequence, it can take up a lot of memory and be slow. In this case, prefix indexes can be used. Instead of indexing the entire value, we index the first few characters to save space and achieve good performance. Prefix index uses the first few letters of the index. However, to reduce the index duplication rate, we must evaluate the uniqueness of the prefix index.

  • First, calculate the uniqueness ratio of the current string field: select 1.0*count(distinct name)/count(*) from test
  • Then, calculate the uniqueness ratio for different prefixes:
    • select 1.0*count(distinct left(name,1))/count(*) from test for the first character of the name as the prefix index
    • select 1.0*count(distinct left(name,2))/count(*) from test for the first two characters of the name as the prefix index
    • ...
  • When left(str, n) does not significantly increase, select n as the prefix index cut-off value.
  • Create the index alter table test add key(name(n));

4. [Viewing Indexes]

After adding indexes, how do we view them? Or, if statements are slow to execute, how do we troubleshoot?

Explain is commonly used to check if an index is effective.

After obtaining the slow query log, observe which statements are slow. Add explain before the statement and execute it again. Explain sets a flag on the query, causing it to return information about each step in the execution plan instead of executing the statement. It returns one or more rows of information showing each part of the execution plan and the execution order.

Important fields returned by explain:

  • type: Shows the search method (full table scan or index scan)
  • key: The index field used, null if not used

Explain's type field:

  • ALL: Full table scan
  • index: Full index scan
  • range: Index range scan
  • ref: Non-unique index scan
  • eq_ref: Unique index scan
版本聲明 本文轉載於:https://dev.to/coder_world/mysql-database-index-explained-for-beginners-3heg?1如有侵犯,請聯絡[email protected]刪除
最新教學 更多>
  • PHP陣列鍵值異常:了解07和08的好奇情況
    PHP陣列鍵值異常:了解07和08的好奇情況
    PHP數組鍵值問題,使用07&08 在給定數月的數組中,鍵值07和08呈現令人困惑的行為時,就會出現一個不尋常的問題。運行print_r($月份)返回意外結果:鍵“ 07”丟失,而鍵“ 08”分配給了9月的值。 此問題源於PHP對領先零的解釋。當一個數字帶有0(例如07或08)的前綴時,PHP...
    程式設計 發佈於2025-04-01
  • 如何正確使用與PDO參數的查詢一樣?
    如何正確使用與PDO參數的查詢一樣?
    在pdo 中使用類似QUERIES在PDO中的Queries時,您可能會遇到類似疑問中描述的問題:此查詢也可能不會返回結果,即使$ var1和$ var2包含有效的搜索詞。錯誤在於不正確包含%符號。 通過將變量包含在$ params數組中的%符號中,您確保將%字符正確替換到查詢中。沒有此修改,PD...
    程式設計 發佈於2025-04-01
  • 如何使用Depimal.parse()中的指數表示法中的數字?
    如何使用Depimal.parse()中的指數表示法中的數字?
    在嘗試使用Decimal.parse(“ 1.2345e-02”中的指數符號表示法時,您可能會出現錯誤。這是因為默認解析方法無法識別指數符號。 成功解析這樣的字符串,您需要明確指定它代表浮點數。您可以使用numbersTyles.Float樣式進行此操作,如下所示:[&& && && &&華氏度D...
    程式設計 發佈於2025-04-01
  • 如何從PHP中的數組中提取隨機元素?
    如何從PHP中的數組中提取隨機元素?
    從陣列中的隨機選擇,可以輕鬆從數組中獲取隨機項目。考慮以下數組:; 從此數組中檢索一個隨機項目,利用array_rand( array_rand()函數從數組返回一個隨機鍵。通過將$項目數組索引使用此鍵,我們可以從數組中訪問一個隨機元素。這種方法為選擇隨機項目提供了一種直接且可靠的方法。
    程式設計 發佈於2025-04-01
  • 如何在GO編譯器中自定義編譯優化?
    如何在GO編譯器中自定義編譯優化?
    在GO編譯器中自定義編譯優化 GO中的默認編譯過程遵循特定的優化策略。 However, users may need to adjust these optimizations for specific requirements.Optimization Control in Go Compi...
    程式設計 發佈於2025-04-01
  • Java是否允許多種返回類型:仔細研究通用方法?
    Java是否允許多種返回類型:仔細研究通用方法?
    在Java中的多個返回類型:一種誤解類型:在Java編程中揭示,在Java編程中,Peculiar方法簽名可能會出現,可能會出現,使開發人員陷入困境,使開發人員陷入困境。 getResult(string s); ,其中foo是自定義類。該方法聲明似乎擁有兩種返回類型:列表和E。但這確實是如此嗎...
    程式設計 發佈於2025-04-01
  • 如何在php中使用捲髮發送原始帖子請求?
    如何在php中使用捲髮發送原始帖子請求?
    如何使用php 創建請求來發送原始帖子請求,開始使用curl_init()開始初始化curl session。然後,配置以下選項: curlopt_url:請求 [要發送的原始數據指定內容類型,為原始的帖子請求指定身體的內容類型很重要。在這種情況下,它是文本/平原。要執行此操作,請使用包含以下標頭...
    程式設計 發佈於2025-04-01
  • 如何從Google API中檢索最新的jQuery庫?
    如何從Google API中檢索最新的jQuery庫?
    從Google APIS 問題中提供的jQuery URL是版本1.2.6。對於檢索最新版本,以前有一種使用特定版本編號的替代方法,它是使用以下語法:獲取最新版本:未壓縮)While these legacy URLs still remain in use, it is recommended ...
    程式設計 發佈於2025-04-01
  • 您可以使用CSS在Chrome和Firefox中染色控制台輸出嗎?
    您可以使用CSS在Chrome和Firefox中染色控制台輸出嗎?
    在javascript console 中顯示顏色是可以使用chrome的控制台顯示彩色文本,例如紅色的redors,for for for for錯誤消息? 回答是的,可以使用CSS將顏色添加到Chrome和Firefox中的控制台顯示的消息(版本31或更高版本)中。要實現這一目標,請使用以下...
    程式設計 發佈於2025-04-01
  • 如何從PHP中的Unicode字符串中有效地產生對URL友好的sl。
    如何從PHP中的Unicode字符串中有效地產生對URL友好的sl。
    為有效的slug生成首先,該函數用指定的分隔符替換所有非字母或數字字符。此步驟可確保slug遵守URL慣例。隨後,它採用ICONV函數將文本簡化為us-ascii兼容格式,從而允許更廣泛的字符集合兼容性。 接下來,該函數使用正則表達式刪除了不需要的字符,例如特殊字符和空格。此步驟可確保slug僅包...
    程式設計 發佈於2025-04-01
  • 如何在鼠標單擊時編程選擇DIV中的所有文本?
    如何在鼠標單擊時編程選擇DIV中的所有文本?
    在鼠標上選擇div文本單擊帶有文本內容,用戶如何使用單個鼠標單擊單擊div中的整個文本?這允許用戶輕鬆拖放所選的文本或直接複製它。 在單個鼠標上單擊的div元素中選擇文本,您可以使用以下Javascript函數: function selecttext(canduterid){ if(d...
    程式設計 發佈於2025-04-01
  • 如何處理PHP文件系統功能中的UTF-8文件名?
    如何處理PHP文件系統功能中的UTF-8文件名?
    在PHP的Filesystem functions中處理UTF-8 FileNames 在使用PHP的MKDIR函數中含有UTF-8字符的文件很多flusf-8字符時,您可能會在Windows Explorer中遇到comploreer grounder grounder grounder gro...
    程式設計 發佈於2025-04-01
  • 如何限制動態大小的父元素中元素的滾動範圍?
    如何限制動態大小的父元素中元素的滾動範圍?
    在交互式接口中實現垂直滾動元素的CSS高度限制問題:考慮一個佈局,其中我們具有與用戶垂直滾動一起移動的可滾動地圖div,同時與固定的固定sidebar保持一致。但是,地圖的滾動無限期擴展,超過了視口的高度,阻止用戶訪問頁面頁腳。 映射{} 因此。我們不使用jQuery的“ .aimimate...
    程式設計 發佈於2025-04-01
  • 如何將PANDAS DataFrame列轉換為DateTime格式並按日期過濾?
    如何將PANDAS DataFrame列轉換為DateTime格式並按日期過濾?
    將pandas dataframe列轉換為dateTime格式示例:使用column(mycol)包含以下格式的以下dataframe,以自定義格式:})指定的格式參數匹配給定的字符串格式。轉換後,MyCol列現在將包含DateTime對象。 date date filtering > = ...
    程式設計 發佈於2025-04-01
  • 如何使用Regex在PHP中有效地提取括號內的文本
    如何使用Regex在PHP中有效地提取括號內的文本
    php:在括號內提取文本在處理括號內的文本時,找到最有效的解決方案是必不可少的。一種方法是利用PHP的字符串操作函數,如下所示: 作為替代 $ text ='忽略除此之外的一切(text)'; preg_match('#((。 &&& [Regex使用模式來搜索特...
    程式設計 發佈於2025-04-01

免責聲明: 提供的所有資源部分來自互聯網,如果有侵犯您的版權或其他權益,請說明詳細緣由並提供版權或權益證明然後發到郵箱:[email protected] 我們會在第一時間內為您處理。

Copyright© 2022 湘ICP备2022001581号-3