1. Use EXPLAIN To optimize SQL Sentence
EXPLAIN The return result of the statement provides TiDB implement SQL Query details ：
EXPLAIN Can and SELECT, DELETE, INSERT, REPLACE, as well as UPDATE Statement used together ;
implement EXPLAIN,TiDB Will return to the EXPLAIN Of SQL The final physical execution plan of statements after optimizer . in other words ,EXPLAIN It shows TiDB Execute this
SQL Complete information of the statement , In what order , How JOIN Two tables , What does the expression tree look like and so on . For details, please see EXPLAIN Output format ;
TiDB Not yet supported EXPLAIN [options] FOR CONNECTION
connection_id, We will support it in the future , For details, please see ：#4351;
By observation EXPLAIN Results of , You can know how to index a data table so that execution plans can use indexes to speed up SQL Statement execution speed ; You can also use EXPLAIN
To check whether the optimizer chooses the best order JOIN data sheet .
2. EXPLAIN Output format
at present TiDB Of EXPLAIN Will output 6 column , namely ：id,parents,children,task,operator info and
count, Each of the execution plans operator It's all here 6 Column properties to describe ,EXPLAIN Each line in the result describes one operator.
3.1Task brief introduction
at present TiDB There are two different types of calculation tasks task: cop task and root task.cop task Is pushed down to KV
End distributed computing tasks ,root task Means at TiDB Calculation tasks performed by single end point .SQL One of the objectives of optimization is to push the calculation down as far as possible to KV End execution .
3.2 Table data and index data
TiDB Table data of refers to the original data of a table , Store in TiKV in . For each row of table data , Its key It's a 64 Bit integer , Be called Handle
ID. If a table exists int Primary key of type , We will treat the value of the primary key as table data Handle ID, Otherwise, it is automatically generated by the system Handle ID. Table data value
Encoded from all data in this row . When reading table data , We can follow Handle ID Incremental sequential return .
TiDB The index data of is the same as the table data , Also stored in TiKV in . Its key Is an order encoded by index columns bytes,value Corresponding to the index data of this row
Handle ID, adopt Handle ID
We can read the non index column of this row . When reading index data , We return in ascending order of index columns , If there are multiple index columns , First of all, we guarantee that No 1 Column increment , And in the i
When columns are equal , Guarantee No i + 1 Column increment .
3.3 Range query
In condition , We will analyze the query return of the primary key or index key . Such as number , Date type comparator , If greater than , less than , Equal to and greater than , Less than or equal to , Character type LIKE
Symbols, etc . It should be noted that , We only support column at one end of the comparator , The other end is constant , Or it can be calculated as a constant , Similar year(birth_day) < 1992
The query condition of cannot utilize index . Also note that the same type should be used as much as possible for comparison , To avoid introducing additional cast Unable to utilize index due to operation , as user_id =
123456, If user_id Is a string , You need to 123456 Also written as a string constant . Use for range query criteria of the same column AND and OR
After combination , Equal to intersection or union of ranges . For multidimensional composite index , We can write conditions for multiple columns . For example, for a composite index (a, b, c), When a When querying for equivalence , Can continue to ask b
Query scope of , When b When it is also equivalent query , Can continue to ask c Query scope of , On the contrary, if a Non equivalent query , Then only a Scope of .
4.1TableReader and TableScan
TableScan Indicated in KV End to end table data scanning ,TableReader Indicated in TiDB End slave TiKV
End read , Two operators belonging to the same function .table Express SQL Table name in statement , If the table name is renamed , Display rename .range Indicates the range of data scanned , If not specified in the query
WHERE/HAVING/ON condition , Full table scan will be selected , If int There are range query criteria on the primary key of type , Select range query .keep order Express table
scan Whether to return in order .
4.2IndexReader and IndexLookUp
Index stay TiDB There are two ways to read ：IndexReader Indicates that the index column is read directly from the index , For SQL
Only the columns or primary keys related to the index are referenced in the statement ;IndexLookUp Indicates that some data is filtered from the index , Only the Handle ID, adopt Handle ID
Look up table data again , In this way, you need to TiKV get data .Index The read mode of is automatically selected by the optimizer .
IndexScan yes KV Operators reading index data , and TableScan Similar functions .table Express SQL
Table name in statement , If the table name is renamed , Display rename .index Indicates index name .range Indicates the range of data scanned .out of order Express index scan
Whether to return in order . Pay attention to the TiDB in , Multi column or non int The primary key of a column is treated as a unique index .
Selection Express SQL Selection conditions in statements , Usually in WHERE/HAVING/ON Clause .
Projection Corresponding SQL In statement SELECT list , The function is to map each input data to a new output data .
Aggregation Corresponding SQL In statement Group By Statement or no Group By Statement but aggregate function , for example count or sum
Functions, etc .TiDB Two aggregation algorithms are supported ：Hash Aggregation as well as Stream Aggregation（ To be added ）.Hash Aggregation
Hash based aggregation algorithm , If Hash Aggregation Next door neighbor Table perhaps Index Read operator of , Then the aggregation operator will TiKV
End prepolymerization , To improve the parallelism and reduce the network overhead .
TiDB Support Inner Join as well as Left/Right Outer Join, And automatically convert the external connections that can be simplified to Inner Join.
TiDB Three kinds of support Join algorithm ：Hash Join,Sort Merge Join and Index Look up Join.Hash Join
The principle is to preload the small tables participating in the connection into memory , Read all data of large table for connection .Sort Merge Join
Use the ordered information of input data , Read the data of two tables at the same time and compare them in turn .Index Look Up Join Can read the data of appearance , And query the primary key or index key of the inner table .
Apply yes TiDB An operator for describing subqueries , Behavior similar to Nested
Loop, I.e. take one piece of data from the appearance every time , Brought into the associated column of inner table , And execute , Final basis Apply inline Join Algorithm for connection calculation .
It should be noted that ,Apply It will be automatically converted to Join operation . Users are writing SQL We should try our best to avoid Apply The emergence of operators .