1. Rational use of index

Index is an important data structure in database , Its basic purpose is to improve the query efficiency . Now most database products adopt IBM First proposed ISAM index structure .

Use the index just right , Its use principle is as follows :

* Connect frequently , However, there is no index on the column specified as a foreign key , Fields that are not frequently joined are automatically indexed by the optimizer .
* To sort or group frequently ( Immediately group by or order by operation ) Index on the column of .
In the conditional expression often used in the different values of the column to establish a search , Do not index columns with fewer values . For example, in the employee table “ Gender ” There are only “ male ” And “ female ” Two different values , So there's no need to index . If the index is built, it will not improve the query efficiency , It will seriously slow down the update speed .
* If there are more than one column to sort , Composite indexes can be built on these columns (compound index).
* Using system tools . as Informix The database has one tbcheck tool , You can check on suspicious indexes . On some database servers , The index may be invalid or due to frequent operations
It reduces the reading efficiency , If a query using an index slows down , You can try it tbcheck The tool checks the integrity of the index , Repair if necessary . in addition , When a large number of database tables are updated
After data , Deleting and rebuilding indexes can improve query speed .
(1) In the next two select Statement :

SELECT * FROM table1 WHERE field1<=10000 AND field1>=0;
SELECT * FROM table1 WHERE field1>=0 AND field1<=10000;

If the data in the data table field1 all >=0, The first one select The statement is better than the second one select The sentence efficiency is much higher , Because of the second one select The first condition of the statement consumes a lot of system resources .

First principle : stay where The most restrictive condition should be put first in the clause .

(2) Below select Statement :

SELECT * FROM tab WHERE a=… AND b=… AND c=…;

If there is an index index(a,b,c), be where The order of the fields in the clause should be consistent with the order of the fields in the index .

The second principle :where The order of the fields in the clause should be consistent with the order of the fields in the index .

The following assumptions are field1 There is a unique index on I1, stay field2 Non unique index on I2.
(3) SELECT field3,field4 FROM tb WHERE field1='sdf' fast
SELECT * FROM tb WHERE field1='sdf' slow [/cci]

Because the latter is one step more after index scanning ROWID Table access .

(4) SELECT field3,field4 FROM tb WHERE field1>='sdf' fast
SELECT field3,field4 FROM tb WHERE field1>'sdf' slow

Because the former can quickly locate the index .

(5) SELECT field3,field4 FROM tb WHERE field2 LIKE 'R%' fast
SELECT field3,field4 FROM tb WHERE field2 LIKE '%R' slow ,

Because the latter does not use indexes .

(6) Use functions such as :
SELECT field3,field4 FROM tb WHERE upper(field2)='RMN' Do not use index .

If a table has 20000 records , Functions are not recommended ; If a table has more than 50000 records , Functions are strictly prohibited ! There is no limit to 20000 records .

(7) Null values are not stored in the index , therefore
SELECT field3,field4 FROM tb WHERE field2 IS[NOT] NULL Do not use index .

(8) Inequalities such as
SELECT field3,field4 FROM tb WHERE field2!='TOM' Do not use index .
Similarly ,
SELECT field3,field4 FROM tb WHERE field2 NOT IN('M','P') Do not use index .

(9) Multi column index , Only if the first column of the index in the query is used for a condition , The index can only be used .

(10) MAX,MIN Equifunction , Use index .
SELECT max(field2) FROM tb therefore , If the max,min,sum etc. , It should be quoted .

Use only one aggregate function at a time , as :
SELECT “min”=min(field1), “max”=max(field1) FROM tb
Not as good as :SELECT “min”=(SELECT min(field1) FROM tb) , “max”=(SELECT max(field1) FROM

(11) Indexes with too many duplicate values are not used by the query optimizer . And because it's indexed , When you change the value of the field, you also change the index , So updating the field is slower than not having an index .

(12) The index value is too large ( As in a char(40) Index on the field of ), Will cause a lot of I/O expenses ( It will even exceed the table scan I/O expenses ). therefore , Try to use integer index .
Sp_estspace Table and index overhead can be calculated .

(13) For multi column indexes ,ORDER BY Must be in the same order as the fields in the index .

(14) stay sybase in , If ORDER BY The fields of a cluster form a cluster index , Then there's no need to do it ORDER BY. The order of records is consistent with the cluster index .

(15) Multi table join ( The specific query scheme needs to be tested )
where Use the associated fields as much as possible for the qualification conditions in the clause , And try to put the associated fields first .
SELECT a.field1,b.field2 FROM a,b WHERE a.field3=b.field3

* field3 If there is no index on the :
Yes a Perform a full table scan , Sorting results
Yes b Perform a full table scan , Sorting results
Results consolidation .
It is suitable for very small tables or large tables .
* field3 Index on
According to the order of table join ,b Is the driver table ,a Is the driven table
Yes b Perform a full table scan
Yes a Scan index range
If it matches , adopt a Of rowid visit
(16) Avoid one to many join. as :
SELECT tb1.field3,tb1.field4,tb2.field2 FROM tb1,tb2 WHERE
tb1.field2=tb2.field2 AND tb1.field2=‘BU1032’ AND tb2.field2= ‘aaa’
Not as good as :
declare @a varchar(80)
SELECT @a=field2 FROM tb2 WHERE field2=‘aaa’
SELECT tb1.field3,tb1.field4,@a FROM tb1 WHERE field2= ‘aaa’

(16) Subquery
use exists/not exists replace in/not in operation
compare :
SELECT a.field1 FROM a WHERE a.field2 IN(SELECT b.field1 FROM b WHERE
SELECT a.field1 FROM a WHERE EXISTS( SELECT 1 FROM b WHERE a.field2=b.field1
AND b.field2=100)
SELECT field1 FROM a WHERE field1 NOT IN( SELECT field2 FROM b)
SELECT field1 FROM a WHERE NOT EXISTS( SELECT 1 FROM b WHERE b.field2=a.field1)

(17) main , Foreign keys are mainly used for data constraints ,sybase The index is automatically created when the primary key is created in , Foreign keys have nothing to do with indexes , To improve performance, index must be rebuilt .

(18) char The field of type is not indexed int It's worse not to index fields of type . The performance of the index is only slightly worse .

(19) use count(*) Don't use it count(column_name), Avoid use count(DISTINCT column_name).

(20) Try not to use field names to the right of the equal sign , as :
SELECT * FROM tb WHERE field1 = field3

(21) Avoid use or condition , because or Do not use index .

2. Avoid use order by and group by Words and sentences .

Because the use of these two clauses will take up a lot of temporary space (tempspace), If you have to use , Available views , The method of generating temporary table manually .
If you have to use , Check first memory,tempdb Size of .
Test certificate , In particular, avoid using both in a query join Use it again group by, It's going to be very slow !

3. Try to use less subqueries , Especially the related subquery . Because this will lead to a decline in efficiency .

The label of a column is in both the main query and the where Appears in the query in clause , It is likely that when the column values in the main query change , Subquery must be queried again . More levels of query nesting , The lower the efficiency , Therefore, subqueries should be avoided as much as possible . If subqueries are unavoidable , Filter out as many rows as possible in the subquery .

4. Eliminating sequential access to large table row data

stay Nested query , Sequential access to tables can have a fatal impact on query efficiency .
For example, sequential access strategy is adopted , A nest 3 Layer query , If every layer queries 1000 That's ok , Then this query is to query 10 Billion data .
The main way to avoid this is to index the joined columns .
for example , Two tables : Student list ( Student number , full name , Age ……) And course schedule ( Student number , Course number , achievement ). If two Tables should be connected , It's going to be in “ Student number ” Index this join field .
Union can also be used to avoid sequential access . Although there are indexes on all the check columns , But some forms of where Clause forces the optimizer to use sequential access .
The following query forces the orders Table execution sequence operation :
SELECT * FROM orders WHERE (customer_num=104 AND order_num>1001) OR

Although in the customer_num and order_num Index built on , However, in the above statement, the optimizer still uses the sequential access path to scan the entire table . Because this statement is retrieving a collection of detached rows , So it should be changed to the following statement :
SELECT * FROM orders WHERE customer_num=104 AND order_num>1001
SELECT * FROM orders WHERE order_num=1008
This allows the query to be processed using the index path .

5. Avoiding difficult normal expressions

MATCHES and LIKE Keywords support wildcard matching , Technically, it's called a normal expression . But this matching is particularly time-consuming . for example :SELECT * FROM customer WHERE
zipcode LIKE “98_ _ _”
Even in zipcode Field is indexed , In this case, sequential scanning is also used . If you change the sentence to SELECT * FROM customer WHERE
zipcode >“98000”, The index is used to query when the query is executed , Obviously, it will greatly increase the speed .
in addition , Also avoid non starting substrings . For example, statements :SELECT * FROM customer WHERE zipcode[2,3] >“80”
, stay where A non starting substring is used in clause , Therefore, the statement does not use an index .

6. Using temporary tables to speed up queries

Sort a subset of the table and create a temporary table , Sometimes it can speed up the query . It helps to avoid multiple sort operations , And there are other ways to simplify the work of the optimizer . for example :
SELECT cust.name,rcvbles.balance,……other COLUMNS
FROM cust,rcvbles
WHERE cust.customer_id = rcvlbes.customer_id
AND rcvblls.balance>0
AND cust.postcode>“98000”
ORDER BY cust.name
If the query is to be executed more than once, but not more than once , You can find all outstanding customers and put them in a temporary file , And sort by customer's name :
SELECT cust.name,rcvbles.balance,……other COLUMNS
FROM cust,rcvbles
WHERE cust.customer_id = rcvlbes.customer_id
AND rcvblls.balance>;0
ORDER BY cust.name
INTO TEMP cust_with_balance
Then query in the temporary table in the following way :
SELECT * FROM cust_with_balance
WHERE postcode>“98000”
There are fewer rows in the temporary table than in the main table , And the physical order is the order required , Reduced disk I/O, Therefore, the query workload can be greatly reduced .
be careful : Changes to the main table will not be reflected after the temporary table is created . In the case of frequent data changes in the main table , Be careful not to lose data .

7. Replacing non sequential access with sorting

Non sequential disk access is the slowest operation , This is reflected in the back and forth movement of the disk access arm .SQL The statement hides this situation , This makes it easy to write queries that require access to a large number of non sequential pages when writing applications .

©2019-2020 Toolsou All rights reserved,
Final review of database : Summary of comprehensive application questions Laplance operator ( Second derivative ) Simple learning of computer composition principle pyqt Button call python program _PyQt: Link button to function in program How much can you go up once you change jobs ? Today, I saw the ceiling of job hopping python in str Function usage _python in str Usage Summary of built-in functions MySQL trigger web The server nginx---linux Installation and deployment C++ Chapter V polymorphism exercises :( It's coming to an end )python Check built-in functions , How to check python Built in function