preface : stay mysql When designing tables ,mysql It is officially recommended not to use uuid Or discontinuous and non repeated snowflakes id(long Shape and unique ), Instead, continuous self incrementing primary keys are recommended id, The official recommendation is auto_increment, So why not recommend it uuid, use uuid What are the disadvantages ? In this blog, let's analyze this problem , Discuss the internal reasons .

<> one :mysql And program instances

1.1: To explain this problem , Let's create three tables first , namely user_auto_key,user_uuid,user_random_key, Indicates the primary key of automatic growth ,uuid As primary key , random key As primary key , Other things remain the same . According to the control variable method , We only use different strategies to generate the primary key of each table , The other fields are exactly the same , Then test the insertion speed and query speed of the table :

notes : Random here key In fact, it refers to the discontinuity and non repetition calculated by the snowflake algorithm Irregular id: a string 18 Bit length long value

id Auto generate table :

user uuid surface

Random primary key table :

<>1.2: Theory alone is not enough , Go directly to the program , use spring of jdbcTemplate To realize the additional check test :

Technical framework :springboot+jdbcTemplate+junit+hutool, The principle of the program is to connect its own test database , Then write the same amount of data in the same environment , Let's analyze it insert Insert the time to integrate its efficiency , In order to achieve the most realistic effect , All data are generated randomly , Like name , mailbox , Addresses are randomly generated , Program uploaded from gitee, The address is at the bottom of the text .
package com.wyq.mysqldemo; import cn.hutool.core.collection.CollectionUtil;
import com.wyq.mysqldemo.databaseobject.UserKeyAuto; import
com.wyq.mysqldemo.databaseobject.UserKeyRandom; import
com.wyq.mysqldemo.databaseobject.UserKeyUUID; import
com.wyq.mysqldemo.diffkeytest.AutoKeyTableService; import
com.wyq.mysqldemo.diffkeytest.RandomKeyTableService; import
com.wyq.mysqldemo.diffkeytest.UUIDKeyTableService; import
com.wyq.mysqldemo.util.JdbcTemplateService; import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired; import
org.springframework.boot.test.context.SpringBootTest; import
org.springframework.util.StopWatch; import java.util.List; @SpringBootTest
class MysqlDemoApplicationTests { @Autowired private JdbcTemplateService
jdbcTemplateService; @Autowired private AutoKeyTableService
autoKeyTableService; @Autowired private UUIDKeyTableService
uuidKeyTableService; @Autowired private RandomKeyTableService
randomKeyTableService; @Test void testDBTime() { StopWatch stopwatch = new
StopWatch(" implement sql Time consumption "); /** * auto_increment key task */ final String insertSql =
"INSERT INTO user_key_auto(user_id,user_name,sex,address,city,email,state)
VALUES(???????)"; List<UserKeyAuto> insertData =
autoKeyTableService.getInsertData(); stopwatch.start(" Automatic generation key Table task start "); long
start1 = System.currentTimeMillis(); if (CollectionUtil.isNotEmpty(insertData))
{ boolean insertResult = jdbcTemplateService.insert(insertSql, insertData,
false); System.out.println(insertResult); } long end1 =
System.currentTimeMillis(); System.out.println("auto key Time consumed :" + (end1 -
start1)); stopwatch.stop(); /** * uudID of key */ final String insertSql2 =
"INSERT INTO user_uuid(id,user_id,user_name,sex,address,city,email,state)
VALUES(????????)"; List<UserKeyUUID> insertData2 =
uuidKeyTableService.getInsertData(); stopwatch.start("UUID of key Table task start "); long
begin = System.currentTimeMillis(); if (CollectionUtil.isNotEmpty(insertData))
{ boolean insertResult = jdbcTemplateService.insert(insertSql2, insertData2,
true); System.out.println(insertResult); } long over =
System.currentTimeMillis(); System.out.println("UUID key Time consumed :" + (over -
begin)); stopwatch.stop(); /** * Random long value key */ final String insertSql3 =
"INSERT INTO user_random_key(id,user_id,user_name,sex,address,city,email,state)
VALUES(????????)"; List<UserKeyRandom> insertData3 =
randomKeyTableService.getInsertData(); stopwatch.start(" Random long value key Table task start ");
Long start = System.currentTimeMillis(); if
(CollectionUtil.isNotEmpty(insertData)) { boolean insertResult =
jdbcTemplateService.insert(insertSql3, insertData3, true);
System.out.println(insertResult); } Long end = System.currentTimeMillis();
System.out.println(" random key Task consumption time :" + (end - start)); stopwatch.stop(); String
result = stopwatch.prettyPrint(); System.out.println(result); }
<>1.3: Program write results

user_key_auto Write results :

user_random_key Write results :

user_uuid Table write results :

<>1.4: Efficiency test results

When the existing data volume is 130W When : Let's test the insertion again 10w data , See what happens :

It can be seen from the data volume 100W Around the time ,uuid The insertion efficiency is at the bottom , And it is added in the following order 130W Data of ,uudi The time has plummeted again . The overall efficiency ranking of time occupation is :auto_key>random_key>uuid,uuid Lowest efficiency , In case of large amount of data , Efficiency plummets . So why does this happen ? With questions , Let's discuss this problem :

<> two : use uuid And self increase id Index structure comparison of

<>2.1: Use self increment id Internal structure of

The values of self incremented primary keys are sequential , therefore Innodb Store every record behind a record . When the maximum fill factor of the page is reached (innodb The default maximum fill factor is page size 15/16, Will set aside 1/16 Space for future
modify ):

① The next record will be written to the new page , Once the data is loaded in this order , The primary key page will be almost filled with records in sequence , Increased the maximum fill rate of the page , There will be no waste of pages

② The newly inserted row must be next to the original maximum data row ,mysql Fast positioning and addressing , No additional consumption will be made for calculating the location of new rows

③ Reduce page splitting and fragmentation

<>2.2: use uuid Index internal structure of

because uuid Self increase of relative order id It's irregular for me , The value of the new row does not have to be greater than the value of the previous primary key , therefore innodb It is impossible to always insert new rows at the end of the index , Instead, we need to find a new suitable location for the new line to allocate new space . This process requires a lot of extra operations , Disordered data will lead to scattered data distribution , Will cause the following problems :

①: The written target page is likely to have been flushed to disk and removed from the cache , Or it has not been loaded into the cache ,innodb You have to find and read the target page from disk to memory before inserting , This will lead to a large number of random IO

②: Because the writing is out of order ,innodb Have to do page splitting operations frequently , To allocate space for new rows , Page splitting causes large amounts of data to be moved , At least three pages need to be modified for one insertion

③: Due to frequent page splitting , Pages become sparse and filled irregularly , Eventually, the data will be fragmented

Putting random values (uuid And snowflakes id) Load into clustered index (innodb Default index type ) in the future , Sometimes it needs to be done once OPTIMEIZE
TABLE To rebuild the table and optimize the filling of pages , This will take some time .

conclusion : use innodb As far as possible, the primary key should be inserted in the self increasing order , And try to insert new rows by monotonically increasing the value of clustering key

<>2.3: Use self increment id Disadvantages of

Then use self increasing id Is there no harm at all ? Not at all , Self accretion id There will also be the following problems :

①: Once someone crawls your database , According to the self increment of the database id Get your business growth information , It's easy to analyze your business situation

②: For highly concurrent loads ,innodb When inserting by primary key, it will cause obvious lock contention , The upper bound of the primary key will become the hot spot of contention , Because all the insertions happen here , Concurrent inserts can cause gap lock contention

③:Auto_Increment The lock mechanism will cause the preemption of self incrementing locks , There is a certain performance loss

attach :Auto_increment Lock contention problem of , If you want to improve, you need to tune innodb_autoinc_lock_mode Configuration of

<> three : summary

This blog first asks questions from the beginning , Form creation to use jdbcTemplate To test different id The performance of the generation strategy in the large amount of data insertion , Then analyzed id The mechanism is different in mysql Index structure and advantages and disadvantages of , Explain in depth why uuid And random non repetition id Performance loss in data insertion , Explained this problem in detail . In actual development, it is still based on mysql It is best to use self augmentation as the official recommendation of id,mysql broad and profound , There are still many points worth optimizing inside that we need to learn .

©2019-2020 Toolsou All rights reserved,
Solve in servlet The Chinese output in is a question mark C String function and character function in language MySQL management 35 A small coup optimization Java performance —— Concise article Seven sorting algorithms (java code ) use Ansible Batch deployment SSH Password free login to remote host according to excel generate create Build table SQL sentence Spring Source code series ( sixteen )Spring merge BeanDefinition Principle of Virtual machine installation Linux course What are the common exception classes ?