- 2020-06-04 00:34
*views 3*- C++( Data structure and algorithm )

One , Index sequential access method

* When the dictionary is long enough , When it can reside in memory ,AVL Tree and red black tree can guarantee good time performance .

For large dictionaries that must be stored on disk （ External dictionary or file ）, Higher degree search tree is needed to improve the performance of dictionary operation . Before looking at a search tree like this , First look at the index order access method of the external dictionary

（ISAM）. This method has good time performance for both sequential and random access

The concept of block

* stay ISAM In the method , The available disk space is divided into many blocks

* A block is the smallest unit of disk space used to store data

* Blocks generally have the same length as tracks , And can complete input or output in one search and delay

* Dictionary elements are stored in blocks in ascending order . And the blocks are organized in one order , Minimize latency from one block to another

Sequential access

* On sequential access , Block input in sequence , Elements in each block are searched in ascending order

* If each block contains m Elements , The number of disk accesses required to search each element is 1/m

Random access （ISAM Index technology ）

* To support random access , We need to maintain an index . Index contains the maximum key for each block . thus , The number of keywords in the index is the same as the number of blocks

, And each block can store many elements （ Namely m Values are usually large ）, So the index is enough to reside in memory

* For random access, the keywords are k Elements of , First, look up the index table , Determine the block to which the element belongs , And then take the corresponding fast out of the disk for internal search , To determine the element

* result , One disk access is enough to complete one random access

ISAM Application in big dictionary

* This technique can be extended to a larger dictionary , The dictionary is stored on several disks . The elements are assigned to different disks in ascending order , Elements in one disk are allocated to different blocks in ascending order

* Each disk has a block index , It holds the largest key for each block in the disk . in addition , There is also a disk index , It holds the largest keyword for each disk . Disk indexes usually reside in memory

* For random access to an element ：

* The first step is to search for disk indexes that reside in memory , To determine the disk to which the element belongs

* Then take the block index from the disk and search , To determine the block to which the element belongs

* Finally, remove the block from the disk for internal search , To determine the element

* thus , One random access requires two disk accesses （ One is to fetch the block index , The other is to take out the block ）

* ISAM Method is essentially an array description method , therefore , When inserting and deleting , There will be big problems . In order to partially relieve the pressure of the problem , You can reserve some space in each block ：

* When inserting a small number of elements , No need to move elements between blocks

* allied , After deletion , Can keep the empty space , You don't have to move elements between blocks to save space

* use ISAM Methods to solve these problems , But sequential access costs more

. Elements stored in any order , The index of each keyword should be checked , All elements are accessed through this index . For data stored on disk ,B- Tree is a data structure suitable for index method

Two ,m Cross search tree

* m A cross search tree can be an empty tree . If not empty , Then the following characteristics must be satisfied ：

* ① In the corresponding extended search tree （ That is, the search tree obtained after replacing the null pointer with an external node ）, Each internal node can have at most m Children and 1~m-1 Elements （ External nodes do not contain elements and children ）

* ② Each contains p All nodes of elements have p+1 Children

* ③ For any one containing p Nodes of elements ：

* set up K1,K2,......Kp The keywords of these elements , Order these elements , Namely K1<K2<......<Kp

* set up C0,C1,......Cp Is the node's p+1 Children

* In order to C0 The keywords of the elements in the subtree for the root are less than K1

* In order to Ci In the root subtree ,Ki< Key of element <K(i+1). among 1<=i<p

* In order to Cp The key of the element in the subtree of the root is greater than Kp

* In definition m When searching the tree , It's useful to include external nodes , But in the actual code , External nodes do not need to be specifically described , Just replace it with a null pointer

* Below is a seven prong search tree , Where black represents black node , The others are internal nodes . For example, the root node has two elements and three children

m Search of cross search tree

* method ： Start at root , Compare with keyword in node , If it is larger or smaller than the keywords in the node , Search in the corresponding subtree . Until it's found or can't be found

* For example, find keywords 31：

* Starting from the root node ,31 be located 10 and 80 between , Then go to the 2 Search among children

* From the 2 When children search , Based on the child , find 31 be located 30 And 40 between , So I went to the first 3 Search among children

* Come to No 3 After a child , Discovery ratio 32 Still small , Then go to the 1 Among the children , Found empty , therefore 31 non-existent

m Insertion of cross search tree

* method ：

Insert from root , If a node has free space and is larger than all the child nodes of the space , Then insert it at the corresponding position of the node ; If it is smaller than the element in a child node , Then move down ... and so on

* For example, insert keyword 31：

* Insert root node , Found unable to insert 10 and 80 between , Because it's smaller than some child nodes

* Then move to No 2 Children's Office , But it can't be inserted in 30 and 40 between , Because it is smaller than the element value in the child node

* Then come to the child node again , Discovery can be found in 32 Insert left of , So it's inserted in the 32 To the left of

* For example, insert keyword 65：

* Start at root , Found unable to insert 10 and 80 between , Because it's smaller than some child nodes

* Then move to No 2 Children's Office , Found can be inserted in 60 and 70 between , Then insert successfully

m Deletion of cross search tree

* method ：

* If there is no child node , Then delete it directly

* If there are children （ Left child / Right child ）： Instead of deleting this node, you can use the largest element in the left child or the smallest element in the right child （ If the child node element exists ）

* For example, delete keyword 20：

* No child found for this element , Then you can delete it directly . At this time, the node becomes [30,40,50,60,70]

* For example, delete keyword 84：

* No child found for this element , It can also be deleted directly . At this time, the node becomes [82,86,88]

* For example, delete keyword 5：

* After node 1 Elements , Then the delete node becomes empty

* But the left child is not empty , Then you can use the largest element of the left child （4） To replace the deleted node

* For example, delete keyword 10：

* This element has two children

* Then you can use the largest element of the left child （5） To replace

* You can also use the smallest element in the right child （20） To replace

* But if the child node moves up, it also needs to recursively consider the deletion of the child node

m Height of cross search tree

* A height of h Of m Cross search tree （ Without external nodes ） It's better to have h Elements （ One node per layer , Each node contains an element ）, Up to elements

* This is how the upper limit is calculated ： from 1 reach h-1 layer , Each node contains m Children , The first h The nodes of the layer have no children , The number of nodes is , And each node has at most m-1 element , So the number of elements is

* Because at a height of h Of m In the cross search tree , Number of elements in h To , So a tree n Element's m The height of the cross search tree is from to n between

* for example , A height of 5 Of 200 How many elements can be contained in the cross search tree , At least 5. same , A tree with elements 200 Cross search tree , Its height is 5 To

* When the search tree is stored on disk , search , insert , The time of deletion depends on the number of accesses to the disk （ Assume that the size of each node is no larger than one disk block ）. When h When it's the height of the tree , This number is O(h), therefore ,

To reduce disk access , Make sure the height of the tree is close to , So we need to make use of it m Cross search tree

Three ,m rank B- tree

* m rank B- A tree is a tree m Cross search tree . If B- Tree is not empty , Then there are the following rules ：

* ① The root node has at least 2 Children

* ② Except root node , All internal nodes have at least m/2 Children

* ③ All external nodes are on the same layer

Full binary tree

* In the second order B- In the tree , Each internal node will not have 2 More than children , And each internal node has at least 2 Children , So the second order B- All internal nodes of the tree happen to have 2 Children

* Because all external nodes are on the same layer , So the second order B- The tree is full of binary trees

* therefore , To an integer h, Only if the number of elements is , Such a tree exists

* Next, a seven step tree B- tree ：

* The root node has 3 Children ： Meet rules ①

* Internal nodes have at least 3, Meet rules ②

* All external nodes are on the same layer , Meet rules ③

* second order B- tree （man Binary tree ）

*

Technology

- Java377 articles
- Python197 articles
- Linux110 articles
- MySQL83 articles
- Vue78 articles
- SpringBoot68 articles
- Spring61 articles
- javascript56 articles
- more...

Daily Recommendation

views 4

©2019-2020 Toolsou All rights reserved,

TP6 Application examples of verifier and correct verification data Unity Scene loading asynchronously （ Implementation of loading interface ） Gude Haowen serial - You deserve to be an engineer （ Preface ） Bitcoin in ten years ,VDS Opportunity or fraud Huawei certification HCIA-AI artificial intelligence Image explanation of over fitting and under fitting Python Basic knowledge and notes ESP8266/ESP32 System : Optimize system startup time First knowledge python Skills summary use C++ I want to talk to you “ Prototype mode ” （ copy / copy constructor ）