hello hello everyone , Long time no see. ! Today's 《 Database system of teaching and learning 》 To learn about storage management in a database system
. Teaching and learning database , I haven't seen such a cool title ?“ Language startle ”, you 're right , The title is so cool .

My sister, xiaoburi 18 year , The existence of goddess in Campus , Excellent performance, versatile sports , Gentle personality, honest and kind . however , Only I know , The bright burying in the eyes of everyone , It used to be a hamster Cape , Rolling all over the place , Except for eating, sleeping and playing . And the transformation of all this , From that night .

Since then , Xiaoburi often asked me to help her with her lessons . Today she wants to know about storage management in a database system . This tutorial talks about storage management through my dialogue with Xiaomian .

<> storage medium (StorageMedia)

<> Storage level (TheMemoryHierarchy)

<> Main memory (PrimaryStorage)

* CPU You can use load / Storage operates on data directly in primary storage (Data in primary storage can be operated on directly
by CPUs using load/store)
* Registers( register )
* Cache( Cache )
* Mainmemory( Memory )
Main memory addressing by byte .

* Secondary Storage( Secondary memory )
CPU Data in secondary storage cannot be processed directly ; Read must be used first / Write copy it to primary storage
* Magneticdisks( disk )/Harddiskdrives(HDD, Mechanical hard disk )
* Flashmemory( Flash Memory )/Solidstatedrives(SSD, Solid state drive )
Addressing of secondary memory by block , And it's online .
(Secondarystorageisblock-addressable( Addressing by block ) and online( online ))

* Tertiary Storage( Tertiary memory )
Data intermediate storage must be copied to the second storage first
* Magnetictape( magnetic tape )
* Opticaldisks( CD )
* NetworkStorage( Networked storage )
Addressing of secondary memory by block , And it's offline .

* Access Time( Time of deposit )

* Data Transfer Between Levels

<>Categories of Storage Media

* VolatileStorage( Volatile memory )
Loss of data nonvolatile storage on computer restart ( After automatic shutdown or crash )

- Primary storage

* Non-volatileStorage( Nonvolatile memory )
When you restart your computer , Data in non-volatile storage will be retained

- Third storage

* Nonvolatile main memory is on the rise !(Non-volatile main memory is emerging!)

<>Phase-ChangeMemory(PCM, Phase change memory )

* PCM Using phase change materials ( Phase change material ) Store data
* Amorphousphase( Amorphous state ): high resistivity →0
* Crystallinephase( Crystalline state ): Low Resistivity →1
* Set phase by current pulse
* Rapid cooling → Amorphous
* Slow cooling → crystal
* Characteristics of NVM

* Magnetic Disks ( disk )
The magnetic disk consists of two components :
Disk assembly :sectors( a sector )⊂tracks( Track )⊂cylinders( cylinder )

Head assembly :diskheads( head )anddiskarms( Magnetic arm )

<>Database Pages( Database page )

* A page is a fixed size block of data (512B–16KB)
* It can contain tuples , metadata , Indexes , Logging …
* majority DBMS Do not mix page types
* some DBMS Page is required to be independent
* Each page is given a unique identifier as its page ID(u〜)
* DBMS Use indirect layer to ID Map to physical location
* Page Layout
Each page contains a metadata title about the content of the page
Page size
DBMS edition
Trading visibility
Compress information

* Tuple-Oriented Page Layout
The most common page layout scheme is called slotted pages ( Page of division ).

<>Record ID

To track a single tuple ,DBMS Assign a unique record identifier to each tuple

Most common records ID yes (PageID,Slot#)

* PageID: Of pages containing tuples ID
* Slot#: Slot number of tuple stored in page

<>Large Values

Most DBMSs don’t allow a tuple to exceed the size of a single page

To store values that are larger than a page, the DBMS uses separate overflow
storage pages

majority DBMS Tuples are not allowed to exceed the size of a single page

To store values larger than the page ,DBMS Use separate overflow to store pages

<>Log-Structured Page Layout

Instead of storing tuples in pages, the DBMS only appends log records to the
file of how the database was modified

Inserts store the entire tuple

Deletes mark the tuple as deleted

Updates contain the delta of just the attributes that were modified

To read a tuple, the DBMS scans the log backwards and “recreates” the tuple to
find what it needs

DBMS Don't store tuples in page , Instead, append only the log records to the files that modify the database

Insert store entire tuple

Delete marks tuples as deleted

Update contains only the increments of the modified properties

To read tuples ,DBMS Scan back logs , then “ Recreate ” Tuples to find the required tuples

Build indexes to allow the DBMS to jump to locations in the log

Periodically compact the log

* Log size
* System load

<> Disk file organization

<>Heap File Organization

* A heap file (⌃áˆ) is an unordered collection of pages where tuples that are
stored in random order
Create/get/write/delete pages

Must also support iterating over all pages

Need meta-data to keep track of what pages exist and which ones have free space

Two ways to represent a heap file

Approach #1: Linked lists

Approach #2: Page directory

1. Heap file (⌃áˆ) Is an unordered collection of pages , Where tuples are stored in random order

establish / obtain / write in / Delete page

You must also support traversal of all pages

2. Metadata is required to track which pages exist and which have free space

3. Two ways to represent heap files

method 1: Link list

method 2: page directory

<> System catalog (System Catalogs)

A DBMS stores meta-data about databases in its internal catalogs

* Tables, columns, indexes, views
* Users, permissions
* Internal statistics
Almost every DBMS stores their a database’s catalog in itself

You can query the DBMS’s internal INFORMATION SCHEMA catalog to get
info about the database

* ANSI standard set of read-only views that provide info about all of the
tables, views, columns, and procedures in a database
DBMSs also have non-standard shortcuts to retrieve this informatio

DBMS Store metadata about the database in its internal directory

* surface , column , Indexes , view
* user , jurisdiction
* Internal statistics
Almost every DBMS Will store the database directory itself

You can query DBMS Inside of INFORMATION SCHEMA Directory to get
Information about the database

* ANSI Standard read-only view sets , Provide information about all tables in the database , view , Column and process information
DBMS There are also nonstandard shortcuts to retrieve this information .

<> Buffer management (Buffer Management)

<> buffer pool (Buffer Pool)

The available memory region is partitioned into an array of fixed-size pages,
which are collectively called the buffer pool

The pages in the bu↵er pool are called frames

The available memory area is divided into a fixed size page array , These pages are collectively referred to as buffer pools

Pages in the buffer pool are called frames

* Buffer Manager
The buffer manager is responsible for bringing pages into the buffer pool as
* The buffer manager decides what existing page in the bu↵er pool to replace
to make space for the new page (if the bu↵er pool is full)
The buffer manager is responsible for bringing pages into the buffer pool as needed

* The buffer manager determines which existing pages to replace in the buffer pool , Make room for new pages ( If the buffer pool is full )
* Buffer Pool Internals: Page Table
The page table keeps track of pages that are currently in the buffer pool.
Page table keeps track of pages currently in the buffer pool .

* Page Requests
1 Check the page table to see if some frame contains the requested
page P
2 If P is in the bu↵er pool, pin page P, i.e. increment the pin count
of the frame containing P
3 Return the pointer of the frame containing P
Example: Request page #2
1 Check the page table to see if a frame contains the requested page P
2 If P In buffer pool , Then in P Fixed pin on page , Including P Number of pins in the frame of
3 Return contains P Pointer example for frame of : Request page #2

* Buffer Pool Internals: Frame’s Meta-Data
The buffer manager maintains two variables for each frame
pin count: the number of times that the page currently in the frame
has been requested but not released, i.e. the number of current users
of the page

The buffer manager maintains two variables for each frame

dirty: the status whether the page in the frame has been modified
since it was brought into the buffer pool

The buffer manager maintains two variables for each frame

pin count: The number of times the current page has been requested but not released in the frame , That is, the number of users

The buffer manager maintains two variables for each frame

dirty: The state of whether a page in a frame is modified because it is brought into the buffer pool

<>Page Requests (Cont’d)

If the requested page is not in the buffer pool ,

* Select number of stitches = 0 Replace the frame of , Use replacement policy , And increase the number of pins
* If the dirty bit of the replacement frame is on , Write the page it contains to disk
* Read the requested page into the replace box
<> summary

Let's play , To return to trouble , Don't make fun of learning .

This paper introduces the storage management in database system , Including memory , File organization representation, buffer pool, etc . Remember the key points of storage management when learning .

©2019-2020 Toolsou All rights reserved,
It's unexpected Python Cherry tree (turtle The gorgeous style of Library )html Writing about cherry trees , Writing about cherry trees java Four functional interfaces ( a key , simple ) Browser kernel ( understand )06【 Interpretation according to the frame 】 Data range filtering -- awesome HashMap Explain in detail os Simple use of module computer network --- Basic concepts of computer network ( agreement , system ) Some East 14 Pay change 16 salary , Sincerity or routine ?