HDFS Main components
1, data block （Block）
HDFS The files in are stored as data blocks , The default basic storage unit is 128MB Data block , in other words , store in HDFS All files in are divided into 128MB
A piece of data is stored , If the file itself is less than 128MB, It is stored according to the actual size , Does not occupy the entire database space .
HDFS Why is the data block set so large , The purpose is to reduce addressing overhead . More data blocks , The more time it takes to address data blocks . Of course, it will not be set too large .
HDFS Each data block has a default 3 Copies , Distributed storage in different DataNode in , To achieve fault tolerance , All if a data block is lost, it does not affect the access of the data block . The size of the data block and the number of copies can be changed in the configuration file .
NameNode yes HDFS Storing metadata in （ Metadata is the file name , Size and location on the computer ） Place , It saves the elements of all files and folders in a file system directory tree , Any metadata changes ,NameNode Will record .HDFS Each file in the is split into multiple data blocks , This correspondence between files and data blocks is also stored in the file system directory tree , from NameNode maintain .
NameNode It also stores data blocks to DataNode Mapping information for , This mapping information includes ： Where is the data block stored DataNode in , each DataNode What data blocks are saved .
NameNode Can monitor DataNode of “ heartbeat ” and “ Block Report ”, Through heartbeat and NameNode Keep communication .
DataNode actually is HDFS Where data is really stored in . The client can DataNode Request to write or read data block ,DataNode Still from NameNode Execute block creation under the instruction of , Delete and copy , And periodically NameNode Report block information .
SecondaryNameNode It's actually an auxiliary tool , It is used to help NameNode Administrative Meta data , So that NameNode Can quickly , Efficient work , So it's not the second NameNode.
Now junior , Dual non dual College Big data specialty , I don't plan to take the postgraduate entrance examination for personal reasons , Prepare for the main attack Hadoop ecosystem . Big data algorithms such as machine learning . Bought a Book
Mr. Zhang Weiyang's hadoop Big data technology development practice , I hope to record my daily learning achievements by writing a blog . Interested partners can learn together , Progress together !