HDFS Main components

1, data block (Block)

        HDFS The files in are stored as data blocks , The default basic storage unit is 128MB Data block , in other words , store in HDFS All files in are divided into 128MB
A piece of data is stored , If the file itself is less than 128MB, It is stored according to the actual size , Does not occupy the entire database space .

        HDFS Why is the data block set so large , The purpose is to reduce addressing overhead . More data blocks , The more time it takes to address data blocks . Of course, it will not be set too large .

HDFS Each data block has a default 3 Copies , Distributed storage in different DataNode in , To achieve fault tolerance , All if a data block is lost, it does not affect the access of the data block . The size of the data block and the number of copies can be changed in the configuration file .


NameNode yes HDFS Storing metadata in ( Metadata is the file name , Size and location on the computer ) Place , It saves the elements of all files and folders in a file system directory tree , Any metadata changes ,NameNode Will record .HDFS Each file in the is split into multiple data blocks , This correspondence between files and data blocks is also stored in the file system directory tree , from NameNode maintain .

NameNode It also stores data blocks to DataNode Mapping information for , This mapping information includes : Where is the data block stored DataNode in , each DataNode What data blocks are saved .

        NameNode Can monitor DataNode of “ heartbeat ” and “ Block Report ”, Through heartbeat and NameNode Keep communication .


DataNode actually is HDFS Where data is really stored in . The client can DataNode Request to write or read data block ,DataNode Still from NameNode Execute block creation under the instruction of , Delete and copy , And periodically NameNode Report block information .


SecondaryNameNode It's actually an auxiliary tool , It is used to help NameNode Administrative Meta data , So that NameNode Can quickly , Efficient work , So it's not the second NameNode.

         Now junior , Dual non dual College   Big data specialty , I don't plan to take the postgraduate entrance examination for personal reasons , Prepare for the main attack Hadoop ecosystem . Big data algorithms such as machine learning . Bought a Book
Mr. Zhang Weiyang's hadoop Big data technology development practice , I hope to record my daily learning achievements by writing a blog . Interested partners can learn together , Progress together !

©2019-2020 Toolsou All rights reserved,
Redis Underlying data structure A person who dissuades others from learning computer , Are not good people win10 System computer C Where's the disc ,c disc users where? (win10c Disk not found users) Freshman c Language student management system ( lower ) hospital WIFI Which family is strong ? utilize Python Script unlimited spoof girlfriend computer C language program design —— Student management system Byte runout - Test development practice - One side cool meridian python Run code at specified time 5 Best style Linux Server system recommendation Anti anxiety life after naked resignation , I believe everything will have the best arrangement