<> preface

In the previous article , We learned how to use the command line based on hdfs Command pair of hdfs Common operations of file system , This article will share how to base on JavaAPI operation hdfs file system

<> Pre preparation

* On the default server hadoop Service started
* Local if yes windows environment , Local configuration is required hadoop Environment variables for
Local configuration hadoop Environment variables for

1, Download a link to the hadoop Same version package

2, Configure this path into the system variable

<> Pre preparation of coding environment

use idea Quickly build one springoot Project of

1, Import maven rely on
<dependency> <groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId> <version>3.1.3</version> </dependency>
<dependency> <groupId>junit</groupId> <artifactId>junit</artifactId>
<version>4.12</version> </dependency> <dependency> <groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId> <version>1.7.30</version> </dependency>
2, For the convenience of outputting logs , stay springoot Engineering resources Add one under the directory log4j.properties file
log4j.rootLogger=INFO, stdout
log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n
log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n
All the above pre preparation and code running environment are ready , Let's start the specific API Encoding operation hdfs file

<>API Use link

1, establish hdfs File directory
public class HdfsClientTest { static Configuration configuration = null;
static FileSystem fs = null; static { configuration = new Configuration();
configuration.set("dfs.client.use.datanode.hostname", "true"); try { fs =
FileSystem.get(new URI("hdfs://IP:9000"), configuration, "hadoop"); } catch
(IOException e) { e.printStackTrace(); } catch (InterruptedException e) {
e.printStackTrace(); } catch (URISyntaxException e) { e.printStackTrace(); } }
/** * Create directory */ public static void mkDir(String dirName){ try { fs.mkdirs(new
Path(dirName)); } catch (IOException e) { e.printStackTrace(); } } public
static void main(String[] args) throws Exception { // Create file directory mkDir("/songguo");
fs.close(); } }
Run this program , Then go web Observe whether the page is created successfully

2, Upload files to hdfs File directory
/** * Upload files to hdfs */ public static void uploadFile(String localPath,String
hdfsPath){ try { fs.copyFromLocalFile(new Path(localPath), new Path(hdfsPath));
} catch (IOException e) { e.printStackTrace(); } } public static void
main(String[] args) throws Exception { // Create file directory //mkDir("/songguo");
// Upload files to hdfs uploadFile("E:\\haha.txt", "/songguo"); fs.close(); }

Running program , observation web On the interface /songguo Is there any in the directory haha.txt file

3, from hdfs Download files locally from above
/** * from hdfs Download files locally from above */ public static void loadFileFromDfs(String
localPath,String hdfsPath){ try { fs.copyToLocalFile(false,new
Path(hdfsPath),new Path(localPath),false); } catch (IOException e) {
e.printStackTrace(); } } public static void main(String[] args) throws
Exception { // Create file directory //mkDir("/songguo"); // Upload files to hdfs
//uploadFile("E:\\haha.txt", "/songguo"); // from hdfs Download files locally from above
loadFileFromDfs("E:\\haha_1.txt","/songguo/haha.txt"); fs.close(); }
Run this program , observation E Whether the download from the disk is successful haha_1.txt file

4, delete hdfs file
/** * delete hdfs file * @param hdfsPath File path * @param recuDelete Whether to delete recursively */ public
static void deleteFile(String hdfsPath,boolean recuDelete){ try { fs.delete(new
Path(hdfsPath),recuDelete); } catch (IOException e) { e.printStackTrace(); } }
public static void main(String[] args) throws Exception { // Delete file
deleteFile("/songguo/haha.txt",false); fs.close(); }
Running program , observation web Interface /songguo Whether the files under the directory have been deleted

5, modify hdfs Document name
/** * File rename * @param sourceFilePath * @param targetFilePath */ public static
void renameFile(String sourceFilePath,String targetFilePath){ try {
fs.rename(new Path(sourceFilePath),new Path(targetFilePath)); } catch
(IOException e) { e.printStackTrace(); } } public static void main(String[]
args) throws Exception { // File rename
renameFile("/qinguo/haha.txt","/qinguo/haha_rename.txt"); fs.close(); }
stay /qinguo There is one under the directory haha.txt file , We rename it , Run the above code

6, Modify while moving hdfs Document name

This and the one above API equally , Still use rename that will do , For example, will /songuo/haha_rename.txt Move to /sanguo Under the directory
, You only need to change the parameters passed in
// File rename renameFile("/qinguo/haha_rename.txt","/sanguo/haha.txt");

The above is the renaming and moving of specific files under the directory , yes hdfs The document directory is also applicable

7, File view related

View file directory
public static void main(String[] args) throws Exception {
RemoteIterator<LocatedFileStatus> files = fs.listFiles(new Path("/"), true);
while (files.hasNext()){ // Specific file information under the iteration directory LocatedFileStatus fileStatus=
files.next(); System.out.println(" ============== see file Information about ==============");
System.out.println(" File path : "+ fileStatus.getPath() );
System.out.println(" File path name :" + fileStatus.getPath().getName());
System.out.println(" File permissions :" + fileStatus.getPermission());
System.out.println(" Document user :" + fileStatus.getOwner());
System.out.println(" File group information :" + fileStatus.getGroup());
System.out.println(" file size :" + fileStatus.getLen()); System.out.println(" Document modification time :"
+ fileStatus.getModificationTime()); System.out.println(" Document copy information :" +
fileStatus.getReplication()); System.out.println(" Document copy information :" +
fileStatus.getReplication()); System.out.println(" File block size :" +
fileStatus.getBlockSize()); } fs.close(); }

of course , About file information ,hdfs There is also a richer display of information , Interested students can refer to the official website for information

8,hdfs Judgment of files and folders

From now on hdfs Understanding of , We know that there are differences between files and folders , Let's see how to use api To judge
public static void main(String[] args) throws Exception { FileStatus[]
fileStatuses = fs.listStatus(new Path("/")); for (FileStatus fileStatus :
fileStatuses) { boolean directory = fileStatus.isDirectory(); if(directory){
System.out.println(fileStatus.getPath().getName() + " Is the file directory "); } boolean file =
fileStatus.isFile(); if(file){
System.out.println(fileStatus.getPath().getName() + " Is a file "); } } fs.close(); }
Can see , There are in the root directory 3 Directories , And a file , Run this program , See if you can make a correct judgment

Through the above content , We basically learned how to JavaAPI yes hdfs Common operations of file system , It is also one of the things we often deal with at work , More content can be further studied on this basis

<> integration Java Several pits encountered during the client process

in fact , Really in idea When writing code for implementation in , Not so smooth , Encountered many pits , Here are some pitfalls in the coding process , I hope the students can reasonably avoid driving

1, The running program directly reports that it cannot connect


* stay configuration The address there , Be sure to confirm and hdfs Inside dataNode The configuration there is consistent

* If Alibaba cloud or Tencent cloud is used , So in hdfs-site Down here , Fill in the intranet address 【 This is not recommended in production environments 】, Anyway, I did

* In the configuration file , Namely hdfs-site In the configuration file of , The following user settings , In the program in , It's better to add

2, Upload files to hdfs Under the directory , Can upload , But the uploaded file is empty , And the console reports an error

The error information is as follows :

The key error content is the following line
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/songguo/haha.txt could only be written to 0 of the 1 minReplication nodes.
There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

There are many answers on the Internet , But they are basically the same , Let me re execute the format command , delete namenode and datanode, Then restart the service , In fact, it only needs to be in the code configuration Add the following line of code to configure there
configuration.set("dfs.client.use.datanode.hostname", "true");
Directly give our analysis results

NameNode The node stores the file directory , That is, folders , Document name , Local can be accessed through the public network
NameNode, So you can create folders , When uploading files, you need to write data to DataNode Hour ,NameNode and DataNode
Communication through LAN ,NameNode The return address is DataNode Private of IP, Local inaccessible

Returned IP Address cannot be returned to the public network IP, Only hostname can be returned , It can be accessed through the mapping of host name and public network address DataNode node , The problem will be solved .

Because the priority of code setting is the highest , So set the code directly

©2019-2020 Toolsou All rights reserved,
Solve in servlet The Chinese output in is a question mark C String function and character function in language MySQL management 35 A small coup optimization Java performance —— Concise article Seven sorting algorithms (java code ) use Ansible Batch deployment SSH Password free login to remote host according to excel generate create Build table SQL sentence Spring Source code series ( sixteen )Spring merge BeanDefinition Principle of Virtual machine installation Linux course What are the common exception classes ?