HDFS filenames without rest of the file details
Get the list of file names and absolute path alone from HDFS:
General hdfs list output would be:
$ hdfs dfs -ls
-rw-r--r-- 3 foo bar 6346268 2016-12-28 02:52 /user/foo/data/file007.csv
-rw-r--r-- 3 foo bar 4397850 2016-12-28 02:52 /user/foo/data/file014.csv
-rw-r--r-- 3 foo bar 13297361 2016-12-28 02:52 /user/foo/data/file020.csv
-rw-r--r-- 3 foo bar 10400852 2016-12-28 02:53 /user/foo/data/file118.csv
-rw-r--r-- 3 foo bar 10184639 2016-12-28 02:52 /user/foo/data/file205.csv
-rw-r--r-- 3 foo bar 5542293 2016-12-28 02:53 /user/foo/data/file214.csv
-rw-r--r-- 3 foo bar 6085128 2016-12-28 02:53 /user/foo/data/file307.csv
But we would need get just the absolute hdfs file paths especially in shell scripts to perform CRUD operations, like:
/user/foo/data/file007.csv
/user/foo/data/file014.csv
/user/foo/data/file020.csv
/user/foo/data/file118.csv
/user/foo/data/file205.csv
/user/foo/data/file214.csv
One way to achieve this is using awk, like:
$ hdfs dfs -ls| awk -F " " '{print $NF}'
Comments