Category Archives: Hadoop – Commands

Tracking YARN logs

Create script to get yarn logs $ vim hadoop_logs.sh #!/bin/bash APPLICATION_ID=$1 CONTAINER_ID=$2 NODE_ADDRESS=$3 if [ $# -eq 1 ]; then yarn logs -applicationId ${APPLICATION_ID} elif [ $# -eq 3 ]; then yarn logs -applicationId ${APPLICATION_ID} -containerId ${CONTAINER_ID} -nodeAddress ${NODE_ADDRESS} else echo "you must specify 1 or 3 arguments <hlogs applicationId containerId nodeAddress>" fi Create a… Read More »

Search for a pattern in HDFS files – python script

Problem: Search a pattern in HDFS files and return the filename which contains this pattern. For example, below are our input files: $vim log1.out [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/htdocs/test [Wed Oct 11 14:32:52 2000] [error] [client 127.0.0.1] client denied by server configuration: /export/home/live/ap/htdocs/test [Wed Oct 11… Read More »

Hadoop / HDFS Commands

Few useful Hadoop Commands Uncompress gz file from HDFS to HDFS – Hadoop: $hadoop fs -text /hdfs_path/compressed_file.gz | hadoop fs -put – /hdfs_path/uncompressed-file.txt To uncompress while copying from local to HDFS directly: $gunzip -c filename.txt.gz | hadoop fs -put – /user/dc-user/filename.txt Hadoop commands for reporting purpose: $hdfs fsck /hdfs_path $hdfs fsck /hdfs_path -files -locations $hadoop… Read More »