Hadoop Monitoring

 

Hadoop is an open source software framework designed for distributed storage and distributed processing of big data (very large data sets). Hadoop's primary architecture mainly consists of a storage part and a processing part. Hadoop splits files into large blocks and distributes them amongst the nodes in the cluster. The processor part of Hadoop transfers tasks to nodes for processing in parallel, thus taking advantage of data locality (nodes manipulating data they have on hand), for faster and efficient processing. Applications Manager's Hadoop Monitor provides monitoring for both versions of Hadoop i.e. Hadoop 1.x and Hadoop 2.x and helps you maintain the overall health of your distributed Hadoop cluster, ensures their availability and processes tasks faster and accurately.

 

Hadoop Server - Monitored Parameters

Go to the Monitors Category View by clicking the Monitors tab. Click on Hadoop under the Services table. Displayed is the Hadoop bulk configuration view distributed into three tabs:

Click on the monitor name to see all the server details listed under the following tabs.


 

Hadoop Version 1.x Hadoop Version 2.x

Hadoop 1.x

Overview:

 

SAFEMODE

Safemode status

Safemode status

Possible values:

 

-Operational

-Safemode

DFS
Total DFS Capacity (in GB) Total capacity of the HDFS .
NonDFS Used Space (in GB) Used memory of the HDFS which is not done using DFS commands.
DFS Used Space (in GB) Used memory of the HDFS which is done using DFS commands.
DFS Used (in %) Percentage of HDFS memory used. 
DFS Free Space (in GB) Free memory of the HDFS.
DFS Free (in %) Percentage of free memory in HDFS.
BLOCKS
Block Capacity Total block capacity of Hadoop.
Total Blocks Total number of blocks in Hadoop.
Missing Blocks Number of missing blocks in Hadoop.
Corrupt Blocks Number of corrupt blocks in Hadoop.
Excess Blocks Number of excess blocks in Hadoop.
UnderReplicated Blocks Number of under replicated blocks in Hadoop.
Pending Deletion Blocks Number of pending deletion blocks in Hadoop.
Pending Replication Blocks Number of pending replication blocks in Hadoop.
FILES
Total Files and Directories Total number of file and directories in HDFS.
Files and Directories created per sec Number of files and directories created per sec.
LOAD
Total Load Total load over the Hadoop service.

 

Top

HDFS:

 

NameNode JVM
NonHeap Memory Committed Total nonheap memory committed for usage currently.
NonHeap Memory Used Currently used nonheap memory. 
Heap Memory Commited Total heap memory committed for usage currently.
Heap Memory Used Currently used heap memory. 
Namenode OS
Total Physical Memory (in GB)  Total RAM of namenode.
Free Physical Memory (in GB) Free RAM of namenode.
Total Swap Space (in GB) Total swap space available in namenode OS.
Free Swap Space (in GB) Free swap space available in namenode OS.
Maximum File Descriptor Count Total  file descriptor capacity.
Open File Descriptor Count Number of file descriptor in open state.
Average System Load Average load in namenode OS.
DataNodes
Node Name Name of the datanode
State Current state of namenode:
  • Live
  • Dead
  • Decommissioned
Used Space (in GB) Used space in HDFS.

Top

MapReduce:

 

Tracker Summary
Total TaskTracker Total number of tasktracker.
Alive Tasktracker Number of tasktracker in alive state.
Blacklisted TaskTracker Number of tasktracker in blacklisted state.
Graylisted TaskTracker Number of tasktracker in graylisted state.
Total Number of Jobs Total number of job executed in mapreduce.
Slots Summary
Total Map Slots Total map slots capacity in mapreduce.
Used Map Slots Number of map slots used currently.
Total Reduce Slots Total reduce slots capacity in mapreduce.
Used Reduce Slots Number of reduce slots used currently.
TaskTrackers
TaskTracker Name Name of the tasktracker
State Current state of tasktracker:
  • Alive
  • Blacklisted
  • Graylisted
  • Dead
Health Current health state of tasktracker:
  • OK
  • <health error message>
Failure Count Number of failure in tasktracker.
Queue
Queue Name Name of the queue.
State Current state of queue.
Info Any error information that is thrown from queue. 

Top

 

 

Job:

 

Jobs Summary


Jobs Submitted Number of jobs in submitted state.
Jobs Preparing Number of jobs in preparing state.
Jobs Running Number of jobs in running state.
Jobs Failed Number of jobs in failed state.
Jobs Killed Number of jobs in killed state.
Jobs Completed Number of jobs in completed state.
Completed Percent (in %) Percentage of completed jobs.
Killed Percent (in %) Percentage of killed jobs.
Failed Percent (in %) Percentage of failed jobs.
Jobs Stats (in last pillong interval)
Submitted jobs count Number of jobs submitted in last polling interval.
Failed jobs count Number of jobs failed in last polling interval.
Killed jobs count Number of jobs killed in last polling interval.
Completed jobs count Number of jobs completed in last polling interval.


Top

Hadoop 2.x


Overview:

 

SAFEMODE

Safemode status

Safemode status

Possible values:

 

-Operational

-Safemode
DFS
Total DFS Capacity (in GB) Total capacity of the HDFS .
NonDFS Used Space (in GB) Used memory of the HDFS which is not done using DFS commands.
DFS Used Space (in GB) Used memory of the HDFS which is done using DFS commands.
DFS Used (in %) Percentage of HDFS memory used. 
DFS Free Space (in GB)

Free memory of the HDFS.

DFS Free (in %) Percentage of free memory in HDFS.
BLOCKS
Block Capacity Total block capacity of Hadoop.
Total Blocks Total number of blocks in Hadoop.
Missing Blocks Number of missing blocks in Hadoop.
Corrupt Blocks Number of corrupt blocks in Hadoop.
Excess Blocks Number of excess blocks in Hadoop.
UnderReplicated Blocks Number of under replicated blocks in Hadoop.
Pending Deletion Blocks Number of pending deletion blocks in Hadoop.
Pending Replication Blocks Number of pending replication blocks in Hadoop.
FILES
Total Files and Directories Total number of file and directories in HDFS.
Files and Directories created per sec Number of files and directories created per sec.
LOAD
Total Load Total load over the Hadoop service.

Top

HDFS:

DataNode Summary
Live Datanodes Number of datanode in live state.
Dead Datanodes Number of datanode in dead state.
Live-Decommissioned Datanodes Number of datanode in live but decommissioned.
Dead-Decommissioed Datanodes Number of datanode in dead and decommissioned.
Decommissioning Datanodes Numer of datanode in decommissioned state.
Stale Datanodes Number of datanode in stale state.
Live Datanode Percent (in %) Percentage of datanode in live state.
Dead Datanode Percent (in %) Percentage of datanode in dead state.
DataNodes

Node Name Name of datanode.
State Current state of the datanode:
  • Live
  • Decommission In Progress
  • Live - Decommissioned
  • Dead - Decommissioned
  • Dead
Total Capacity (in GB) Total capacity of the HDFS.
NonDFS Used (in GB) Amount of memory used in HDFS by non- HDFS commands.
DFS Used (in GB) Amount of memory used in HDFS by HDFS commands.
DFS Used Percent (in %) Percentage of memory used in HDFS by HDFS commands
DFS Free (in GB) Amount of memory free in HDFS.
DFS Free Percent (in GB) Percentage of memory free in HDFS.

Top


YARN:

NodeManger Summary
Active NodeManagers Number of nodemanagers in active state.
Decommissioned NodeManagers Number of nodemanagers in decommissioned state.
Lost NodeManagers Number of nodemanagers in lost state.
UnHealthy NodeManagers Number of nodemanagers in unhealthy state.
Rebooted NodeManagers Number of nodemanagers in rebooted state.
Active NodeManager Percent (in %) Percentage of nodemanager in active state.
Lost NodeManager Percent (in %) Percentage of nodemanager in lost state.
UnHealthy NodeManager Percent (in %) Percentage of nodemanager in unhealthy state.
NodeManager
HostName Hostname of nodemanager.
Rack Rack to which this nodemanager belongs.
State

Current state of nodemanager.

  • Running
  • Unhealthy
  • Dead
Memory used (in %) Percentage of main memory used by nodemanager.
Version Version of nodemanager.

Top


Applications:

Applications

Apps Submitted Number of applications in submitted state.
Apps Completed Number of applications in completed state.
Apps Pending Number of applications in pending state.
Apps Running Number of applications in running state.
Apps Failed Number of applications in failed state.
Apps Killed Number of applications in killed state.
Percent Completed (in %) Percentage of completed applications.
Percent Killed (in %) Percentage of killed applications.
Percent Failed (in %) Percentage of failed applications.
Applications stat (in last polling interval)
Submitted apps count Number of applications submitted in last polling interval.
Failed apps count Number of applications failed in last polling interval.
Killed apps count Number of applications killed in last polling interval.
Completed apps count Number of applications completed in last polling interval.

Top