2015-09-22 19:25:04

The problem: In /usr/local/hadoop-2.7.1 I can run ./sbin/start-dfs.sh, and the processes for both the namenode and the datanodes on all nodes are started. jps shows that the DataNode, NameNode and SecondaryNameNode are clearly running. But they are not binding to any port - neither netstat nor "lsof -i" show any TCP ports being used beside the SSH port.

Running any hdfs commands gets me  a "connection refused" error.

Apache Hadoop has a nice wiki page about how any networking problems with connection refused are not their problem, but mine. Okaaaay.

I searched for hours - was it my DNS / IP settings, was one of the configuration files, was it being on raspbian?

No. In the end, I found a posting on StackOverflow:

http://stackoverflow.com/questions/17392531/namenode-appears-to-hang-on-start

The reason for the Hadoop services not binding to any port / refusing all connections was: the process was hanging because of an old version of the Google Guava jar.

So I wrote a fix to download and install a more current version:

#!/bin/bash
cd /tmp
wget http://central.maven.org/maven2/com/google/guava/guava/18.0/guava-18.0.jar
export HADOOP_SHARED=/usr/local/hadoop-2.7.1/share/hadoop

rm "${HADOOP_SHARED}/common/lib/guava-11.0.2.jar"
rm "${HADOOP_SHARED}/hdfs/lib/guava-11.0.2.jar"
rm "${HADOOP_SHARED}/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/guava-11.0.2.jar"
rm "${HADOOP_SHARED}/kms/tomcat/webapps/kms/WEB-INF/lib/guava-11.0.2.jar"
rm "${HADOOP_SHARED}/tools/lib/guava-11.0.2.jar"
rm "${HADOOP_SHARED}/yarn/lib/guava-11.0.2.jar"

cp guava-18.0.jar "${HADOOP_SHARED}/common/lib/guava-18.0.jar"
cp guava-18.0.jar "${HADOOP_SHARED}/hdfs/lib/guava-18.0.jar"
cp guava-18.0.jar "${HADOOP_SHARED}/httpfs/tomcat/webapps/webhdfs/WEB-INF/lib/guava-18.0.jar"
cp guava-18.0.jar "${HADOOP_SHARED}/kms/tomcat/webapps/kms/WEB-INF/lib/guava-18.0.jar"
cp guava-18.0.jar "${HADOOP_SHARED}/tools/lib/guava-18.0.jar"
cp guava-18.0.jar "${HADOOP_SHARED}/yarn/lib/guava-18.0.jar"

And now Hadoop (at least the hdfs part)  starts and listens to the default ports:

hduser@node1 ~ $ sudo lsof -i tcp
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
sshd 2169 root 3u IPv4 6612 0t0 TCP *:ssh (LISTEN)
sshd 2292 root 3u IPv4 6663 0t0 TCP node1:ssh->turtle.local:55560 (ESTABLISHED)
sshd 2296 hduser 3u IPv4 6663 0t0 TCP node1:ssh->turtle.local:55560 (ESTABLISHED)
java 4395 hduser 196u IPv4 13062 0t0 TCP *:50090 (LISTEN)
java 5029 hduser 190u IPv4 22738 0t0 TCP *:50070 (LISTEN)
java 5029 hduser 202u IPv4 21275 0t0 TCP node1:8020 (LISTEN)
java 5029 hduser 212u IPv4 21877 0t0 TCP node1:8020->node4:36701 (ESTABLISHED)
java 5029 hduser 213u IPv4 21878 0t0 TCP node1:8020->node2:46195 (ESTABLISHED)
java 5029 hduser 214u IPv4 21879 0t0 TCP node1:8020->node5:44667 (ESTABLISHED)
java 5029 hduser 215u IPv4 21880 0t0 TCP node1:8020->node3:46794 (ESTABLISHED)
java 5029 hduser 216u IPv4 21882 0t0 TCP node1:8020->node1:36134 (ESTABLISHED)
java 5130 hduser 192u IPv4 21626 0t0 TCP *:50010 (LISTEN)
java 5130 hduser 196u IPv4 21632 0t0 TCP localhost:57456 (LISTEN)
java 5130 hduser 250u IPv4 18108 0t0 TCP *:50075 (LISTEN)
java 5130 hduser 251u IPv4 21851 0t0 TCP *:50020 (LISTEN)
java 5130 hduser 262u IPv4 23654 0t0 TCP node1:36134->node1:8020 (ESTABLISHED)