ubuntu下搭建hadoop平台

终于把单击模式跟伪分布式模式搭建起来了，记录于此。

1.SSH无密码验证配置

因为伪分布模式下DataNode和NameNode均是本身，所以必须配置SSH localhost的无密码验证。

第一步，安装并启动SSH：

~$ sudo apt-get install openssh-server

~$ sudo /etc/init.d/ssh start

第二步，生成公钥和私钥，并将公钥追加到authorized_keys中（authorized_keys用于保存所有允许以当前用户身份登录到ssh客户端用户的公钥内容）：

~$ ssh-keygen -t rsa -P ""

~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

2.安装java：

~$ sudo apt-get install openjdk-6-jdk

3.安装hadoop

第一步，官网http://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/下载hadoop（我下载的是1.2.1版本）。解压并移动到/usr/local目录下，增加hadoop用户权限：

~$ sudo tar -xzf hadoop-1.1.2.tar.gz

~$ sudo mv hadoop-1.1.2 /usr/local/hadoop

~$ sudo chown -R hadoop:hadoop /usr/local/hadoop

第二步，在/hadoop/conf/hadoop-env/sh中配置java环境：

export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386

export HADOOP_HOME=/usr/local/hadoop

export PATH=$PATH:/usr/local/hadoop/bin

第三步，配置core-site.xml，hdfs-site.xml和mapred-site.xml：

core-site.xml：

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>

        <name>fs.default.name</name>

        <value>hdfs://localhost:9000</value>

    </property>

    <property>

        <name>hadoop.tmp.dir</name>

        <value>/usr/local/hadoop/tmp</value>

    </property>

</configuration>

hdfs-site.xml：

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>

        <name>dfs.replication</name>

        <value>1</value>

    </property>

    <property>

        <name>dfs.name.dir</name>

        <value>/usr/local/hadoop/hdfs/name</value>

    </property>

    <property>

        <name>dfs.data.dir</name>

        <value>/usr/local/hadoop/hdfs/data</value>

    </property>

</configuration>

mapred-site.xml：

<?xml version="1.0"?>

<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>

    <property>

        <name>mapred.job.tracker</name>

        <value>localhost:9001</value>

    </property>

</configuration>

第四步，使环境变量生效并格式化HDFS：

~$ source /usr/local/hadoop/conf/hadoop-env.sh

~$ hadoop namenode -format

第五步，启动hadoop并列出所有守护进程来查看是否安装成功：

~$ bin/start-all.sh

~$ jps

第六步，环境测试：

~$ bin/hadoop dfs -mkdir input

~$ hadoop dfs -copyFromLocal conf/* input

~$ hadoop jar hadoop-examples-1.1.2.jar wordcount input output

~$ hadoop dfs -cat output/*

第七步，关闭hadoop守护进程：

~$ bin/stop-all.sh

巴特西

ubuntu下搭建hadoop平台

最新文章

热门文章