在MacOs上配置hadoop和spark环境

Setting up Hadoop with Spark on MacOs

Instructions

  1. 准备环境
    如果没有brew,先google怎样安装brew
    先uninstall老版本的Hadoop

    brew cleanup hadoop

    然后更新homebrew formulae

    brew update
    brew upgrade
    brew cleanup

    检查版本信息

    brew info hadoop
    brew info apache-spark
    brew info sbt
    brew info scala

    如果以上程序没有安装,需要使用brew install app 进行安装。

  2. 安装环境
    安装hadoop

    brew install hadoop

    安装spark

    brew install apache-spark scala sbt

  3. 设置环境变量
    使用vim编辑~/.bash_profile,将以下内容贴到最后

     # set environment variables
    export JAVA_HOME=$(/usr/libexec/java_home)
    export HADOOP_HOME=/usr/local/Cellar/hadoop/2.5.1
    export HADOOP_CONF_DIR=$HADOOP_HOME/libexec/etc/hadoop
    export SCALA_HOME=/usr/local/Cellar/apache-spark/1.1.0 # set path variables
    export PATH=$PATH:$HADOOP_HOME/bin:$SCALA_HOME/bin # set alias start & stop scripts
    alias hstart=$HADOOP_HOME/sbin/start-dfs.sh;$HADOOP_HOME/sbin/start-yarn.sh
    alias hstop=$HADOOP_HOME/sbin/stop-dfs.sh;$HADOOP_HOME/sbin/stop-yarn.sh
  4. Hadoop必须要使ssh生效,设置ssh

    • 配置文件路径:

      /etc/sshd_config

    • 生成秘钥:

      sh-3.2# sudo ssh-keygen -t rsa

        Generating public/private rsa key pair.
      Enter file in which to save the key (/var/root/.ssh/id_rsa): 输入/var/root/.ssh/id_rsa
      Enter passphrase (empty for no passphrase): [直接回车]
      Enter same passphrase again: [直接回车]
      Your identification has been saved in /var/root/.ssh/id_rsa.
      Your public key has been saved in /var/root/.ssh/id_rsa.pub.
      key fingerprint is:
      97:e9:5a:5e:91:52:30:63:9e:34:1a:6f:24:64:75:af root@cuican.local
      The key's randomart image is:
      +--[ RSA 2048]----+
      | .=.X . |
      | . X B . |
      | . = . . |
      | . + o |
      | S = E |
      | o . . |
      | o . |
      | + . |
      | . . |
      +-----------------+
    • 修改配置文

      sudo vim /etc/ssh/sshd_config

        Port 22
      #AddressFamily any
      #ListenAddress 0.0.0.0
      #ListenAddress ::
      # The default requires explicit activation of protocol 1
      Protocol 2
      # HostKey for protocol version 1
      #HostKey /etc/ssh/ssh_host_key
      # HostKeys for protocol version 2
      #HostKey /etc/ssh/ssh_host_rsa_key
      #HostKey /etc/ssh/ssh_host_dsa_key
      #HostKey /etc/ssh/ssh_host_ecdsa_key
      HostKey /var/root/.ssh/id_rsa # Lifetime and size of ephemeral version 1 server key
      KeyRegenerationInterval 1h
      ServerKeyBits 1024 # Logging
      # obsoletes QuietMode and FascistLogging
      SyslogFacility AUTHPRIV
      #LogLevel INFO # Authentication:
      LoginGraceTime 2m
      PermitRootLogin yes
      StrictModes yes
      #MaxAuthTries 6
      #MaxSessions 10 RSAAuthentication yes PubkeyAuthentication yes
    • 启动ssh服务

      which sshd //查找sshd的位置。

      Mac 上sshd的位置在 /usr/sbin/sshd

      在终端输入sudo /usr/sbin/sshd即可启动sshd服务。

      ssh-keygen -t rsa
      cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

  5. 配置Hadoop
    到hadoop的安装路径

    cd usr/local/Cellar/hadoop/2.5.1/libexec/

    编辑etc/hadoop/hadoop-env.sh

     # this fixes the "scdynamicstore" warning
    export HADOOP_OPTS="$HADOOP_OPTS -Djava.security.krb5.realm= -Djava.security.krb5.kdc="

    编辑etc/hadoop/core-site.xml

     <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
    </property>
    </configuration>

    编辑etc/hadoop/hdfs-site.xml

     <configuration>
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    </configuration>

    编辑etc/hadoop/mapred-site.xml

     <configuration>
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    </configuration>

    编辑etc/hadoop/yarn-site.xml

     <configuration>
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    </configuration>
  6. 开始启用Hadoop
    移动到Hadoop的root directory

    cd /usr/local/Cellar/hadoop/2.5.1

    格式化Hadoop HDFS

    ./bin/hdfs namenode -format

    启动NameNode和DataNode daemon

    ./sbin/start-dfs.sh

    从网页中查看

    http://localhost:50070/

    启动ResourceManager和NodeManager daemon

    ./sbin/start-yarn.sh

    检查所有的守护线程是不是已经在运行

    jps

    从网页中查看ResourceManager

    http://localhost:8088/

    创建HDFS目录

    ./bin/hdfs dfs -mkdir -p /user/{username}

    启动一个MapReduce的例子

     \#calculate pi
    ./bin/hadoop jar libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar pi 10 100
  7. 启动spark

    到Spark的安装目录

    cd /usr/local/Cellar/apache-spark/1.1.0

    启动Spark的例子

    ./bin/run-example SparkPi

    在网页中查看Spark任务

    http://localhost:4040/

    也可以使用Spark-submit来提交任务

     # pattern to launch an application in yarn-cluster mode
    ./bin/spark-submit --class <path.to.class> --master yarn-cluster [options] <app.jar> [options] # run example application (calculate pi)
    ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster libexec/lib/spark-examples-*.jar
  8. 结束

最新文章

  1. SAE使用心得1
  2. Android成长日记-Noification实现状态栏通知
  3. centos 添加 composer
  4. 《Linux系统free命令的使用》学习笔记
  5. javascript中的关键字和保留字
  6. 集成电路中的assert和deassert应该如何翻译?
  7. Lucene.Net 2.3.1开发介绍 —— 三、索引(三)
  8. struts2于validate要使用
  9. css变化代码
  10. Android通过编译源代码提供系统服务-android学习之旅(85)
  11. 我是庖丁,&lt;肢解IOT平台&gt;之物模型
  12. bower简明入门教程
  13. 模块的语法 import ,from...import....
  14. Swift5 语言指南(十四) 下标
  15. rabbitmq线上服务器与项目结合的问题总结
  16. python 内置方法join 给字符串加分隔符
  17. oracle 查询SQL 的执行速度
  18. 慢速HTTP拒接服务攻击(DoS)复现
  19. Oracle学习操作(3)
  20. Java内存模型Cookbook

热门文章

  1. 第一个shell编程,输出hello world!
  2. USACO5.4-TeleCowmunication
  3. 【USACO 2.1.5】海明码
  4. 当OOP语言RAII特性发展到functional形式的极致
  5. sql -实验二
  6. jquery 选项卡实现
  7. Google Protocal Buffer
  8. oracle中字符串的大小比较,字符串与数字的比较和运算
  9. Extjs4.1.x使用Application动态按需加载MVC各模块
  10. 【Oracle】安装