Maven依赖:

<dependency>
<groupId>jdk.tools</groupId>
<artifactId>jdk.tools</artifactId>
<version>1.6</version>
<scope>system</scope>
<systemPath>${JAVA_HOME}/lib/tools.jar</systemPath>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.6.5</version>
</dependency> <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.5</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.6.5</version>
</dependency>

Mapper类:

public class WordcountMapper extends Mapper<LongWritable,Text,Text,IntWritable>{

    @Override
protected void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException { String line = value.toString(); for(String word : line.split(" ")) {
context.write(new Text(word), new IntWritable(1));
} }
}

Reducer类:

public class WordcountReducer extends Reducer<Text, IntWritable,Text, IntWritable> {

    @Override
protected void reduce(Text key, Iterable<IntWritable> values,Context context) throws IOException, InterruptedException {
int count = 0;
for(IntWritable value : values) {
count += value.get();
}
context.write(key , new IntWritable(count));
}
}

启动类:

public class WordcountLancher {

    public static void main(String[] args) throws Exception{
String inputPath = args[0];
String outputPath = args[1]; Job job = Job.getInstance(); job.setMapperClass(WordcountMapper.class);
job.setReducerClass(WordcountReducer.class); job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class); job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class); FileInputFormat.setInputPaths(job, new Path(inputPath));
FileOutputFormat.setOutputPath(job, new Path(outputPath)); boolean success = job.waitForCompletion(true); System.exit(success ? 0 : 1); } }

在HDFS中准备输入数据:

hadoop fs -mkdir -p /wordcount/input

hadoop fs -put LICENSE.txt /wordcount/input

记得启动yarn:

start-yarn.sh

启动map-reduce程序:

 hadoop jar wordcount.jar me.huqiao.hadoop.mr.WordcountLancher /wordcount/input /wordcount/output

查看结果:

hadoop fs -cat /wordcount/output/part-r- |more

最新文章

  1. iOS 内存问题
  2. ios swift 2 新的OptionSetType使用方法
  3. 《BI项目笔记》挑选产出分析Cube
  4. 嵌入式web服务器
  5. 利用同一 ASP.NET 的多个代码框架
  6. C# is as
  7. TCP的几个状态 (SYN, FIN, ACK, PSH, RST, URG)
  8. 关于 iOS 批量打包的总结
  9. mycat未配置对应表导致报错
  10. cell高度自动适应文章内容
  11. linux查看压缩包的文件列表
  12. RDLC报表系列(六) 多图表-折线图和柱状图
  13. 201312月CCF-2,ISBN号码分析
  14. UML总结复习指南
  15. MO_GLOBAL - EBS R12 中 Multi Org 设计的深入研究 (3)
  16. Traceback (most recent call last): File &quot;c:\program files (x86)\microsoft visual studio\2019\community\common7\ide\extensions\microsoft\python\core\Packages\ptvsd\_vendored\pydevd\_pydevd_bundle\pyd
  17. Oracle 数据表误删恢复 Flashback
  18. PHP利用ImageMagick把PDF转成PNG
  19. Java学习笔记——File类文件管理及IO读写、复制操作
  20. tablix“Tablix1”有一个具有内部成员的详细信息成员

热门文章

  1. Facebook-Haystack合并小文件
  2. 以太网接口芯片W5300使用说明
  3. mybatis中sql语句的批量插入
  4. 老男孩Python全栈开发(92天全)视频教程 自学笔记03
  5. AIO5程序中审批跳转条件:超过某一个值必须总经理审批
  6. 《Maven实战》 第7章 生命周期与插件
  7. Android智能下拉刷新加载框架—看这些就够了
  8. 巧-微信公众号-操作返回键问题-angularjs开发 SPA
  9. 基于laravel5.5和vue2开发的个人博客
  10. CCF-201512-2-消除类游戏