序:本以为今天花点时间将WordCount例子完全理解到,但高估自己了,更别说我只是在大学选修一学期的java,之后再也没碰过java语言了

总的来说,从宏观上能理解具体的程序思路,但具体到每个代码有什么作用,什么原理,那还需要花点时间,毕竟需要一点java基础和hadoop的运行机制的知识

首先启动hadoop;

[hadoop@hadoop01 eclipse]$ cd ~/hadoop-3.2.0
[hadoop@hadoop01 hadoop-3.2.0]$ sbin/start-all.sh
WARNING: Attempting to start all Apache Hadoop daemons as hadoop in 10 seconds.
WARNING: This is not a recommended production deployment configuration.
WARNING: Use CTRL-C to abort.
Starting namenodes on [hadoop01]
Starting datanodes
Starting secondary namenodes [hadoop01]
Starting resourcemanager
Starting nodemanagers
[hadoop@hadoop01 hadoop-3.2.0]$ jps
8497 NameNode
9121 ResourceManager
8868 SecondaryNameNode
9268 NodeManager
9630 Jps

然后,进入root权限打开eclipse;

[hadoop@hadoop01 hadoop-3.2.0]$ su root
Password:
[root@hadoop01 hadoop-3.2.0]# cd ..
[root@hadoop01 hadoop]# cd eclipse
[root@hadoop01 eclipse]# ./eclipse

在eclipse的window里面show view打开terminal;

在eclipse中点击打开open a terminal,在终端中输入命令:gedit input.txt;

在文档中任意输入内容;

在终端中输入命令:hadoop fs -put /home/hadoop/input.txt /test/;

最后,file--new--project--MapReduce project并取项目名“Wordcount”,再从创建的文件下src中new--package并为包取名“com.hadoop”,又在src下new--class并为类取名“Wordcount”,然后将下面的代码粘贴进去。

然后可以run as hadoop,成功运行得到计算结果

注:若package下无log4j.properties,会报错,需在该文件下手动添加该文件。
内容 如下:

# Configure logging for testing: optionally with log file

#log4j.rootLogger=debug,appender
log4j.rootLogger=info,appender
#log4j.rootLogger=error,appender #\u8F93\u51FA\u5230\u63A7\u5236\u53F0
log4j.appender.appender=org.apache.log4j.ConsoleAppender
#\u6837\u5F0F\u4E3ATTCCLayout
log4j.appender.appender.layout=org.apache.log4j.TTCCLayout

附代码:

/**
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.hadoop.examples; import java.io.IOException;
import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class TokenizerMapper
extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1);
private Text word = new Text(); public void map(Object key, Text value, Context context
) throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
} public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWritable> {
private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values,
Context context
) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
} public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs();
if (otherArgs.length < 2) {
System.err.println("Usage: wordcount <in> [<in>...] <out>");
System.exit(2);
}
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
for (int i = 0; i < otherArgs.length - 1; ++i) {
FileInputFormat.addInputPath(job, new Path(otherArgs[i]));
}
FileOutputFormat.setOutputPath(job,
new Path(otherArgs[otherArgs.length - 1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}

最新文章

  1. EXCEL中对1个单元格中多个数字求和
  2. R可视化lend_club 全球最大的P2P平台数据75W条
  3. AC日记——有趣的跳跃 openjudge 1.6 07
  4. openwrt的环境搭建、定制和编译
  5. vsphere vcenter server下安装ubuntu的vmwaretools
  6. 杂乱无章之Oracle(二)
  7. Python实战:Python爬虫学习教程,获取电影排行榜
  8. 用Ajax读取XML格式的数据
  9. javascript创建类的6种方式
  10. 1!到n!的和
  11. 详解 Vue 2.4.0 带来的 4 个重大变化
  12. iOS程序闪退的原因以及处理办法
  13. lvs 负载均衡 NAT模式
  14. linux 服务器网络有关的内核参数
  15. python学习日记(OOP——@property)
  16. 分布式监控系统开发【day38】:监控trigger表结构设计(一)
  17. adv生成控制器手腕位置倾斜原因以及解决方案
  18. kettle用mysql创建资源库执行sql代码报错
  19. 通过XMLHttpRequest和jQuery两种方式实现ajax
  20. 2.0CNN

热门文章

  1. js控制input框只能输入数字和一位小数点和小数点后面两位小数
  2. C#中的断言(Assert)
  3. rabbitmq - 消息接收,解析xml格式数据时异常:ERROR not well-formed (invalid token): line 4, column 46
  4. Pytorch 类别平衡化处理
  5. LODOP问答部分链接
  6. LODOP表格table简短问答及相关博文
  7. Linq查询连接guid与varchar字段
  8. ADB 常用命令及详解
  9. 第1/7Beta冲刺
  10. springboot整合mybatis,mongodb,redis