目的:

初步感受一下hadoop mapreduce

环境:

hadoop 2.6.4

1 准备输入文件

paper.txt 内容一般为英文文章,随便弄点什么进去
hadoop@ssmaster:~$ hadoop fs -mkdir /input
hadoop@ssmaster:~$ ls
Desktop Documents Downloads examples.desktop hadoop-2.6..tar.gz Music paper.txt Pictures Public Templates Videos
hadoop@ssmaster:~$ hadoop fs -put paper.txt /input
hadoop@ssmaster:~$ hadoop fs -ls /input
Found items
-rw-r--r-- hadoop supergroup -- : /input/paper.txt

注意:输出目录/output 不用提前创建,程序会自动做这一步

2  执行

hadoop@ssmaster:~$ hadoop jar /opt/hadoop-2.6./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6..jar  wordcount /input /output
// :: INFO client.RMProxy: Connecting to ResourceManager at ssmaster/192.168.249.144:
// :: INFO input.FileInputFormat: Total input paths to process :
// :: INFO mapreduce.JobSubmitter: number of splits:
// :: INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1477208120905_0001
// :: INFO impl.YarnClientImpl: Submitted application application_1477208120905_0001
// :: INFO mapreduce.Job: The url to track the job: http://ssmaster:8088/proxy/application_1477208120905_0001/
// :: INFO mapreduce.Job: Running job: job_1477208120905_0001
// :: INFO mapreduce.Job: Job job_1477208120905_0001 running in uber mode : false
// :: INFO mapreduce.Job: map % reduce %

6/10/23 00:51:38 INFO mapreduce.Job: map 0% reduce 0%
16/10/23 00:52:17 INFO mapreduce.Job: map 100% reduce 0%
16/10/23 00:52:39 INFO mapreduce.Job: map 100% reduce 100%
16/10/23 00:52:41 INFO mapreduce.Job: Job job_1477208120905_0001 completed successfully
16/10/23 00:52:41 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=2061
FILE: Number of bytes written=217797
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1863
HDFS: Number of bytes written=1425
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=35792
Total time spent by all reduces in occupied slots (ms)=18540
Total time spent by all map tasks (ms)=35792
Total time spent by all reduce tasks (ms)=18540
Total vcore-milliseconds taken by all map tasks=35792
Total vcore-milliseconds taken by all reduce tasks=18540
Total megabyte-milliseconds taken by all map tasks=36651008
Total megabyte-milliseconds taken by all reduce tasks=18984960
Map-Reduce Framework
Map input records=11
Map output records=303
Map output bytes=2969
Map output materialized bytes=2061
Input split bytes=101
Combine input records=303
Combine output records=158
Reduce input groups=158
Reduce shuffle bytes=2061
Reduce input records=158
Reduce output records=158
Spilled Records=316
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=1093
CPU time spent (ms)=5550
Physical memory (bytes) snapshot=442781696
Virtual memory (bytes) snapshot=1448112128
Total committed heap usage (bytes)=276299776
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1762
File Output Format Counters
Bytes Written=1425

可以从Web监控页面查看执行状态

http://ssmaster:8088/cluster

Cluster Metrics

Apps Submitted Apps Pending Apps Running Apps Completed Containers Running Memory Used Memory Total Memory Reserved VCores Used VCores Total VCores Reserved Active Nodes Decommissioned Nodes Lost Nodes Unhealthy Nodes Rebooted Nodes
1 0 1 0 2 3 GB 8 GB 0 B 2 8 0 1 0 0 0 0
Show 
20
40
60
80
100

entries

Search: 
 
ID
User
Name
Application Type
Queue
StartTime
FinishTime
State
FinalStatus
Progress
Tracking UI
Blacklisted Nodes
application_1477208120905_0001 hadoop word count MAPREDUCE default Sun, 23 Oct 2016 07:51:13 GMT N/A RUNNING UNDEFINED   ApplicationMaster 0

3 查看输出结果

hadoop@ssmaster:~$ hadoop fs -ls /output
Found items
-rw-r--r-- hadoop supergroup -- : /output/_SUCCESS
-rw-r--r-- hadoop supergroup -- : /output/part-r-
hadoop@ssmaster:~$ hadoop fs -cat /output/part-r-
Always
Dream
There
a
all
along
always
...........
...........

Q 总结

非常简单,没什么感觉。

后续:

  • 自己编写mapreduce wordcount 程序
  • 搭建一个纯分布式,同样的程序处理一个大文件,观察一下速度

最新文章

  1. [LeetCode] Minimum Moves to Equal Array Elements 最少移动次数使数组元素相等
  2. ARM: STM32F7: hardfault caused by unaligned memory access
  3. clang LLVM 介绍和安装(Ubuntu10 64位)
  4. js-面向对象的程序设计,函数表达式
  5. 最火的.NET开源项目
  6. ACM ICPC 2015 Moscow Subregional Russia, Moscow, Dolgoprudny, October, 18, 2015 C. Colder-Hotter
  7. Objective-c——UI进阶开发第一天(UIPickerView和UIDatePicker)
  8. C语言下WebService的使用方式
  9. ASP.NET-遇到的错误汇总
  10. ZOJ 1078 Palindrom Numbers
  11. Microsoft SqlServer2008技术内幕:T-Sql语言基础-读书笔记-单表查询SELECT语句元素
  12. extjs中grid对于其中表单的表头的读取以及是否隐藏的判断
  13. 飘窗代码修改了一段js
  14. Grunt.js 上手
  15. bzoj4554: [Tjoi2016&Heoi2016]游戏 二分图匹配
  16. hdu1312 Red and Black 简单BFS
  17. (转)生产者/消费者问题的多种Java实现方式 (待整理)
  18. django 第二天
  19. vue的一些随记
  20. 也谈开源GIS架构实现思想

热门文章

  1. http请求post,文件导出兼容IE10+
  2. CSS绘制三角形和箭头,不用再用图片了
  3. day94_11_26爬虫find与findall
  4. Numpy常用函数用法大全
  5. [题解向] CF#Global Round 1の题解(A $\to$ G)
  6. jquery延迟加载
  7. css 适配
  8. Java连载42-this不能省略的情况、构造方法设置默认值的方法
  9. 手把手教你如何用 OpenCV + Python 实现人脸检测
  10. vue组件定义方式,vue父子组件间的传值