非常希望能在hadoop上做c++程序。自己对c++还是有点情节的,依据《hadoop权威指南中文第二版》Hadoop的Pipes进行了试验,并測试成功

#include <algorithm>
#include <limits.h>
#include <stdint.h>
#include <string> #include "Pipes.hh"
#include "TemplateFactory.hh"
#include "StringUtils.hh" class MaxTemperatureMapper : public HadoopPipes::Mapper {
public:
MaxTemperatureMapper(HadoopPipes::TaskContext& context) {
}
void map(HadoopPipes::MapContext& context) {
std::string line = context.getInputValue();
std::string year = line.substr(15, 4);
std::string airTemperature = line.substr(87, 5);
std::string q = line.substr(92, 1);
if (airTemperature != "+9999" &&
(q == "0" || q == "1" || q == "4" || q == "5" || q == "9")) {
context.emit(year, airTemperature);
}
}
}; class MapTemperatureReducer : public HadoopPipes::Reducer {
public:
MapTemperatureReducer(HadoopPipes::TaskContext& context) {
}
void reduce(HadoopPipes::ReduceContext& context) {
int maxValue = INT_MIN;
while (context.nextValue()) {
maxValue = std::max(maxValue, HadoopUtils::toInt(context.getInputValue()));
}
context.emit(context.getInputKey(), HadoopUtils::toString(maxValue));
}
}; int main(int argc, char *argv[]) {
return HadoopPipes::runTask(HadoopPipes::TemplateFactory<MaxTemperatureMapper,
MapTemperatureReducer>());
}

注意:和书上不一样的地方:limit.h头文件

Makefile文件(自己进行了改动):

.SUFFIXES:.h .c .cpp .o

CC=g++
CPPFLAGS = -m64
RM = rm
SRCS = max_temperature.cpp
PROGRAM = max_temperature INC_PATH = -I$(HADOOP_DEV_HOME)/include
LIB_PATH = -L$(HADOOP_DEV_HOME)/lib/native
LIBS = -lhadooppipes -lcrypto -lhadooputils -lpthread $(PROGRAM):$(SRCS)
$(CC) $(CPPFLAGS) $(INC_PATH) $< -Wall $(LIB_PATH) $(LIBS) -g -O2 -o $@ .PHONY:clean
clean:
$(RM) $(PROGRAM)

源数据文件:

0067011990999991950051507004+68750+023550FM-12+038299999V0203301N00671220001CN9999999N9+00001+99999999999  

0043011990999991950051512004+68750+023550FM-12+038299999V0203201N00671220001CN9999999N9+00221+99999999999  

0043011990999991950051518004+68750+023550FM-12+038299999V0203201N00261220001CN9999999N9-00111+99999999999  

0043012650999991949032412004+62300+010750FM-12+048599999V0202701N00461220001CN0500001N9+01111+99999999999  

0043012650999991949032418004+62300+010750FM-12+048599999V0202701N00461220001CN0500001N9+00781+99999999999

上传到HDFS:hdfs dfs -put sample.txt

make后生成了可运行文件上传到HDFS: hdfs dfs -put max_temperature /bin

运行方法: hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input /user/root/sample.txt -output /output -program /bin/max_temperature

数据输出结果:

最新文章

  1. php函数类型
  2. DOCTYPE的重要性
  3. UDP发送中文
  4. Oracle10g在Win2008R2下因版本无法安装问题的解决
  5. Python基本数据类型之str
  6. Swift - @IBDesignable和@IBInspectable
  7. Java基础知识强化03:Java中的堆与栈
  8. C++ 简单的入门语法
  9. js获取url传递参数(转的,原作不详)
  10. WCF必须使用证书验证吗
  11. 微信小程序之获取验证码js
  12. 6、Libgdx文件处理
  13. LeetCode 字符串专题(一)
  14. [Swift]LeetCode856. 括号的分数 | Score of Parentheses
  15. 使用trash-cli防止rm -rf 误删除带来的灾难(“事前”非“事后”)
  16. jQuery 自执行函数
  17. drupal7 覆写node-type.tpl.php获取字段值的两种方式
  18. PostGIS安装
  19. Azure 门户中基于角色的访问控制入门
  20. bzoj1040 骑士

热门文章

  1. 洛谷 P2036 Perket 题解
  2. 791. Custom Sort String
  3. Oracle与MySQL连接配置
  4. LoadRunner 使用虚拟IP测试流程
  5. bzoj 1483 链表 + 启发式合并
  6. 【C#】编码史记
  7. 2017-2018-1 20179202《Linux内核原理与分析》第六周作业
  8. Python添加系统路径BASE_DIR
  9. Node.js后台开发初体验
  10. mcnp的重复探测器单元计数-fmesh卡的介绍