hadoop2.2编程:Tool, ToolRunner, GenericOptionsParser, Configuration
继承关系:
1. java.util
Interface Map.Entry<K,V>
description:
public static interface Map.Entry<K,V>
methods:
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object o)
Compares the specified object with this entry for equality.
|
K |
getKey()
Returns the key corresponding to this entry.
|
V |
getValue()
Returns the value corresponding to this entry.
|
int |
hashCode()
Returns the hash code value for this map entry.
|
V |
setValue(V value)
Replaces the value corresponding to this entry with the specified value (optional operation).
|
2.java.lang.Object
|__ org.apache.hadoop.conf.Configuration
constructor: public class Configuration extends Objectimplements Iterable<Map.Entry<String,String>>, Writable 3.org.apache.hadoop.util Class ToolRunner java.lang.Object |__ org.apache.hadoop.util.ToolRunner description:
public class ToolRunner extends Object
ToolRunner
can be used to run classes implementing Tool
interface. It works in conjunction with GenericOptionsParser
to parse the generic hadoop command line arguments and modifies the Configuration
of the Tool
. The application-specific options are passed along without being modified.
methods:
static int |
run(Configuration conf, Tool tool, String[] args) Runs the given Tool by Tool.run(String[]) ,after parsing with the given generic arguments. |
static int |
run(Tool tool, String[] args) Runs the Tool with itsConfiguration . |
4.org.apache.hadoop.util Interface Tooldescription:
public interface Tool extends Configurablemethods:
int |
run(String[] args) Execute the command with the given arguments. |
5.org.apache.hadoop.conf
Interface Configurable
constructor:
public interface Configurable methods:
Configuration |
getConf() Return the configuration used by this object. |
void |
setConf(Configuration conf) Set the configuration to be used by this object. |
6.
java.lang.Object |__ org.apache.hadoop.conf.Configureddescription:
public class Configured extends Objectimplements Configurable
constructor:
Configured() Construct a Configured. |
Configured(Configuration conf) Construct a Configured |
methods:
Configuration |
getConf() Return the configuration used by this object. |
void |
setConf(Configuration conf) Set the configuration to be used by this object. |
Code1 (Configuration里添加的resource是String类型):
import java.util.Map.Entry; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.conf.Configured; import org.apache.hadoop.util.ToolRunner; import org.apache.hadoop.util.Tool; import org.apache.hadoop.fs.Path; public class ConfigurationPrinter extends Configured implements Tool { static { Configuration.addDefaultResource("config.xml"); } @Override public int run(String[] args) throws Exception { Configuration conf = getConf(); for (Entry<String, String> hash: conf) { System.out.printf("%s=%s\n", hash.getKey(), hash.getValue()); } return 0; } public static void main(String[] args) throws Exception { int exitCode = ToolRunner.run(new ConfigurationPrinter(), args); System.exit(exitCode); } }
注:Configuration class提供只一种静态方法:addDefaultresource(String name), 如上述代码, 添加Resource "config.xml"为String类型时,hadoop将从classpath里查找此文件;若Resource 为Path()类型时,hadoop将从local filesystem里查找此文件: Configuration conf = new Configuration(); conf.addResource(new Path("config.xml"));
code1的执行步骤:
#将自定义的config文件config.xml放在hadoop的$HADOOP_CONF_DIR里 mv config.xml $HADOOP_HOME/etc/hadoop/
#假如我们添加的resource如下:
<!--cat $HADOOP_HOME/etc/hadoop/config.xml--> <configuration> <property> <name>color</name> <value>yellow</value> </property> <property> <name>size</name> <value>10</value> </property> <property> <name>weight</name> <value>heavy</value> <final>true</final> </property> </configuration>
执行代码:
mkdir class source $HADOOP_HOME/libexec/hadoop-config.sh javac -d class ConfigurationPrinter.java jar -cvf ConfigurationPrinter.jar -C class ./ export HADOOP_CLASSPATH=ConfigurationPrinter.jar:$CLASSPATH #下面查找刚才添加的resource是否被读入 #我们在config.xml里添加了一项 <name>color</name>,执行 yarn ConfigurationPrinter|grep "color" color=yellow #可见代码是正确的
或者在commandline里指定HADOOP_CONF_DIR, 比如执行:
yarn ConfigurationPrinter --conf config.xml | grep color color=yellow
也是可以的!
Code2 (Configuration里添加的resource是Path类型):
import java.util.Map.Entry; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.conf.Configured; import org.apache.hadoop.util.ToolRunner; import org.apache.hadoop.util.Tool; import org.apache.hadoop.fs.Path; public class ConfigurationPrinter extends Configured implements Tool { @Override public int run(String[] args) throws Exception { Configuration conf = new Configuration(); conf.addResource(new Path("config.xml")); for (Entry<String, String> hash: conf) { System.out.printf("%s=%s\n", hash.getKey(), hash.getValue()); } return 0; } public static void main(String[] args) throws Exception { int exitCode = ToolRunner.run(new ConfigurationPrinter(), args); System.exit(exitCode); } }
此时添加的resource类型是Path()类型,故hadoop将从local filesystem里查找config.xml, 不需要将config.xml放在conf/下面,只要在代码中指定config.xml在本地文件系统中的路径即可(new Path("../others/config.xml"))
运行步骤:
mkdir class source $HADOOP_HOME/libexec/hadoop-config.sh javac -d class ConfigurationPrinter.java jar -cvf ConfigurationPrinter.jar -C class ./ export HADOOP_CLASSPATH=ConfigurationPrinter.jar:$CLASSPATH #下面查找刚才添加的resource是否被读入 #我们在config.xml里添加了一项 <name>color</name>,执行 yarn ConfigurationPrinter|grep "color" color=yellow #可见代码是正确的
备注:ConfigurationParser支持set individual properties:
Generic Options The supported generic options are: -conf <configuration file> specify a configuration file -D <property=value> use value for given property -fs <local|namenode:port> specify a namenode -jt <local|jobtracker:port> specify a job tracker -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath. -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
可以尝试:
yarn ConfigurationPrinter -d fuck=Japan | grep fuck #输出为: fuck=Japan
再次提醒:
ToolRunner
can be used to run classes implementing Tool
interface. It works in conjunction with GenericOptionsParser
to parse the generic hadoop command line arguments and modifies the Configuration
of the Tool
. The application-specific options are passed along without being modified.
ToolRunner和GenericOptionsParser共同来(解析|修改) generic hadoop command line arguments (什么是generic hadoop command line arguments? 比如:yarn command [genericOptions] [commandOptions]
最新文章
- 搭建 SubversionEdge for VS
- HTML5新增元素、标签总结
- 正向代理与反向代理的区别【Nginx读书笔记】(zz)
- SQL server 2008 Express Edition实现自动备份和自动删除备份
- Android本地服务
- JavaScript——new Date().getMonth()
- [大牛翻译系列]Hadoop(12)MapReduce 性能调优:诊断硬件性能瓶颈
- Codeforces 622B The Time 【水题】
- Asp.net获取用户名和IP
- Spark Wordcount
- Bootstrap框架的要点--栅格系统
- linux &; mac环境python多版本切换与选择
- [十二省联考2019]骗分过样例 luoguP5285 loj#3050
- 怎么去掉Xcode工程中的某种类型的警告 Implicit conversion loses integer precision: &#39;NSInteger&#39; (aka &#39;long&#39;) to &#39;int32
- SpringBoot2.x配置JsonRedisSerializer
- C 函数声明及求最大值
- 从输入URL到页面显示发生了什么
- [GO]解决golang.org/x/ 下包下载不下来的问题
- MT【131】$a_{n+1}\cdot a_n=\dfrac 1n$
- 解题:洛谷4721 [模板]分治FFT