Thrift JDBC Server描述

Thrift JDBC Server使用的是HIVE0.12的HiveServer2实现。能够使用Spark或者hive0.12版本的beeline脚本与JDBC Server进行交互使用。Thrift JDBC Server默认监听端口是10000。

使用Thrift JDBC Server前需要注意:

1、将hive-site.xml配置文件拷贝到$SPARK_HOME/conf目录下;

2、需要在$SPARK_HOME/conf/spark-env.sh中的SPARK_CLASSPATH添加jdbc驱动的jar包

export SPARK_CLASSPATH=$SPARK_CLASSPATH:/home/hadoop/software/mysql-connector-java-5.1.27-bin.jar

Thrift JDBC Server命令使用帮助:

cd $SPARK_HOME/sbin
start-thriftserver.sh --help
Usage: ./sbin/start-thriftserver [options] [thrift server options]
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
--deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
on one of the worker machines inside the cluster ("cluster")
(Default: client).
--class CLASS_NAME Your application's main class (for Java / Scala apps).
--name NAME A name of your application.
--jars JARS Comma-separated list of local jars to include on the driver
and executor classpaths.
--py-files PY_FILES Comma-separated list of .zip, .egg, or .py files to place
on the PYTHONPATH for Python apps.
--files FILES Comma-separated list of files to be placed in the working
directory of each executor. --conf PROP=VALUE Arbitrary Spark configuration property.
--properties-file FILE Path to a file from which to load extra properties. If not
specified, this will look for conf/spark-defaults.conf. --driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: 512M).
--driver-java-options Extra Java options to pass to the driver.
--driver-library-path Extra library path entries to pass to the driver.
--driver-class-path Extra class path entries to pass to the driver. Note that
jars added with --jars are automatically included in the
classpath. --executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G). --help, -h Show this help message and exit
--verbose, -v Print additional debug output Spark standalone with cluster deploy mode only:
--driver-cores NUM Cores for driver (Default: ).
--supervise If given, restarts the driver on failure. Spark standalone and Mesos only:
--total-executor-cores NUM Total cores for all executors. YARN-only:
--executor-cores NUM Number of cores per executor (Default: ).
--queue QUEUE_NAME The YARN queue to submit to (Default: "default").
--num-executors NUM Number of executors to launch (Default: ).
--archives ARCHIVES Comma separated list of archives to be extracted into the
working directory of each executor. Thrift server options:
--hiveconf <property=value> Use value for given property

master的描述与Spark SQL CLI一致

beeline命令使用帮助:

cd $SPARK_HOME/bin
beeline --help
Usage: java org.apache.hive.cli.beeline.BeeLine
-u <database url> the JDBC URL to connect to
-n <username> the username to connect as
-p <password> the password to connect as
-d <driver class> the driver class to use
-e <query> query that should be executed
-f <file> script file that should be executed
--color=[true/false] control whether color is used for display
--showHeader=[true/false] show column names in query results
--headerInterval=ROWS; the interval between which heades are displayed
--fastConnect=[true/false] skip building table/column list for tab-completion
--autoCommit=[true/false] enable/disable automatic transaction commit
--verbose=[true/false] show verbose error messages and debug info
--showWarnings=[true/false] display connection warnings
--showNestedErrs=[true/false] display nested errors
--numberFormat=[pattern] format numbers using DecimalFormat pattern
--force=[true/false] continue running script even after errors
--maxWidth=MAXWIDTH the maximum width of the terminal
--maxColumnWidth=MAXCOLWIDTH the maximum width to use when displaying columns
--silent=[true/false] be more silent
--autosave=[true/false] automatically save preferences
--outputformat=[table/vertical/csv/tsv] format mode for result display
--isolation=LEVEL set the transaction isolation level
--help display this message

Thrift JDBC Server/beeline启动

启动Thrift JDBC Server:默认端口是10000

cd $SPARK_HOME/sbin
start-thriftserver.sh

如何修改Thrift JDBC Server的默认监听端口号?借助于--hiveconf

start-thriftserver.sh  --hiveconf hive.server2.thrift.port=

HiveServer2 Clients 详情参见:https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients

启动beeline

cd $SPARK_HOME/bin
beeline -u jdbc:hive2://hadoop000:10000/default -n hadoop

sql脚本测试

SELECT track_time, url, session_id, referer, ip, end_user_id, city_id FROM page_views WHERE city_id = -1000 limit 10;
SELECT session_id, count(*) c FROM page_views group by session_id order by c desc limit 10;

最新文章

  1. [WPF系列] 高级 调试
  2. 第二天 django apache
  3. select count(*)和select count(1)哪个性能高
  4. MS SQL Server 如何得到执行最耗时的前N条T-SQL语句-
  5. 在CentOS 6.3中安装与配置JDK-7
  6. 利用TreeSet给纯数字字符串排序
  7. memcached学习笔记——连接模型
  8. 单目录下多文件 makefile编写
  9. 【转载】ADO.NET与ORM的比较(2):NHibernate实现CRUD
  10. sparksql中行转列
  11. TI AM335X处理器介绍
  12. Linux显示工作路径
  13. Thymeleaf3语法详解和实战
  14. 2018-2019 ICPC, NEERC, Southern Subregional Contest
  15. nodejs 动态创建二维码
  16. Appium1.9.1 部署及结果检验
  17. 洛谷 P1880 [NOI1995] 石子合并(区间DP)
  18. myisamchk命令修复表操作
  19. 通过DataTable获得表的主键
  20. linux命令学习之:chown

热门文章

  1. 动态代理到基于动态代理的AOP
  2. html简介
  3. C#本地时间和GMT(UTC)时间的转换
  4. jQ的toggle()方法示例
  5. 【Redis】使用Redis Sentinel实现Redis HA
  6. python命令行下tab键补全命令
  7. RPM Fusion on CentOS7
  8. 纯真IP根据IP地址获得地址
  9. [hibernate]基本值类型映射之日期类型
  10. Java方法总结与源码解析(未完待续)