Spark SQL所支持的SQL语法

select [distinct] [column names]|[wildcard]
from tableName
[join clause tableName on join condition]
[where condition]
[group by column name]
[having conditions]
[order by column names [asc|desc]]

如果只用join进行查询,则支持的语法为:

select statement
from statement
[join | inner join | left join | left semi join | left outer join | right join |right outer join | full join | full outer join]
on join condition

Spark SQL的SQL的框架

与Hive Metastore结合

(1)Spark要能找到HDFS和Hive的配置文件

  • 第1种方法:可以直接将core-site.xml、hdfs-site.xml和hive-site.xml复制到Spark安装目录下的conf目录中。该方法存在一个缺陷,如果HDFS或Hive的配置修改了,则需要手动修改Spark对应的配置文件。
  • 第2种方法:在Spark配置文件中指定Hadoop配置文件目录

(2)Spark SQL与Hive Metastore结合,直接使用spark.sql(“select … from table where …”)

15.4 实例演示

(1)spark-shell

[root@node1 ~]# spark-shell
// :: WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Spark context Web UI available at http://192.168.80.131:4040
Spark context available as 'sc' (master = local[*], app id = local-).
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 2.2.
/_/ Using Scala version 2.11. (Java HotSpot(TM) -Bit Server VM, Java 1.8.0_112)
Type in expressions to have them evaluated.
Type :help for more information. scala> spark.sql("show databases").show
+------------+
|databaseName|
+------------+
| default|
| test|
+------------+ scala> spark.sql("show tables").show
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
| default| copyemp| false|
| default| demo| false|
| default| dept| false|
| default| dual| false|
| default| emp| false|
| default| empbak| false|
| default|employees| false|
| default| mytb| false|
| default| users| false|
+--------+---------+-----------+ scala> spark.sql("select * from emp").show
+----+------+---------+----+----------+------+------+----+
| eid| ename| job| mgr| hiredate| sal| comm| did|
+----+------+---------+----+----------+------+------+----+
|| CLARK| MANAGER||--|2450.0| 0.0| |
|| KING|PRESIDENT| |--|5000.0| 0.0| |
||MILLER| CLERK||--|1300.0| 0.0| |
|| SMITH| CLERK||--| 800.0| 0.0| |
|| JONES| MANAGER||--|2975.0| 0.0| |
|| FORD| ANALYST||--|3000.0| 0.0| |
|| ALLEN| SALESMAN||--|1600.0| 300.0| |
|| WARD| SALESMAN||--|1250.0| 500.0| |
||MARTIN| SALESMAN||--|1250.0|1400.0| |
|| BLAKE| MANAGER||--|2850.0| 0.0| |
||TURNER| SALESMAN||--|1500.0| 0.0| |
|| JAMES| CLERK||--| 950.0| 0.0| |
||HADRON| null|null|--|6666.0| null|null|
+----+------+---------+----+----------+------+------+----+

(2)spark-sql

[root@node1 ~]# spark-sql
// :: WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
// :: WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.
// :: WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
spark-sql> show databases;
default
test
Time taken: 3.93 seconds, Fetched row(s)
spark-sql> show tables;
default copyemp false
default demo false
default dept false
default dual false
default emp false
default empbak false
default employees false
default mytb false
default users false
Time taken: 0.145 seconds, Fetched row(s)
spark-sql> select * from emp;
CLARK MANAGER -- 2450.0 0.0
KING PRESIDENT -- 5000.0 0.0
MILLER CLERK -- 1300.0 0.0
SMITH CLERK -- 800.0 0.0
JONES MANAGER -- 2975.0 0.0
FORD ANALYST -- 3000.0 0.0
ALLEN SALESMAN -- 1600.0 300.0
WARD SALESMAN -- 1250.0 500.0
MARTIN SALESMAN -- 1250.0 1400.0
BLAKE MANAGER -- 2850.0 0.0
TURNER SALESMAN -- 1500.0 0.0
JAMES CLERK -- 950.0 0.0
HADRON NULL NULL -- 6666.0 NULL NULL
Time taken: 3.266 seconds, Fetched row(s)
spark-sql>

最新文章

  1. gRPC C#学习
  2. Kali Linux渗透基础知识整理(一):信息搜集
  3. MYSQL数据库函数集合
  4. 1028 C语言文法
  5. Java基础知识强化69:基本类型包装类之Character案例(统计字符串中大写小写以及数字的次数)
  6. (转)精通 JS正则表达式
  7. C# .NET更智能的数据库操作封装项目
  8. Selenium 设置浏览器下载 Firefox 和Chrome
  9. leetcode算法:Reshape the Matrix
  10. Sqlserver常用基础语句
  11. June 16. 2018, Week 24th. Saturday
  12. Badboy录制Jmter脚本
  13. Spring Boot 数据库连接池 Druid
  14. [UE4]记录瞬移目标点
  15. python字符串前面的r/u/b的意义 (笔记)
  16. Eclipse SVN文件冲突及不能直接提交情况
  17. 【Java】 大话数据结构(6) 栈的顺序与链式存储
  18. ds18b20驱动及应用程序
  19. SharePoint _layouts下自定义程序页面权限管理
  20. Maven私仓配置

热门文章

  1. HTML5实现图片预览功能
  2. open-falcon之transfer
  3. 【19道XSS题目】不服来战!(转)
  4. ndk编译android的lame库
  5. Css中!important的用法
  6. [工具]Sublime 显示韩文
  7. Unity3D笔记十八 GL图像库
  8. express运行原理
  9. msyql DATETIME类型和Timestamp之间的转换
  10. jquery实践