Hive架构

Figure 1 also shows how a typical query flows through the system.
图一显示一个普通的查询是如何流经Hive系统的。

The UI calls the execute interface to the Driver (step 1 in Figure 1).
图中的第1步,UI向Driver调用执行接口

The Driver creates a session handle for the query and sends the query to the compiler to generate an execution plan (step 2).
第2步,Driver为查询创建一个Session句柄,将查询发送到compiler编译器,生成一个执行计划(execution plan)。

The compiler gets the necessary metadata from the metastore (steps 3 and 4).
第3-4步,编译器从metastore中获取必要的元数据信息。

This metadata is used to typecheck the expressions in the query tree as well as to prune partitions based on query predicates.
元数据被用户对查询树中的表达式进行类型检查,以及基于查询谓词进行剪枝。

The plan generated by the compiler (step 5) is a DAG of stages with each stage being either a map/reduce job, a metadata operation or an operation on HDFS.
第5步,编译器生成的计划是一个多个阶段的DAG,每个阶段都是一个MR任务,或者一个元数据操作、HDFS操作。

For map/reduce stages, the plan contains map operator trees (operator trees that are executed on the mappers) and a reduce operator tree (for operations that need reducers). The execution engine submits these stages to appropriate components (steps 6, 6.1, 6.2 and 6.3).
对于MR阶段,这个计划包含map操作树和reduce操作树。这个执行引擎提交这些阶段到恰当的组件。

In each task (mapper/reducer) the deserializer associated with the table or intermediate outputs is used to read the rows from HDFS files and these are passed through the associated operator tree. Once the output is generated, it is written to a temporary HDFS file though the serializer (this happens in the mapper in case the operation does not need a reduce).

The temporary files are used to provide data to subsequent map/reduce stages of the plan. For DML operations the final temporary file is moved to the table's location.

This scheme is used to ensure that dirty data is not read (file rename being an atomic operation in HDFS).
scheme被用来确保脏数据不会被读到。

For queries, the contents of the temporary file are read by the execution engine directly from HDFS as part of the fetch call from the Driver (steps 7, 8 and 9).

Hive数据模型

Metastore

Hive Query Language

参考文档

Hive 官方文档-Design

最新文章

  1. 【干货】用大白话聊聊JavaSE — ArrayList 深入剖析和Java基础知识详解(二)
  2. linux访问windows共享文件夹的方法
  3. 如何使用VS2013对C++进行编程
  4. Windows 2012 装 Remote Desktop Organizer 无法连接到其他远程服务器
  5. 【Java数据格式化】使用DecimalFormat 对Float和double进行格式化
  6. Windows 8.1激活问题
  7. Android之NDK编程(JNI)
  8. mysql use mysql hang
  9. cocos2dx tolua传递参数分析
  10. 05).30分钟学会Servlet+过滤器+监听器+实际案例
  11. Linux系统中调用短信猫发送短信(笔记)
  12. H5动画
  13. C语言第三次作业--嵌套循环
  14. 洛谷P4640 王之财宝 [BJWC2008] 数论
  15. 【Python】2.x与3​​.x版本的选用&版本间的区别
  16. EF Core创建实体的Code First标准方法
  17. elasticsearch DSL查询
  18. Vuejs 高仿饿了么外卖APP 百度云视频教程下载
  19. csc命令
  20. 美团codeM之美团代金券

热门文章

  1. 在Centos7下源代码安装配置Nginx
  2. dubbo小教程
  3. 吴裕雄--天生自然JAVA SPRING框架开发学习笔记:Spring自动装配Bean
  4. E - Third-Party Software - 2 Gym - 102215E (贪心)
  5. SAP_SD常用事务代码
  6. linux 安装禅道 和 CentOS 7 开放防火墙端口 命令
  7. sping中 各种注解——@SuppressWarnings注解用法
  8. 池ThreadPoolExecutor使用简介
  9. vue简单逻辑判断
  10. sklearn 模型评估