
tidb/index_lookup_hash_join.go at master · pingcap/tidb

Hash Join: Basic Steps

The optimizer uses the smaller data source to build a hash table on the join key in memory, and then scans the larger table to find the joined rows.

The basic steps are as follows:

  1. The database performs a full scan of the smaller data set, called the build table, and then applies a hash function to the join key in each row to build a hash table in the PGA.

    In pseudocode, the algorithm might look as follows:

    FOR small_table_row IN (SELECT * FROM small_table)
    slot_number := HASH(small_table_row.join_key);
  2. The database probes the second data set, called the probe table, using whichever access mechanism has the lowest cost.

    Typically, the database performs a full scan of both the smaller and larger data set. The algorithm in pseudocode might look as follows:

    FOR large_table_row IN (SELECT * FROM large_table)
    slot_number := HASH(large_table_row.join_key);
    small_table_row = LOOKUP_HASH_TABLE(slot_number,large_table_row.join_key);
    IF small_table_row FOUND
    output small_table_row + large_table_row;
    END IF;

    For each row retrieved from the larger data set, the database does the following:

    1. Applies the same hash function to the join column or columns to calculate the number of the relevant slot in the hash table.

      For example, to probe the hash table for department ID 30, the database applies the hash function to 30, which generates the hash value 4.

    2. Probes the hash table to determine whether rows exists in the slot.

      If no rows exist, then the database processes the next row in the larger data set. If rows exist, then the database proceeds to the next step.

    3. Checks the join column or columns for a match. If a match occurs, then the database either reports the rows or passes them to the next step in the plan, and then processes the next row in the larger data set.

      If multiple rows exist in the hash table slot, the database walks through the linked list of rows, checking each one. For example, if department 30 hashes to slot 4, then the database checks each row until it finds 30.

Example 9-4 Hash Joins

An application queries the oe.orders and oe.order_items tables, joining on the order_id column.

SELECT o.customer_id, l.unit_price * l.quantity
FROM orders o, order_items l
WHERE l.order_id = o.order_id;

The execution plan is as follows:

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
| 0 | SELECT STATEMENT | | 665 | 13300 | 8 (25)|
|* 1 | HASH JOIN | | 665 | 13300 | 8 (25)|
| 2 | TABLE ACCESS FULL | ORDERS | 105 | 840 | 4 (25)|
| 3 | TABLE ACCESS FULL | ORDER_ITEMS | 665 | 7980 | 4 (25)|
-------------------------------------------------------------------------- Predicate Information (identified by operation id):
1 - access("L"."ORDER_ID"="O"."ORDER_ID")

Because the orders table is small relative to the order_items table, which is 6 times larger, the database hashes orders. In a hash join, the data set for the build table always appears first in the list of operations (Step 2). In Step 3, the database performs a full scan of the larger order_items later, probing the hash table for each row.

How Hash Joins Work When the Hash Table Does Not Fit in the PGA

The database must use a different technique when the hash table does not fit entirely in the PGA. In this case, the database uses a temporary space to hold portions (called partitions) of the hash table, and sometimes portions of the larger table that probes the hash table.

The basic process is as follows:

  1. The database performs a full scan of the smaller data set, and then builds an array of hash buckets in both the PGA and on disk.

    When the PGA hash area fills up, the database finds the largest partition within the hash table and writes it to temporary space on disk. The database stores any new row that belongs to this on-disk partition on disk, and all other rows in the PGA. Thus, part of the hash table is in memory and part of it on disk.

  2. The database takes a first pass at reading the other data set.

    For each row, the database does the following:

    1. Applies the same hash function to the join column or columns to calculate the number of the relevant hash bucket.

    2. Probes the hash table to determine whether rows exist in the bucket in memory.

      If the hashed value points to a row in memory, then the database completes the join and returns the row. If the value points to a hash partition on disk, however, then the database stores this row in the temporary tablespace, using the same partitioning scheme used for the original data set.

  3. The database reads each on-disk temporary partition one by one

  4. The database joins each partition row to the row in the corresponding on-disk temporary partition.

Hash Join Controls

The USE_HASH hint instructs the optimizer to use a hash join when joining two tables together.


  1. VS2015 + Cordova Html5开发使用Crosswalk Web引擎
  2. C# 正则表达式小坑 -- not enough
  3. iOS - Bitcode App 瘦身中间码
  4. Oracle定时计划快速使用
  5. Android ADT 下载 ( ADT-23.0.7 )
  6. Go语言示例-函数返回多个值
  7. HTML+CSS学习笔记 (15) - css样式设置小技巧
  8. C语言获取系统时间的几种方式[转]
  9. HDU 4509 湫湫系列故事——减肥记II(暴力模拟即可)
  10. ./scripts/feeds update -a OpenWrt大招系列
  11. android 5.0新特性学习--RecyclerView
  12. java加密算法入门(二)-对称加密详解
  13. 机器学习理论提升方法AdaBoost算法第一卷
  14. CTF丨2019互联网安全城市巡回赛·西安站,我们来了!
  15. 数据库【mongodb篇】练习操作
  16. 【学习笔记】python
  17. Linux或UNIX系统配置检查
  18. 画时序图工具TimingDesigner 9.2 安装指导
  19. re模块 模块
  20. java提供类与cglib包实现动态代理


  1. python初学者-使用for循环做一个九九乘法表
  2. hive向es推送数据
  3. Linux嵌入式学习-网络配置-ping外网、主机和域名
  4. Python获取网页html代码
  5. jQuery是如何实现?
  6. 别再费劲去找后台的前端框架了,2021 年就用 Fantastic-admin 吧
  7. Java内存模型精讲
  8. java枚举类学习笔记总结
  9. 五、Zookeeper、Hbase集群搭建
  10. [论文阅读笔记] node2vec Scalable Feature Learning for Networks