DATABASE SYSTEM CONCEPTS, SIXTH EDITION
11.1 Basic Concepts
An index for a file in a database system works in much the same way as the index
in this textbook. If we want to learn about a particular topic (specified by a word
or a phrase) in this textbook, we can search for the topic in the index at the back
of the book, find the pages where it occurs, and then read the pages to find the
information for which we are looking. The words in the index are in sorted order,
making it easy to find the word we want. Moreover, the index is much smaller
than the book, further reducing the effort needed.
Database-system indices play the same role as book indices in libraries. For
example, to retrieve a student record given an
ID
, the database system would look
up an index to find on which disk block the corresponding record resides, and
then fetch the disk block, to get the appropriate student record.
Keeping a sorted list of students’
ID
would not work well on very large
databases with thousands of students, since the index would itself be very big;
further, even though keeping the index sorted reduces the search time, finding a
student can still be rather time-consuming. Instead, more sophisticated indexing
techniques may be used. We shall discuss several of these techniques in this
chapter.
There are two basic kinds of indices:


Ordered indices. Based on a sorted ordering of the values.


Hash indices. Based on a uniform distribution of values across a range of
buckets. The bucket to which a value is assigned is determined by a function,
called a hash function.

We shall consider several techniques for both ordered indexing and hashing.
No one technique is the best. Rather, each technique is best suited to particular
database applications. Each technique must be evaluated on the basis of these
factors:


Access types: The types of access that are supported efficiently. Access types
can include finding records with a specified attribute value and finding
records whose attribute values fall in a specified range.

Access time: The time it takes to find a particular data item, or set of items,
using the technique in question.

Insertion time: The time it takes to insert a new data item. This value includes
the time it takes to find the correct place to insert the new data item, as well
as the time it takes to update the index structure.

Deletion time: The time it takes to delete a data item. This value includes
the time it takes to find the item to be deleted, as well as the time it takes to
update the index structure.

Space overhead: The additional space occupied by an index structure. Pro-
vided that the amount of additional space is moderate, it is usually worth-
while to sacrifice the space to achieve improved performance.
We often want to have more than one index for a file. For example, we may
wish to search for a book by author, by subject, or by title.
An attribute or set of attributes used to look up records in a file is called a
search key. Note that this definition of key differs from that used in primary key,
candidate key, and superkey. This duplicate meaning for key is (unfortunately) well
established in practice. Using our notion of a search key, we see that if there are
several indices on a file, there are several search keys.

最新文章

  1. MongoDB 搭建分片集群
  2. Google Java编程风格指南
  3. DIV布局-高度不同DIV,自动换行并对齐
  4. 使用SignalR构建一个最基本的web聊天室
  5. PAT Ranking (排名)
  6. SVN: revert all command
  7. PHP面向对象的基本写法(区别于java)
  8. LintCode-两数之和
  9. .net EF 事物 订单流水号的生成 (二):观察者模式、事物、EF
  10. Web层后端权限模块
  11. Swift 与 JSON 数据
  12. 小程序的1024KB
  13. 6656 Watching the Kangaroo
  14. 用js判断是否为手机浏览,如果是手机浏览就跳转到手机站
  15. centos7zabbix-agen安装
  16. STL容器及泛型算法
  17. jQuery应用实例3:全选、二级联动
  18. Nodejs----学习路线
  19. Object.MemberwiseClone
  20. [leetcode]658. Find K Closest Elements绝对距离最近的K个元素

热门文章

  1. Linux(CentOS)中安装MongoDB
  2. HDU5855 Less Time, More profit(最大权闭合子图)
  3. Docker安装RStudio
  4. 斑点检测(LoG,DoG)(下)
  5. 不刷新改变URL: pushState + Ajax
  6. HDU 3069 (树形DP)
  7. Java 文件和byte数组转换
  8. Codeforces Testing Round #10 B. Balancer
  9. codeforces Round #252 (Div. 2) C - Valera and Tubes
  10. 当编译AFNetworking 2.0时出现了Undefined symbols for architecture i386