算法

排序算法

  • 稳定排序

    待排序序列中相等元素在排序完成后,原有先后顺序不变。
  • 非稳定排序
  • 有序度

    待排序序列中有序关系的元素对个数。
  • 逆序度

1. 插入排序

  • 遍历有序数组,对比待插入的元素大小,找到位置。把该位置后的元素依次后移。
  • 时间复杂度: O(N2)

2. 选择排序

  • 区分已排序区间和未排序区间,每次从未排序区间选择最小的放在已排序区间的最后。
  • 时间复杂度: O(N2)

3. 归并排序

  • 将待排序元素从中间分为二半,对左右分别递归排序,最后合并在一起。
  • 思想: 分治思想
  • 时间复杂度: O(nLogN)
  • 常见实现: 递归
  • 特点: 非原地排序,需要借助额外存储空间,在数据量较大时空间复杂度较高。

4. 快速排序

  • 选择一个pivot分区点,将小于pivot的数据放到左边,大于的放到右边,在用同样方法递归排序左右。
  • 思想: 分治思想
  • 时间复杂度: O(nLogN)
  • 常见实现: 递归
  • 特点: 非稳定原地排序,另外pivot选择比较重要,常见的有随机选择,前中后三值取中值等。

5. 桶排序

  • 将待排序数据根据值区间划分m个桶,再对桶内进行排序,最后合并
  • 要求: 待排序数据值范围能比较轻松划分为m个区间(桶)
  • 思想: 分治思想
  • 时间复杂度: O(n)
  • 场景: 外部排序, 如TB级别数据,设置m个桶,将符合值区间的元素分别放入不同桶,再对桶分别排序

6. 基数排序

  • 使用稳定排序算法,从最后一位开始进行排序。
  • 要求: 带排序数据可以分割出独立的'位'来进行比较,而且位之间有递进关系
  • 时间复杂度: O(m*n)
  • 场景: 电话号码,英文字典排序等

7. 二分查找

  • 待查找数据与中间数对比,若小与则在左边递归二分,若大于则在右边递归二分
  • 要求:待查找集合为有序'数组'
  • 时间复杂度: O(LogN)
  • 场景:数据量不能太大,也不能太小

JavaScript字符串查找算法

温故而知新,最新在复习数据结构和算法,结合Chrome的V8源码,看看JS中的一些实现。

首先我们看看字符串查找在V8中是使用的哪种算法。

我们知道JS中的String是继承于Object,源码如下:

// https://github.com/v8/v8/blob/master/src/objects/string.cc
// Line942 字符串查找方法定义
int String::IndexOf(Isolate* isolate, Handle<String> receiver,Handle<String> search, int start_index) {
...
// 省略多余代码,根据返回值可见调用SearchString方法
// Line968
return SearchString<const uc16>(isolate, receiver_content, pat_vector,start_index);
} // Line18
#include "src/strings/string-search.h"
// 根据头引入查找该文件
// https://github.com/v8/v8/blob/master/src/strings/string-search.h
// 重点来了 // Line 20
*根据以下注释我们可以知道, JS中的字符串查找使用的BM查找算法。模式串最小匹配长度为7
class StringSearchBase {
protected:
// Cap on the maximal shift in the Boyer-Moore implementation. By setting a
// limit, we can fix the size of tables. For a needle longer than this limit,
// search will not be optimal, since we only build tables for a suffix
// of the string, but it is a safe approximation.
static const int kBMMaxShift = Isolate::kBMMaxShift; // Bad-char shift table stored in the state. It's length is the alphabet size.
// For patterns below this length, the skip length of Boyer-Moore is too short
// to compensate for the algorithmic overhead compared to simple brute force.
static const int kBMMinPatternLength = 7;
}; // 重点的重点 Line 54
template <typename PatternChar, typename SubjectChar>
class StringSearch : private StringSearchBase {
public:
StringSearch(Isolate* isolate, Vector<const PatternChar> pattern)
: isolate_(isolate),
pattern_(pattern),
start_(Max(0, pattern.length() - kBMMaxShift)) {
if (sizeof(PatternChar) > sizeof(SubjectChar)) {
if (!IsOneByteString(pattern_)) {
strategy_ = &FailSearch;
return;
}
}
// 获取模式串字符长度
int pattern_length = pattern_.length();
// 如果小于7
// static const int kBMMinPatternLength = 7;
if (pattern_length < kBMMinPatternLength) {
if (pattern_length == 1) {
//如果待查找字符串长度为1,使用单字符查找
strategy_ = &SingleCharSearch;
return;
}
//否则使用线性查找
strategy_ = &LinearSearch; return;
}
// 如果大于7,使用BM查找算法
strategy_ = &InitialSearch;
}

JavaScript数组排序算法

我们先看看各浏览器的排序算法是否是稳定排序

  • IE6+: stable
  • Firefox < 3: unstable
  • Firefox >= 3: stable
  • Chrome < 70: unstable
  • Chrome >= 70: stable
  • Opera < 10: unstable
  • Opera >= 10: stable
  • Safari 4: stable
  • Edge: unstable for long arrays (>512 elements)

在V8 v7.0/Chrome70以后,源码不在包含/src/js目录,相应的迁移到了/src/torque

关于V8 Torque的详情可以参考V8 Torque

我们先看看Chrome70以前的Array.prototype.sort的实现

// https://github.com/v8/v8/blob/6.9.454/src/js/array.js
// Line 802
// 以下可以知道sort方法返回InnerArraySort结果
DEFINE_METHOD(
GlobalArray.prototype,
sort(comparefn) {
if (!IS_UNDEFINED(comparefn) && !IS_CALLABLE(comparefn)) {
throw %make_type_error(kBadSortComparisonFunction, comparefn);
} var array = TO_OBJECT(this);
var length = TO_LENGTH(array.length);
return InnerArraySort(array, length, comparefn);
}
); // Line645
// 我们接下来看InnerArraySort的定义
function InnerArraySort(array, length, comparefn) {
// In-place QuickSort algorithm.
// For short (length <= 10) arrays, insertion sort is used for efficiency.
// 原地快排算法
// 如果长度小于10,使用插入排序 function InsertionSort(a, from, to) {
for (var i = from + 1; i < to; i++) {
var element = a[i];
for (var j = i - 1; j >= from; j--) {
var tmp = a[j];
var order = comparefn(tmp, element);
if (order > 0) {
a[j + 1] = tmp;
} else {
break;
}
}
a[j + 1] = element;
}
};
// ...省略部分代码
function QuickSort(a, from, to) {
var third_index = 0;
while (true) {
// Insertion sort is faster for short arrays.
if (to - from <= 10) {
InsertionSort(a, from, to);
return;
}
// ...省略部分代码
if (to - high_start < low_end - from) {
QuickSort(a, high_start, to);
to = low_end;
} else {
QuickSort(a, from, low_end);
from = high_start;
}
}
}; if (length < 2) return array;
QuickSort(array, 0, num_non_undefined);
return array;
}

快排所带来的问题

  • 众所周知快排是非稳定排序算法,由此带来了很多问题
  • V8在7.0以后将JS的数组排序更改为稳定排序算法
  • 在V8的博客上有一篇详细的介绍关于排序算法更改的文章

再看看Chrome70以后的排序实现

// https://github.com/v8/v8/blob/4b9b23521e6fd42373ebbcb20ebe03bf445494f9/third_party/v8/builtins/array-sort.tq

// Line1236
transitioning macro
ArrayTimSortImpl(context: Context, sortState: SortState, length: Smi) {
if (length < 2) return;
let remaining: Smi = length; // March over the array once, left to right, finding natural runs,
// and extending short natural runs to minrun elements.
let low: Smi = 0;
const minRunLength: Smi = ComputeMinRunLength(remaining);
while (remaining != 0) {
let currentRunLength: Smi = CountAndMakeRun(low, low + remaining); // If the run is short, extend it to min(minRunLength, remaining).
// 当前执行长度小于最小长度
if (currentRunLength < minRunLength) {
const forcedRunLength: Smi = SmiMin(minRunLength, remaining);
//使用插入排序
BinaryInsertionSort(low, low + currentRunLength, low + forcedRunLength);
currentRunLength = forcedRunLength;
} // Push run onto pending-runs stack, and maybe merge.
PushRun(sortState, low, currentRunLength); MergeCollapse(context, sortState); // Advance to find next run.
low = low + currentRunLength;
remaining = remaining - currentRunLength;
} //其他时候使用归并排序
MergeForceCollapse(context, sortState);
assert(GetPendingRunsSize(sortState) == 1);
assert(GetPendingRunLength(sortState.pendingRuns, 0) == length);
} // Line485 // BinaryInsertionSort is the best method for sorting small arrays: it
// does few compares, but can do data movement quadratic in the number of
// elements. This is an advantage since comparisons are more expensive due
// to calling into JS.
//
// [low, high) is a contiguous range of a array, and is sorted via
// binary insertion. This sort is stable.
//
// On entry, must have low <= start <= high, and that [low, start) is
// already sorted. Pass start == low if you do not know!.
macro BinaryInsertionSort(implicit context: Context, sortState: SortState)(
low: Smi, startArg: Smi, high: Smi) {
assert(low <= startArg && startArg <= high); const workArray = sortState.workArray; let start: Smi = low == startArg ? (startArg + 1) : startArg; for (; start < high; ++start) {
// Set left to where a[start] belongs.
let left: Smi = low;
let right: Smi = start; const pivot = workArray.objects[right]; // Invariants:
// pivot >= all in [low, left).
// pivot < all in [right, start).
assert(left < right); // Find pivot insertion point.
while (left < right) {
const mid: Smi = left + ((right - left) >> 1);
const order = sortState.Compare(pivot, workArray.objects[mid]); if (order < 0) {
right = mid;
} else {
left = mid + 1;
}
}
assert(left == right); // The invariants still hold, so:
// pivot >= all in [low, left) and
// pivot < all in [left, start),
//
// so pivot belongs at left. Note that if there are elements equal
// to pivot, left points to the first slot after them -- that's why
// this sort is stable. Slide over to make room.
for (let p: Smi = start; p > left; --p) {
workArray.objects[p] = workArray.objects[p - 1];
}
workArray.objects[left] = pivot;
}
} // Regardless of invariants, merge all runs on the stack until only one
// remains. This is used at the end of the mergesort.
transitioning macro
MergeForceCollapse(context: Context, sortState: SortState) {
let pendingRuns: FixedArray = sortState.pendingRuns; // Reload the stack size becuase MergeAt might change it.
while (GetPendingRunsSize(sortState) > 1) {
let n: Smi = GetPendingRunsSize(sortState) - 2; if (n > 0 &&
GetPendingRunLength(pendingRuns, n - 1) <
GetPendingRunLength(pendingRuns, n + 1)) {
--n;
}
MergeAt(n);
}
}

最新文章

  1. centos 6.5 yum安装 mysql 5.6
  2. js中 javascript:void(0) 用法详解
  3. linux部分系统信息命令
  4. C#在类中用调用Form的方法
  5. WinForm上显示gif动画:转
  6. Android 学习笔记之Volley开源框架解析(五)
  7. 爬虫再探实战(三)———爬取动态加载页面——selenium
  8. 【风马一族_Android】让app上传到Android市场的网站介绍
  9. 【块状树】BZOJ 1086: [SCOI2005]王室联邦
  10. Mac 配置jdk
  11. Hibernate: merge方法
  12. VBA基础概念
  13. 随手记今天跟的几个iOS项目代码的问题
  14. Canvas_2
  15. docker方式mysql设置字符集
  16. c/c++ 网络编程 文件传输
  17. PHP二维数组(或任意维数组)转换成一维数组的方法汇总(实用)
  18. Redis、RabbitMQ、Memcached
  19. listView悬浮头部的简单实现
  20. MySQL Crash Course #21# Chapter 29.30. Database Maintenance &amp; Improving Performance

热门文章

  1. pandas处理字符串
  2. pandas数据排序(series排序 &amp; DataFrame排序)
  3. 表达式,数据类型和变量(Expressions,Data Types &amp; Variables)
  4. bzoj1367 [Baltic2004]sequence 左偏树+贪心
  5. CSS元素居中汇总
  6. java生成图片验证码(转)--封装生成图片验证码的工具类
  7. 黑客代码HTML
  8. 「概率,期望DP」总结
  9. 【bzoj3162】独钓寒江雪
  10. 特征点检测算法——FAST角点