算法

排序算法

稳定排序

待排序序列中相等元素在排序完成后，原有先后顺序不变。
非稳定排序
有序度

待排序序列中有序关系的元素对个数。
逆序度

1. 插入排序

遍历有序数组,对比待插入的元素大小,找到位置。把该位置后的元素依次后移。
时间复杂度: O(N2)

2. 选择排序

区分已排序区间和未排序区间,每次从未排序区间选择最小的放在已排序区间的最后。
时间复杂度: O(N2)

3. 归并排序

将待排序元素从中间分为二半,对左右分别递归排序,最后合并在一起。
思想: 分治思想
时间复杂度: O(nLogN)
常见实现: 递归
特点: 非原地排序,需要借助额外存储空间，在数据量较大时空间复杂度较高。

4. 快速排序

选择一个pivot分区点,将小于pivot的数据放到左边,大于的放到右边,在用同样方法递归排序左右。
思想: 分治思想
时间复杂度: O(nLogN)
常见实现: 递归
特点: 非稳定原地排序，另外pivot选择比较重要,常见的有随机选择,前中后三值取中值等。

5. 桶排序

将待排序数据根据值区间划分m个桶,再对桶内进行排序,最后合并
要求: 待排序数据值范围能比较轻松划分为m个区间(桶)
思想: 分治思想
时间复杂度: O(n)
场景: 外部排序, 如TB级别数据,设置m个桶,将符合值区间的元素分别放入不同桶,再对桶分别排序

6. 基数排序

使用稳定排序算法,从最后一位开始进行排序。
要求: 带排序数据可以分割出独立的'位'来进行比较,而且位之间有递进关系
时间复杂度: O(m*n)
场景: 电话号码,英文字典排序等

7. 二分查找

待查找数据与中间数对比,若小与则在左边递归二分,若大于则在右边递归二分
要求：待查找集合为有序'数组'
时间复杂度: O(LogN)
场景：数据量不能太大,也不能太小

JavaScript字符串查找算法

温故而知新，最新在复习数据结构和算法，结合Chrome的V8源码，看看JS中的一些实现。

首先我们看看字符串查找在V8中是使用的哪种算法。

我们知道JS中的String是继承于Object，源码如下:

// https://github.com/v8/v8/blob/master/src/objects/string.cc

// Line942 字符串查找方法定义

int String::IndexOf(Isolate* isolate, Handle<String> receiver,Handle<String> search, int start_index) {

...

// 省略多余代码，根据返回值可见调用SearchString方法

// Line968

  return SearchString<const uc16>(isolate, receiver_content, pat_vector,start_index);

}

// Line18

#include "src/strings/string-search.h"

// 根据头引入查找该文件

// https://github.com/v8/v8/blob/master/src/strings/string-search.h

// 重点来了

// Line 20

*根据以下注释我们可以知道, JS中的字符串查找使用的BM查找算法。模式串最小匹配长度为7

class StringSearchBase {

 protected:

  // Cap on the maximal shift in the Boyer-Moore implementation. By setting a

  // limit, we can fix the size of tables. For a needle longer than this limit,

  // search will not be optimal, since we only build tables for a suffix

  // of the string, but it is a safe approximation.

  static const int kBMMaxShift = Isolate::kBMMaxShift;

  // Bad-char shift table stored in the state. It's length is the alphabet size.

  // For patterns below this length, the skip length of Boyer-Moore is too short

  // to compensate for the algorithmic overhead compared to simple brute force.

  static const int kBMMinPatternLength = 7;

};

// 重点的重点 Line 54

template <typename PatternChar, typename SubjectChar>

class StringSearch : private StringSearchBase {

 public:

  StringSearch(Isolate* isolate, Vector<const PatternChar> pattern)

      : isolate_(isolate),

        pattern_(pattern),

        start_(Max(0, pattern.length() - kBMMaxShift)) {

    if (sizeof(PatternChar) > sizeof(SubjectChar)) {

      if (!IsOneByteString(pattern_)) {

        strategy_ = &FailSearch;

        return;

      }

    }

    // 获取模式串字符长度

    int pattern_length = pattern_.length();

    // 如果小于7

    // static const int kBMMinPatternLength = 7;

    if (pattern_length < kBMMinPatternLength) {

      if (pattern_length == 1) {

          //如果待查找字符串长度为1,使用单字符查找

        strategy_ = &SingleCharSearch;

        return;

      }

      //否则使用线性查找

      strategy_ = &LinearSearch;

      return;

    }

    // 如果大于7,使用BM查找算法

    strategy_ = &InitialSearch;

  }

JavaScript数组排序算法

我们先看看各浏览器的排序算法是否是稳定排序

IE6+: stable
Firefox < 3: unstable
Firefox >= 3: stable
Chrome < 70: unstable
Chrome >= 70: stable
Opera < 10: unstable
Opera >= 10: stable
Safari 4: stable
Edge: unstable for long arrays (>512 elements)

在V8 v7.0/Chrome70以后,源码不在包含/src/js目录,相应的迁移到了/src/torque

关于V8 Torque的详情可以参考V8 Torque

我们先看看Chrome70以前的Array.prototype.sort的实现

// https://github.com/v8/v8/blob/6.9.454/src/js/array.js

// Line 802

// 以下可以知道sort方法返回InnerArraySort结果

DEFINE_METHOD(

  GlobalArray.prototype,

  sort(comparefn) {

    if (!IS_UNDEFINED(comparefn) && !IS_CALLABLE(comparefn)) {

      throw %make_type_error(kBadSortComparisonFunction, comparefn);

    }

    var array = TO_OBJECT(this);

    var length = TO_LENGTH(array.length);

    return InnerArraySort(array, length, comparefn);

  }

);

// Line645

// 我们接下来看InnerArraySort的定义

function InnerArraySort(array, length, comparefn) {

  // In-place QuickSort algorithm.

  // For short (length <= 10) arrays, insertion sort is used for efficiency.

  // 原地快排算法

  // 如果长度小于10,使用插入排序

  function InsertionSort(a, from, to) {

    for (var i = from + 1; i < to; i++) {

      var element = a[i];

      for (var j = i - 1; j >= from; j--) {

        var tmp = a[j];

        var order = comparefn(tmp, element);

        if (order > 0) {

          a[j + 1] = tmp;

        } else {

          break;

        }

      }

      a[j + 1] = element;

    }

  };

  // ...省略部分代码

  function QuickSort(a, from, to) {

    var third_index = 0;

    while (true) {

      // Insertion sort is faster for short arrays.

      if (to - from <= 10) {

        InsertionSort(a, from, to);

        return;

      }

    // ...省略部分代码

      if (to - high_start < low_end - from) {

        QuickSort(a, high_start, to);

        to = low_end;

      } else {

        QuickSort(a, from, low_end);

        from = high_start;

      }

    }

  };

  if (length < 2) return array;

  QuickSort(array, 0, num_non_undefined);

  return array;

}

快排所带来的问题

众所周知快排是非稳定排序算法,由此带来了很多问题
V8在7.0以后将JS的数组排序更改为稳定排序算法
在V8的博客上有一篇详细的介绍关于排序算法更改的文章

再看看Chrome70以后的排序实现

// https://github.com/v8/v8/blob/4b9b23521e6fd42373ebbcb20ebe03bf445494f9/third_party/v8/builtins/array-sort.tq

// Line1236

transitioning macro

  ArrayTimSortImpl(context: Context, sortState: SortState, length: Smi) {

    if (length < 2) return;

    let remaining: Smi = length;

    // March over the array once, left to right, finding natural runs,

    // and extending short natural runs to minrun elements.

    let low: Smi = 0;

    const minRunLength: Smi = ComputeMinRunLength(remaining);

    while (remaining != 0) {

      let currentRunLength: Smi = CountAndMakeRun(low, low + remaining);

      // If the run is short, extend it to min(minRunLength, remaining).

      // 当前执行长度小于最小长度

      if (currentRunLength < minRunLength) {

        const forcedRunLength: Smi = SmiMin(minRunLength, remaining);

        //使用插入排序

        BinaryInsertionSort(low, low + currentRunLength, low + forcedRunLength);

        currentRunLength = forcedRunLength;

      }

      // Push run onto pending-runs stack, and maybe merge.

      PushRun(sortState, low, currentRunLength);

      MergeCollapse(context, sortState);

      // Advance to find next run.

      low = low + currentRunLength;

      remaining = remaining - currentRunLength;

    }

    //其他时候使用归并排序

    MergeForceCollapse(context, sortState);

    assert(GetPendingRunsSize(sortState) == 1);

    assert(GetPendingRunLength(sortState.pendingRuns, 0) == length);

  }

  // Line485

  // BinaryInsertionSort is the best method for sorting small arrays: it

  // does few compares, but can do data movement quadratic in the number of

  // elements. This is an advantage since comparisons are more expensive due

  // to calling into JS.

  //

  //  [low, high) is a contiguous range of a array, and is sorted via

  // binary insertion. This sort is stable.

  //

  // On entry, must have low <= start <= high, and that [low, start) is

  // already sorted. Pass start == low if you do not know!.

  macro BinaryInsertionSort(implicit context: Context, sortState: SortState)(

      low: Smi, startArg: Smi, high: Smi) {

    assert(low <= startArg && startArg <= high);

    const workArray = sortState.workArray;

    let start: Smi = low == startArg ? (startArg + 1) : startArg;

    for (; start < high; ++start) {

      // Set left to where a[start] belongs.

      let left: Smi = low;

      let right: Smi = start;

      const pivot = workArray.objects[right];

      // Invariants:

      //   pivot >= all in [low, left).

      //   pivot  < all in [right, start).

      assert(left < right);

      // Find pivot insertion point.

      while (left < right) {

        const mid: Smi = left + ((right - left) >> 1);

        const order = sortState.Compare(pivot, workArray.objects[mid]);

        if (order < 0) {

          right = mid;

        } else {

          left = mid + 1;

        }

      }

      assert(left == right);

      // The invariants still hold, so:

      //   pivot >= all in [low, left) and

      //   pivot  < all in [left, start),

      //

      // so pivot belongs at left. Note that if there are elements equal

      // to pivot, left points to the first slot after them -- that's why

      // this sort is stable. Slide over to make room.

      for (let p: Smi = start; p > left; --p) {

        workArray.objects[p] = workArray.objects[p - 1];

      }

      workArray.objects[left] = pivot;

    }

  }

  // Regardless of invariants, merge all runs on the stack until only one

  // remains. This is used at the end of the mergesort.

  transitioning macro

  MergeForceCollapse(context: Context, sortState: SortState) {

    let pendingRuns: FixedArray = sortState.pendingRuns;

    // Reload the stack size becuase MergeAt might change it.

    while (GetPendingRunsSize(sortState) > 1) {

      let n: Smi = GetPendingRunsSize(sortState) - 2;

      if (n > 0 &&

          GetPendingRunLength(pendingRuns, n - 1) <

              GetPendingRunLength(pendingRuns, n + 1)) {

        --n;

      }

      MergeAt(n);

    }

  }

巴特西

Web高级 JavaScript中的算法

算法