HashMap、ConcurrentHashMap 1.7和1.8对比

本篇内容是学习的记录，可能会有所不足。

一：JDK1.7中的HashMap

JDK1.7的hashMap是由数组 + 链表组成

/** 1 << 4，表示1，左移4位，变成10000，即16，以二进制形式运行，效率更高

     * 默认的hashMap数组长度

     * The default initial capacity - MUST be a power of two.

     */

    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**

     * The maximum capacity, used if a higher value is implicitly specified

     * by either of the constructors with arguments.

     * MUST be a power of two <= 1<<30.

     * hashMap的最大容量

     */

    static final int MAXIMUM_CAPACITY = 1 << 30;        //1 073 741 824

    /**

     * The load factor used when none specified in constructor.

     * 负载因子

     */

    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**

     * An empty table instance to share when the table is not inflated.

     */

    static final Entry<?,?>[] EMPTY_TABLE = {};

    /**

     * The table, resized as necessary. Length MUST Always be a power of two.

     * hashTable，根据需要调整大小。长度一定是2的幂。

     */

    transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;

    /**

     * The number of key-value mappings contained in this map.

     * hashMap中元素的个数

     */

    transient int size;

    /**

     * The next size value at which to resize (capacity * load factor).

     * @serial

     */

    // If table == EMPTY_TABLE then this is the initial capacity at which the

    // table will be created when inflated.

    int threshold;

    /**

     * The load factor for the hash table.

     *

     * @serial

     */

    final float loadFactor;

    /**

     * The number of times this HashMap has been structurally modified

     * Structural modifications are those that change the number of mappings in

     * the HashMap or otherwise modify its internal structure (e.g.,

     * rehash).  This field is used to make iterators on Collection-views of

     * the HashMap fail-fast.  (See ConcurrentModificationException).

     * 记录hashMap元素被修改的次数

     */

    transient int modCount;

1：DEFAULT_INITIAL_CAPACITY，是hashMap默认的初始容量，它的大小一定是2的幂。

2：MAXIMUM_CAPACITY，hashMap支持的最大容量。

3：DEFAULT_LOAD_FACTOR，hashMap默认的负载因子，值为0.75，它决定hashMap数据的密度。

4：Entry<K,V>[] table，hashMap数组，可以根据自己的需要调整大小，长度一定是2的幂。

5：size，主要是记录hashMap中元素的数量。

6：threshold，调整hashMap后的值，即容量*负载因子。

7：loadFactor，可以调整的负载因子。

8：modCount，用来记录hashMap结构被修改的次数。

hashMap源码中有四个构造函数，初始化的时候可以知道容量和负载因子的大小。

 /**   做了两件事：1、为threshold、loadFactor赋值   2、调用init()

     * Constructs an empty <tt>HashMap</tt> with the specified initial

     * capacity and load factor.

     *

     * @param  initialCapacity the initial capacity

     * @param  loadFactor      the load factor

     * @throws IllegalArgumentException if the initial capacity is negative

     *         or the load factor is nonpositive

     */

    public HashMap(int initialCapacity, float loadFactor) {

        if (initialCapacity < 0)

            throw new IllegalArgumentException("Illegal initial capacity: " +

                                               initialCapacity);

        if (initialCapacity > MAXIMUM_CAPACITY)     //限制最大容量

            initialCapacity = MAXIMUM_CAPACITY;

        if (loadFactor <= 0 || Float.isNaN(loadFactor))     //检查 loadFactor

            throw new IllegalArgumentException("Illegal load factor: " +

                                               loadFactor);

        //真正在做的，只是记录下loadFactor、initialCpacity的值

        this.loadFactor = loadFactor;       //记录下loadFactor

        threshold = initialCapacity;        //初始的 阈值threshold=initialCapacity=16

        init();

    }

    /**

     * Constructs an empty <tt>HashMap</tt> with the specified initial

     * capacity and the default load factor (0.75).

     *

     * @param  initialCapacity the initial capacity.

     * @throws IllegalArgumentException if the initial capacity is negative.

     */

    public HashMap(int initialCapacity) {

        this(initialCapacity, DEFAULT_LOAD_FACTOR);

    }

    /**  默认的初始化容量、默认的加载因子

     * Constructs an empty <tt>HashMap</tt> with the default initial capacity

     * (16) and the default load factor (0.75).

     */

    public HashMap() {    //16  0.75

        this(DEFAULT_INITIAL_CAPACITY, DEFAULT_LOAD_FACTOR);

    }

    /**

     * Constructs a new <tt>HashMap</tt> with the same mappings as the

     * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with

     * default load factor (0.75) and an initial capacity sufficient to

     * hold the mappings in the specified <tt>Map</tt>.

     *

     * @param   m the map whose mappings are to be placed in this map

     * @throws  NullPointerException if the specified map is null

     */

    public HashMap(Map<? extends K, ? extends V> m) {

        this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,

                      DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);

        inflateTable(threshold);

        putAllForCreate(m);

    }

接下来看下put方法：

public V put(K key, V value) {

        if (Entry<K,V>[] table == EMPTY_TABLE) {

            inflateTable(threshold);    //初始化表 （初始化、扩容 合并为了一个方法）

        }

        if (key == null)        //对key为null做特殊处理

            return putForNullKey(value);

        int hash = hash(key);           //计算hash值

        int i = indexFor(hash, table.length);   //根据hash值计算出index下标

        for (Entry<K,V> e = table[i]; e != null; e = e.next) {  //遍历下标为i处的链表

            Object k;

            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {  //如果key值相同，覆盖旧值，返回新值

                V oldValue = e.value;

                e.value = value;    //新值 覆盖 旧值

                e.recordAccess(this);   //do nothing

                return oldValue;    //返回旧值

            }

        }

        modCount++;         //修改次数+1，类似于一个version number

        addEntry(hash, key, value, i);

        return null;

    }

可以看到到table是空的时候，调用了一个方法：

private void inflateTable(int toSize) {

        // Find a power of 2 >= toSize

        int capacity = roundUpToPowerOf2(toSize);

        //

        threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);

        table = new Entry[capacity];    //初始化表

        initHashSeedAsNeeded(capacity);

}

这个方法用来初始化table和table的扩容，roundUpToPowerOf2可以保证hashMap的容量一定是2的幂。

hashMap put元素时，会先根据hash运算计算出hash值，然后根据hash值和table的长度进行取模，计算出元素在table中的下标，如果key相同就覆盖原来的旧值，如果不相同就加入链表中。

    /**

     * Returns index for hash code h.

     * 计算元素在table中的下标位置

     */

    static int indexFor(int h, int length) {

        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";

        return h & (length-1);

    }

    /**

     * Adds a new entry with the specified key, value and hash code to

     * the specified bucket.  It is the responsibility of this

     * method to resize the table if appropriate.

     *

     * Subclass overrides this to alter the behavior of put method.

     */

    void addEntry(int hash, K key, V value, int bucketIndex) {

        if ((size >= threshold) && (null != table[bucketIndex])) {  //如果size大于threshold && table在下标为index的地方已经有entry了

            resize(2 * table.length);       //扩容，将数组长度变为原来两倍

            hash = (null != key) ? hash(key) : 0;       //重新计算 hash 值

            bucketIndex = indexFor(hash, table.length); //重新计算下标

        }

        createEntry(hash, key, value, bucketIndex);     //创建entry

    }

    /**

     * Like addEntry except that this version is used when creating entries

     * as part of Map construction or "pseudo-construction" (cloning,

     * deserialization).  This version needn't worry about resizing the table.

     *

     * Subclass overrides this to alter the behavior of HashMap(Map),

     * clone, and readObject.

     */

    void createEntry(int hash, K key, V value, int bucketIndex) {

        Entry<K,V> e = table[bucketIndex];      //获取table中存的entry

        table[bucketIndex] = new Entry<>(hash, key, value, e);   //将新的entry放到数组中，next指向旧的table[i]

        size++;         //修改map中元素个数

    }

当put的元素个数大于12时，即大于hashMap的容量*负载因子计算后的值，那么就会进行扩容，上述源代码可以看到扩容的条件，除了大于12，还要看当前put进table所处的位置，是否为null，若是null，就不进行扩容，否则就扩容成原来容量的2倍，扩容后需要重新计算hash和计算下标，由于table的长度发生了变化，需要重新计算。

接下来看下get方法：

public V get(Object key) {

        if (key == null)

            return getForNullKey();

        Entry<K,V> entry = getEntry(key);

        return null == entry ? null : entry.getValue();

}

/**

     * Returns the entry associated with the specified key in the

     * HashMap.  Returns null if the HashMap contains no mapping

     * for the key.

     */

    final Entry<K,V> getEntry(Object key) {

        if (size == 0) {

            return null;

        }

        int hash = (key == null) ? 0 : hash(key);

        for (Entry<K,V> e = table[indexFor(hash, table.length)];

             e != null;

             e = e.next) {

            Object k;

            if (e.hash == hash &&

                ((k = e.key) == key || (key != null && key.equals(k))))

                return e;

        }

        return null;

    }

get方法也是需要先计算hash然后计算下标，再去寻找元素。

二：JDK1.8中的HashMap

JDK1.8中的hashMap和1.7最大的区别就是引入了红黑树

/**

     * The table, initialized on first use, and resized as

     * necessary. When allocated, length is always a power of two.

     * (We also tolerate length zero in some operations to allow

     * bootstrapping mechanics that are currently not needed.)

     */

    transient Node<K,V>[] table;

    /**

     * Holds cached entrySet(). Note that AbstractMap fields are used

     * for keySet() and values().

     */

    transient Set<Map.Entry<K,V>> entrySet;

    /**

     * The number of key-value mappings contained in this map.

     */

    transient int size;

    /**

     * The number of times this HashMap has been structurally modified

     * Structural modifications are those that change the number of mappings in

     * the HashMap or otherwise modify its internal structure (e.g.,

     * rehash).  This field is used to make iterators on Collection-views of

     * the HashMap fail-fast.  (See ConcurrentModificationException).

     */

    transient int modCount;

    /**

     * The next size value at which to resize (capacity * load factor).

     *

     * @serial

     */

    // (The javadoc description is true upon serialization.

    // Additionally, if the table array has not been allocated, this

    // field holds the initial array capacity, or zero signifying

    // DEFAULT_INITIAL_CAPACITY.)

    int threshold;

    /**

     * The load factor for the hash table.

     *

     * @serial

     */

    final float loadFactor;

    /**

     * The default initial capacity - MUST be a power of two.

     */

    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**

     * The maximum capacity, used if a higher value is implicitly specified

     * by either of the constructors with arguments.

     * MUST be a power of two <= 1<<30.

     */

    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**

     * The load factor used when none specified in constructor.

     */

    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**

     * The bin count threshold for using a tree rather than list for a

     * bin.  Bins are converted to trees when adding an element to a

     * bin with at least this many nodes. The value must be greater

     * than 2 and should be at least 8 to mesh with assumptions in

     * tree removal about conversion back to plain bins upon

     * shrinkage.

     *

     */

    static final int TREEIFY_THRESHOLD = 8;

    /**

     * The bin count threshold for untreeifying a (split) bin during a

     * resize operation. Should be less than TREEIFY_THRESHOLD, and at

     * most 6 to mesh with shrinkage detection under removal.

     */

    static final int UNTREEIFY_THRESHOLD = 6;

    /**

     * The smallest table capacity for which bins may be treeified.

     * (Otherwise the table is resized if too many nodes in a bin.)

     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts

     * between resizing and treeification thresholds.

     */

    static final int MIN_TREEIFY_CAPACITY = 64;

    /**

     * Basic hash bin node, used for most entries.  (See below for

     * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)

     */

    static class Node<K,V> implements Map.Entry<K,V> {

        final int hash;

        final K key;

        V value;

        Node<K,V> next;

        Node(int hash, K key, V value, Node<K,V> next) {

            this.hash = hash;

            this.key = key;

            this.value = value;

            this.next = next;

        }

        public final K getKey()        { return key; }

        public final V getValue()      { return value; }

        public final String toString() { return key + "=" + value; }

        public final int hashCode() {

            return Objects.hashCode(key) ^ Objects.hashCode(value);

        }

        public final V setValue(V newValue) {

            V oldValue = value;

            value = newValue;

            return oldValue;

        }

        public final boolean equals(Object o) {

            if (o == this)

                return true;

            if (o instanceof Map.Entry) {

                Map.Entry<?,?> e = (Map.Entry<?,?>)o;

                if (Objects.equals(key, e.getKey()) &&

                    Objects.equals(value, e.getValue()))

                    return true;

            }

            return false;

        }

    }

下面看下put方法：

 public V put(K key, V value) {

        return putVal(hash(key), key, value, false, true);

    }

    /**

     * Implements Map.put and related methods.  添加元素

     *

     * @param hash hash for key

     * @param key the key

     * @param value the value to put

     * @param onlyIfAbsent if true, don't change existing value

     * @param evict if false, the table is in creation mode.

     * @return previous value, or null if none

     */

    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,

                   boolean evict) {

        Node<K,V>[] tab; Node<K,V> p; int n, i;

        if ((tab = table) == null || (n = tab.length) == 0)     //若table为null

            n = (tab = resize()).length;                        //resize

        if ((p = tab[i = (n - 1) & hash]) == null)              //计算下标i，取出i处的元素为p，如果p为null

            tab[i] = newNode(hash, key, value, null);       //创建新的node，放到数组中

        else {                  //若 p!=null

            Node<K,V> e; K k;

            if (p.hash == hash &&

                ((k = p.key) == key || (key != null && key.equals(k))))     //若key相同

                e = p;      //直接覆盖

            else if (p instanceof TreeNode)     //如果为 树节点

                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);     //放到树中

            else {                                          //如果key不相同，也不是treeNode

                for (int binCount = 0; ; ++binCount) {      //遍历i处的链表

                    if ((e = p.next) == null) {             //找到尾部

                        p.next = newNode(hash, key, value, null);       //在末尾添加一个node

                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st    //如果链表长度  >= 8

                            treeifyBin(tab, hash);             //将链表转成共黑树

                        break;

                    }

                    if (e.hash == hash &&

                        ((k = e.key) == key || (key != null && key.equals(k))))     //若果key相同，直接退出循环

                        break;

                    p = e;

                }

            }

            if (e != null) { // existing mapping for key

                V oldValue = e.value;

                if (!onlyIfAbsent || oldValue == null)

                    e.value = value;

                afterNodeAccess(e);

                return oldValue;

            }

        }

        ++modCount;

        if (++size > threshold)

            resize();

        afterNodeInsertion(evict);

        return null;

    }

可以看到，上述源代码中，put的时候加入了红黑树，当put元素时，若链表的长度大于8，即源代码中的TREEIFY_THRESHOLD的值，这个时候链表就会转化为红黑树结构；当进行扩容的时候，红黑树转移后，若元素个数小于6，那么就会重新转化为链表。

三：JDK1.7中的ConcurrentHashMap

JDK1.7中的ConcurrentHashMap和JDK1.7中的HashMap的区别就是数组所存的元素，我们知道ConcurrentHashMap 是线程安全的。

public V put(K key, V value) {

        Segment<K,V> s;

        if (value == null)

            throw new NullPointerException();

        int hash = hash(key);       // 计算Hash值

        int j = (hash >>> segmentShift) & segmentMask;      //计算下标j

        if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck

             (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment

            s = ensureSegment(j);       //若j处有segment就返回，若没有就创建并返回

        return s.put(key, hash, value, false);  //将值put到segment中去

}

 final V put(K key, int hash, V value, boolean onlyIfAbsent) {

            HashEntry<K,V> node = tryLock() ? null :

                scanAndLockForPut(key, hash, value);        //如果tryLock成功，就返回null，否则。。。

            V oldValue;

            try {

                HashEntry<K,V>[] tab = table;

                int index = (tab.length - 1) & hash;        //根据table数组的长度 和 hash值计算index小标

                HashEntry<K,V> first = entryAt(tab, index); //找到table数组在 index处链表的头部

                for (HashEntry<K,V> e = first;;) {      //从first开始遍历链表

                    if (e != null) {                    //若e!=null

                        K k;

                        if ((k = e.key) == key ||

                            (e.hash == hash && key.equals(k))) {        //如果key相同

                            oldValue = e.value;                 //获取旧值

                            if (!onlyIfAbsent) {                //若absent=false

                                e.value = value;                //覆盖旧值

                                ++modCount;                     //

                            }

                            break;      //若已经找到，就退出链表遍历

                        }

                        e = e.next;     //若key不相同，继续遍历

                    }

                    else {              //直到e为null

                        if (node != null)   //将元素放到链表头部

                            node.setNext(first);

                        else

                            node = new HashEntry<K,V>(hash, key, value, first); //创建新的Entry

                        int c = count + 1;      //count 用来记录元素个数

                        if (c > threshold && tab.length < MAXIMUM_CAPACITY)     //如果hashmap元素个数超过threshold，并且table长度小于最大容量

                            rehash(node);       //rehash跟resize的功能差不多,将table的长度变为原来的两倍，重新打包entries，并将给定的node添加到新的table

                        else        //如果还有容量

                            setEntryAt(tab, index, node);   //就在index处添加链表节点

                        ++modCount;     //修改操作数

                        count = c;      //将count+1

                        oldValue = null;    //

                        break;

                    }

                }

            } finally {

                unlock();           //执行完操作后，释放锁

            }

            return oldValue;        //返回oldValue

}

private Segment<K,V> ensureSegment(int k) {

        final Segment<K,V>[] ss = this.segments;

        long u = (k << SSHIFT) + SBASE; // raw offset   获取下标k处的offset，

        Segment<K,V> seg;

        if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u)) == null) {    //如果下标k处没有元素

            Segment<K,V> proto = ss[0]; // use segment 0 as prototype

            int cap = proto.table.length;   //根据proto 获得 cap参数

            float lf = proto.loadFactor;    //。。。

            int threshold = (int)(cap * lf);    //计算threshold

            HashEntry<K,V>[] tab = (HashEntry<K,V>[])new HashEntry[cap];

            if ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))

                == null) { // recheck   //如果下标k处仍然没有元素

                Segment<K,V> s = new Segment<K,V>(lf, threshold, tab);  //创建segment

                while ((seg = (Segment<K,V>)UNSAFE.getObjectVolatile(ss, u))

                       == null) {   //若下标k处仍然没有元素，自旋

                    if (UNSAFE.compareAndSwapObject(ss, u, null, seg = s))  //若通过CAS更新成功，则退出

                        break;

                }

            }

        }

        return seg;

    }


    /** segments中每个元素都是一个专用的hashtable

     * The segments, each of which is a specialized hash table.

     */

    final Segment<K,V>[] segments;

可以看到1.7中的ConcurrentHashMap数组中所存的是segments，每个segments下都是一个hashTable。当put元素时，会加锁，然后计算hash和下标，计算下标会计算两次，一次是在数组中的segments的位置，一次是在hashTable的位置。

四：JDK1.8中的ConcurrentHashMap

JDK1.8中的ConcurrentHashMap和JDK1.8中的HashMap结构一样，只是在处理上有区别

public V put(K key, V value) {

        return putVal(key, value, false);

    }

    /** Implementation for put and putIfAbsent */

    final V putVal(K key, V value, boolean onlyIfAbsent) {

        if (key == null || value == null) throw new NullPointerException();

        int hash = spread(key.hashCode());      //计算hash值

        int binCount = 0;

        for (Node<K,V>[] tab = table;;) {   //自旋

            Node<K,V> f; int n, i, fh;

            if (tab == null || (n = tab.length) == 0)       //table==null || table.length==0

                tab = initTable();                          //就initTable

            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {    //若下标 i 处的元素为null

                if (casTabAt(tab, i, null,                           //直接用CAS操作，i处的元素

                             new Node<K,V>(hash, key, value, null)))

                    break;                   // no lock when adding to empty bin   想emptybin中假如元素的时候，不需要加锁

            }

            else if ((fh = f.hash) == MOVED)    //若下标 i 处的元素不为null，且f.hash==MOVED MOVED为常量值-1

                tab = helpTransfer(tab, f);     //

            else {                              //如果是一般的节点

                V oldVal = null;

                synchronized (f) {              //当头部元素不为null，且不需要转换成树时，需要进行同步操作

                    if (tabAt(tab, i) == f) {

                        if (fh >= 0) {          //若 链表头部hash值 >=0

                            binCount = 1;

                            for (Node<K,V> e = f;; ++binCount) {

                                K ek;

                                if (e.hash == hash &&

                                    ((ek = e.key) == key ||

                                     (ek != null && key.equals(ek)))) {     //如果key相同

                                    oldVal = e.val;

                                    if (!onlyIfAbsent)      //且不为absent

                                        e.val = value;      //旧值覆盖新值

                                    break;

                                }

                                Node<K,V> pred = e;

                                if ((e = e.next) == null), {     //如果链表遍历完成，还没退出，说明没有相同的key存在，在尾部添加节点

                                    pred.next = new Node<K,V>(hash, key,

                                                              value, null);

                                    break;

                                }

                            }

                        }

                        else if (f instanceof TreeBin) {        //如果f是Tree的节点

                            Node<K,V> p;

                            binCount = 2;

                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,

                                                           value)) != null) {

                                oldVal = p.val;

                                if (!onlyIfAbsent)

                                    p.val = value;

                            }

                        }

                    }

                }

                if (binCount != 0) {

                    if (binCount >= TREEIFY_THRESHOLD)

                        treeifyBin(tab, i);

                    if (oldVal != null)

                        return oldVal;

                    break;

                }

            }

        }

        addCount(1L, binCount);

        return null;

    }

当put元素时，会使用CAS操作，去判断数组中所要put到的位置元素是否为空，为空就修改为当前的put的元素，若CAS操作失败，那么会自旋，这个时候发现数组里已经有元素了，那么就会锁住链表或者红黑树头部，把元素放入链表或者红黑树下面。

五：hash冲突

当put的时候需要计算hash和下标，这个时候计算出来的值可能存在一样的，那么存到数组中的相同位置，就会发生hash冲突，

计算出的hash值一样一定会发生hash冲突，但是hash值一样的概率很小，计算出的下标值是一样的概率很大，所以hash冲突主要是由下标位置一样引起的，hashMap的解决方式是使用链地址法，即使用链表的方式解决，key一样的时候才会覆盖，否则就把元素放到链表的下一个位置。

巴特西