hbase之setCaching 和 setBatch 和setMaxResultSize

scan的setBatch()用法

        val conf = HBaseConfiguration.create()

        val table: Table = ConnectionFactory.createConnection(conf).getTable(TableName.valueOf(Bytes.toBytes("user")))

        val scan = new Scan()

        scan.addColumn(Bytes.toBytes("info"),Bytes.toBytes("gender"))

        scan.setBatch(2)

        val scanner: ResultScanner = table.getScanner(scan)

        var res = scanner.next()

        while (res!=null){

           println(res.listCells().size())

            res = scanner.next()

        }

获取hbase连接
获取表的句柄
获取scanner
通过scanner的next的方法获取result，每个result的包含多少cell由Batch决定：
当batch小于列数，则每个result的cell数等于batch
当batch大于列数，则每个result的cell数等于列数
写成公式：
Result 包含的cell数 = Min(每行列数，Batch大小)
Result 的个数 =（ row数 * 每行的列数）/ Min(每行列数，Batch大小)

scan的setCaching()用法
对于一个拥有两个列族,10行,每行在每个列族下10列的 habse表(共计200列),hbase权威指南提供了一个表，如下图所示

由上表可知,batch 决定了返回多少个 result,而Caching(缓存的 result 数)决定了 rpc 的次数.

有些博客说Caching的值代表行数，这种说法是不完全正确的，而且这里的Caching不能无限制的大，因为hbase的每个rpc是有时间限制的，固定的时间如果值读取不完会出现连接异常。

https://blog.csdn.net/lidaxueh_heart/article/details/82763357

https://blog.csdn.net/weixin_37275456/article/details/89847965

巴特西

hbase之setCaching 和 setBatch 和setMaxResultSize

最新文章

热门文章