Hive Bug修复:ORC表中array数据类型长度超过1024报异常
目前HVIE里查询如下语句报错:
select * from dw.ticket_user_mtime limit 10;
错误如下:
17/07/06 16:45:38 [main]: DEBUG impl.RecordReaderImpl: merge = [{data range [22733, 19927580), size: 19904847 type: array-backed}]
Failed with exception java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 1024
17/07/06 16:45:38 [main]: ERROR CliDriver: Failed with exception java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 1024
java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 1024
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:517)
at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424)
at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:144)
at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1885)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
at org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
at org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
at org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
at org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
at org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:484)
... 15 more
hive 补丁位置https://issues.apache.org/jira/browse/HIVE-14483
该BUG触发的条件是,orc表中,array数据类型,并且当array这个字段中数组的元素超过1024个。
可以通过打补丁的方式修复,重新编译hive的源码得到 hive-exec-2.1.0.jar 以及hive-orc-2.1.0.jar,
将这两个jar包更新到所有线上的hive客户端的lib,然后需要重启hive相关的服务(轮流重启hive metasotre,然后轮流重启hiveserver2)。
最新文章
- CF memsql Start[c]UP 2.0 A
- ubuntu14.04切换root用户
- 解决Android Studio 和 Android SDK Manager 无法在线更新的问题.
- 微软BI 之SSIS 系列 - 使用 Script Component Destination 和 ADO.NET 解析不规则文件并插入数据
- Linux命令学习-mpstat
- 对List
- 表单提交是ajax提交,PC提交没问题但是手机提交就会一直跳到error,并且也没状态码一直是0
- fitnesse - Variables and Symbols
- jQuery选择器(子元素过滤选择器)第七节
- kafka数据迁移实践
- EF Core 快速上手——创建应用的DbContext
- python threading 用法
- SQL Server最大内存设为0后的处置办法
- postgresql数据库3种程序(rule,trigger ,FUNCTION )
- 【读书笔记】iOS-iOS流媒体
- js有关事件驱动
- ssh-copy-id命令解析
- 接口文档管理系统mindoc安装手册
- C 语言与动态库相关基础知识
- iOS - 获取安装所有App的Bundle ID
热门文章
- error C4996: 'GetVersionExW': 被声明为已否决
- 在离线安装gazebo的时候可能在运行turtlebot_gazebo的时候会出现问题
- 解决maven update project 后项目jdk变成1.5的问题
- Web Api 2 认证与授权 2
- OneZero第三周——预完成功能点统计
- JS高级-***Function- ***OOP
- idea 设置编译的编码方式
- php-fpm超时时间设置request_terminate_timeout分析
- 2019.01.09 bzoj3697: 采药人的路径(点分治)
- ACM-ICPC 2018 徐州赛区网络预赛 A Hard to prepare