Lucene 4.6.1 java.lang.IllegalStateException: TokenStream contract violation
这是旧代码在新版本Lucene中出现的异常,异常如下:
Exception in thread "main" java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow.
at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:110)
at java.io.Reader.read(Reader.java:140)
at org.wltea.analyzer.core.AnalyzeContext.fillBuffer(AnalyzeContext.java:124)
at org.wltea.analyzer.core.IKSegmenter.next(IKSegmenter.java:122)
at org.wltea.analyzer.lucene.IKTokenizer.incrementToken(IKTokenizer.java:78)
at com.hankcs.train.IKHelper.parse(IKHelper.java:36)
at com.hankcs.train.AnalysisAdjuster.handleFile(AnalysisAdjuster.java:44)
at com.hankcs.train.AnalysisAdjuster.main(AnalysisAdjuster.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)Process finished with exit code 1
旧代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
IKAnalyzer ss = new IKAnalyzer(); StringReader reader = new StringReader(str); try { TokenStream tokenStream = ss.tokenStream( "" , reader); while (tokenStream.incrementToken()) { CharTermAttribute termAttribute = tokenStream.getAttribute(CharTermAttribute. class ); System.out.println(termAttribute.toString()); } } catch (IOException e) { e.printStackTrace(); } |
根据新的API文档,调用TokenStream API的流程必须是:
The workflow of the new
TokenStream
API is as follows:
Instantiation of
TokenStream
/TokenFilter
s which add/get attributes to/from theAttributeSource
.The consumer calls
reset()
.The consumer retrieves attributes from the stream and stores local references to all attributes it wants to access.
The consumer calls
incrementToken()
until it returns false consuming the attributes after each call.The consumer calls
end()
so that any end-of-stream operations can be performed.The consumer calls
close()
to release any resource when finished using theTokenStream
.
所以代码必须在incrementToken()
之前调用一次reset()
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
IKAnalyzer ss = new IKAnalyzer(); StringReader reader = new StringReader(str); try { TokenStream tokenStream = ss.tokenStream( "" , reader); tokenStream.reset(); while (tokenStream.incrementToken()) { CharTermAttribute termAttribute = tokenStream.getAttribute(CharTermAttribute. class ); System.out.println(termAttribute.toString()); } } catch (IOException e) { e.printStackTrace(); } |
转载请注明:码农场 » Lucene 4.6.1 java.lang.IllegalStateException: TokenStream contract violation
最新文章
- tesseract 编译与使用(windows)
- Oracle 记录插入时“Invalid parameter binding ”错误
- a标签加入单击事件 屏蔽href跳转页面
- 数据库逆向框架代码生成工具:MyBatis Generator的使用
- distribution数据库过大问题
- js之规范代码写法
- 发短信的主要代码(SmsManger)
- ecstore生成二维码
- Request.Params用法
- perl 读取cookie
- js实时监听input中值的变化
- 【转】并行类加载——让tomcat玩转双十一 @双十一实战
- 12.python进程\协程\异步IO
- 使用CreateFile, ReadFile, WriteFile在Windows NT/2000/XP下读写绝对扇区的方法
- 想不到的:js中加号操作符
- Nginx+Tomcat整合的安装与配置(win.linux)
- “finally block does not complete normally”的警告解决
- nvidia Compute Capability(GPU)
- angularjs写公共方法
- beego跨域请求配置