现象:

org.apache.lucene.queryParser.ParseException: Encountered "<EOF>" at line 1, column 0.
Was expecting one of:
<NOT> ...
"+" ...
"-" ...
"(" ...
<QUOTED> ...
<TERM> ...
<PREFIXTERM> ...
<WILDTERM> ...
"[" ...
"{" ...
<NUMBER> ... at org.apache.lucene.queryParser.QueryParser.generateParseException(QueryParser.java:1226)
at org.apache.lucene.queryParser.QueryParser.jj_consume_token(QueryParser.java:1109)
at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:759)
at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:684)
at ch2.lucenedemo.process.Test.RunVsIndex(Test.java:142)
at ch2.lucenedemo.process.Test.main(Test.java:169)

方法一:

如果出现了下列错误,那是因为用错了函数。把queryParser.Query改称queryParser.parse就通过了

方法二:

1、提问:

I am working on a classification problem to classify product reviews as positive, negative or neutral as per the training data using Lucene API.

I am using an ArrayList of Review objects - "reviewList" that stores the attributes for each review while crawling the web pages.

The review attributes which include "polarity" & "review content" are then indexed using the indexer. Thereafter, based on the indexes objects, I need to classify the remaining review objects. But while doing so, there is a review object for which the Query parser is encountering an EOF character in the "review content", and hence terminating.

The line causing error has been commented accordingly -

IndexReader reader = IndexReader.open(FSDirectory.open(new File("index")));
IndexSearcher searcher = new IndexSearcher(reader);
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_31);
QueryParser parser = new QueryParser(Version.LUCENE_31, "Review", analyzer); int length = Crawler.reviewList.size();
for (int i = 200; i < length; i++) {
String true_class;
double r_stars = Crawler.reviewList.get(i).getStars(); if (r_stars < 2.0) {
true_class = "-1";
} else if (r_stars > 3.0) {
true_class = "1";
} else {
true_class = "0";
} String[] reviewTokens = Crawler.reviewList.get(i).getReview().split(" ");
String parsedReview = ""; int j; for (j = 0; j < reviewTokens.length; j++) {
if (reviewTokens[j] != null) {
if (!((reviewTokens[j].contains("-")) || (reviewTokens[j].contains("!")))) {
parsedReview += reviewTokens[j] + " ";
}
} else {
break;
}
} Query query = parser.parse(parsedReview); // CAUSING ERROR!! TopScoreDocCollector results = TopScoreDocCollector.create(5, true);
searcher.search(query, results);
ScoreDoc[] hits = results.topDocs().scoreDocs;

I've parsed the text manually to remove the characters that are causing the error, apart from checking if the next string is null...but the error persists.

This is the error stack trace -

Exception in thread "main" org.apache.lucene.queryParser.ParseException: Cannot parse 'I made the choice ... be all "thumbs ': Lexical error at line 1, column 938.  Encountered: <EOF> after : "\"thumbs "
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:216)
at Sentiment_Analysis.Classification.classify(Classification.java:58)
at Sentiment_Analysis.Main.main(Main.java:17)
Caused by: org.apache.lucene.queryParser.TokenMgrError: Lexical error at line 1, column 938. Encountered: <EOF> after : "\"thumbs "
at org.apache.lucene.queryParser.QueryParserTokenManager.getNextToken(QueryParserTokenManager.java:1229)
at org.apache.lucene.queryParser.QueryParser.jj_scan_token(QueryParser.java:1709)
at org.apache.lucene.queryParser.QueryParser.jj_3R_2(QueryParser.java:1598)
at org.apache.lucene.queryParser.QueryParser.jj_3_1(QueryParser.java:1605)
at org.apache.lucene.queryParser.QueryParser.jj_2_1(QueryParser.java:1585)
at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1280)
at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1266)
at org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:1313)
at org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:1266)
at org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:1226)
at org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:206)
... 2 more
Java Result: 1

Please help me solve this problem...have been banging my head with this for hours now!

2、问答

You should escape the double quote and other special characters via

Query query = parser.parse(QueryParser.escape(parsedReview));

As the QueryParser.escape Javadoc suggested,

Returns a String where those characters that QueryParser expects to be escaped are escaped by a preceding '\'.

小结:使用 QueryParser的静态方法QueryParser.escape(string s),进行自动转义特殊字符后再进行关键字的查询

原文出处:

现象及方法一:

不设限, org.apache.lucene.queryParser.ParseException: Encountered "<EOF>" at line 1, column 0. https://blog.csdn.net/tengdazhang770960436/article/details/17881671

方法二:

https://stackoverflow.com/questions/10259907/lucene-exception-query-parser-encountered-eof-after-some-word

最新文章

  1. IIS GZip
  2. PHP过滤各种HTML标签
  3. .NET架构设计、框架设计系列文章总结
  4. 【转】SQL Server中的事务与锁
  5. 界面绚丽的SharePoint仪表盘控件Nevron Gauge for SharePoint 控件详细介绍
  6. Javacard 解释器怎样在API类库中找到源文件调用的类、方法或者静态域?
  7. 算法与数据结构题目的 PHP 实现:栈和队列 设计一个有 getMin 功能的栈
  8. JPA的Column注解总结
  9. 学习笔记7_Java_day11_JSP原理(5)
  10. 10.30 morning
  11. UVa 11082 Matrix Decompressing(最大流)
  12. a标签中使用img后的高度多了几个像素
  13. curl笔记
  14. C#中窗体的close,dispose,以及application.exit()的区别
  15. 1.Cocos2d-x-3.2编写3d打飞机,粒子管理器代码
  16. QT 二维图形 原理、发展及应用
  17. mysql 创建存储过程
  18. DeepLearning.ai-Week4-Deep Learning &amp; Art: Neural Style Transfer
  19. &lt;基础&gt; PHP 文件、目录操作
  20. ADT Bundle下载和安装

热门文章

  1. drf之接口规范
  2. 【转载】C#手动往DataTable中末尾新增一行数据
  3. Django:RestFramework之-------权限
  4. Spark高级函数应用【combineByKey、transform】
  5. 在 Vim 中,删除 ^@ 符号的几种方法
  6. web服务器-apache
  7. 折腾deepin修改终端语言
  8. Bean property ‘mapperHelper’ is not writable or has an invalid setter method. Does the parameter type of the setter match the return type of the getter?
  9. 命令行的方式启动和关闭Mysql
  10. Caching POST-post是否能缓存