IDEA Spark程序报错处理
2024-09-08 16:09:37
错误一:
// :: ERROR Executor: Exception in task 0.0 in stage 0.0 (TID )
java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
at Person.<init>(RDD_To_DataFrame.scala:)
at RDD_To_DataFrame$.$anonfun$main$(RDD_To_DataFrame.scala:)
at scala.collection.Iterator$$anon$.next(Iterator.scala:)
at scala.collection.Iterator$$anon$.next(Iterator.scala:)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$$$anon$.hasNext(WholeStageCodegenExec.scala:)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$.apply(SparkPlan.scala:)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$.apply(SparkPlan.scala:)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$$$anonfun$apply$.apply(RDD.scala:)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$$$anonfun$apply$.apply(RDD.scala:)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:)
at org.apache.spark.scheduler.Task.run(Task.scala:)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:)
at java.lang.Thread.run(Thread.java:)
// :: ERROR TaskSetManager: Task in stage 0.0 failed times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task in stage 0.0 failed times, most recent failure: Lost task 0.0 in stage 0.0 (TID , localhost, executor driver): java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
at Person.<init>(RDD_To_DataFrame.scala:)
at RDD_To_DataFrame$.$anonfun$main$(RDD_To_DataFrame.scala:)
at scala.collection.Iterator$$anon$.next(Iterator.scala:)
at scala.collection.Iterator$$anon$.next(Iterator.scala:)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$$$anon$.hasNext(WholeStageCodegenExec.scala:)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$.apply(SparkPlan.scala:)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$.apply(SparkPlan.scala:)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$$$anonfun$apply$.apply(RDD.scala:)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$$$anonfun$apply$.apply(RDD.scala:)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:)
at org.apache.spark.scheduler.Task.run(Task.scala:)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:)
at java.lang.Thread.run(Thread.java:) Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$.apply(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$.apply(DAGScheduler.scala:)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$.apply(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$.apply(DAGScheduler.scala:)
at scala.Option.foreach(Option.scala:)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:)
at org.apache.spark.util.EventLoop$$anon$.run(EventLoop.scala:)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:)
at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:)
at org.apache.spark.sql.Dataset$$anonfun$head$.apply(Dataset.scala:)
at org.apache.spark.sql.Dataset$$anonfun$head$.apply(Dataset.scala:)
at org.apache.spark.sql.Dataset$$anonfun$.apply(Dataset.scala:)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:)
at org.apache.spark.sql.Dataset.head(Dataset.scala:)
at org.apache.spark.sql.Dataset.take(Dataset.scala:)
at org.apache.spark.sql.Dataset.showString(Dataset.scala:)
at org.apache.spark.sql.Dataset.show(Dataset.scala:)
at org.apache.spark.sql.Dataset.show(Dataset.scala:)
at org.apache.spark.sql.Dataset.show(Dataset.scala:)
at RDD_To_DataFrame$.main(RDD_To_DataFrame.scala:)
at RDD_To_DataFrame.main(RDD_To_DataFrame.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:)
at java.lang.reflect.Method.invoke(Method.java:)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:)
Caused by: java.lang.NoSuchMethodError: scala.Product.$init$(Lscala/Product;)V
at Person.<init>(RDD_To_DataFrame.scala:)
at RDD_To_DataFrame$.$anonfun$main$(RDD_To_DataFrame.scala:)
at scala.collection.Iterator$$anon$.next(Iterator.scala:)
at scala.collection.Iterator$$anon$.next(Iterator.scala:)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$$$anon$.hasNext(WholeStageCodegenExec.scala:)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$.apply(SparkPlan.scala:)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$.apply(SparkPlan.scala:)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$$$anonfun$apply$.apply(RDD.scala:)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$$$anonfun$apply$.apply(RDD.scala:)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:)
at org.apache.spark.scheduler.Task.run(Task.scala:)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:)
at java.lang.Thread.run(Thread.java:)
错误处理:将IDEA中的Scala 改为2.10.4版本
这个问题主要出现在 Spark程序使用 case class 类时
错误二:
Error:(, ) No TypeTag available for (Array[String],)
val documentDF= spark.createDataFrame(Seq(
错误处理:将IDEA中的Scala 改为2.12.3版本
这个问题主要出现在 Spark程序使用 Seq时:
比如:
val df= spark.createDataFrame(Seq(
(,Array("soyo","spark","soyo2","soyo","")),
(,Array("soyo","hadoop","soyo","hadoop","xiaozhou","soyo2","spark","","")),
(,Array("soyo","spark","soyo2","hadoop","soyo3","")),
(,Array("soyo","spark","soyo20","hadoop","soyo2","","")),
(,Array("soyo","","spark","","spark","spark",""))
)).toDF("id","words")
最新文章
- drawable animation
- 规范化的软件项目演进管理--从 Github 使用说起
- linux timezone
- 学习windows内核书籍推荐 ----------转自http://tieshow.iteye.com/blog/1565926
- [BS-21] 关于OC中对象与指针的思考
- Quartz Scheduler(2.2.1) - Working with SchedulerListeners
- linux 多线程基础1
- js 表单不为空,数字长度验证
- 第16讲- UI组件之TextView
- OpenCV学习(3)--Mat矩阵的操作
- AES 加密
- VC++ WIN32 sdk实现按钮自绘详解.
- MarkDown的快速入门
- nginx防恶意域名解析
- Python基础测试题
- linux 下安装arm-linux-gnueabi交叉编译器
- JVM调优总结 -Xms -Xmx -Xmn -Xss(转)
- 关于viewport我自己的理解
- 艾伦AI研究院发布AllenNLP:基于PyTorch的NLP工具包
- 返回json格式数据乱码